2022: Data, Algorithms, and Ethics: Calculating instead of deciding.

Abstract

Pre-Print Version.

Publication
In: IPPI Backgrounder on Digital Transformation
Date

»Under the modern conditions of data processing, the free development of personality presupposes the protection of the individual against the unlimited collection, storage, use and disclosure of his or her personal data«. This was formulated by the German Federal Constitutional Court (BVerfG) as early as 1983, ten years before the World Wide Web, twenty years before Facebook, thirty years before the revelations of Edward Snowden and almost forty years before the infamous leak of Israel’s full voter register in 2020. This universal right to informational self-determination has lost none of its significance, on the contrary. To exercise this right accordingly, we as responsible and educated citizens of the information society have to handle technical terms like data and algorithms and we do that easily nowadays, but we should also deeply understand them to see the chances and assess the risks of its use. What are data, what are algorithms, and what is this artificial intelligence everyone is writing into research grant applications right now? This topic is so important because data now determines the weal and woe of everyone on this planet. More precisely, it is a privileged few who have the informational power to use this data according to their own advantage – at the expense of others. Data and algorithms have left the innocent sphere of science and now determine human actions and decisions. They have thus become subjects of ethics, which was examined in more depth elsewhere (Ullrich 2019). Prior knowledge of what data are and how they are created is not necessary for understanding this text; on the contrary, it is technical experts who have striking gaps in their knowledge of the socio-technical context. We want to follow the flow of data together, from the emergence of the term data, the development of data processing technology, to today’s ethical considerations in a global data-driven business ecosystem.

Data comes from the Latin word for That What Is Given. One of the founders of modern science, Francis Bacon (1561-1626) called his written observations of nature »data«. That is what data is in a nutshell: the scientific mind observes or measures its surroundings and writes that down. Of course, in the computer age we strive for machine readable data coded in a way that a mindless machine can process. The first step therefore is to transform the observed phenomena into a discrete and automatically processable form. Let’s take for example a sung or played note. An acoustic wave is continuous like a hand-drawn curved line on paper. So, if we want to record that continuous wave, we have two possibilities: We can use an analogue way of doing so (audio cassettes, magnetic tapes, vinyl) or we use a digital recording device. Inside, there is a so-called analogue-to-digital converter that works pretty much like us trying to create an image with ironing beads. Ironing beads are small cylinders of plastic that are placed on a grid to form a picture and then ironed so that the individual plastic pieces combine to make a great gift for, say, grandparents. Digitalization is a similar process in the way that we do need a grid as an overlay for an analogue figure. Imagine a piece of squared paper from school on which you draw a wave. Now, in your mind or with the help of an actual squared paper and a pen, color in all the boxes through which this wave passes. You can then save the discrete data version of that wave digitally and in binary: Write a zero for each white box and a one for each colored box. Of course, the digital data is only an approximation of the original wave, but you can use a smaller grid if you want a better precision. Don’t bother too much though, we humans have brilliant processing qualities and can recognize or more precisely reconstruct the original wave (what we experience when we listen to compressed audio files).

Now that we have data – what to do with it? We can interpret them and write down our interpretation as facts in the original Latin sense of That What Is Made. If we use scientific methods, we write down our intention, our observation setting, our data, our methods that we apply in order to come to conclusions or scientific facts that we also write down. Somewhat reluctantly, we then invite our fellow colleagues to take a critical look at it. So, yes, there are »alternative facts« but not all of them are scientifically derived. Francis Bacon insisted that there is an essential difference between data and facts. What both have in common is that they can be symbolically noted in pictures, writings, and numbers. The number, however, is something special, it can be used not only for counting, i.e. for recording, but also for calculating. Focusing only on the European Renaissance (a bias I share with many of my colleagues from Europe) it is striking to see that in modern times everything became a calculable number. Renaissance merchants discover this added value of calculability when they use the then-new, then-modern Arabic numerals, including the incredible zero, in place of the Roman numerals that were common at the time. In addition to Arabic numerals, they used a simple and ingenious scheme: the table. They were then able to save what they had thus put into form and share this in-formation with their colleagues. Science (at that time less dependent on third-party funding and the spreadsheets that came with it) also uses the table enthusiastically. The first line usually contains designations, such as measured quantities and units, the other lines symbols, noted in picture, writing and – above all – numbers. Gottfried Wilhelm Leibniz (1646-1716) described the power of the table to his sovereign in flowery words: The busy mind of the ruling person could not possibly know how much woolen cloth is manufactured in which factories and in what quantity is demanded by whom in the population. Since knowledge of this »connexion of things« is essential for good government, he proposed so-called »government tables« (»Staatstafeln«), which make complex facts comprehensible at a glance and thus governable and controllable (Leibniz 1685). Leibniz went even one step further. Wouldn’t it be great though, mused Leibniz, if we actually calculate the result of a debate instead of exchanging arguments? Calculemus – let us calculate! What was meant as a suggestion with a tongue in cheek resonated more and more with the enlightened spirit looking at a complex world. Data soothes its confusion in providing a toolbox of answering pressing questions about life, the universe, and all the rest. Will there be sun after the night? Will there be another spring after this winter? Let’s collect data in the course of several years and we see a pattern. This data led to a hypothesis by presenting correlations that could show the way towards an unknown cause. It is a powerful tool but like all tools it is to be handled with expertise and care. Feeding data to a data mining or machine learning system leads to a clue where to look further – it is not a fact; it is another type of data: algorithmic generated data. Before we continue, we have to explain what an algorithm is.

Algorithms are codified prescriptions for action to solve a codifiable problem that can be processed by a human mind. Algorithms are thus primarily techniques of instrumental reason. The oldest example of an algorithm you all learned in school: Euclid’s algorithm for determining the greatest common divisor. Given two positive integers a and b such that a is greater than b, we calculate the greatest common divisor gcd(a,b) by calculating gcd(a-b, b), replacing the larger number by the difference of the numbers, and repeating this until the two numbers are equal – that is the greatest common divisor.

Algorithmic data processing provides results that are not yet directly visible in the data itself, like a common divisor hidden in a pair of numbers. An algorithm cannot tell the Minotaur where the exit is, but it can tell it how to find the exit, guaranteed. A possible algorithm would be: »Always feel your way along the right wall, follow every corridor that leads off to the right. If you meet a wall head-on, turn left so that that wall is now on your right. Keep going until you come to an exit«. »Where is the exit« and »How do I find the exit« are fundamentally different questions (cf. Ullrich 2019). The power of the algorithms is nowadays most evident thanks to the availability of huge amounts of data (»Big Data«) and a new tool called Machine Learning. Machine Learning »Definition: A computer program is said to learn [original emphasis] from experience E with respect to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P, improves with experience E.« (Mitchell 1997, p.2)

This canonical definition of machine learning is found right on the second page of Tom Mitchell’s widely cited standard work on the subject. It is very easy to read, but it is not easy to understand what »experience« means when applied to a machine. I’ve been pondering for a long time with my colleagues at the Weizenbaum Institute about how we can explain machine learning to a broad audience without too much formula and at the same time without over-simplifying or anthropomorphizing. During our research on this, we came across an article from 1961 that successfully overcomes this challenge.

MENACE was the name of a machine built by Donald Michie in the 1960s that could play Noughts and Crosses (also known as Tic-Tac-Toe, Three in a Line, or Tatetí) against a human player (Michie 1961). The Machine Educable Noughts And Crosses Engine was a machine learning system but with a twist: The machine was made of matchboxes filled with colored beads. The setup was quite impressive, no less than 304 boxes are needed, one box for each possible configuration during the game. The operator of MENACE makes the first move by picking a matchbox labelled with an empty playing field, shaking it, and drawing a colored bead. Each color represents one of the nine possible positions an X or an O can take on the playing field. In the course of the first games the machine will likely lose, because the beads are drawn randomly. Enter the machine learning part: If MENACE loses, the operator will remove all the drawn beads that lead to defeat. If MENACE wins, the operator will add three beads of the drawn beads in each picked box. That means that the chance of losing again will be reduced while on the other side good moves are rewarded considerably. Trained long enough, MENACE will »learn« a winning strategy (by improving the chances of good moves) and therefore will »play« pretty well.

The interesting part is that no human player would ascribe any intention or other cognitive ability to a pile of boxes in contrast to machine learning systems implemented with software on a computer hardware. And even when there is no machine learning or otherwise as »artificial intelligence« labelled systems involved, a critical observer of the information society is still puzzled about the »enormously exaggerated attributions an even well-educated audience is capable of making, even strives to make, to a technology it does not understand.« (Weizenbaum 1976, p.7)

MENACE is not a metaphor for a machine learning system, it is a machine learning system. Of course, there are much more sophisticated machine learning systems out there, but all have one thing in common: They were designed and created by human beings with a purpose and applied in a specific domain. Outside that domain or used for any other purpose, these systems are useless (in non-critical contexts) or harmful (if used for sovereign tasks). So, MENACE is a good way of thinking about the use of decision-making systems: Do you want a pile of matchboxes »decide« what to wear today? Probably, I mean, why not. But do you want a pile of matchboxes »decide« whether you will give a job to someone? Definitively not. Assisted Decisions We all know situations in which we cannot make up our minds. In 19th century France, people in that situation used to pick a daisy and pull off the petals one by one: »elle/il m’aime, un peu, beaucoup, passionnément, à la folie, plus que tout, pas du tout«. It is, of course, complete superstition that a flower can tell us anything about another person’s state of mind or about anything in the world. Nevertheless, the very process of dealing with this issue helps us to become clear about ourselves. »If you want to know something and cannot find it through meditation«, Heinrich von Kleist advised in 1878, »you, my dear, sensible friend, must talk about it with the next acquaintance who bumps into you. It need not be a sharp-thinking head, nor do I mean it as if you should ask him about it, no! Rather, you should tell him yourself first«. A flower is not known to be a sharp thinker, but it serves the purpose of the decision-making process. As a computer scientist, I am irritated that the computer is thought to have more thinking capabilities than the daisy. This is not the case, on the contrary, in a direct comparison the plant would win hands down. It is the lack of knowledge about the nature of daisies and computers that makes outsourced decisions so problematic. A daisy usually has 34 petals, so the person asking this oracle always concludes to be loved beyond measure (»plus que tout«). Even with binary questions that are answered with yes and no, the answer is »yes« for most plants: snowdrops have 3 petals, buttercups 5, 13 for Calendula, 21 on asters – only larkspur (Delphinium) gives us a »no«, but in which case the answer doesn’t matter anyway (it is very poisonous).

Whether with plants or computers: the structure must be deeply understood. If I am to be honest, I have no idea whether the number of petals of the plants described above really adhere to the mathematical Fibonacci sequence – but I know why that should be the case: The highest density on the stem is obtained when the angle between two successive primordia is 137.5°. This so-called Golden Angle results when the plant forms its structures exactly according to the Fibonacci sequence. The plant does not roll the dice, it optimizes its growth. (Empirically, the reader will perhaps find out at the festival of trees next Tu Bishvat whether that is the case.) Let us leave the field of nature oracles and move towards its modern variant: computer aided decisions. Calculating instead of deciding – automated edition. Several centuries after Leibniz proposed his idea of calculating the results of an argument, it is software companies that used a similar pitch to sell their products. Wouldn’t it be nice if we could calculate the best outcome of a decision we have to face? With such a universal decision-making system, we would not only be able to calculate which of us is right in the event of a disagreement, we could also calculate who is the best fit for our company, we no longer decide, we simply calculate. What’s more: Even for things we don’t know, we simply calculate the right answer. In the more than three hundred years since Leibniz, the mindset has become firmly embedded in our scientific methods. Calculating takes decisions away from us, it is faster and cheaper. There is a categorial difference between calculating or deciding and ignoring that is a problem.

An algorithmic decision system is not only a technical but also and foremost a socio-technical system. It consists of at least two subsystems: The information technology (IT) system and the human beings. IT systems cannot decide, they can only calculate. Human beings operating the systems decide. Admittedly, they do so in the midst of a complex process in which data has been processed and classified. At the end of this process, in most cases, there is a number that can be read off an interface. But it is not a decision.

This number provides security, safety, control. It suggests a certainty that no IT system has. People with their tools discretize their continuous environment and note their observations in symbolic form (like described above in detail). This data now also enters the digital computer. This process of discretizing and its problems can be demonstrated very well using a famous painting. Perhaps you know George Seurat’s famous work »Un dimanche après-midi à l’Île de la Grande Jatte« which is a work of Pointillism. In this technique of painting, the transitions between the areas of color appear to us to be fluid even though they are the result of very precisely placed dots on a canvas. When I look at a painted tree and go to the edges: Does the dot right under my index finger still belong to the tree or already to the sky? Our brain draws a symbolic line where there is none in the physical world. That is one of the many messages of Pointillism: There are no lines in nature, we draw some.

We also draw such lines in software systems that are supposed to classify something. Given a point/dot/pixel, the software »decides« whether it is a tree or a sky. Maybe it measures the frequency of the color and spits out a number, 490 nanometers for instance. Is that still blue? Or already green? Instead of the result blue or green, the software should printout the confidence interval. Can be blue, can be green, you decide! This points to another basic assumption that is rarely talked about. The IT system does not classify tree/sky, but green/blue, or 490480 nanometers. A tree at sunset or in autumn is nevertheless recognized by humans as a tree, even without looking and only feeling, because we do not pay attention (only) to the colors of the leaves but recognize its arboreality (another homework for Tu Bishvat: hug a tree). This ability is innate to us, but we cannot fully explain it, let alone reproduce it.

If we give up this absolute idea of true/false and only work with probabilities and statistics, classification works surprisingly well in most cases. Is that a tree? It’s big, brown on the bottom, green on top, and all the people and bots on Twitter say it’s a tree – then it probably is a tree. This approach is sufficient for the reality of our lives, and this is precisely the strength of heuristic as opposed to algorithmic computer science systems: They can deal with little data, with a lot of data, with accurate data, with inaccurate or even contradictory data – and deliver a result that is approximately true.

We are currently experiencing nothing less than a heuristic revolution. We no longer compute using algorithms, we train a system that works heuristically. Old and busted: Truth. New hotness: Sufficiency.

Last year, the German federal government commissioned an expert opinion for the Third Report on Gender Equality, which was to examine the functioning of recruiting systems in more detail. However, the products on the market are proprietary and not open source, and a detailed investigation regarding gender equality and non-discrimination has yet to be conducted. To understand how automated decision systems work, however, it is sufficient to refer to the related heuristic system called biometrics and examine it in more detail.

Biometrics, i.e. the measurement of life, is an instrument of statistics. Mortality tables, age structure of the population and average life expectancy are interesting for state leaders when it comes to taxes, participation, and distributive justice. In one of the first scientific works on biometry, the Swiss natural scientist Christoph Bernoulli first describes how a table of life expectancy should be structured and what advantages arise from this clear connexion of things, before pointing out in a somewhat hidden insertion that it was life insurance institutions that made the collection of this data »a necessity« (Bernoulli 1841, pp.398-399). Once the transdisciplinary cultural technology researcher has picked up this techno-historical trail, she discovers the true motivations behind biometric systems everywhere (following Knaut, 2017). Since Francis Galton, dactyloscopy has not only served law enforcement purposes, but also, like all other biometric surveying systems to this day, voluntarily or involuntarily, feed racist thinking and practices.

Biometric recognition systems are used for verification and identification and are usually marketed as access systems (verification) or generally as official security technology (identification). The introduction of biometric passports and ID cards in Germany (in 2003) or Israel (in 2013) were also presented from this point of view. In background discussions and when asked directly, however, it is clear to all involved that this is a business promotion, as the corresponding reading devices have to be licensed. However, the data-based business models of biometric recognition systems have a catch: they technically fall under the General Data Protection Regulation (Article 9 (1) GDPR), which makes exploitation so challenging. Biometric data is also the most intimate and visible data: Unless there is a pandemic, we are constantly showing our face. And even in Corona times, our walk in a crowd of people can be quite unique. Finally, there are our fingerprints, which are emblematic of identity, although technicians and scientists have been pointing out for decades that it is less about identity and more about identity constructions and attributions.

Attribution is the exercise of informational power. A privileged few can impose an assessment of many data subjects. This can be shown most vividly in the debate about parole board decisions. In most countries, prisons serve not only to protect the general public from convicted criminals but also to reintegrate these same criminals into society after they have served their sentences. In the case of suspended sentences, a review is carried out during the time to be served to determine whether a convicted person can be released on parole. In Germany, the decision on this is made by a court, which has to make a prognosis as to whether it can be assumed that the offender will not commit any more crimes in the future even without serving the prison sentence. The suspended sentence is based on the convicted person’s right to resocialization in accordance with the constitution. In U.S. law, the decision is made by a parole board within the state government, which considers the conduct in prison, the prospect of rehabilitation, and any continuing danger posed by the convict. Parole Board members must meet high standards to adequately address both the fundamental rights of individuals and the rights of the community. They are usually judges, psychologists or criminologists and are also specially trained in moral issues. It is an immensely mentally demanding activity, and any help is therefore gladly accepted. As with Leibniz, experts are asking themselves: Wouldn’t it be wonderful if there were a number that gave out the risk score of a potential new crime committed by the person under consideration? Why yes, there is, it is calculated as follows:

Violent Recidivism Risk Score = (age∗−w) + (age-at-first-arrest∗−w) + (history-of-violence*w) + (vocation-education∗w) + (history-of-noncompliance∗w)

The Practitioner’s Guide to the decision support tool named »Correctional Offender Management Profiling for Alternative Sanctions (COMPAS)« from 2015 explains that each item is multiplied by a weight (w) which is determined by the »strength of the item’s relationship to person offense recidivism that we observed in our study data« (Northpointe 2015, p.29). Even if you do not understand the underlying formula (which I doubt because it is really that simple as you think it is), you see immediately that the whole score depends on the history, it depends on the past, and it depends on a mysterious correction factor called »weight« that is obtained in an obscure process protected by trade secrets. A parole board decision is a prognosis – yet it is achieved by means of the past so it should be called »postgnosis« instead. One of its biggest problem is that such systems are necessarily biased, simply because its underlying data are biased. Data are biased in a fundamental way: You can only measure what is measurable. You cannot measure future intentions of criminal minds (or any mind for that matter), but maybe there is a proxy datum that correlates very well with the unobservable. In statistics, there is the famous example of the positive correlation between appearance of storks and the number of new-born children. What’s more, the people in Europe all heard the Slavic myth that storks bring children into the world. Ultimately, we treat phenomena that correlate with existing data and fit the narrative as if there is a causal relationship. When we have heard often enough that women and children are rescued first in a shipwreck and then look at the data of the Titanic disaster, we see the maxim at work. Admittedly, with a different narrative, the data can be interpreted even better: It was people on the upper decks who were saved in a greater number compared to people of the lower decks. This happened for the simple reason that there were many more lifeboats in the vicinity of the rich and powerful passengers. So, it is more accurate to say that the rich and powerful were rescued first – but that is an unpopular narrative.

Data sets in general are created to benefit groups that already have economic power over other groups for the simple reason that good data are expensive. You have to invest in data analysis and processing and therefore you expect a return of interest: data as a means of payment. There are other types of incentives other than money, for example political and informational power. The main incentive of Luftdaten.info, to pick a well-known civic tech example from my country, was to collect environmental data to shape the public discourse about fine dust in Stuttgart (Germany). Before this project started, there were no data available to the public regarding this important environmental issue. That is no coincidence: The car manufacturers in Stuttgart have a combined annual revenue of 200 billion EUR – compare that to the .000002 billion EUR budget of Luftdaten. Nevertheless, we revealed in a recent study how this Civic IoT project, though being organized within limits of technical equipment, resources, and academic knowledge, contributed in multiple ways to more sustainable cities or communities (Hamm et al. 2021). That brings us to my final point. Data Literacy and Empowerment Data serve the control of people. Initially, this is only to be understood as genetivus subjectivus, people use data to measure and therefore control their environment. Recently, however, the meaning of people affected by data (genetivus objectivus) has also been debated: Data is used to control people. I would like to conclude with a positive and constructive note regarding the use of data. Data are the key to knowledge; they are the basis of empirical sciences and offer a view of the world not only to quantitative but also to qualitative researchers. Data are not facts, as we learnt from Francis Bacon, and we should always keep this in mind. Data can generate, confirm, or question facts in the scientific working mind. Data can also obscure facts. Data science is slowly maturing into the basic cultural technique of the responsible member of the networked society. Data scientist Hans Rosling demonstrated to a large audience (and thanks to audiovisual data also on Youtube, vimeo and co) how data can be used to bridge cultural differences, break down prejudices and ensure common understanding. In a very humorous and exposing way, Rosling holds a mirror up to us that we rely on obsolete data, wrong numbers, and biased facts that we learned in school and that are now reproduced on all media channels. Our conceptions of countries in the global South, for example, are closer to myth than to the present (Rosling 2006). Demystifying false and even harmful assumptions with data was the main drive of Rosling the humanist. But for this to happen, the data must be available. This depends on instruments and tools, but it also depends on culture and customs. It is not because of a lack of tools that Caroline Criado Perez (2019) observed a Gender Data Gap, but also because of the data culture of the majority society. Data is collected for a purpose, and the more effort put into data collection, the more likely people are to expect dividends.

Data are part of both the Old World of automated data processing and the Brave New World of heuristic data techniques such as Machine Learning, Big Data and Artificial Intelligence. It is therefore not surprising that the development of data literacy is repeatedly insisted on, without, of course, saying what exactly this should look like. Demystification also includes a sobering look at current data processing practices. The majority of people simply have no desire to deal with data, and in a society based on the division of labor, we should accept this and hold computer scientists and companies with data-based business models more accountable, for example by demanding that data-based business models not be subject to any secrecy obligation or that the data-processing systems be precisely labeled.

We need a data literacy for all people, but that is easier said than done. Now, in the second age of the Turing Galaxy, data literacy is more important than ever to maintain informational sovereignty, what was true 30 years ago has now become even more important – there is no such thing as harmless data. However, when we discover inequities or biases in data-driven algorithmic systems that support decision making, that can be a very good thing, given the appropriate transparency: After all, abuse of informational power becomes visible and can be addressed.

Everything starts with the will to understand in order to be able to use the power of data accordingly for the benefit of the general public or the common good. As international policy experts, researchers, and scholars, we are aware of the special responsibility that technological and scientific progress has on the human mind, and we are therefore committed to regaining informational sovereignty as a networked society. This text is intended to contribute to this.

Acknowledgements

Like all texts of mine published in the last years, this one was written after inspiring discussions with colleagues and friends from the Weizenbaum Institute for the Networked Society Berlin and the German Informatics Society. Some of the basic ideas have already been published in German, and I keep using favorite phrases and examples. Plagiarism hunters with appropriate translation software will find what they are looking for on the author’s homepage cytizen.de. The example with the petals is however new and I am glad about feedback whether the reader has checked this empirically with first-hand data!

Literature

Bernoulli 1841: Bernoulli, Christoph (1841): Handbuch der Populationistik. Stettin.

BVerfG (Bundesverfassungsgericht) (1983): Volkzählungsurteil vom 15.12.1983 — 1 BvR 20983, 1 BvR 26983, 1 BvR 36283, 1 BvR 42083, 1 BvR 44083, 1 BvR 48483.

Hamm et al. 2021: Hamm, Andrea, Yuya Shibuya, Stefan Ullrich, und Teresa Cerratto Pargman. „What Makes Civic Tech Initiatives To Last Over Time? Dissecting Two Global Cases“. In Proceedings of CHI Conference on Human Factors in Computing Systems (CHI ’21). Yokohama, Japan: ACM New York, NY, USA, 2021. https://doi.org/10.1145/3411764.3445667.

Kleist 1878: Kleist, Heinrich von, und Vera F. Birkenbihl. Über die allmählige Verfertigung der Gedanken beim Reden. Dielmann, 1999.

Knaut 2017: Knaut, Andrea (2017): Fehler von Fingerabdruckerkennungssystemen Im Kontext. Disssertation an der Humboldt-Universität zu Berlin. https://edoc.hu-berlin.de/handle/18452/19001 (1.4.2021)

Leibniz 1685: Leibniz, Gottfried Wilhelm (1685): Entwurf gewisser Staatstafeln. In: Politische Schriften I, edited by Hans Heinz Holz, Frankfurt am Main, 1966, S. 80–89.

Michie 1961: Michie, Donald (1961): Trial and Error. In: Barnett, S.A., & McLaren, A.: Science Survey, Part 2, 129–145. Harmondsworth.

Mitchell 1997: Mitchell, Tom M. Machine Learning. McGraw-Hill, 1997.

Northpointe 2015: A Practicioner’s Guide to COMPAS Core. http://www.northpointeinc.com/downloads/compas/Practitioners-Guide-COMPAS-Core-_031915.pdf (15.11.2021)

Perez 2019: Perez, Caroline Criado: Invisible Women: Exposing Data Bias in a World Designed for Men. Random House, 2019.

Rosling 2006: Rosling, Hans (2006): Debunking myths about the “third world”. TED Talk, Monterey. https://www.gapminder.org/videos/hans-rosling-ted-2006-debunking-myths-about-the-third-world/

Ullrich 2019: Ullrich, Stefan (2019): Algorithmen, Daten und Ethik. In: Bendel, Oliver (Ed.), Handbuch Maschinenethik. Wiesbaden. S. 119-144.

Weizenbaum 1976: Weizenbaum, Joseph: Computer Power and Human Reason: From Judgment to Calculation. W. H. Freeman, 1976.