“We need to engage more deliberately, and be part of influencing how these technologies develop.”
When we buy something on Amazon or watch something on Netflix, we think it’s our own choice. Well, it turns out that algorithms influence one-third of our decisions on Amazon and more than 80% on Netflix. What’s more, algorithms have their own biases. They can even go rogue.
In his recent book titled, A Human’s Guide to Machine Intelligence: How Algorithms Are Shaping Our Lives and How We Can Stay in Control, Kartik Hosanagar, a professor of operations, information and decisions at Wharton, discusses how algorithmic decisions can go wrong, and how we can control the way technology impacts decisions that are made for us or about us.
In a recent conversation with Knowledge@Wharton, Hosanagar notes that we must stay engaged, and be part of the process of developing these new technologies. An edited version of the conversation appears below.
Knowledge@Wharton: There’s a growing buzz about artificial intelligence (AI) and machine learning. In all the conversations that are going on, what are some points that are being overlooked?
Kartik Hosanagar: Yes, there’s a lot of buzz around AI and machine learning, which is a sub-field of AI. The conversation tends to either glorify the technology or, in many instances, create fear-mongering around it. I don’t think the conversation has focused on the solution, i.e. how we are going to work with AI, especially in the context of making decisions. My book is focused on making decisions through intelligent algorithms.
One of the core questions when it comes to AI is: Are we going to use AI to make decisions? If so, are we going to use it to support [human] decision-making? Are we going to have the AI make decisions autonomously? If so, what can go wrong? What can go well? And how do we manage this? We know AI has a lot of potential, but I think there will be some growing pains on our way there. How can algorithmic decisions go wrong? How do we control how technology impacts the decisions that are made for us or about us?
Knowledge@Wharton: The book begins with some striking examples about chatbots and how they interact with humans. Could you use those to talk about how human beings interact with algorithms?
Hosanagar: I began the book with a description of Microsoft’s experience with a chatbot called “Xiaobing” or “Xiaoice.” This was a chatbot created in the avatar of a teenage girl. It’s meant to engage in fun, playful conversations with young adults and teenagers. This chatbot has about 40 million followers in China, and reports say that roughly a quarter of those followers have said, “I love you” to Xiaoice. That’s the kind of affection and following it has.
Inspired by the success of Xiaoice in China, Microsoft decided to test a similar chatbot in the U.S. They created a chatbot in English, which would engage in fun, playful conversations with young adults and teenagers. They launched it on Twitter under the name “Tay,” but this chatbot’s experience was very different and short-lived. Within an hour of launching, the chatbot turned sexist, racist, and fascist. It tweeted very offensively. It said things like, “Hitler was right.” Microsoft shut it down within 24 hours. Later that year, MIT’s Technology Review rated Microsoft’s Tay as the “Worst Technology of the Year.”
That incident made me question how two similar chatbots or pieces of AI built by the same company could produce such different results. What does that mean for us in terms of using these systems, these algorithms, for a lot of our decisions in our personal and professional lives?
Knowledge@Wharton: Why did the experiences differ so dramatically? Is there anything that can be done about that?
“Psychologists describe human behavior in terms of nature and nurture . . . Algorithms, too, have nature and nurture.”
Hosanagar: One of the insights that I got as I was writing this book, trying to explain the differences in behavior of these two chatbots, was from human psychology. Psychologists describe human behavior in terms of nature and nurture. Our nature is our genetic core, and nurture is our environment. Psychologists attribute problematic issues like alcoholism, for instance, partly to nature and partly to nurture. I realized algorithms, too, have nature and nurture. Nature, for algorithms, is not a genetic core, but the code that the engineer actually writes. That’s the logic of the algorithm. Nurture is the data from which the algorithm learns.
As we move towards machine learning, we’re heading away from a world where engineers program the end-to-end logic of an algorithm, where they specify what happens in any given situation. It used to be all about nature, because the programmer gave very minute specifications telling the algorithm how to work. But as we move towards machine learning, we’re telling algorithms: “Here’s data. Learn from it.” So nature starts to become less important, and nurture starts to dominate.
If you look at what happened between Tay and Xiaoice, in some ways the difference is in terms of their training data. Xiaoice was created to mimic how people converse, while Tay picked up how people were talking to it, and then reflected that. That’s the nurture aspect, but part of it was nature, as well. The code could have specified certain rules like: “Do not say the following kinds of things,” or “Do not get into discussions of these topics,” and so on. So it’s a bit of both nature and nurture, and I think that’s what, in general, rogue algorithmic behavior comes down to.
Knowledge@Wharton: There was a time when algorithmic decision-making seemed to be about Amazon suggesting what books to read, or Netflix recommending which movies you should watch. But algorithmic decision-making has become a lot more complex. Could you give some examples of this?
Hosanagar: Yes, algorithms pervade our lives. Sometimes we see it—like Amazon’s recommendations—and sometimes we don’t. But they have a huge impact on the decisions we make. On Amazon, for example, more than a third of the choices that we make are influenced by algorithmic recommendations like: “People who bought this also bought this. People who viewed this eventually bought that.” On Netflix, they drive more than 80% of the viewing activity. Algorithmic recommendations also influence decisions such as whom we date and marry. In apps like Tinder, algorithms create most of the matches.
Algorithms also drive decisions in the workplace. For example, when you apply for a loan, algorithms increasingly make mortgage approval decisions. If you apply for a job, résumé-screening algorithms decide whom to invite for an interview. In U.S. courtrooms, there are algorithms that predict the likelihood that the defendant will re-offend, so that judges can make sentencing decisions. In medicine, we’re moving towards personalized medicine, in which two people with the same symptoms might not get the same treatment. It might be customized based on their DNA profile. Algorithms guide doctors on those sometimes life-or-death decisions.
We’re moving to a point where the algorithms don’t merely offer decision support—they can function autonomously, as well. Driverless cars are a great example of that.
Knowledge@Wharton: You write that design choices can have unintended consequences. Could you explain that?
Hosanagar: By unintended consequences, I’m referring to situations where perhaps you optimize some aspect of a decision, but then something else goes wrong. For example, when Facebook was manually curating its trending stories through human editors, it was accused of having a left-leaning bias–these editors supposedly were choosing left-leaning stories and curating those more often. So Facebook used an algorithm for this curation and then tested it for political bias. It did not have any political bias, but there was something else it had which they hadn’t explicitly tested for, which is fake news. The algorithm curated fake news stories and circulated them. That’s an example of unintended consequences, and algorithm design can drive that.
I’ve done a lot of work on recommendation systems and how they influence the products and media we consume. I’ve specifically studied two kinds of recommendation algorithms—one kind is like what Amazon does: “People who bought this also bought this.” It’s based on social curation. The other kind of algorithm attempts to understand at a deeper level—it tries to find items that are similar to the user’s interests. An example of that would be Pandora. Its music recommendations are not [based on social curation]. Pandora has very detailed information—more than 150 musical attributes for each song. For instance, how rhythmic is the song? How much instrumentation is there in the music? And every time you say you like a song or you don’t like it, they look at the musical qualities of the song, and then they adjust their recommendations based on other songs which have attributes similar to what you have liked or not liked.
I looked at both these designs, and the conventional wisdom was that all these algorithms help in pushing niche, novel items or indie songs that nobody has heard of. But what I found was that these designs were very different—the algorithm that looks at what others are consuming has a popularity bias. It’s trying to recommend stuff that others are consuming, so it tends to lean towards popular items. It cannot truly recommend the hidden gems.
But an algorithm like Pandora’s doesn’t have popularity as a basis for recommendation, so it tends to do better. That’s why companies like Spotify and Netflix and many others have changed the design of their algorithms. They’ve combined the two approaches. They’ve combined the social appeal of a system that looks at what others are consuming, and the ability of the other design to bring hidden gems to the surface.
Knowledge@Wharton: Let’s go back to the point you brought up earlier about algorithms going rogue. Why does that happen and what can be done about it?
Hosanagar: Let me point to a couple of examples of algorithms going rogue, and then we’ll talk about why this happens. I mentioned algorithms are used in courtrooms in the U.S., in the criminal justice system. In 2016, there was a report by ProPublica that looked at algorithms used in courtrooms, and they found that these algorithms have a race bias—they were twice as likely to falsely predict future criminality in a black defendant than in a white defendant.
“We think we see the recommendations, and then we do what we want. But the algorithms are actually nudging us in interesting ways.”
Late last year, Reuters carried a story about Amazon trying to use algorithms to screen job applications. They found that the algorithms tended to have a gender bias, tending to reject female applicants more often, even when the qualifications were similar. [Editor’s note: Amazon later discontinued the use of this recruiting tool.] There are probably many other companies that are using algorithms to screen résumés, and they might be prone to race bias, gender bias, and so on.
In terms of why algorithms go rogue, there are a couple of reasons. One is that we have moved away from the old, traditional algorithms where the programmer wrote up the algorithm end-to-end, and we have moved towards machine learning. In this process, we have created algorithms that are more resilient and perform much better, but they’re prone to biases that exist in the data. [Say] you tell a résumé-screening algorithm: “Here’s data on all the people who applied to our job, and here are the people we actually hired, and here are the people we promoted. Now figure out whom to invite for job interviews based on this data.” The algorithm will observe that in the past, you were rejecting more female applications, or you were not promoting women in the workplace, and it will pick up that behavior.
The other piece is that engineers tend to focus narrowly on one or two metrics. With a résumé-screening application, you will tend to measure the accuracy of your model, and if it’s highly accurate, you’ll roll it out. But you don’t necessarily look at fairness and bias.
Knowledge@Wharton: You tell a fascinating story in the book about a patient who gets diagnosed with tapanuli fever. What implications does that have for how far algorithms can be trusted?
Hosanagar: The story is that of a patient walking into a doctor’s office feeling fine. The patient and doctor joke around for a while, but the doctor eventually picks up the pathology report and suddenly looks very serious. He informs the patient: “I’m sorry to let you know that you have tapanuli fever.” The patient hasn’t heard of tapanuli fever, so he asks what exactly it is. The doctor says it’s a very rare disease, and it’s known to be fatal. He suggests that if the patient takes a particular tablet, it will reduce the chance that he will have any problems. The doctor says: “Here, take this tablet three times a day, and go about your life.”
If they were the patient, would they feel comfortable in that situation? Here’s a disease you know nothing about, and a solution you know nothing about. If an algorithm were to make this recommendation—that you have this rare disease, and you should take this medication—without any additional information, would you [take the medication]?
Tapanuli fever is not a real disease—it’s a disease in one of the Sherlock Holmes stories, and even in the original story, it turns out that the person who is supposed to have tapanuli fever doesn’t actually have it. But [the story] brings up the question of transparency: Are we willing to trust decisions when we don’t have information about why it was made?
Sometimes we seek more transparency from algorithms than humans, but lots of companies are imposing algorithmic decisions on us without [providing] any information about why these decisions are being made. Research shows that we’re not fine with that—for example, a PhD student at Stanford looked at an algorithm that would compute grades for students, [sometimes providing] just their score, [and other times providing] their score with an explanation. As expected, when the students had an explanation, they trusted it more.
So why is it that in the real world, there are so many algorithms making decisions for us—or about us—and we have no transparency about those decisions? I advocate that we need a certain level of transparency with regard to, say, what kinds of data were used to make the decision. For example, if you applied for a loan, and the loan was rejected, we would like to know why that was the case. If you applied for a job, and you were rejected, it would be helpful to know that the algorithm not only evaluated what you submitted as part of your job application, but also looked at your social media posts. Transparency regarding what data was considered, what the key factors were that drove a decision, is important.
Knowledge@Wharton: You recommend an Algorithmic Bill of Rights. What exactly is that, and why is it necessary?
Hosanagar: The Algorithmic Bill of Rights is a concept that I borrowed from the Bill of Rights in the U.S. Constitution. When the Founding Fathers were drafting the Constitution, some people were worried that we were creating a powerful government, so the Bill of Rights was created to protect citizens.
Today, there is a lot of talk about powerful tech companies, and there’s a feeling that consumers need certain protections. The Algorithmic Bill of Rights is targeted at that. A lot of consumers feel that they’re helpless against big tech and their algorithms, but I feel that consumers do have some power, and that power is in terms of our knowledge, our votes, and our dollars.
We shouldn’t be passive users of technology—we should be active and deliberate about it. We should know how it’s changing decisions we are making or others are making about us. Look at how Facebook is changing its product design today. That change—support for encryption and so on—is because of a push from users. It shows that when users complain, changes do happen.
Votes are another aspect of that. They involve our being aware of which elected representatives understand the nuances of algorithms and how to regulate them. The question is: How are these regulators going to protect us?
“Companies should formally audit algorithms before they deploy them, especially in socially consequential settings like recruiting.”
That’s where the Bill of Rights comes in. The Bill of Rights I propose has a few key pillars. One pillar is transparency—transparency with regard to the data used to make decisions, and with regard to the underlying decision itself. What were the most important factors that led to a certain decision? Europe’s GDPR [General Data Protection Regulation] has certain provisions, like right to explanations and information on the data that companies are using. I think some of that transparency is needed, and companies should provide that.
Another pillar in my Bill of Rights is the idea of some user control, that we cannot be in an environment where we have no control over the technology. We should, for example, be able to tell Alexa: “You’re not listening to any conversation in the house until I instruct you that it’s allowed.” There’s no such provision at present. We are told that the system is not listening, but we’re also hearing that there are instances where it does listen, even when you’re not giving it instructions.
This control is very important. Two years ago there was no way for users to alert Facebook’s algorithm and say: “This post in my newsfeed is false news.” Today, with just two clicks, you can let Facebook know that a certain news post in your feed is either offensive or false. That feedback is very important for the algorithm to correct itself.
Lastly, I have been advocating the idea that companies should formally audit algorithms before they deploy them, especially in socially consequential settings like recruiting. The audit process must be done by a team that is independent of the one that developed the algorithm. The audit process is important because it will help ensure that somebody has looked at things beyond, say, the prediction accuracy of the model, looking at things like privacy, bias, and fairness. That will help curb some of these problems with algorithmic decisions.
Knowledge@Wharton: Any final points that you would like to emphasize?
Hosanagar: Even though I talk about many of the challenges with algorithms in my book, I’m not a skeptic—I’m actually a believer in algorithms. The message I want to share is not “Be wary,” but “Engage more actively and more deliberately, and be part of the process of influencing how these technologies develop.” Studies show that algorithms, on average, are less biased than human beings, and my contention is that it is easier to fix algorithmic bias than it is to fix human bias.
The challenge with algorithm bias is in the way it scales. A prejudiced judge can impact the lives of maybe 200 or 300 people, but an algorithm used in all the courtrooms in a country or across the world can influence the lives of hundreds of thousands, or even millions, of people. Similarly, a biased recruiter can affect the lives of hundreds of people, but a biased recruiting algorithm can affect the lives of millions of people. It’s the scale that we have to worry about—that’s why we need to take the issue seriously.
The key message is that we are going into a world where these algorithms will help us make better decisions. We’ll have growing pains along the way, so we should actively engage now to minimize those incidences.
Republished with permission from Knowledge@Wharton, the online research and business analysis journal of the Wharton School of the University of Pennsylvania.