[2016] Weapons of math destruction

@book{o2016weapons,
  title={Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy},
  author={O'Neil, C.},
  isbn={9780553418828},
  lccn={2016016487},
  url={https://books.google.co.cr/books?id=NgEwCwAAQBAJ},
  year={2016},
  publisher={Crown}
}

In openlibrary.

My highlights:

Mathematics, once my refuge, was not only deeply entangled in the world’s problems but also fueling many of them. Math was able to combine with technology to multiply the chaos and misfortune, adding efficiency and scale to systems that I now recognized as flawed. Increasingly they focused not on the movements of global financial markets but on human beings, on us. Mathematicians and statisticians were studying our desires, movements, and spending power. Many of these models encoded human prejudice, misunderstanding, and bias into the software systems that increasingly managed our lives. Their verdicts, even when wrong or harmful, were beyond dispute or appeal. And they tended to punish the poor and the oppressed while making the rich richer. They define their own reality and use it to justify their results. Many poisonous assumptions are camouflaged by math and go largely untested and unquestioned. The privileged are processed more by people, the masses by machines.

The human victims of WMDs are held to a far higher standard of evidence than the
algorithms themselves. Collateral damage, unworthy and expendable. Think of the astounding scale, and ignore the imperfections.

Ill-conceived mathematical models now micromanage the economy, from advertising to prisons. Money pouring in seems to prove that their models are working. Profits end up serving as a stand-in, or proxy, for truth. Their feedback is money, which is also their incentive.

There would always be mistakes, however, because models are, by their very nature, simplifications. No model can include all of the real world’s complexity or the nuance of human communication. Inevitably, some important information gets left out. A model’s blind spots reflect the judgments and priorities of its creators. Our own values and desires influence our choices, from the data we choose to collect to the questions we ask. Models are opinions embedded in mathematics. Whether or not a model works is also a matter of opinion.

Racists don’t spend a lot of time hunting down reliable data to train their twisted models. And once their model morphs into a belief, it becomes hardwired. It generates poisonous assumptions, yet rarely tests them, settling instead for data that seems to confirm and fortify them.

The three elements of a WMD: Opacity, Scale, and Damage.

Unlike the numbers in my academic models, the figures in my models at the hedge fund stood for something. They were people’s retirement funds and mortgages. This wealth was coming out of people’s pockets. When the market crashed, rich opportunities would emerge from the wreckage. The algorithms would make sure that those deemed losers would remain that way. A lucky minority would gain ever more control over the data economy, raking in outrageous fortunes and convincing themselves all the while that they deserved it.

When you create a model from proxies, it is far simpler for people to game it. Proxies are easier to manipulate than the complicated reality they represent. As people game the system, the proxy loses its effectiveness.

We are ranked, categorized, and scored in hundreds of models, on the basis of our
revealed preferences and patterns. Adds that pinpoint people in great need and sell them false or overpriced promises. They find inequality and feast on it. They zero in on the most desperate among us at enormous scale. The targets have little idea how they were scammed. Corinthian College targeted “isolated,” “impatient” individuals with “low self esteem” who have “few people in their lives who care about them” and who are “stuck” and “unable to see and plan well for future.” Vulnerability is worth gold. Many people unwittingly disclose their pain points when they look for answers on Google or, later, when they fill out college questionnaires.

With machine learning the computer dives into the data, following only basic instructions. The algorithm finds patterns on its own, and then, through time, connects them with outcomes. people across the earth have produced quadrillions of words about our lives and work, our shopping, and our friendships. By doing this, we have unwittingly built the greatest-ever training corpus for natural-language machines. If the program is predatory, it gauges their weaknesses and vulnerabilities and pursues the most efficient path to exploit them.

University of Phoenix spent $2,225 per student on marketing and only $892 per student on instruction. If students could scrape together a few thousand dollars,
either from savings or bank loans, the universities could line them up for nine times that sum in government loans, making each student incredibly profitable. To many of the students, the loans sound like free money, and the school doesn’t take pains to correct this misconception. What these people need is money. And the key to earning more money, they hear again and again, is education.

Most crimes aren’t as serious as burglary and grand theft auto, and that is where serious problems emerge. Vagrancy, aggressive panhandling, and selling and consuming small quantities of drugs are endemic to many impoverished
neighborhoods. Once the nuisance data flows into a predictive model, more police are drawn into those neighborhoods, where they’re more likely to arrest more people. The policing itself spawns new data, which justifies more policing. Prisons
fill up with hundreds of thousands of people found guilty of victimless crimes. Even if a model is color blind, the result of it is anything but. Geography is a highly effective proxy for race. How about crimes far removed from the boxes on the
PredPol maps, the ones carried out by the rich? Thanks largely to the industry’s wealth and powerful lobbies, finance is underpoliced. Imagine if police enforced their zero-tolerance strategy in finance. Bankers are virtually invulnerable. They spend heavily on our politicians, which always helps, and are also viewed as crucial to our economy. It ensnared thousands of black and Latino men, many of them for committing the petty crimes and misdemeanors that go on in college frats, unpunished, every Saturday night. Officers routinely “stopped blacks and Hispanics who would not have been stopped if they were white.”

We’re often faced with a choice between fairness and efficacy. Our legal traditions lean strongly toward fairness. WMDs, by contrast, tend to favor efficiency. They feed on data that can be measured and counted. Fairness is squishy and hard to quantify. It is a concept. Programmers don’t know how to code for it, and few of their bosses ask them to. Justice cannot just be something that one part of society inflicts upon the other.

The companies hiring minimum-wage workers are managing herds. They slash expenses by replacing human resources professionals with machines, and those machines filter large populations into more manageable groups. Our livelihoods increasingly depend on our ability to make our case to machines. The key is to learn what the machines are looking for. The insiders find a way to gain a crucial edge. The computer learned from the humans how to discriminate, and it carried out this work with breathtaking efficiency. Mathematical models can sift through data to locate people who are likely to face great challenges, whether from crime, poverty, or education. It’s up to society whether to use that intelligence to reject and punish them—or to reach out to them with the resources they need.

Scheduling software can be seen as an extension of the just-in- time economy. But instead of lawn mower blades or cell phone screens showing up right on cue, it’s people, usually people who badly need money. Companies take steps not to make people’s lives too miserable. They all know to the penny how much it costs to replace a frazzled worker who finally quits. Software were designed expressly to punish low-wage workers and to keep them down. Managers assume that the scores are true enough to be useful, and the algorithm makes tough decisions easy.

That’s why society needs countervailing forces, such as vigorous press coverage that highlights the abuses of efficiency and shames companies into doing the right thing. And when they come up short, as Starbucks did, it must expose them again and again. It also needs regulators to keep them in line, strong unions to organize workers and amplify their needs andcomplaints, and politicians willing to pass laws to restrain corporations’ worst excesses.

Creditworthiness has become an all-too-easy stand-in for other virtues. Bad credit has grown to signal a host of sins and shortcomings that have nothing to do with paying bills. Before companies carry out these checks, they must first ask for permission. But that’s usually little more than a formality; at many companies, those refusing to surrender their credit data won’t even be considered for jobs. Framing debt as a moral issue is a mistake. Plenty of hardworking and trustworthy people lose jobs every day as companies fail, cut costs, or move jobs offshore. Many of the newly unemployed find themselves without health insurance, all it takes is an accident or an illness for them to miss a payment on a loan. A sterling credit rating is not just a proxy for responsibility and smart decisions. It is also a proxy for wealth. And wealth is highly correlated with race. “The more data, the better” is the guiding principle of the Information Age. Yet in the name of fairness, some of this data should remain uncrunched. They naturally project the past into the future. The poor are expected to remain poor forever and are treated accordingly—denied opportunities, jailed more often, and gouged for services and loans.

Any operation that attempts to profile hundreds of millions of people from thousands of different sources is going to get a lot of the facts wrong. Such mistakes are learning opportunities—as long as the system receives feedback on the error.

As insurance companies learn more about us, they’ll be able to pinpoint those who appear to be the riskiest customers and then either drive their rates to the stratosphere or, where legal, deny them coverage. This is a far cry from insurance’s original purpose, which is to help society balance its risk.

These automatic programs will increasingly determine how we are treated by the other machines, the ones that choose the ads we see, set prices for us, line us up for a dermatologist appointment, or map our routes. They will be highly efficient, seemingly arbitrary, and utterly unaccountable. No one will understand their logic or be able to explain it.

Modern consumer marketing, however, provides politicians with new pathways to specific voters so that they can tell them what they know they want to hear. Once they do, those voters are likely to accept the information at face value because it confirms their previous beliefs. The appetite for fresh and relevant data, as you might imagine, is intense. And some of the methods used to gather it are unsavory, not to mention intrusive.

It will become harder to access the political messages our neighbors are seeing—and as a result, to understand why they believe what they do, often passionately. This asymmetry of information prevents the various parties from joining forces—which is precisely the point of a democratic government. Political microtargeting harms voters of every economic class. Rich and poor alike find themselves disenfranchised (though the truly affluent, of course, can more than compensate for this with campaign contributions).

Promising efficiency and fairness, they distort higher education, drive up debt, spur mass incarceration, pummel the poor at nearly every juncture, and undermine democracy. For many of them, it can feel as though the world is getting smarter and easier. The quiet and personal nature of this targeting keeps society’s winners from seeing how the very same models are destroying lives.

How do we start to regulate the mathematical models that run more and more of our lives? I would suggest that the process begin with the modelers themselves.

  • I will remember that I didn’t make the world, and it doesn’t satisfy my equations.
  • Though I will use models boldly to estimate value, I will not be overly impressed by mathematics.
  • I will never sacrifice reality for elegance without explaining why I have done so.
  • Nor will I give the people who use my model false comfort about its accuracy. Instead, I will make explicit its assumptions and oversights.
  • I understand that my work may have enormous effects on society and the economy, many of them beyond my comprehension.

We must reevaluate our metric of success. Measure hidden costs, while also incorporating a host of non-numerical values. They’re concepts that reside only in the human mind, and they resist quantification. Make sure that various ethnicities or income levels are represented within groups of voters or consumers. Get a grip on our techno-utopia, that unbounded and unwarranted hope in what algorithms
and technology can accomplish. Admit they can’t do everything. Conduct algorithmic audits. Piece together the assumptions behind the model and score them for fairness. Do we have to dumb down our algorithms? In some cases, yes.

First, we need to demand transparency. Each of us should have the right to receive an alert when a credit score is being used to judge or vet us. And each of us should have access to the information being used to compute that score. If it is incorrect, we should have the right to challenge and correct it. Any data collected must be approved by the user, as an opt-in.

Data is not going away. Nor are computers—much less mathematics. Predictive models are, increasingly, the tools we will be relying on to run our institutions, deploy our resources, and manage our lives. These models are constructed not just from data but from the choices we make about which data to pay attention to—and which to leave out. Those choices are not just about logistics, profits, and efficiency. They are fundamentally moral. If we back away from them and treat mathematical models as a neutral and inevitable force, like the weather or the tides, we abdicate our responsibility. And the result, as we’ve seen, is WMDs that treat us like machine parts in the workplace, that blackball employees and feast on inequities.