Review of Madhumita Murgia’s Code Dependent — Living in the Shadow of AI: Being human in an AI world

If you have ever asked Alexa to play a song, Siri to make a call, or Google to map a route, then you have engaged with an artificial intelligence (AI) system. These are just a small subset of everyday activities people do with their AI-powered products.

Madhumita Murgia’s Code Dependent lifts the veil hanging over humans that are building the base for AI’s super structure to stand on. Most of them are unaware that the very system they are building would soon gobble up their very livelihood.

Her book offers a cross-section view of tech’s bedrock — labelled data, humans building it, and the influence of automated systems on people. She throws light on the real AI stack, which has humans right at the bottom of the pyramid, without whose inputs, the current crop of AI tech wouldn’t stand.

Over the past decade, tech devices and software have become more intuitive, creative and powerful. Underpinning their advance is a confluence of four major forces: Big Data, algorithmic recommendation systems, innovation in chip design, and cash-rich Big Tech firms.

This potent mix is redefining the way people interact with technology and is bringing humans much closer to machines than they were ever before in recorded history. Of these four powerful forces, Big Data is the most crucial ingredient in concocting a powerful AI system.

Making sense of numbers

Lumps of data mean nothing unless someone slices and dices them down to manageable parts. And that act of cutting datasets down to specific parts can be done only after categorising and labelling content.

If a self-driving car adjusts its steering wheel after noticing a sign post, that means it was trained on a dataset that contained labelled information on roads and signposts. This labelling makes the car’s advanced driver-assistance system (ADAS) adept at manoeuvring diverse terrains. The process of labelling parts in a dataset is called data annotation.

In some ways, this job is like the data-entry operator’s job of outsourcing and offshoring era that began in the early 2000s that enriched corporations in the developed world by exploiting cheap labour in the Global South.

Two decades later, large tech firms in the developed world are exploiting cheap human labour in lower-income countries to enrich themselves and build massive AI models. They are outsourcing the data labelling task to firms based in developing nations. ChatGPT-maker OpenAI’s is a client of one such firm based on Nigeria.

The Microsoft-backed company hired Sama, a digital outsourcing firm, for labelling support. OpenAI’s chatbots are fast. They hammer out replies to prompts within seconds after a user hits send. That’s because its transformer efficiently arranges tokenised words and phrases to put together a meaningful and coherent response.

But, if those responses do not include toxic or graphic text snippets, then that’s because the Nigeria-based company’s employees labelled such terms to ensure OpenAI’s algorithm doesn’t pick them up in its responses.

While most people may be aware of the number of parameters large language models (LLMs) are trained on, very few people know how such training happens and what sorts of inputs these models receive. A large proportion of people, including those developing such systems, see AI’s decision-making ability as a black-box.

For example, when a group of researchers began developing COVID-19 diagnostic software, they used pneumonia chest X-ray data in the control group, which included data of children aged 1 to 5. The machine learning model, instead of differentiating pneumonia from COVID-19 based on the X-rays, wrongly distinguished children from adults.

The AI black-box

This is just one of many instances of AI exposing its black-box nature, confounding scientists, and researchers. Take the case of algorithmic profiling, a machine learning-based system that helps law enforcement agencies quantify a citizen’s inclination to commit a crime.

In the book, Murgia documents how ProKid, an algorithmic profiling software, was used by the Dutch police to predict a youth’s propensity to commit crime simply based on data from their “previous contacts with the police, their addresses, their relationships and their roles as a witness or victim.”

Such use of AI shows how an individual’s quality of agency is shrinking, disempowering them and making them lose their sense of free will and ability change themselves.

Murgia’s book is an essential read, particularly at a time when lawmakers around the world are drafting legislations around AI. While the book does not offer solutions to the problems manifest in the system, it offers a new perspective to visualise the AI stack through the lens of human actors. And the most important part to start with is ‘data’.

Code Dependent: Living in the Shadow of AI; Madhumita Murgia, Macmillan, ₹699.

john.xavier@thehindu.co.in

Comments

Comments have to be in English, and in full sentences. They cannot be abusive or personal. Please abide by our community guidelines for posting your comments.

We have migrated to a new commenting platform. If you are already a registered user of The Hindu and logged in, you may continue to engage with our articles. If you do not have an account please register and login to post comments. Users can access their older comments by logging into their accounts on Vuukle.

Review of Madhumita Murgia’s Code Dependent — Living in the Shadow of AI: Being human in an AI world