Book Excerpt — Amber Sinha

The Suspect Science of Political Targeting

To anyone privy to how social media platforms work, the use of our digital data as a political tool against us was inevitable. But it was the events surrounding the British political consulting firm, Cambridge Analytica that finally brought the issue front and centre globally. In March 2018, British newspapers The Guardian and The Observer broke stories about how Cambridge Analytica had acquired the Facebook data of 87 million users, and then used it for political targeting during the 2016 presidential elections in the United States. Suggestions that this was all done with Russian involvement made the incident even more scandalous. What made matters worse, or at least puzzling, was Facebook’s response—they claimed that no “data breach” had occurred. The first statement made by Facebook following the scandal said that there had been ‘access to information from users who chose to sign up to this app, and everyone involved gave their consent. People knowingly provided their information, no systems were infiltrated, and no passwords or sensitive pieces of information were stolen or hacked.’ In India, the subject of privacy had entered popular discourse recently when the Supreme Court upheld a right to privacy in 2017, and the country’s many Facebook users wondered how this may have happened without a breach.

The Cambridge Analytica- Facebook scandal

In 2013, Aleksandr Kogan, a psychology researcher based at the University of Cambridge, created a Facebook app featuring a personality test called “thisisyourdigitallife” to collect user data for purported academic purposes. Facebook’s API allowed Kogan to collect data such as details about users’ identities, their friend networks and “likes”, along with the answers to the personality test.

Kogan’s personality test was taken by 270,000 people. Any reasonable person would expect that the data of only these 270,000 people should have been collected. But news reports famously quoted Christopher Wylie, a past employee of Cambridge Analytica turned whistle-blower, who revealed that Kogan was able to collect the data of about 87 million people—that is everyone who took the test, and all their Facebook “friends”. Kogan, or anyone else, could collect this data, all the while honouring Facebook’s terms and conditions for its developers. At the time, Facebook had a feature called “friends permission” designed to facilitate the collection of personal data of users without their permission (it was rolled back in 2015). It allowed developers to access the profiles of not just the person who installs their application, but that of all their “friends” as well. Such was Facebook’s disregard for the privacy of its users that this feature was enabled by default. This means that unless you went through your Facebook settings and opted out, every time one of your “friends” may have taken such a test or played a game on the platform, your data was collected without your permission or knowledge.

The final nail in the coffin was when Kogan sold the data he collected to Cambridge Analytica. Bear in mind that this was the first act that went against Facebook’s contractual terms and policies; the first and only action that it deemed wrong. When Facebook was informed of this unauthorized sharing, all it did was send emails to Kogan and Cambridge Analytica requesting them to delete this data, with little or no follow-up. Kogan and Cambridge Analytica were not suspended or banned from Facebook’s platform, and they continued to enjoy developer privileges. Talk about not even shutting the stable doors after the horse has bolted! McNamee, who had known Zuckerberg from the early years of Facebook, claimed that he ‘did not believe in data privacy and did everything he could to maximize disclosure and sharing’ and ‘embraced invasive surveillance, careless sharing of private data, and behaviour modification in pursuit of unprecedented scale and influence.’

Cambridge Analytica was a subsidiary of a larger British behavioural research firm called Strategic Communication Laboratories (SCL). SCL describes one of its aims as the creation of ‘behaviour change through research, data, analytics, and strategy for both domestic and international government clients.’ In order to achieve this, SCL builds psychographic profiles of people, which it uses to target advertising and content. According to Wylie, the data bought from Kogan became the basis for voter profiles created by Cambridge Analytica for the 2016 US elections. It also appeared that Cambridge Analytica’s activities were linked with companies and executives linked to Russian intelligence agencies. The notion of Russian involvement in manipulating the American public to help secure the Trump presidency soon made the whole world sit up.

In the aftermath of the news reports, Mark Zuckerberg—founder and CEO of Facebook—accepted in a public statement that while Facebook enabled people to log into apps and share who their friends were and some information about them, there was no data breach. We understand now that the collection of data by Kogan without legitimate consent was not a bug, but a feature of Facebook’s platform. When Kogan collected the data of 87 million people, he was only doing what Facebook had always intended for developers like him to do. There had been no security lapse or breach of contractual terms. Yet, the breach is a much graver one, and it involves Facebook’s relationship of trust with its users. While ostensibly claiming to be concerned with user privacy, Facebook had designed its platform to ensure that there was no way users could exert any meaningful control over their data. It had effectively implemented a new form of social contract for data, where consent was assumed merely by the barest forms of participation.

Cambridge Analytica in India

Despite its American focus, the ripples of the Cambridge Analytica scandal were felt in India. The Bharatiya Janata Party and the Indian National Congress traded allegations in the aftermath of the incident, claiming the other was a client of Cambridge Analytica, and had used political targeting and manipulation in their election campaigns. With 241 million Facebook users in India, it is not a leap of imagination that social media would be a ripe place for voter profiling. After testifying before the British parliamentary committee on the role of SCL in India, Wylie also claimed publicly that SCL had worked on state elections in India in Uttar Pradesh (2012, 2011, 2007), Bihar (2007), Kerala (2007), West Bengal (2007), Assam (2007), Jharkhand (2007), Madhya Pradesh (2003), Rajasthan (2003) and also in the national elections in 2009. The Ministry for Electronics and Information Technology (MeitY) sent notices to both Facebook and Cambridge Analytica seeking information about how Indian citizens may have been impacted. Facebook responded that 562,455 Indians may have been put at risk. Cambridge Analytica, on the other, assured the Indian government that it had not used the personal data of Indians it had obtained from Kogan. While the incident involving Kogan may have had only minimal impact on Indian users, its bigger impact has been in the journalistic investigations into the activities of Cambridge Analytica and SCL in India. Unsurprisingly, they have been involved in political consulting in India for a few years now.

In 2009, BJP leader Mahesh Sharma (later Union Minister of Culture) ran his first Lok Sabha campaign. He was, at the time, a respected doctor and businessman in Noida, and was contesting for the Lok Sabha seat from that district. Freelance political consultant Avneesh Rai helped him with his campaign. Rai, who was a seasoned operator with two-and-a-half decades of experience, had very high expectations for Sharma, but he lost. In a detailed investigative story for the online news portal The Print, Shivam Vij explained that puzzled by this result, Rai reached out to other experts in order to probe the reasons for this loss. Through mutual acquaintances, Sharma was put in touch with Dan Muresan who hailed from a political family in Romania, and had recently taken over as Head of Elections at SCL in the UK. Muresan and his colleagues from the Behavioural Dynamics Institute (yet another company belonging to the SCL group) conducted interviews in Noida to understand people’s perception of Mahesh Sharma. They put together detailed insights on the reasons behind Sharma’s loss, and that left a strong impression on Rai.

Rai and Muresan kept in touch, and discussed working together. Rai had access to databases of households in many states. He had already created voter profiles using demographic data (including details such as caste) and political preferences of these households. He saw an advantage in collaborating with SCL to get behavioural insights and access to alternate sources of data. SCL, too, were looking to expand their global operations having just worked on the elections in Ghana. In 2011, SCL incorporated an Indian entity—Strategic Communication Laboratories Private Limited—with Rai as one of its directors. The company’s other directors were SCL’s Alexander Nix and Alexander Oakes, and Rai’s friend Amrish Tyagi. Tyagi, whose father KC Tyagi is a leader in Janta Dal (United) party, also ran a company called Ovleno Business Intelligence that provided services to pharmaceutical companies. According to news reports, this Indian entity set up offices in ten cities—Ahmedabad, Bengaluru, Cuttack, Ghaziabad, Guwahati, Hyderabad, Indore, Kolkata, Patna and Pune; and worked on at least eight different contracts. Rai and Tyagi have denied these reports. They say that these were projects they worked on in their personal capacities, and SCL India claimed credit for this work in presentations it made to prospective clients so they could attract business.

Unfortunately for SCL India, these denials by Rai and Tyagi did little to stop media from questioning them about their involvement with elections in India. Whistle-blower Christopher Wylie has claimed that SCL India boasted a database of ‘over 600 districts and 7 lakh villages,’ and that the Indian National Congress was one of their clients. At the same time, Tyagi’s company Ovleno Business Intelligence, which pivoted its business to political strategy, listed both Bharatiya Janata Party (BJP) and Congress as its clients on its website. In his LinkedIn profile, Ovleno Business Intelligence’s director, Himanshu Sharma, listed managing four election campaigns for the BJP as one his achievements.

After the Cambridge Analytica scandal broke out in March 2018, the Ovleno Business Intelligence website was taken down. According to the investigative story by Shivam Vij, SCL India first tried to woo the Congress. They worked on databases on voters in four Lok Sabha constituencies—Amethi, Rae Bareli, Jaipur Rural and Madhubani—and gifted their findings to the Congress prior to the 2014 general elections. Later, Alexander Nix decided to take on an Indian-American client who wanted SCL India to work for BJP, without the knowledge of Rai or Tyagi. This eventually led to a breakdown of their relationship, though Tyagi’s Ovleno appears to have continued to work with SCL.

Clear answers about SCL’s involvement in elections in India still evade us, but this much is certain that there is considerable interest in using social media for political targeting of content in India. With its largest user-base coming from India, the country has been central to Facebook’s strategies. Up to 133 million new voters became eligible to vote in the 2019 general elections, offering an even bigger, younger, and more connected user base for political targeting. Facebook’s WhatsApp messenger is already the primary conduit for misinformation in India, and due to its nature as an encrypted messaging service, is largely immune to any oversight mechanisms. After the severe backlash that it received after the Cambridge Analytica scandal, Facebook announced in 2018 that it will hire thousands of more people to verify pages and advertisers before the next elections in US, Mexico, Brazil, India and Pakistan.

How social media targeting works

The availability of cheap and easily accessible personal data offers new opportunities for political profiling. Collecting data and analysing it to create voter profiles is suddenly a very lucrative business. It is not as if the methods for creating such profiles are new; they have been around for the last century in some form or the other, but the deluge of data, and its new purveyors, promise more detailed and granular insights now.

In 2017, Michal Kosinski, a researcher affiliated with Stanford University, co-authored a paper that claimed that facial recognition technology along with deep neural networks could be used on profile pictures uploaded on social media to predict sexual orientation. Predictably, the paper generated a lot of controversy. It was an audacious claim, which critics asserted was based on a faulty premise. Jim Halloran, the Chief Digital Officer of GLAAD, the world’s largest LGBTQ media advocacy organization, called the paper reckless and without basis. He said that technology cannot identify someone’s sexual orientation. What Kosinski’s paper actually showed was that algorithms could detect a pattern in the appearance of a small subset of white gay and lesbian people on dating sites. The algorithm detected differences and similarities in facial structure, and tried to predict sexual orientation on the assumption that gay men’s faces were more feminine than straight men, and lesbian women’s faces were more masculine than straight women. This finding was based on the prenatal hormone theory of sexual orientation. This theory suggests that our sexuality is, in part, determined by hormone exposure in the womb. Kosinski’s critics pointed out that factors such as less facial hair in the case of gay male subjects may as easily be a consequence of fashion trends and cultural norms as prenatal hormonal exposure. More importantly, critics felt that the paper was dangerous and irresponsible because it could be used to support an authoritarian and brutal regime’s efforts to identify and/or persecute people they believed to be homosexual. After the paper was published, Kosinski went on to claim that similar algorithms could help measure intelligence quotients, political orientations, and criminal inclinations of people from their facial images alone. Soon Kosinski faced so much flak that he was targeted with death threats, resulting in a campus police officer being stationed outside his door.

While inferring intimate details from facial traits may seem audacious, using digital traces from social networks to do the same has gained more acceptance, and has become standard practice. Social media data is turned into actionable information for advertising and targeting by building psychometric profiles. Psychometric profiling is a process to measure and assess personality and psychology against a small number of set parameters. Most of us have taken some form of personality test, often based on the Big Five Personality Model or the Myer-Briggs questionnaire. The traditional method for conducting psychometric profiling was to carry out surveys that ask questions that can reveal aspects of the participants' psychological composition. Answers from the surveys would then be analysed to create a psychometric profile of the individual or group. However, recently, researchers such as Kosinski have found that instead of conducting surveys, which are expensive and require individuals to actively participate, digital traces from social media platforms can be used to predict psychological profiles more easily and cheaply. Kosinski started out as a traditional social psychologist trained in small-sample and questionnaire research, but was drawn to the new reality of digital data collection. The use of digital footprint as an indicator of user attributes and preferences had been at the centre of Kosinski's research for some years. Back in 2013, Kosinski wrote a paper in which he analysed Facebook “likes” of 58,000 people, and inferred sexual orientation, race and political leanings with an accuracy range of 85 per cent - 95 per cent. Six years before that, Kosinski and his frequent collaborator David Stillwell, spearheaded the building of a Facebook app featuring a personality test, prosaically named “myPersonality”. This app was, in fact, the precursor to Kogan’s application that led to the Cambridge Analytica scandal.

In 2014, SCL courted Kosinski and Stillwell and expressed interest in buying the dataset of the “myPersonality” app. They declined on the grounds that the data had been collected for academic purposes only. SCL then explored the possibility of hiring Kosinski and Stillwell to do psychometric modelling, but the deal fell through on monetary grounds. Eventually, Kogan used his app with the understanding that he would sell the data he collects to Cambridge Analytica. This app “thisisyourdigitallife” is said to be inspired by Kosinski and Stillwell’s app.

Psychometric Profiling and Persuasion

The theories of psychometrics, which guide the apps that assess personality, have remained unchanged from the days of survey-based research. Most personality tests and apps are based on a psychometric model called Big Five Personality Factors. In this model, every personality is mapped across five factors—Extraversion, Neuroticism, Conscientiousness, Agreeableness and Openness. In the past century and a half, a school of psychologists have followed the theory “lexical hypothesis”, according to which all personality traits are encoded in natural language. This means that the basis for personality types is not a theoretical model but the analyses of language terms people use to describe themselves. A pioneer in this field was Sir Francis Galton, who in the late 19th century, picked up an authoritative dictionary, and began noting down words he felt were expressive of character. His exercise yielded a thousand such words. Galton’s technique was refined by others in the early twentieth century, and Raymond Cattell brought up the count of trait-descriptive terms to 4,500. He later distilled these terms into 35 variables. It is these variables which were repeatedly studied, and turned into the Big Five factors.

The second dominant personality model is the Myers-Briggs Type Indicator (MBTI). Unlike the Big Five, it draws from cognitive theories. This model sees personality traits as arising from differences in how we receive and process information. Based largely on the work of psychologist Carl Jung, MBTI divides cognitive functions into eight types depending on how we perceive and judge information. According to Jung, people could be classified along three distinct dichotomies. The first is related to the source and direction of their energy expression, and most people would fall under the categories of extraverted or introverted. The second dichotomy is that of how information is perceived and people are classified under it as either sensing or intuitive. Those who fall under sensing mainly believe information that they directly receive from the external sources, while those who are intuitive mainly believe information that they receive from their internal or imaginative world. Finally, people can be understood in terms of how they process information and would either fall under thinking or feeling. Thinking individuals are likely to make decisions through logic, while feeling individuals are likely to make decisions based on emotions.

It has been expanded further by Isabel Briggs-Myers and Katharine Cook Briggs (who this model is named after) to include the fourth dichotomy around how people implement the information they process. Here, people are classified as judging (more structured and decided lifestyle) or perceiving (more flexible and adaptable). Essentially, MBTI classifies people into types whereas Big Five measures traits on a dimensional scale.

In both these models, the profile is intended to show how an individual may make decisions, and consequently, how they may be influenced. Even though these profiling methods are broad-brush, machine learning promises to find correlations between “likes” and demographic details to find patterns that represent a detailed and nuanced psychographic sketch of the individual.

The effectiveness of these methods for political micro-targeting is not a proven fact. In another paper from 2017, Michal Kosinski, D J Stillwell and others argue that tweaking advertising to psychological traits of people (again derived simply from Facebook “likes”) can be very effective in influencing their behaviour. Sandra Matz, one of the co-authors of the paper said, ‘We wanted to provide some scientific evidence that psychological targeting works, to show policymakers that it works, to show people on the street that it works, and say this is what we can do simply by looking at your Facebook likes. This is the way we can influence behaviour.’ This research used the data previously collected by Kosinski and Stillwell, where they inferred personality traits such as extroversion and introversion from Facebook “likes”. They used this insight to target female Facebook users with advertisements for beauty products of a particular brand. The ads were decided simply based on whether the target was introverted or extroverted. They demonstrated that tailoring ads to match users’ personality traits made a considerable difference in purchasing, compared to users who were shown mismatched ads. Matz claimed that this was proof that consumers respond to personalized targeting, if there is a granular psychographic profile of them to guide the content. Even though this is one of the strongest available evidence to show that personality traits predicted from Facebook usage can be used to design advertising that affects behaviour, this claim has limited acceptance in the academic community. There are other studies that try to show that personalized targeting has meagre results, especially when used as a political tool of persuasion. While a fairly accurate profile of individuals can be built using digital traces, how effective this profile is in actually changing minds remains questionable.

To understand this, let us consider the real nature of Kosinski and Stillwell’s finding and compare it to voting behaviour. The 2017 paper was based on three sets of experiments conducted. In the first experiment, they matched ad type to personality type, but this did not convert in significant increase the buying behaviour. They achieved less than 400 purchases, in an experiment conducted over three million people. The second experiment dealt with a specific aspect of personality—openness, one of the factors in the MBTI scale discussed above. The results were much more significant for users with low openness personality type than with high openness type. Here, the results had a much higher strike rate—500 app installs out of 84,000 users subjected to the manipulations. The third experiment compared a standard one-size-fits-all marketing message with a personality-led marketing message. There was a 0.05% improvement in the conversion rate in app installation. In the second and third experiment, the users only had to install a free app, which saw more significant results than in the first experiment where you had to make purchases. Influencing someone’s voting behaviour is perhaps more comparable to influencing buying decisions, which involve some bigger stakes than with installing free apps, and thus, suggests that political microtargeting may have limited success.

There are also fundamental questions that remain about such a broad sweep of classification systems. Typing in this manner is basically a psychological and systematic classification of people according to a specific category. One can look to differentiate between individuals based on temperament, character, traits, behaviour patterns, and much more, but a fairly general approach is to look for type, which is a grouping of behavioural tendencies based on an underling and supposedly universal model. So according to the typology model in MBTI, introverts and extroverts are two fundamentally different categories of people.

At this point in time, neither researchers nor political campaigns know very much about how well targeting works at persuading voters. There is some evidence to suggest that voters rarely prefer targeted pandering to general messages. And any form of targeted messaging runs the risk of being shown to “mistargeted” voters. This could actually harm the candidate, negating any positive returns from targeting. The big issue for behavioural scientists is that even though detailed profiles of voters can be created, just because you have some understanding of the voter does not mean that you will be able to craft a persuasive message to change their views. However, even though campaigns might be unable to use targeted advertising to persuade voters to shift their loyalties, it can still be a powerful tool to contain the voters within the echo-chambers of their ideology. Already, feed algorithms of platforms like Facebook show us content they feel we would like. In this case, that could be political posts or advertisements and sponsored messages by political parties to whom the algorithms think we belong. Because platforms prioritize sensational content, political agents have a greater opportunity to push voters to the far end of their ideological spectrum by trapping them in a virtual world, which only shows them messaging from one, and often an extreme point of view.

Sinha Amber, 2019. “The Suspect Science of Political Targeting” in The Networked Public: How Social Media is Changing Democracy. New Delhi: Rupa Publications.