How an Election Propaganda Campaign on Social Media Might Work / by Chris Shaffer

I’m writing this on the heels of watching a documentary that was (for me, at least) disappointingly low in operational detail - about Cambridge Analytica and their role in “hacking” the Trump campaign, Brexit campaign, and others.

I don’t have any experience rigging elections, of course, nor do I have any insider information on the topic. But a rough outline can be pieced together from information that has entered the public domain, combined with a knowledge of how machine learning and social media advertising work.

The idea is (obviously) not to produce a how-to guide. Too often this conversation devolves into defeatism - “do we still have free will” and “can we still have democracy?” Demystifying how this works isn’t just a matter of curiosity - anyone who understands how misinformation works is bound to be less susceptible to it.

With that, let’s get ourselves inside the mind of a villain.

Your first step is to figure out how you want to categorize people. There’s an array of personality tests used by psychologists, employers, marketers, and others. Pick one of these, or even craft your own.

The goal is to be able to put people into groups that define their behavior in some way. For example, a Myers-Briggs type will tell you, among other things - Is this person more introverted or extroverted? Are they more likely to respond to emotional appeals or statistics? For political purposes, you might want to group people by their party affiliation, likelihood to vote, etc.

Second is to collect some data on people; whatever you can get your hands on. Publicly available on social media? For sale from a marketing company? Obtain it illegally? All fair game, if you’re a bad guy.

The text of Tweets and status updates? Addresses you’ve lived at? Job titles you’ve had? Companies you’ve worked for? Bands, books, movies you enjoy? Demographic data (age, gender, race, income level)? It’s all useful, anything you can put into a profile might end up having value.

The third step is to train a machine learning model. Get whoever you can to take the personality test (from step 1). Then, match those people up with their data (from step 2). Feed all of that into some machine-learning code.

Getting the machine-learning bit to work well is where you need to hire a bunch of smart people…

Without going into a PhD-long sidebar, the gist of machine learning is that it finds patterns in large sets of data. If you feed it a lot of problems and their solutions, and if there is a correlation between the problem and the solution, and if your smart people are doing their jobs well, your machine will soon be able to find that correlation (even if it’s too nuanced to express) and use it to solve the next problem and the next one … at super-human speed.

Feed a machine learning model 100,000 pictures labeled “hot dog” or “not hot dog” and it can tell you whether or not picture 100,001 is a hot dog. It’s not behaving according to a set of pre-defined rules; it’s just looking at the data you put in, and saying, “this data looks similar to the previous data, when you told me the answer was X, more strongly than it looks similar to the data you gave me when you told me the answer was Y, so I’ll answer X”.

The problems with machine learning stem directly from this methodology:

  • If it sees a hot dog that looks very different from the hot dogs it was trained on, or from an angle that’s too different, it won’t detect it. How different is too different? It’s pretty hard to tell until you hit that case…

  • If training data is labeled wrong, it simply won’t work - if you feed it wrong answers, it’ll spit out wrong answers.

  • While machine-learning approaches work very well for “I know it when I see it” applications, such as image recognition, they don’t really work when you want to follow a strict set of rules.

  • It’s never going to ask itself whether it should be looking for hamburgers instead of hot dogs. In fact, it’s never even heard of a hamburger.

So, you feed 100,000 profiles (lists of favorite bands, job history, etc.) along with each of those people’s personality category into the machine. You’ve now got a machine that can look at a new profile, and tell you that person’s personality category.

It’s not always right, of course - there’s only so much you can predict about how someone will behave based on the information you’ve fed in - but you don’t have to always be right when you’re doing something at scale. You just have to be right a little bit more often than random chance …

We have a person that likes these bands and works at this job? 74% chance they’re an introvert. Drives this kind of car and uses these words a lot in their posts? 59% they’re more emotional than rational.

Next, craft some content for each group. Is this personality type afraid of change? Tell them that the opposition candidate is a dangerous radical. Have an angry person who’s not afraid of change? Tell them that your candidate is a revolutionary. Got someone who isn’t likely to be won over? Tell them that the candidates are basically the same, the election is rigged, so they might as well not vote at all.

Test and refine that content. Generate dozens of slightly different versions of each message. Most advertising platforms already have tools built-in that help you figure out which posts are getting the most “engagement” so you can run more of those, and less of the ones that don’t work.

Having people engage - like, share, comment - is good, because it gives you more attention at less cost. But it’s not strictly necessary - if people are seeing your ads enough, then it can still lead to an impression of “all I hear about the opposition candidate is ‘scandal scandal scandal’” even if they never explicitly buy into any individual story.

Last, micro-target. Once you find the personality types most susceptible to the message you’ve got, cut down your spending elsewhere. Working in a regional electoral system? Focus on the places that most frequently swing between parties.

If you ignore 95% of the population, you can show the ones you’re targeting 20x the number of ads for the same amount of money. Facebook, for example, makes $2 in revenue per user per month - with a $400,000 spend, you could double that for 100,000 people, thus completely dominating the sponsored content in their news feeds for an entire month. (The numbers are higher when you look at US users rather than global users, but still well within the reach of a national campaign budget.)

Now that we understand how the game is played, how do we protect ourselves?

  1. Just by understanding how propaganda works, it loses a great deal of its power over you. Step one complete! These tools are extremely powerful on a population level, when you can target millions of people and only have to succeed in (slightly) changing the behavior of a small percentage of them, but they’re not “mind control.”

  2. Understand that it’s not going away. Online ad networks might get better at policing illegal and/or unmarked political ads, but they’re basically designed to sell the attention of a specific set of users to someone with a very tailored message. When you turn on the TV, you’re seeing the same political ads as everyone else - it’s hard to imagine one candidate buying every single ad slot (and if they did, lots of people would notice). With social media, you’re seeing a set of ads unique to just you, and one candidate might have bought every single ad slot.

  3. Don’t rely on social media as your main source of news. You might think you’re reading what your friends share with you, but you aren’t. You’re not really seeing “what Facebook wants you to see,” either - you’re seeing what an algorithm decides is going to generate the most ad revenue, and that algorithm is very easy to hijack. At least when you’re reading a newspaper, you know whose voice you’re hearing. If you don’t trust the nearest liberal rag, then pick up a conservative rag, as well.