Twitter reports that less than 5% of accounts are fake or spammers, commonly referred to as ‘bots’. Since his offer to buy Twitter was accepted, Elon Musk has repeatedly questioned and even rejected these estimates Public Response of Chief Executive Officer Parag Agrawal†
later, Musk put the deal on hold and demanded more evidence†
So why are people arguing about the percentage of bot accounts on Twitter?
If the makers of botometera widely used bone detection tool, our group at Indiana University Social Media Observatory has been studying inauthentic accounts and manipulation on social media for over a decade. We brought the concept of the “social bot” to the foreground and first estimated their prevalence on Twitter in 2017.
Based on our knowledge and experience, we believe that estimating the percentage of bots on Twitter has become a very difficult task, and debating the accuracy of the estimate would miss the point. Here’s why.
What exactly is a bot?
To measure the prevalence of problematic accounts on Twitter, a clear definition of the targets is needed. Common terms such as “fake accounts,” “spam accounts,” and “bots” are used interchangeably, but they have different meanings. Fake or fake accounts are accounts that impersonate people. Accounts that mass produce unsolicited promotional content are defined as spammers. Bots, on the other hand, are accounts that are partially controlled by software; they can automatically post content or perform simple interactions such as retweeting.
These types of accounts often overlap. For example, you can create a bot that pretends to be a human to automatically post spam. Such an account is a bot, a spammer and a fake at the same time. But not every fake account is a bot or a spammer, and vice versa. Making an estimate without a clear definition will only produce misleading results.
Defining and distinguishing account types can also lead to appropriate interventions. Fake and spam accounts degrade and violate the online environment platform policy† Malicious bots are used to spreading misinformation† blow up popularity† aggravate conflicts through negative and incendiary content† manipulate opinions† influence elections† commit financial fraud and disrupt communication† However, some bots can be harmless or even usablefor example, by helping to spread news, deliver disaster warnings and conducting research†
Simply banning all bots is not in the best interest of social media users.
For simplicity, researchers use the term “inauthentic accounts” to refer to the collection of fake accounts, spammers, and malicious bots. This is also the definition Twitter seems to use. However, it is unclear what Musk has in mind.
hard to count
Even when a consensus is reached on a definition, there are still technical challenges in estimating prevalence.
External researchers do not have access to the same data as Twitter, such as IP addresses and phone numbers. This one hinders the ability of the public to identify inauthentic accounts. But even Twitter acknowledges that the actual number of inauthentic accounts may be higher than estimatedbecause detection is a challenge†
Inauthentic accounts evolve and develop new tactics to evade detection. Some fake accounts for example Using AI-generated faces as their profile† These faces are indistinguishable from real ones, even for people† Identifying such accounts is difficult and requires new technologies.
Another difficulty is formed by: coordinated accounts that individually appear normal, but are so similar that they are almost certainly controlled by a single entity. Yet they are like needles in a haystack of hundreds of millions of daily tweets.
The distinction between inauthentic and real accounts is becoming increasingly blurred. Accounts can be hacked, bought or rentedand some users “donate” their credentials to organizations who post on their behalf. The result is that the so-called “cyborg” accounts are controlled by both algorithms and humans. Likewise, spammers sometimes post legitimate content to cover up their activity.
We observed a broad spectrum of behaviors that mix the characteristics of bots and humans. To estimate the prevalence of inauthentic accounts, a simplistic binary classification should be applied: authentic or inauthentic account. Wherever the line is drawn, mistakes are inevitable.
Missing the big picture
The focus of the recent debate on estimating the number of Twitter bots oversimplifies the problem and misses the point of quantifying the harm of online abuse and manipulation by inauthentic accounts.
Recent evidence suggests that inauthentic accounts may not be the only culprits responsible for the spread of misinformation, hate speech, polarization and radicalization. These issues typically involve many human users. For example, our analysis shows that: misinformation about COVID-19 was spread openly on both Twitter and Facebook by verified, high profile accounts.Through BoneAmp, a new tool in the Botometer family that anyone with a Twitter account can use, we found that the presence of automated activity is not evenly distributed. For example, the discussion about cryptocurrencies shows more bot activity than the discussion about cats. Therefore, it makes little difference to individual users whether the overall prevalence is 5% or 20%; their experiences with these accounts depend on who they follow and the topics they care about.
Even if it were possible to accurately estimate the prevalence of inauthentic accounts, it would do little to solve these problems. A meaningful first step would be to recognize the complex nature of these problems. This will help social media platforms and policy makers to develop meaningful responses.