Following a number of highly publicized leaks of citizens’ sensitive personal information, Ukrainians are slowly waking up to the importance of data protection. Yet, with the local elections approaching on October 25, a trove of voters’ personal data already available online creates ripe conditions for targeted dissemination of disinformation or any other malicious content by both domestic and foreign political actors.
In May 2020, a Ukrainian activist Volodymyr Flonts alerted the public about an anonymous @UA_baza chatbot on a popular messaging app Telegram selling personal data of Ukrainian citizens. Appearing only a few months earlier, the bot claimed to have aggregated 900 GB of records including passport numbers, personal identification codes, declared places of residence, driver’s licenses, social media passwords, and even bank details of millions of Ukrainians. It offered access to five entries and the sale of the whole database for 500 USD. The list of available data makes it clear that it could not have only been accumulated from sources openly available online. Then where could this data have come from?
There are a large number of official and unofficial databases containing citizens’ personal data in Ukraine. Among them state registries compiled and administered by various agencies for the purposes of providing public services, consumer databases, and other datasets of likely commercial nature whose origin is difficult to pinpoint. For example, the State Voter Registry is one of the major state databases containing sensitive information of millions of citizens, which is maintained by a special body within the Central Election Commission (CEC). The access to the Registry is regulated so strictly, that one 2019 Presidential candidate famously complained that it would take him 6,000 years to properly scrutinize it for any irregularities. However, despite apparently stringent security procedures, Ukrainian hacktivists have previously pointed out vulnerabilities of the registry’s website, while CEC members have publicly admitted to the lack of qualified IT/cyber security personnel among civil servants due to a large pay gap between the public and commercial sectors.
In turn, commercial entities such as telecommunication companies, online retailers, banks, and logistic operators maintain their own datasets of customer data. Additionally, consumer data is pooled and shared in the framework of several nationwide loyalty programs — some including a network of over 90 online stores and millions of clients across Ukraine. Smaller businesses typically operate their own client databases. While large companies usually only provide access to their data to third parties for advertising purposes, for smaller businesses it is much more common to sell their client databases online. And while the Law on Protection of Personal Data prohibits selling consumer data without the informed consent of its subjects, Ukraine lacks effective regulations and mechanisms to investigate all such instances or hold those in violation accountable.
The sheer volume and nature of data made available for purchase on the @UA_baza chatbot prompted a public outcry and an official investigation. The bot itself has promptly gone offline, although it is clear if it was removed by Telegram or deleted by its creators. However, several accounts with a similar name later reappeared on Telegram. A journalistic inquiry into the incident suggested that the leaked dataset combined data from government registries, including older versions of the State Voter Registry and the Unified Demographic Registry, commercial databases, and social media.
Although no one source ever aggregated so much data before, this was not the first time Ukrainian citizens’ personal data were leaked online. In 2018, what seemed to be a database of 18 million clients of the country’s largest logistics company “Nova poshta” was leaked. In 2019, law enforcement apprehended someone selling a database of Ukraine’s Customs Service. And in June 2020, journalists confirmed the leak of a client database from PrivatBank – one of the largest banks of Ukraine. Before the appearance of the infamous Telegram bot, such large datasets were sold online on obscure bulletin boards, while a simple internet search would reveal smaller data traders offering to compile custom databases containing full names, telephone numbers, gender, and email addresses upon request.
Come October 25 local elections, the online trade of personal Ukrainians’ data continues. Just recently, a Telegram account mimicking the name of the disabled @UA_baza bot announced it was offering voter databases for sale. And while this account appears to be fraudulent, it is followed by over 16 thousand users. Yet other Telegram bots trading smaller subsets of personal data have been in operation since at least 2018. One such bot (see screenshot number 2) ties a telephone number to a name and searches for other associated pieces of data, such as an email address, a photo, social media accounts, registered business, or a car plate number. Individual requests are provided for free or in exchange for telephone numbers from a user’s phone book, encouraging users to submit other people’s data without their knowledge or consent, while a larger dataset is sold for a modest payment of 50 USD. The creators of the bot remain anonymous and claim to have amassed their datasets from “open sources”, such as job seeker websites. However, some users have recognized the data they previously provided to private entities, indicating that it may also be using leaked consumer databases.
And although the chatbot selling the biggest dataset has gone offline and initial public outrage has subsided, the volume of personal data still available online allows for a vast number of citizens to be targeted with potentially malicious content. For instance, a recent incident in the town of Novi Sanzhary where an unrest connected with the arrival of Ukrainian citizens from the Coronavirus-struck Chinese city of Wuhan was instigated via Viber groups, Instagram and Facebook posts, demonstrated how unsolicited access to thousands of telephone numbers can be used to disseminate disinformation, provoke mass panic, and incite unrest on a particular territory using popular messengers applications and social media. With many of the leaked datasets containing sensitive personal and commercial information, it is difficult to predict where they would resurface next. However, it is clear that such data could be easily exploited by both domestic and external actors in a variety of contexts, including in politics.