Forget Russian Bots: Fake Native Americans Are Using Russian Characters To Avoid Fake News And Plagiarism Detectors

Fact Check

  • by: Maarten Schenk

STORY UPDATED: check for updates below.

Forget Russian Bots: Fake Native Americans Are Using Russian Characters To Avoid Fake News And Plagiarism Detectors

Compare following two headlines:

Thе rеаsоn аrmy hеlіcоptеrs аrе nаmеd аftеr nаtіvе trіbеs wіll mаkе yоu smіlе
The reason Army helicopters are named after native tribes will make you smile

Notice anything different about them? OK, the "A" in "army" is capitalized in one case and not in the other but other than that they seem identical, no? That 'a' vs 'A' shouldn't matter to Google if you copy paste these titles into the search field and run search, right?

Let's try it out, here is the first headline:

google_russian_characters.jpg

OK, so apparently nobody wrote anything yet with that headline in it, at least not to Google's knowledge. Nothing strange going on, there are millions of hypothetical headlines that won't return a single Google search result.

Let's try the second one by simply typing it into Google (including the lower case 'a' so we can properly compare results):

google_latin_characters.jpg

Wait, what? 489.000 results? And why wasn't Google able to find that article on wearthemighty.com only seconds ago? What changed?

On Google's side nothing did. But the first headline is not as identical to the second one as you might think. Most of the vowels (a, e, i, o) from the Latin alphabet in it have been replaced with their visual equivalents in the Cyrillic alphabet (а, е, і, о). In most fonts there is no visual difference between these characters (i.e. they are Homoglyphs) but to a computer (or Google...) they look very different. Cyrillic characters are mainly used in Eastern-European countries like Russia and Ukraine but also in places like Macedonia or Kosovo.

Check what happens when we paste both headlines into a document on Google Docs and switch the font to "Syncopate":

googledocs.jpg

Not only do the letters look different now but because Google does't recognize these weird mixed strings of Latin and Cyrillic characters as words anymore the built-in spellchecker goes crazy with the red wavy underlines (and for no apparent reason as long as you are using one of the standard fonts like Arial or Times New Roman).

So why would anyone want to do this? Doesn't using headlines like these make it harder or even impossible to be found in Google? Compare these two posts we found on Facebook:

wearethemighty_latin_characters.jpgnativeallnews_russian_characters.jpg

The first one links to the original article on We Are The Mighty published in December 2017 (archived here), the second one points to a plagiarized copy on Native Animals which was published in March of 2018 (archived here). Crucially the second headline has the Cyrillic characters in it (although you couldn't tell by just looking at it).

"We Are The Mighty" is a website for the military community run by David M. Gale from Los Angeles. "Native Animals" on the other hand appears to be a site about Native American issues run out of... Kosovo by Mirsad Rexhepi (archived WHOIS data here).

The site is part of a growing list of fake Native American pages run out of places like Macedonia, Kosovo or Vietnam. It is seen as an easy way to make a little money with advertising. They get most of their traffic by reposting content stolen from elsewhere. Craig Silverman of Buzzfeed and Alex Kaplan of MediaMatters already wrote about this phenomenon if you want to learn more about it.

But why would a site go through the trouble of obfuscating their headline and make it harder to be found in Google? Don't they want the traffic maybe? We don't think so: these sites tend to get most of their visitors from Facebook and other social networks and their stolen content will rank much lower than the originals in Google anyway. But being visible in Google carries a risk for them: perhaps the original copyright owner will spot his articles being plagiarized leading to angry emails and DMCA requests to advertisers, Facebook's abuse department and website hosting providers.

Also: fact checkers like ourselves might be looking for new sites copying articles from known fake news websites by plugging the headlines into tools like our own Trendolizer or Buzzsumo in order to add these sites to various fake news watchlists. (Note: in this case we haven't fact checked the article, so we are not claiming it is true or false here). But plugging the original headline in Buzzsumo reveals several copycats:

buzzsumo_latin_characters.jpg

When operating a fake news or plagiarism website you would much prefer it if copy pasting in your headline resulted in zero results, like this:

buzzsumoe_russian_characters.jpg

It keeps those pesky fact checkers and copyright lawyers off your back. At least, that was the theory. As you can see, we found out eventually...

Updates:

  • 2018-03-09T14:03:14Z 2018-03-09T14:03:14Z
    A previous version of this article erroneously mentioned Albania as the location in Mirsad Rexhepi's registration of the nativeallnews.com domain name.

  Maarten Schenk

Lead Stories co-founder Maarten Schenk is our resident expert on fake news and hoax websites. He likes to go beyond just debunking trending fake news stories and is endlessly fascinated by the dazzling variety of psychological and technical tricks used by the people and networks who intentionally spread made-up things on the internet.  He can often be found at conferences and events about fake news, disinformation and fact checking when he is not in his office in Belgium monitoring and tracking the latest fake article to go viral.

Read more about or contact Maarten Schenk

About Us

International Fact-Checking Organization Meta Third-Party Fact Checker

Lead Stories is a fact checking website that is always looking for the latest false, misleading, deceptive or inaccurate stories, videos or images going viral on the internet.
Spotted something? Let us know!.

Lead Stories is a:


@leadstories

Subscribe to our newsletter

* indicates required

Please select all the ways you would like to hear from Lead Stories LLC:

You can unsubscribe at any time by clicking the link in the footer of our emails. For information about our privacy practices, please visit our website.

We use Mailchimp as our marketing platform. By clicking below to subscribe, you acknowledge that your information will be transferred to Mailchimp for processing. Learn more about Mailchimp's privacy practices here.

Most Read

Most Recent

Share your opinion