The Curious Case of BNN

spoiler: no evidence of an “extensive network of reporters across the globe” was foundBNN (the “Breaking News Network”, a news website operated by tech entrepreneur and convicted domestic abuser Gurbaksh Chahal) allegedly offers independent news coverage from an extensive worldwide network of on-the-ground reporters. As is often the case, things are not as they seem. A few minutes of perfunctory Googling reveals that much of BNN’s “coverage” appears to be mildly reworded articles copied from mainstream news sites. For science, here’s a simple technique for algorithmically detecting this form of copying.many of BNN’s “articles” are weirdly similar to articles published by mainstream media outletsMany of the articles on the BNN website appear to have been created by copying and rewording individual paragraphs from articles published on major media sites. Much of the original language is preserved, however, and this can be detected in a variety of ways. One (extremely simple but reasonably effective) algorithm is as follows:Split each article (both the BNN articles and the articles being compared to) into paragraphs, convert to lowercase, and eliminate all punctuation so that each paragraph becomes a list of consecutive words.Convert each paragraph into the set of trigrams (sequences of three consecutive words) it contains. (N-grams of other lengths work as well.)To compare two articles, compare the set of trigrams in every paragraph in the first article with the set of trigrams in every paragraph in the second, calculating the score for that pair of paragraphs as the percentage of trigrams in the first…The Curious Case of BNN

Leave a Reply

Your email address will not be published. Required fields are marked *