Many of you are aware that certain pages and topics are more controversial on Wikipedia than others. Politics certainly stirs up edit wars much more often than mathematics, and articles related to current events are often locked while the events are ongoing, to prevent defacement. But can we quantify the topics that are most controversial on Wikipedia?
Over on the arXiv there's a draft of a forthcoming book chapter entitled "The most controversial topics in Wikipedia: A multilingual and geographical analysis" which aims to answer this question. The researchers looked at multiple different language versions of Wikipedia in order to see if there are any commonalities for controversy. Using a metric based on "reverts (when an editor completely undoes the work of another editor), they measured an article's "controversiality."
Below is a word cloud derived from the titles of the 1,000 most controversial articles:
And here's a table showing the most controversial articles in each of the language editions:
The authors also examined the geography of controversy for each language edition (the locationss of the controversial topics) as well as the general topics of the controversial articles (politics topics the list). And if you want to see more about the topics, the authors have even constructed an interactive tool to see the differences across languages. Check it out!
The lead author, Taha Yasseri, also wrote a bit on his blog about the article, with the following conclusions (or as he wrote, the answer to "So what?"). Number two is relevant for many aspects of social science research:
Check out the original paper here.
Thanks to Paul Kedrosky