Agents of W.I.K.I.M.E.D.I.A.

The common user agents of Wikimedia readers and editors

About this data

This data contains the most common user agents for Wikimedia readers and editors. Agents are provided parsed using the open-source ua-parser project.

Details

The intent behind releasing the parsed agents is to make it easier for Wikimedia developers to understand how to best test their software for the group they're targeting.

The actual data collection and anonymisation process varied between readers and editors. For readers, a 1:1000 sampled log of pageviews in February 2014 was taken. Any user agent that had more than 500 (in other words, 500,000) requests in a 24-hour period, from no fewer than 500/500,000 distinct IP addresses, was extracted, along with a count of how many times the agent appeared. For editors, a 90 day sample (December 2014 - February 2015) of user agents was taken globally; any user agent used by >= 50 distinct users was extracted, along with a count of the associated number of edits.

For both sets, the agents were then split by 'site used' - whether they were requests to/edits through the desktop or mobile versions of the site - and then parsed using ua-parser. The results of that parsing were themselves aggregated, resulting in the datasets you see here.

Reusing this data

The data is released into the public domain under the CC-0 public domain dedication, and can be freely reused by all and sundry. Iff you decide you want to credit it to people, though, the appropriate citation is:

Keyes, Oliver (2015) Browser Choices of Wikimedia Readers and Editors
Agents of W.I.K.I.M.E.D.I.A. is built using Shiny and has source code on GitHub.