Wikipedia access traces

This directory contains a trace of 10% of all user requests issued to Wikipedia (in all languages) during the period between September 19th 2007 and January 2nd 2008.

If you plan to use this trace, please send me a note and cite the trace as follows in your articles:

@Article{,
  author = 	 {Urdaneta, Guido and Pierre, Guillaume and van Steen, Maarten},
  title = 	 {Wikipedia Workload Analysis for Decentralized Hosting},
  volume =       {53},
  number =       {11},
  pages =        {1830-1845},
  month =        {July},
  year = 	 {2009},
  journal = 	 {Elsevier Computer Networks},
  note = 	 {\url{http://www.globule.org/publi/WWADH_comnet2009.html}}
}

We publish this data with authorization from the WikiMedia foundation in the hope that it will be useful to the scientific community. Special thanks to them for making this trace available to us and allowing us to publish it! In particular, we owe a great deal to Gerard Meijssen and Tim Starling without whom none of this would have been possible.

Comments are closed.