0

I want to get the number of wikipedia revision edits for many entities. For example, the number of deletion or addition.

e.g. link:

https://en.wikipedia.org/w/index.php?title=Help%3APage_history&type=revision&diff=%22+696846306+%22&oldid=%22+688747750

difference between two revisions But each entity has much revisions from earlier time(e.g.2006) to now. Around 10000 revision ids for one entities. So if I retrieve difference pages one by one. It would be very slow! Is there some more efficient way to do this task?

PySerial Killer
  • 428
  • 1
  • 9
  • 26
Cocoa3338
  • 95
  • 1
  • 2
  • 12
  • Have you looked at the wikipedia api? – pvg Sep 04 '17 at 22:37
  • 1
    You could fetch a [Wikimedia database dump](https://dumps.wikimedia.org/) and do all the processing offline. – ephemient Sep 04 '17 at 22:50
  • 1
    Possible duplicate of [How to get full Wikipedia revision-history list from some article?](https://stackoverflow.com/questions/34411896/how-to-get-full-wikipedia-revision-history-list-from-some-article) – chickity china chinese chicken Sep 05 '17 at 03:33
  • @pvg I already had a look. But no finding – Cocoa3338 Sep 05 '17 at 06:39
  • @ephemient good idea! I will have a look when I go home. – Cocoa3338 Sep 05 '17 at 06:41
  • @downshift hi, Wikipedia revision history I already got. But it's not the final result I want. I need to compare old revision page and new revision page and compare how many deletions or additions they made. I can get this result but it's very very slow. If I make it faster, I think Wikipedia doesn't want me to do that also. – Cocoa3338 Sep 05 '17 at 06:44
  • Possible duplicate of [API for getting edits on Wikipedia](http://stackoverflow.com/questions/40313820/api-for-getting-edits-on-wikipedia) – chickity china chinese chicken Sep 05 '17 at 16:46
  • I searched a own way to make it faster a little bit more. Thanks – Cocoa3338 Sep 07 '17 at 19:11

0 Answers0