I want to count entities/categories in wiki dump of a particular language, say English. The official documentation is very tough to find/follow for a beginner. What I have understood till now is that I can download an XML dump (What do I download out of all the available different files), and parse it (?) to count entities (The article topics) and categories.
This information, if available, is very difficult to find. Please help with some instructions as to how do I work with it or resources where I can learn about it.
Thanks!