Using PHP, how do I get the first paragraph of a Wikipedia article with the MediaWiki API?

Question

How do I use PHP to get the first paragraph of any article from Wikipedia through their MediaWiki API?

I am open to all suggestions. Most probably CURL or XML will come in handy.

What makes you think this is trivially possible? As far as I'm aware, there's nothing in the API about first paragraphs... — lonesomeday, Feb 21 '12 at 16:31
The problem you have isn't an issue with Wikipedia, but working with the result you get back. You should create a new question with the example page text/data, asking how to parse out just the first paragraph. — Brad, Feb 21 '12 at 16:40

score 2 · Answer 1 · answered Jun 29 '13 at 11:22

2

You can use the API as so:

http://en.wikipedia.org/w/api.php?action=parse&page=Stack_overflow&format=xml&prop=text&section=0

This will return an xml file with structure:

<?xml version="1.0"?>
<api>
  <parse title="Article Title">
    <text xml:space="preserve">Text you wanted goes here</text>
  </parse>
</api>

Note the variables: page=Article_Title_Goes_Here format=xml prop=text

answered Jun 29 '13 at 11:22

Yotam Omer

is there a way to skip all the extra content and just get the first intro para of the page. i seem to be geting the image and the right side tabular details etc – Harsha M V Jul 01 '14 at 19:30

score -3 · Answer 2 · answered Feb 21 '12 at 17:28

-3

I would use file_get_contents('http://wikipedia.com/'.$rest_of_url)

Then just use string parsing to select everything form

to

answered Feb 21 '12 at 17:28

cyrusv

it escaped: use string parsing to select everything between the first `
` and `
` using `substr` – cyrusv Feb 21 '12 at 17:29

2 Answers2