-2

I wanted to make a system in which we give something to be search onto the terminal of a Raspberry Pi and the Pi gives a voice output.

I've solved the text-to-speech conversion problem using pico TTS. Now what I wanted to do is go to the Wikipedia page of the term to be searched, and store the first paragraph of the page to a text file.

For example, the result for input Tiger in Simple English should make a text file containing -

The tiger (Panthera tigris) is a carnivorous mammal. It is the largest living member of the cat family, the Felidae. It lives in Asia, mainly India, Bhutan, China and Siberia.

I tried using this but it didn't seem to work.

Error message for

$ pip install wikipedia
...
Command /usr/bin/python -c "import setuptools, tokenize;__file__='/tmp/pip-build-qdTIZY/wikipedia/setup.py';exec(compile(getattr(tokenize, 'open', open)(__file__).read().replace('\r\n', '\n'), __file__, 'exec'))" install --record /tmp/pip-9CPD6D-record/install-record.txt --single-version-externally-managed --compile
failed with error code 1 in /tmp/pip-build-qdTIZY/wikipedia
Storing debug log for failure in /home/pi/.pip/pip.log
Community
  • 1
  • 1
Souvik Saha
  • 167
  • 1
  • 1
  • 8

1 Answers1

0

this seems to work:

title=Tiger
n_sentences=2
curl -s http://simple.wikipedia.org/w/api.php?action=query&prop=extracts&titles="$title"&exsentences="$n_sentences"&explaintext=&format=json |
  sed 's/.*"extract":"\|"}}}}$//g'

it correctly yields:

The tiger (Panthera tigris) is a carnivorous mammal. It is the largest living member of the cat family, the Felidae.

Also tested with title=Albert_Einstein:

Albert Einstein (14 March 1879 \u2013 18 April 1955) was a German-born theoretical physicist who developed the general theory of relativity, one of the two pillars of modern physics (alongside quantum mechanics).\nHe received the Nobel Prize in Physics in 1921, but not for relativity.

(Note that title="Albert Einstein", title=albert_einstein, and title=albert%20einstein all don't work, so you'll eventually want another command to find the best matching real simple.wikipedia article title.)

the curl command makes an http request to simple.wikipedia.org. to see this in action, try this:

curl http://simple.wikipedia.org/w/api.php?action=query&prop=extracts&titles=Tiger&exsentences=2&explaintext=&format=json 

the sed command then extracts the desired part of the response.

updated to increase chance of working with raspberry's curl & sed: changed https to http and rewrote sed command without -e.

ref:

MediaWiki API?

Community
  • 1
  • 1
webb
  • 4,180
  • 1
  • 17
  • 26
  • Can you please tell me how to use this exactly? I tried running it as a bash script and it doesn't seem to be giving an output @webb – Souvik Saha Jun 11 '16 at 12:09