Regular expression to extract data from news page

Question

Hi I'm running python regular expression to extract some data from news pages, however when it is displayed the code produces brackets and apostrophes in the output. For example this is my code:

description_title = findall('<item>[\s]*<title[^>]*>(.*?)<\/title>[\s]*<description>', html_source)[:1]
        news_file.write('<h3 align="Center">' + str(description_title) + ": " + '</h3\n>')

but this code creates the output of ['Technology']:, ['Finance']: but i want Technology, Finance without the [''] around it.

Possible Duplicate of http://stackoverflow.com/questions/11178061/print-list-without-brackets-in-a-single-row — Bhargav Rao, Oct 08 '16 at 13:20

score 1 · Answer 1 · answered Oct 08 '16 at 13:20

1

By using str, you're printing a Python string representation of description_title (which is a list of length 1). Try without the str:

'<h3 align="Center">' + description_title[0] + ": " + '</h3\n>'

answered Oct 08 '16 at 13:20

wildwilhelm

4,809
1
19
24

3

If fact, `str` is not defined, in that peculiar case it calls `repr`. Which returns the string representation of a Python `list`. – Laurent LAPORTE Oct 08 '16 at 13:26

Regular expression to extract data from news page

1 Answers1