2

I am receiving data from a web service and its replying me data in html form.The response data i am getting is this dropdown:

<span>

        <select name="country" id="country" class="text " style="width:170px;">
                        <option value="">-Select country-</option>
                                <option value="Russia" >Russia</option>
                                <option value="America" >America</option>
                                <option value="Spain" >Spain</option>
                                <option value="France" >France</option>
                                <option value="X - 15" >X - 15</option>


        </select>
</span>

I need to further process on this data and get option values in python list.How can i select all country names and collect them into a python list?

user1170793
  • 220
  • 2
  • 12
  • possible duplicate of [Parsing HTML in Python](http://stackoverflow.com/questions/717541/parsing-html-in-python) – DrTyrsa Jan 26 '12 at 08:09
  • If you're getting an html response, you don't need regexps but an xml/html parser. – Rik Poggi Jan 26 '12 at 08:17
  • If you plan to use regex for parsing HTML, *please* read this: http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#1732454 –  Jan 26 '12 at 08:33

2 Answers2

3

Check out beautiful soup.

In this case, you could do the following assuming you had your html block in the html var as a string:

 >>> import BeautifulSoup as bs
 >>>  
 >>>  html = bs.BeautifulSoup(html)
 >>>  html.findAll('option')

For even more syntactic sugar, check out soupselect.

mvanveen
  • 9,754
  • 8
  • 33
  • 42
0
import re

pattern = r"<option value=\"(.*)\" >"
val=re.findall(pattern,htmlCode)

val will contain a list of all values

Based on your example html code, the above regex findall should do the job for you, however if you are doing a lot of extensive html code parsing then usually regex are not an good option. But for a simple case like yours this is the best option.

Nitin Kumar
  • 109
  • 3