-3

I am new to Python and I am trying to get alt and images source from a website, but I am facing problem with the quote ' and "

import requests,urllib,urllib2,re

rule = re.compile(r'^[^*$<,>?!\']*$')

r = requests.get('http://www.hotstar.com/channels/star-plus')
match = re.compile('<img alt="(.*?)" ng-mouseleave="mouseLeaveCard()" ng-mouseenter="mouseEnterCard()" ng-click="mouseEnterCard(true)" ng-class="{\'dull-img\': isThumbnailTitleVisible || isRegionalLanguageVisible}" class="show-card imgtag card-minheight-hc ng-scope ng-isolate-scope" placeholder-img="{\'realUrl\' :  \'(.*?)\', \'placeholderUrl\' : \'./img/placeholder/hs.jpg\'}" ng-if="record.urlPictures" src="(.*?)" style="display: block;">',re.DOTALL).findall(r.content)
for name,img,image in match:

I can only use the standard Python library.

I've read about defining rule so I did from this: Regex Apostrophe how to match?

Honestly, I don't know how to use it.

Thanks in advance

Community
  • 1
  • 1
Mark Pole
  • 27
  • 4

2 Answers2

0

Use a parser instead:

import requests
from bs4 import BeautifulSoup
r = requests.get('http://www.hotstar.com/channels/star-plus')
soup = BeautifulSoup(r.text, "lxml")
imgs = soup.findAll('img')
for img in imgs:
    print(img["alt"])
Jan
  • 42,290
  • 8
  • 54
  • 79
0

I took a quick look at this problem and I tried to look in to and I found a few different ways to go about it from looking at the links below. It looks like something like this has happened to other people. I took a quick glance at it, and thought maybe these might help. Try looking at a few of the pages below:

Possibly similar posts:

Then you could also try looking at Python's Regular Expression Documentation.

Community
  • 1
  • 1
B. Cratty
  • 1,725
  • 1
  • 17
  • 32