-3

I'm very new to coding and I've tried to write a code that imports the current price of litecoin from coinmarketcap. However, I can't get it to work, it prints and empty list.

import urllib
import re

htmlfile = urllib.urlopen('https://coinmarketcap.com/currencies/litecoin/')

htmltext = htmlfile.read()

regex = 'span class="text-large2" data-currency-value="">$304.08</span>'

pattern = re.compile(regex)

price = re.findall(pattern, htmltext)

print(price)

Out comes "[]" . The problem is probably minor, but I'm very appreciative for the help.

Mark Skelton
  • 3,663
  • 4
  • 27
  • 47
User2245
  • 1
  • 2
  • I did use single quotation marks in my code, but stack overflow converted "span class="text-large2" data-currency-value="">$304.08" to $304.08 straight away. – User2245 Dec 15 '17 at 23:38
  • 4
    Regular expressions are generally not the best tool for processing HTML. I suggest looking at something like [BeautifulSoup](https://www.crummy.com/software/BeautifulSoup/). That aside, your `regex` pattern probably doesn't do what you think it should. Review the [documentation](https://docs.python.org/3.4/library/re.html). – Galen Dec 15 '17 at 23:49
  • 1
    It's also much easier than re – Xantium Dec 15 '17 at 23:51

2 Answers2

1

Regular expressions are generally not the best tool for processing HTML. I suggest looking at something like BeautifulSoup.

For example:

import urllib
import bs4

f = urllib.urlopen("https://coinmarketcap.com/currencies/litecoin/")
soup = bs4.BeautifulSoup(f)
print(soup.find("", {"data-currency-value": True}).text)

This currently prints "299.97".

This probably does not perform as well as using a re for this simple case. However, see Using regular expressions to parse HTML: why not?

Galen
  • 1,307
  • 8
  • 15
0

You need to change your RegEx and add a group in parenthesis to capture the value.

Try to match something like: <span class="text-large2" data-currency-value>300.59</span>, you need this RegEx:

regex = 'span class="text-large2" data-currency-value>(.*?)</span>'

The (.*?) group is used to catch the number.

You get:

['300.59']
Laurent LAPORTE
  • 21,958
  • 6
  • 58
  • 103