3

I am using beautifulsoup4 to extract prices tag from a website. The code i m using is this

 #price
        try:
            price = soup.find('span',{'id':'actualprice'})
            price_result= str(price.get_text())
            print "Price: ",price_result
        except StandardError as e:
            price_result="Error was {0}".format(e)
            print price_result

The output i m getting is a string with a format with commas in it. e.g. 82,000,00

What i want:

Change the format from string price to integer price without commas in it so that i can use them as values intead of strings in excel

Panetta
  • 79
  • 1
  • 2
  • 8

4 Answers4

7

You can do this :

>>> string = '82,000,00'
>>> int(price_result.replace(',', ''))
8200000
3kt
  • 2,543
  • 1
  • 17
  • 29
1

Checkout https://docs.python.org/2/library/string.html or https://docs.python.org/3/library/string.html depending on the Python version you are using and use the "replace()" function:

int_price = int(price_result.replace(',',''))

This replaces all commas within the string and then casts it to an INT:

>>> price = "1,000,000"
>>> type(price)
<type 'str'>
>>> int_price = int(price.replace(',',''))
>>> type(int_price)
<type 'int'>
>>> 
ProfFalken
  • 190
  • 10
1

If the last part is a fractional part, you could do something like this:

import re
r = re.compile(r'((?:\d{1,3},?)+)(,\d{2})')
m = r.match('82,000,00')
v = m.group(1).replace(',', '') + m.group(2).replace(',', '.')
print(float(v))

Output:

82000.0
totoro
  • 2,469
  • 2
  • 19
  • 23
1
import re

''.join(re.findall(r'\d+', '82,000,00'))

or another method will be,

int(filter(str.isdigit, '82,000,00'))
SuperNova
  • 25,512
  • 7
  • 93
  • 64