-2

so i was webscraping Foot locker Website , now when i get the price i get it in more than one decimal points.

i want to round it off to 2 digits after decimal point, how can i do that ?

My price list:

90.00
170.00
198.00
137.99137.99158.00

When i try the float function/Method i get an error, can someone Please help :)

print(float(Price))

90.0
170.0
198.0

ValueError: could not convert string to float: '137.99137.99158.00'

and i also want to round it off to two decimal points, so 90.0 will become 90.00 :)

5 Answers5

1

After a second look at your prices it seems to me that the problem with the multiple decimal points is due to missing spaces between the prices. Maybe the webscraper needs a fix? If you want to go on with what you have, you can do it with regular expressions. But my fix only works if prices are always given with two decimal digits.

import re

list_prices = [ '90.00', '170.00', '198.00',  '137.99137.99158.00' ]

pattern_price = re.compile(r'[0-9]+\.[0-9]{2}')
list_prices_clean = pattern_price.findall('\n'.join(list_prices))
print(list_prices_clean)

# ['90.00', '170.00', '198.00', '137.99', '137.99', '158.00']
Durtal
  • 1,063
  • 3
  • 11
0

You're getting that error because the input 137.99137.99158.00 is not a valid input for the float function. I have written the below function to clean your inputs.

def clean_invalid_number(num):
    split_num = num.split('.')
    num_len = len(split_num)
    if len(split_num) > 1:
        temp = split_num[0] + '.'
        for i in range(1,num_len):
            temp += split_num[i]
        return temp
    else:
        return num

To explain the above, I used the split function which returns a list. If the list length is greater than 1 then there is more than 1 fullstop which means the data needs to be cleaned.The list does not contain the character you split.

As for returning 2 decimal points simply use

Price = round(Price,2)

Returning two 90.00 instead of 90.0 does not make sense if you are casting to float.

Here is the full code as a demo:

prices = ['90.00', '170.00', '198.00', '137.99137.99158.00']

prices = [round(float(clean_invalid_number(p)),2 ) for p in prices]

print(prices)

[90.0, 170.0, 198.0, 137.99]


The_flash
  • 103
  • 6
0
  1. replace first dot by a temporary delimiter
  2. delete all other dots
  3. replace temporary delimiter with dot
  4. round
  5. print with two decimals

like this:

list_prices = [ '90.00', '170.00', '198.00',  '137.99137.99158.00']

def clean_price(price, sep='.'):
    price = str(price)
    price = price.replace(sep, 'DOT', 1)
    price = price.replace(sep, '')
    price = price.replace('DOT', '.')
    rounded = round(float(price),2)
    return f'{rounded:.2f}'

list_prices_clean = [clean_price(price) for price in list_prices]

print(list_prices_clean)

# ['90.09', '170.00', '198.00', '137.99']

EDIT:

In case you mean rounding after the last decimal point:

def clean_price(price, sep='.'):
    price = str(price)
    num_seps = price.count(sep)
    price = price.replace(sep, '', num_seps-1)
    rounded = round(float(price),2)
    return f'{rounded:.2f}'

list_prices_clean = [clean_price(price) for price in list_prices]

print(list_prices_clean)

# ['90.00', '170.00', '198.00', '1379913799158.00']
Durtal
  • 1,063
  • 3
  • 11
0

No need to write custom methods, use regular expressions (regex) to extract patterns from Strings. Your problem is that the long string (137.99137.99158.00) are 3 prices without spaces in between. The regex expression "[0-9]+.[0-9][0-9]" finds all patterns with one or more numbers before a "." and two numbers after the "."

import re           
reg = "[0-9]+\.[0-9]{0,2}";
test = "137.99137.99158.00";
p = re.compile(reg);
result = p.search(test);
result.group(0)

Output:

137.99

Short explanation:

  • '[0-9]' "numbers"
  • '+' "one or more"
  • '.' "String for the dot"

Regex seems to be quite weird at the start, but it is an essential skill. Especially when you want to mine text.

TomCV
  • 49
  • 1
  • 4
  • This is a Python question, Tom. – Pranav Hosangadi Mar 18 '21 at 16:06
  • @PranavHosangadi ups, sorry corrected to Python. But exactly the same thing. – TomCV Mar 18 '21 at 16:16
  • 1
    Be aware that you should escape the dot as I did in my second answer. Your pattern unintentionally also matches "137991379915800" – Durtal Mar 18 '21 at 16:34
  • @Durtal True, thank you. For the problem with the 2 decimals after the dot (in mine and your post), consider [0-9]{0,2} to allow 0 or 2 decimals after the dot ... not really ideal but it will help with some prices – TomCV Mar 18 '21 at 16:54
  • In case of 0 decimals there shouldn't be a dot. So "19" wont be matched. Maybe we should declare the dot optional with '[0-9]+\.?[0-9]{0,2}'. But this is getting way beyond the original question. – Durtal Mar 18 '21 at 17:03
  • @Durtal To make the decimal optional is a good idea. Yeah, there is no clean solution to separate prices when there is no defined pattern. Imagine prices without a decimal are joined together, it is impossible to separate them. The OP asked to extract the first price when the String has more than two decimals in it, which we definitely answered. – TomCV Mar 18 '21 at 17:20
-3

Ok, i have finally sound a solution to my Problem, nad thank you everyone for helping out as well

def Price(s):
    try:
        P = s.find("div",class_="ProductPrice").text.replace("$","").strip().split("to")[1].split(".")
        return round(float(".".join(P[0:2])),2)
    except:
        P = s.find("div",class_="ProductPrice").text.replace("$","").strip().split("to")[0].split(".")
        return float(".".join(P[0:2]))