I am scraping numbers from a webpage and appending them onto a Python list. The scraped strings take on the following forms: Millions:
- 1,000,000
- 1,000,000.9
- 1,000,000.99
- 1,000,000.999
Hundreds of thousands (same applies for tens of thousands and thousands):
- 100,000
- 100,000.9
- 100,000.99
- 100,000.999
Which means: trailing zeros are not displayed in the decimal places.
My list has the following composition:
list = [{'all examples above'}]
I want to format all numbers that are floats into floats, with their respective decimal places, and format all integers into integers (or floats with .0) with the correct comma or period separation.
My current process is simply to eliminate all non-numeric characters:
list = [re.sub("[^0-9]", "", i) for i in list] # remove non-numeric characters
list = [int(i) for i in list] # turn strings into integers
I don't know what to do next because I don't know how to account for the different formating within a single list.