0

I have a huge txt file with the following values (first 5):

$42,198.98  
$1,305.04  
$1,821.91  
$105,747.79  
$100,931.55

How this list of strings could be converted into a list of numbers (meaning dropping "$" and ",")?

infile = open('sample.txt', 'r')
list_2016 = [line.rstrip() for line in infile]
infile.close()
list_2016 = [i[1:] for i in list_2016]  # dropping $
list_2016 = [list_2016.replace(',', '') for i in list_2016]  # dropping ','
list_2016 = [float(x) for x in list_2016]
  • Do you want a separate list for each line in your file (i.e. a list of lists), or everything in one list of floats? – Alexander Oct 12 '18 at 03:00
  • No, just one giant list, every line in the file is in the following format: $000,000.00. Then I will need to calculate an average of the whole file. The file is large (couple million of rows) – user10492782 Oct 12 '18 at 03:15

1 Answers1

4

Not the most elegant, but:

s = "$42,198.98 $1,305.04 $1,821.91 $105,747.79 $100,931.55"
f = [float(x) for x in s.replace("$",'').replace(',','').split()]

print(f)    # [42198.98, 1305.04, 1821.91, 105747.79, 100931.55]

The idea:

  • Take the line, replace both dollar signs and commas with nothing (remove them from the string)
  • Call the string's .split() method which, by default, splits over whitespace
  • Use a list comprehension to convert that now-list-of-strings to floats
jedwards
  • 29,432
  • 3
  • 65
  • 92