In Python, removing thousands comma from numbers in a list where the numbers are separated by commas

Question

I have a list of data similar to that below:

a = ['"105', '424"', '"102', '629"', '"104', '307"']

I want this data to be in a form similar to that of below:

a = ['105424', '102629', '104307']

I am unsure of how to proceed. I thought perhaps removing all the commas then inserting commas only where they should be and then removing the quotations. I am finding this to be quite challenging.

1) Search `', '` and kill. 2) Replace `'"` by `'` and `"'` by `'`'. — Kerrek SB, Jul 05 '11 at 21:49
are you sure youre nesting the single and double quotes how you want? the first "a" is a list of 6 strings. Did you want it to be a list of 3 strings? — totowtwo, Jul 05 '11 at 21:50
Where did this data come from? A CSV file? if so, why aren't you using the `csv` module? — S.Lott, Jul 05 '11 at 21:52
Thanks a lot everyone. All of your advice was very helpful. For those of you who were interested, the data did come from a csv file where commas were separating both the column entries and the thousands. Thanks — , Jul 06 '11 at 03:06

Steven Rumbalski · Answer 1 · 2011-07-05T22:29:20.633

4

I'm assuming this data was originally in a csv file where data that contains commas is quoted ("105,424","102,629","104,307") and then you are splitting on comma:

>>> '"105,424","102,629","104,307"'.split(',')
['"105', '424"', '"102', '629"', '"104', '307"']

Rather you should let the csv module do the work as it will handle the double quotes:

import csv

with open('u:\\foobar.csv', 'rb') as f:
    reader = csv.reader(f)
    for row in reader:
        print [x.replace(',','') for x in row]

This prints: ['105424', '102629', '104307']

edited Jul 05 '11 at 22:29

answered Jul 05 '11 at 22:11

Steven Rumbalski

44,786
9
89
119

If this doesn't work for you, paste a line of your source data in a comment. – Steven Rumbalski Jul 05 '11 at 22:31

score 1 · Answer 2 · edited May 23 '17 at 11:52

If the source data is CSV, you should use @steven's answer.

Regardless, here's how you could process what you pasted.

As @troutwine stated, this will only work if the number parts are always in pairs.

a = ['"105', '424"', '"102', '629"', '"104', '307"']

from itertools import izip

def pairwise(iterable):
    "s -> (s0,s1), (s2,s3), (s4, s5), ..."
    a = iter(iterable)
    return izip(a, a)

result = []

for x, y in pairwise(a):
    result.append(''.join([x, y]).strip('"'))

print result

Gives:

['105424', '102629', '104307']

Pairwise snippet from here: Iterating over every two elements in a list

Dunes · Answer 3 · 2011-07-05T22:24:35.437

Does your data look something like:

"123", "123,456", "123,456,789"

If so then try this

input = '"123", "123,456", "123,456,789"'

import re

reg = re.compile('"(\d{1,3}(,\d{3})*)"')

stringValues = [wholematch.replace(',', '') for wholematch, _endmatch 
                                                    in reg.findall(input)]

This regex should also work on thousands with decimal places as well.

re.compile('"(\d{1,3}(,\d{3})*(\.\d*)?)"')

score 0 · Answer 4 · answered Jul 05 '11 at 21:50

0

If you'll never have an unmatched pair, loop over a range 1/2 the size of the input list, mash the current index plus the next together, do a string substitution and skip to the current index plus two.

answered Jul 05 '11 at 21:50

troutwine

3,721
3
28
62

Manny D · Answer 5 · 2011-07-05T22:30:10.580

Reduce to the rescue:

l = ['"105', '424"', '"102', '629"', '"104', '307"', '"123', '456', '789"', '"123"']

# Concatenate everything and split by ", get non-empties
l2 = [num for num in reduce(lambda x, y: x+y, l).split('"') if num != '']

# Output:
# ['105424', '102629', '104307', '123456789', '123']
print l2

Few caveats though: This code can do numbers beyond thousands (ie, 1,457,664), but also assumes that the whole number was double-quoted.

As others have said though, you should revisit your data retrieval as there are most likely ways to get the values correctly without dealing with the double-quotes. This was a fun little challenge nonetheless.

In Python, removing thousands comma from numbers in a list where the numbers are separated by commas

5 Answers5