Replace comma only when between quotation marks

Question

I have to replace strings like this one:

-- 1234,BUCARAMANGA,"KM 15 VIA QUE CONDUCE AGUACHICA-BUCARAMANGA, CORREGIMIENTO EL JUNCAL"

and turn them into strings like this one (removing the comma in between the quotation amrks)

-- 1234,BUCARAMANGA,"KM 15 VIA QUE CONDUCE AGUACHICA-BUCARAMANGA CORREGIMIENTO EL JUNCAL"

I'm using this code:

f1 = open('FACMES.txt', 'r',encoding ='utf-8')
f2 = open('FACMES_2.txt', 'w',encoding ='utf-8')

checkWords = ("KM 15 VIA QUE CONDUCE AGUACHICA-BUCARAMANGA, CORREGIMIENTO EL JUNCAL","COOPERATIVA TECNICOS, TECNOLOGOS PROFESIONALES COOTETECPRO","3 KMS DE LA VÍA FLORENCIA - PAUJIL, DE LA Y A MANO DERECHA DELANTE PTO ARANGO")
repWords = ("KM 15 VIA QUE CONDUCE AGUACHICA-BUCARAMANGA CORREGIMIENTO EL JUNCAL","COOPERATIVA TECNICOS TECNOLOGOS PROFESIONALES COOTETECPRO","3 KMS DE LA VÍA FLORENCIA - PAUJIL DE LA Y A MANO DERECHA DELANTE PTO ARANGO")

for line in f1:
    for check, rep in zip(checkWords, repWords):
        line = line.replace(check, rep)
    f2.write(line)
f1.close()
f2.close()

Is there some way to replace the comma only when the comma is between double quotes and other characters?

Please clarify your specific problem or provide additional details to highlight exactly what you need. As it's currently written, it's hard to tell exactly what you're asking. — Community, Mar 16 '22 at 18:14
Does [Remove All Commas Between Quotes](https://stackoverflow.com/questions/38336518/remove-all-commas-between-quotes) answer your question? — wwii, Mar 16 '22 at 18:26

score 1 · Answer 1 · answered Mar 16 '22 at 18:25

You can use a regular expression to look for a quotation mark, followed by one or more characters ((.+) in the pattern), followed by a comma, followed by one or more characters, followed by another quotation mark.

These groups of one or more characters form capture groups, which we then refer to using \1 and \2 in the call to re.sub().

Using only .replace() is going to be a bit complicated here:

import re
data = '-- 1234,BUCARAMANGA,"KM 15 VIA QUE CONDUCE AGUACHICA-BUCARAMANGA, CORREGIMIENTO EL JUNCAL"'

pattern = r'"(.+),(.+)"'
result = re.sub(pattern, r'"\1\2"', data)

print(result)

Output:

-- 1234,BUCARAMANGA,"KM 15 VIA QUE CONDUCE AGUACHICA-BUCARAMANGA CORREGIMIENTO EL JUNCAL"

score 0 · Answer 2 · edited Mar 16 '22 at 18:25

stringData =  '"KM 15 VIA QUE CONDUCE AGUACHICA-BUCARAMANGA, CORREGIMIENTO EL JUNCAL","COOPERATIVA TECNICOS, TECNOLOGOS PROFESIONALES COOTETECPRO","3 KMS DE LA VÍA FLORENCIA - PAUJIL, DE LA Y A MANO DERECHA DELANTE PTO ARANGO"'

list_of_elements = stringData.split('","')
cleaned_list_of_elements = []
for element in list_of_elements:
    if(list_of_elements.index(element) == 0):
        cleaned_list_of_elements.append(element[1:].replace(',', ''))
    elif(list_of_elements.index(element) == len(list_of_elements) - 1):
        cleaned_list_of_elements.append(element[:len(element) - 1].replace(',', ''))
    else:
        cleaned_list_of_elements.append(element.replace(',', ''))
print(cleaned_list_of_elements)

Replace comma only when between quotation marks

2 Answers2