0

My csv file ("challenges.csv") contains multiple rows as shown below (the number of columns are different, about 8000 rows):

2937 ,58462bc9a559fa7d29819028 ,29 ,57eb63d813fd7c0329bdb01f ,

2938 ,58462bc9a559fa7d29819028 ,30 ,57eb63d713fd7c0329bdafb5 ,57eb63d713fd7c0329bdafb6


And I also have a dictionary named mydic from "forDic.csv" for example:

{ '58462bc9a559fa7d29819028':'negative chin up', '57eb63d813fd7c0329bdb01f':'knee squeeze squat', '57eb63d713fd7c0329bdafb5: 'squat', '57eb63d713fd7c0329bdafb6':'lunge', ... }

I want to change values of "challenges.csv" with values of mydic if values of "challenges.csv" is equal to keys of mydic.
How can i do? Please help me.


Expected output: a csv file which contains rows like below

2937 ,'negative chin up' ,29 ,'knee squeeze squat' ,

2938 ,'negative chin up' ,30 ,'squat' ,'lunge'

import csv

with open('./forDic.csv', mode='r')as infile:
    reader = csv.reader(infile)
    mydic = dict((rows[0], rows[1]) for rows in reader)
    print(mydic)


def replace_all()
with open('./challenges.csv', mode='r')as infile, open('./challenges_new.csv', mode='w') as outfile:
    r = csv.reader(infile)
    w = csv.writer(outfile)

    for row in r:
        for k in iter(mydic.keys()):
        print(', '.join(row))
        rl = [w.replace(str(k), str(mydic.values())) for w in rl]
        print(rl[0])
        row_list_string = ' / '.join(map(str, rl))
        for k in list(mydic.keys()):
            k = k.replace(k, mydic.get(k))
            print(k)


replace_all()
Martin Evans
  • 45,791
  • 17
  • 81
  • 97
yh Hong
  • 1
  • 3
  • can you paste data as text instead of image? – saikumarm Mar 21 '17 at 06:02
  • @saikumarm thanks!! this only a part of "challenges.csv" 224 ,5847aac3d22f9043ad999e64 ,27 , 225 ,57b2fb7c50bc88cb7a168f40 ,1 ,57eb63d713fd7c0329bdafbd ,57eb63d813fd7c0329bdb019 ,57eb63d813fd7c0329bdb007 ,57eb63d813fd7c0329bdb043 ,57eb63d713fd7c0329bdaf62 ,57eb63d713fd7c0329bdaf63 – yh Hong Mar 21 '17 at 06:08
  • Do you want to replace the csv text with the mapping you have in dict or you want to replace the complete row? Could you share your expected output as well – saurabh baid Mar 21 '17 at 06:24
  • ㅑ want to replace the specific csv text with the mapping dictionary i have! – yh Hong Mar 21 '17 at 06:39

2 Answers2

0

Assuming

chanllenges.csv

317, change1, 89, change2, change3
318, change1, 89, change3, change4

fordic.csv

change1, changedto1
change2, changedto2
change3, changedto3
change4, changedto4

The following code just prints the replaced line

import re, csv

with open('fordic.csv', mode='r')as infile:
    reader = csv.reader(infile)
    mydic = dict((rows[0], rows[1]) for rows in reader)
    print(mydic)


mydic = dict((re.escape(k), v) for k, v in mydic.iteritems())
pattern = re.compile("|".join(mydic.keys()))

with open('./challenges.csv', mode='r') as infile:
    lines = infile.readlines()

    for row in lines:
        print pattern.sub(lambda m: mydic[re.escape(m.group(0))], row)

output

317,  changedto1, 89,  changedto2,  changedto3
318,  changedto1, 89,  changedto3,  changedto4

to understand the multi string replace follow this SO Answer

Community
  • 1
  • 1
saikumarm
  • 1,565
  • 1
  • 15
  • 30
  • Thanks but i got an error message...because i use python3.x? : mydic = dict((re.escape(k), v) for k, v in mydic.iteritems()) AttributeError: 'dict' object has no attribute 'iteritems' – yh Hong Mar 21 '17 at 06:48
  • pattern = re.compile("|".join(mydic.keys)) TypeError: can only join an iterable T.T – yh Hong Mar 21 '17 at 06:54
  • there are changes in file names, please check the follow once, before you directly copy and test. – saikumarm Mar 21 '17 at 07:27
  • I changed file names, but still not solved.. anyway i really appreciate for your help :) – yh Hong Mar 21 '17 at 07:30
0

It is best not to try and update the values in place, but rather create a new temporary file as output. This script attempts the dictionary substitution on all of your column values and writes each row back to the new temporary file. By using this approach the file can be of any size without needing to be loaded completely into memory:

The following approach should work:

import csv
import os

challenges = 'challenges.csv'
temp = '_temp.csv'

with open('forDic.csv', newline='') as f_fordic:
    mydic = {row[0] : row[1] for row in csv.reader(f_fordic)}

with open(challenges, newline='') as f_challenges, open(temp, 'w', newline='') as f_temp:
    csv_temp = csv.writer(f_temp)

    for row in csv.reader(f_challenges):
        csv_temp.writerow([mydic.get(c.strip(), c.strip()) for c in row])

# Rename the temp file back to challenges (optional)
os.remove(challenges)
os.rename(temp, challenges)

Giving you an updated challenges.csv file as follows:

2937,negative chin up,29,knee squeeze squat,
2938,negative chin up,30,squat,lunge
Martin Evans
  • 45,791
  • 17
  • 81
  • 97