3

I would like to update specific columns in a csv file. I want to update the first column in the function func1 and the second column in function func2:

def func1(x):
    data = 'test1'
    file = open("test.csv","a+")
    file.write(data)
    file.close()
    return data

def func2(x):
    data = 'test2'
    file = open("test.csv","a+")
    file.write(data)
    file.close()
    return data

Expected out

col1,col2
test1,test2

How do I specify the column to be updated? Here, no file is getting created.

4 Answers4

5

You can use the pandas package to update specific columns (If you don't have it, you can install it by running pip install pandas in your terminal):

import pandas as pd

def func1(file):        
    column = 'col1'
    data = 'new_test1'
    file[column] = data
    return data
def func2(file):  
    column = 'col2'      
    data = 'new_test2'
    file[column] = data
    return data

# you should open the file only once, you don't need to open it seperately within each function
df = pd.read_csv('data.csv')

print(df)
#    col1   col2
# 0  test1  test2
# the '0' is an index column, you can remove it when writing to the file (using index=False, see below)

data1 = func1 (df)
data2 = func2 (df)

df.to_csv('data.csv', index=False) # with 'index=False' you won't see an index column in data.csv

print(df)
#        col1       col2
# 0  new_test1  new_test2

This question may also be useful for you

Ali Atiia
  • 139
  • 7
2

You can use the module csv. For example, if you have the following data:

col1,col2
Baked,Beans
Lovely,Spam

You can use the function:

import csv, os
​
def func(file, **values):
    with open(file) as fin, open('temp', 'w') as fout:
        reader = csv.DictReader(fin)

        # get column names
        first_row = next(reader)
        cols = list(first_row.keys())

        writer = csv.DictWriter(fout, fieldnames=cols)
        writer.writeheader()

        # update the first row
        first_row.update(values)
        writer.writerow(first_row)

        # update the rest
        for row in reader:
            row.update(values)
            writer.writerow(row)
        os.replace('temp', file)

func('test.csv', col1='test1', col2='test2')

Result:

col1,col2
test1,test2
test1,test2

The advantage of this solution is that you don't have to load the whole file into memory.

Mykola Zotko
  • 15,583
  • 3
  • 71
  • 73
0

Since you're tagging pandas, I will go for a pandas solution:

import pandas as pd
df = pd.read_csv('data.csv')

def func1(df):
    df['col1'] = 'test1'
    return df

def func2(df):
    df['col2'] = 'test2'
    return df

And now:

df = func1(df)
df = func2(df)
print(df)

Is:

    col1   col2
0  test1  test2

But seriously just use:

import pandas as pd
df = pd.read_csv('data.csv')
df['col1'] = 'test1'
df['col2'] = 'test2'
U13-Forward
  • 69,221
  • 14
  • 89
  • 114
0

I get that you want to use two functions to update the value, but as others have already provided great answers, I am showing an alternative way on how you can use one function to get the flexibility.

CSV file

data.csv

col1,col2
foo1,foo2
bar1,bar2
foobar1,foobar2

Code

import pandas as pd
df = pd.read_csv('data.csv')

def func(rows, col, elem):
    global df
    if isinstance(col, str):     # check if column name is passed
        df.iloc[rows, df.columns.get_loc(col)] = elem
    else:                        # check if column index is passed
        df.iloc[rows, col] = elem

func([0,2], 'col1', 'test1')  # you can specify column by name, or
func(1, 1, 'test2')           # specify column by index

df.to_csv('data.csv', index = False)

print(df)

#     col1     col2
# 0  test1     foo2
# 1   bar1    test2
# 2  test1  foobar2

Advantage is that, you can update multiple rows at a time, and can use either index or name to specify columns.

Note only that, if you want to update multiple rows, with different values, you can do that too:

print(df)

#     col1     col2
# 0  test1     foo2
# 1   bar1    test2
# 2  test1  foobar2

func([0,2], 'col1', ['new_test1', 'new_test2'])

print(df)

#         col1     col2
# 0  new_test1     foo2
# 1       bar1    test2
# 2  new_test2  foobar2
Sayandip Dutta
  • 15,602
  • 4
  • 23
  • 52