0

I am stuck on this one and I am sure the answer is a simple one. I am new to python and am learning my way around.

I am working on a small project that needs to look at content of all .csv files in a directory. Each file has only 1 row, with 2 columns (ID and DateTime). I have a python script in the same directory as follows;

import glob

path = '*.csv'   
files=glob.glob(path)   
for file in files:     
    f=open(file, 'r')
    print ('%s' % f.readlines())

This returns the data I need to the terminal, from here I would like to take this data and create another .csv file with the same 2 columns. Effectively all of the single row data files are to be consolidated into one file. Once I have this new consolidated .csv file I can work with it. How can I take all the data returned to the terminal to create a new .csv file?

Many thanks

3 Answers3

0

why use print ('%s' % f.readlines()) when you can "save" all your data into a list: lines.append(f.readlines()). In fact you can use f.read() which reads the entire file as a string

And if you want to write into a file you can:

with open('new_file.csv', 'w') as file:
    file.writelines(lines)
Pranav Void
  • 359
  • 3
  • 7
0

When it is just about reading and writing content from/in files Pranav Void's way will work perfectly fine. But if you want to work with csv files in general, you should consider having a look at the pandas library. There are some commands that can be really useful for you such as pandas.read_csv(filename) and the to_csv(filename) method of the DataFrame class. The program you need could look as following (using glob and pandas):

Python 3.7.1

from glob import glob
import pandas as pd

path = '*.csv'
filenames = glob(path)

for filename in filenames:
    df = pd.read_csv(filename)  # df is a shortcut for DataFrame, which is some kind of table from the pandas library
    print(df.head())  # This will print out the first 5 lines of the dataframe that was read from the file to give you a quick overview

    df.to_csv('edited_' + filename)  # this will add an 'edited_' at the beginning of the filename and then write the dataframe into a csv file with that name. So when the filename was 'my_csv.csv' it will be written to 'edited_my_csv.csv'

When you want to modify the data in the dataframe there are plenty of options but would be too much for this answer to write. If you want to work further with csv files, take a look at the full functionality of the pandas library and you will get a load of tools to work with.

halfer
  • 19,824
  • 17
  • 99
  • 186
byTreneib
  • 176
  • 13
  • If you have any questions about the answer or the code, feel free to ask! – byTreneib Dec 25 '19 at 12:20
  • This example creates a new .csv file for each existing .csv file. Thanks for pointing out the pandas library, I spent some time reviewing and have a solution. – CharlieTangoAU Dec 29 '19 at 13:13
  • If you have a closer look at the pandas library you will find an easy way to merge the content of all those files together. Happy to help. – byTreneib Jan 01 '20 at 23:00
0

one more way using python pandas library. first read all csvs using pandas read_csv function and create generator which will be used as parameter in pd.concatenate to consolidate all data frames into single data frame and then write data frame to csv file.

    import pandas as pd
    import glob
    path = '*.csv'
    all_files = glob.glob(path) 
    df_generator = (pd.read_csv(f) for f in all_files)
    consolidated_df = pd.concat(df_generator, ignore_index=True)
    consolidated_df.to_csv('consolidated_file')
Sachin Kharude
  • 296
  • 3
  • 4
  • For some reason this example omitted column 2 of the last file in the data read. On spending some time reviewing the pandas library I have a solution. Thanks for your time and input on this. – CharlieTangoAU Dec 29 '19 at 13:14