output from readlines() into a list

Question

I am stuck on this one and I am sure the answer is a simple one. I am new to python and am learning my way around.

I am working on a small project that needs to look at content of all .csv files in a directory. Each file has only 1 row, with 2 columns (ID and DateTime). I have a python script in the same directory as follows;

import glob

path = '*.csv'   
files=glob.glob(path)   
for file in files:     
    f=open(file, 'r')
    print ('%s' % f.readlines())

This returns the data I need to the terminal, from here I would like to take this data and create another .csv file with the same 2 columns. Effectively all of the single row data files are to be consolidated into one file. Once I have this new consolidated .csv file I can work with it. How can I take all the data returned to the terminal to create a new .csv file?

Many thanks

Have you tried `file` api of python. https://stackoverflow.com/questions/6159900/correct-way-to-write-line-to-file — panoskarajohn, Dec 25 '19 at 11:53

score 0 · Answer 1 · answered Dec 25 '19 at 11:49

why use print ('%s' % f.readlines()) when you can "save" all your data into a list: lines.append(f.readlines()). In fact you can use f.read() which reads the entire file as a string

And if you want to write into a file you can:

with open('new_file.csv', 'w') as file:
    file.writelines(lines)

score 0 · Answer 2 · edited Dec 27 '19 at 19:35

When it is just about reading and writing content from/in files Pranav Void's way will work perfectly fine. But if you want to work with csv files in general, you should consider having a look at the pandas library. There are some commands that can be really useful for you such as pandas.read_csv(filename) and the to_csv(filename) method of the DataFrame class. The program you need could look as following (using glob and pandas):

Python 3.7.1

from glob import glob
import pandas as pd

path = '*.csv'
filenames = glob(path)

for filename in filenames:
    df = pd.read_csv(filename)  # df is a shortcut for DataFrame, which is some kind of table from the pandas library
    print(df.head())  # This will print out the first 5 lines of the dataframe that was read from the file to give you a quick overview

    df.to_csv('edited_' + filename)  # this will add an 'edited_' at the beginning of the filename and then write the dataframe into a csv file with that name. So when the filename was 'my_csv.csv' it will be written to 'edited_my_csv.csv'

When you want to modify the data in the dataframe there are plenty of options but would be too much for this answer to write. If you want to work further with csv files, take a look at the full functionality of the pandas library and you will get a load of tools to work with.

If you have any questions about the answer or the code, feel free to ask! — byTreneib, Dec 25 '19 at 12:20
This example creates a new .csv file for each existing .csv file. Thanks for pointing out the pandas library, I spent some time reviewing and have a solution. — CharlieTangoAU, Dec 29 '19 at 13:13
If you have a closer look at the pandas library you will find an easy way to merge the content of all those files together. Happy to help. — byTreneib, Jan 01 '20 at 23:00

score 0 · Answer 3 · answered Dec 25 '19 at 14:56

one more way using python pandas library. first read all csvs using pandas read_csv function and create generator which will be used as parameter in pd.concatenate to consolidate all data frames into single data frame and then write data frame to csv file.

    import pandas as pd
    import glob
    path = '*.csv'
    all_files = glob.glob(path) 
    df_generator = (pd.read_csv(f) for f in all_files)
    consolidated_df = pd.concat(df_generator, ignore_index=True)
    consolidated_df.to_csv('consolidated_file')

For some reason this example omitted column 2 of the last file in the data read. On spending some time reviewing the pandas library I have a solution. Thanks for your time and input on this. — CharlieTangoAU, Dec 29 '19 at 13:14

output from readlines() into a list

3 Answers3

Python 3.7.1