New file from data from multiple files

Question

I am a Python beginner and trying to solve this task: I have multiple (125) .csv files (48 rows and 5 columns each), and trying to make a new file that will contain first row and last row (written in a single row) from every .csv file a have.

Please write the exact formatting of the files. It's important for the parsing. — 098799, Mar 28 '17 at 16:06
There are lots of subquestions here: how to generate the list of files (assuming you're not entering them all by hand); how to read a csv; how to store the first and last row; how to combine rows; and how to write a csv. Each of them has been addressed in different ways on SO before, so I'd recommend breaking your problem up, reading the docs/SO questions, and trying some things yourself. — DSM, Mar 28 '17 at 16:10
You will find all basic information in any site like : https://wiki.python.org/moin/BeginnersGuide then to take the last row : http://stackoverflow.com/questions/38704949/read-the-last-n-lines-of-a-csv-file-in-python-with-numpy-pandas — Dadep, Mar 28 '17 at 16:11
Well this is pretty broad but when you figure out how to generate the list of files you could use Pandas read_csv() to open a Dataframe with just the first and last columns. You then have to figure out how you want them represented in a single row. You can build a new Pandas Dataframe out of these and use Pandas to_csv() to make a new csv out of the results. — John Morrison, Mar 28 '17 at 16:13

score 0 · Answer 1 · answered Mar 28 '17 at 17:55

To get you started here is how you can generate the list of files and open them using Pandas. This will generate a list of csv files from a directory, iterate the list and open each as a CSV Pandas DataFrame. Then it creates a new list of the first and last rows of each csv file. I am not sure how you want to create one row out of two though so hopefully this is a starting point for you.

import os
import pandas as pd

#get all files in current directory, or specify the directory in lisdir
csv_files = [file for file in os.listdir(".")]

#create dictionary and load all the files as dataframes.
dataframes = {}
for x in range(len(csv_files)):
    dataframes[x] = pd.read_csv(csv_files[x])

#get first and last row from each dataframe(loaded csv).
result_df = pd.DataFrame()
for item in dataframes:
    result_df = result_df.append(dataframes[item].iloc[0])
    result_df = result_df.append(dataframes[item].iloc[-1])

#write to csv file.
result_df.to_csv("resulting.csv")

New file from data from multiple files

1 Answers1