I am a Python beginner and trying to solve this task: I have multiple (125) .csv files (48 rows and 5 columns each), and trying to make a new file that will contain first row and last row (written in a single row) from every .csv file a have.
Asked
Active
Viewed 44 times
0
-
Please write the exact formatting of the files. It's important for the parsing. – 098799 Mar 28 '17 at 16:06
-
2There are lots of subquestions here: how to generate the list of files (assuming you're not entering them all by hand); how to read a csv; how to store the first and last row; how to combine rows; and how to write a csv. Each of them has been addressed in different ways on SO before, so I'd recommend breaking your problem up, reading the docs/SO questions, and trying some things yourself. – DSM Mar 28 '17 at 16:10
-
You will find all basic information in any site like : https://wiki.python.org/moin/BeginnersGuide then to take the last row : http://stackoverflow.com/questions/38704949/read-the-last-n-lines-of-a-csv-file-in-python-with-numpy-pandas – Dadep Mar 28 '17 at 16:11
-
Well this is pretty broad but when you figure out how to generate the list of files you could use Pandas read_csv() to open a Dataframe with just the first and last columns. You then have to figure out how you want them represented in a single row. You can build a new Pandas Dataframe out of these and use Pandas to_csv() to make a new csv out of the results. – John Morrison Mar 28 '17 at 16:13
1 Answers
0
To get you started here is how you can generate the list of files and open them using Pandas. This will generate a list of csv files from a directory, iterate the list and open each as a CSV Pandas DataFrame. Then it creates a new list of the first and last rows of each csv file. I am not sure how you want to create one row out of two though so hopefully this is a starting point for you.
import os
import pandas as pd
#get all files in current directory, or specify the directory in lisdir
csv_files = [file for file in os.listdir(".")]
#create dictionary and load all the files as dataframes.
dataframes = {}
for x in range(len(csv_files)):
dataframes[x] = pd.read_csv(csv_files[x])
#get first and last row from each dataframe(loaded csv).
result_df = pd.DataFrame()
for item in dataframes:
result_df = result_df.append(dataframes[item].iloc[0])
result_df = result_df.append(dataframes[item].iloc[-1])
#write to csv file.
result_df.to_csv("resulting.csv")

John Morrison
- 3,810
- 1
- 15
- 15