2

I am trying to combine multiple csv files into 1 csv file in a python script. I want to skip writing the first 5 lines of each csv file. Having some trouble and I am new to Python. I have tried several examples that I have found but it seems to have trouble with the working directory. Here is my latest attempt:

import pandas as pd
import csv
import glob
import os

path = '//server01/tmp/'
files_in_dir = [f for f in os.listdir(path) if f.endswith('csv')]
count = 0
for filenames in files_in_dir:
    df = pd.read_csv(filenames)
    if count < 6:
            count += 1
            continue
    df.to_csv('out.csv', mode='a')

Any help would be appreciated. Thanks!

Paul
  • 21
  • 1
  • 1
    hmm you want to combine the data frames right. when you do pd.read_csv it reads in the whole chunk. So the count doesn't work. – StupidWolf Feb 24 '20 at 22:12
  • Does `read_csv()` work if you read the whole thing, or do those first 5 lines break it? If the former, you can just read the CSV and omit those 5 lines when appending to a master dataframe. – elPastor Feb 24 '20 at 23:19

1 Answers1

3

Try this:

import pandas as pd
import csv
import glob
import os

path = '//server01/tmp/'
files_in_dir = [os.path.join(path,f) for f in os.listdir(path) if f.endswith('csv')]
for filenames in files_in_dir:
    df = pd.read_csv(filenames, skiprows = 5)
    df.to_csv('out.csv', mode='a')

skiprows: number of lines to skip

nrows: number of rows of file to read

Riet
  • 1,240
  • 1
  • 15
  • 28
ipj
  • 3,488
  • 1
  • 14
  • 18