I am running the following code in python, using Pandas, to read through various csv files.
def create_formatted(csv_files):
for f in csv_files:
df = pd.read_csv(f)
However, the header of each file could start on different rows, with all files having an initial 3-5 rows with just a single column of data, before the header starts - at which point the number of fields increases from 1 to 13. An example would be:
CSV File 1:
| Row 1 |
| Row 2 |
| Row 3 |
| Header A | Header B | Header c |
| -------- | -------- | -------- |
| Cell 1 | Cell 2 | Cell 3 |
| Cell 4 | Cell 5 | Cell 6 |
CSV File 2:
| Row 1 |
| Row 2 |
| Row 3 |
| Row 4 |
| Header A | Header B | Header c |
| -------- | -------- | -------- |
| Cell 1 | Cell 2 | Cell 3 |
| Cell 4 | Cell 5 | Cell 6 |
I have tried using the following 'skiprows=' parameter, but obviously it won't work as the header can start on a different row in each csv file.
def create_formatted(csv_files):
for f in csv_files:
df = pd.read_csv(f, skiprows=3)
Is there another work around I can try to ignore the first few lines, until the header row starts?