2

So now I have a csv file here I can access it using python and also I was successful to delete the rows in middle but I am not able to get the remaining data as new csv.

I HAVE TRIED THIS CODE-

import pandas as pd
import csv
df = pd.read_csv('/content/Final_Data.csv',error_bad_lines= False)
df.head()
data = df.drop(columns='-BEGIN HEADER-')

print(data)

with open('example.csv', 'w') as file:
    writer = csv.writer(file)
    writer.writerow(data)

[THIS IS THE IMAGE LINK OF MY DATA.][1] [1]: https://i.stack.imgur.com/hJPCl.jpg

It is considering the -BEGIN HEADER- as one column but it is accepting only rows till -END HEADER- row. I tried to delete the -BEGIN HEADER- COLUMN but it is only deleting the values till -END HEADER- row. My dataframe is accepting the values only till 11th column.

Please help me to access the remaining data using python.

  • Does this answer your question? [Extract Values between two strings in a text file using python](https://stackoverflow.com/questions/18865058/extract-values-between-two-strings-in-a-text-file-using-python) – Ken Y-N Oct 14 '22 at 06:56

1 Answers1

0

For variable amounts of header lines, you can

  1. Read the full file
  2. Remove the header by splitting on a specific string
  3. Feed the remaining file_content to pandas as an ioString
from io import StringIO
import pandas as pd

# read entire file_content
with open('/content/Final_Data.csv') as file:
    file_content = file.read()

# remove header from file_content
header_end = "-END HEADER-"
csv_content = file_content.split(header_end)[-1]


# get dataframe from remaining csv_content
df = pd.read_csv(StringIO(csv_content))
print(df.head())
Christian Karcher
  • 2,533
  • 1
  • 12
  • 17