I have a large txt file that I want to read in python. It's tab delimited. I want to be able to read the headers as well. I saw this stackoverflow site but it doesn't show how to both designate the n qty of rows as well as determine the delimiter and line break: Read first N lines of a file in python
Asked
Active
Viewed 127 times
1 Answers
1
Pandas dataframe will help you do that automatically.
import pandas as pd
df = pd.read_csv(myfile,sep='\t')
df.head(n=5) # for the 5 first lines of your file
For more info, see https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_csv.html#pandas.read_csv

Lawrence
- 869
- 7
- 10
-
Thanks! I ran into memory issues. Is there a read function that doesn't load the entire file into memory first? I'm looking at a txt file with 45,000,000 records... – Scott Davis Feb 09 '19 at 00:24
-
@ScottDavis try using `df = pd.read_csv(myfile,sep='\t',low_memory=False)`, or try using chunks. I've never had this problem before, but my files are much smaller. Good luck! – Lawrence Feb 11 '19 at 02:53
-
Thanks Lawrence! I used the suggested code and still hit the "pandas.errors.ParserError: Error tokenizing data. C error: out of memory" – Scott Davis Feb 11 '19 at 23:35