Not able to read csv while skipping first row and using second as header in pandas for raw tick data of symbols

Question

I have many csv files with a different size containing tick data for some symbols. Here is an image of one sample file.

Everything is in one columns seprated by ';'. I want to read the data with second row as header and skipping the first row. Till this time I have tried evrything that I can find out regarding loading the csv file while skipping the first row and using the second row as header. Here are some of my code snippet that I tired

df = pd.read_csv(cwd + folder + name +'.csv',delimiter=';', skip_blank_lines=True, encoding='utf-8', skiprows=[0])

another is like this

df = pd.read_csv(cwd + folder + name +'.csv',delimiter=';', encoding='utf-8', skiprows=[0], header=1)

and the output of all of these are with single column named 'Unnamed: 0' with all the values in dataframe as NaN. I have tried different solutions like

Python Pandas read_csv skip rows but keep header but none of them worked for me. If I do not skip the first row and read the file without any delimiter then it gives unicodeerror in Python. How to solve this problem?

After trying two solution in first two answers this is my output for both codes

can you do `print(cwd+folder+name+'.csv')` and share the output. — meW, Dec 27 '18 at 06:56
that is path to the file. If it is wrong then it throws error directly. But it is something like this `E:/pirimid/trader/LEAD.csv` — Urvish, Dec 27 '18 at 06:57
@Urvish - the code that you have used itself gives me a correct dataframe as you expect. Very sure the error is with the file read. — Jim Todd, Dec 27 '18 at 06:57
@Urvish are you getting some output (not NAN) if you read it as table `pd.read_table('E:/pirimid/trader/LEAD.csv')` — meW, Dec 27 '18 at 06:59
Here is the exact output of `print(cwd+folder+name+'.csv'` `E:\Pirimid\trader data/Tick Files/LEADAPR.csv` — Urvish, Dec 27 '18 at 07:01
@Urvish path is correct, try reading it as table. Are you getting any NaN? — meW, Dec 27 '18 at 07:03
Yes still it gives NaN in all rows. I am trying to upload the data file but it is too large so taking some time. Will add the link to that in question once uploaded. — Urvish, Dec 27 '18 at 07:05
Yeah, but that will change the file itself. So if there is anything wrong with the file then I won't be able to know. That is why I uploaded the whole file. — Urvish, Dec 27 '18 at 07:13

Nihal · Accepted Answer · 2018-12-27T07:50:13.020

2

in skiprows you need to give number of rows you want to skip from the top of your csv

use utf-16

df = pd.read_csv(cwd + folder + name +'.csv',delimiter=';', encoding='utf-16', skiprows=1)

for more info:

To check the encoding i have checked in libreoffice. if you open with libreoffice in its starting window you can choose delimiter, in which it also shows utf encoding of that file.

edited Dec 27 '18 at 07:50

answered Dec 27 '18 at 06:50

Nihal

5,262
7
23
41

`Unnamed: 0 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN` This is the output using your code. – Urvish Dec 27 '18 at 06:51
1

please paste some top 10 rows data in your question – Nihal Dec 27 '18 at 06:52
@Nihal agree with you. – meW Dec 27 '18 at 06:54
i meant you to post some data of csv, not the wrong output – Nihal Dec 27 '18 at 06:55
added link for one data file. – Urvish Dec 27 '18 at 07:10
@Urvish there is a problem with your file. which can't be decode with utf-8. – Nihal Dec 27 '18 at 07:27
if you remove the first row then it can be read and used as well. But you have to edit the file manually then – Urvish Dec 27 '18 at 07:31

score 0 · Answer 2 · answered Dec 27 '18 at 06:59

0

@Urvish - I have used the same code that you have used in your post,and see that the output is exact. pls check your file.

import pandas as pd
df = pd.read_csv("C:\\Users\\user\\Downloads\\sof.csv" ,delimiter=';', skip_blank_lines=True, encoding='utf-8', skiprows=[0])
print(df)

Output:

            Date    bid    ask  last  volume
2017 06 05   799  149.6  149.7   0.0     0.0
2017 06 05   799  149.6  149.7   0.0     0.0

answered Dec 27 '18 at 06:59

Jim Todd

1,488
1
11
15

What are your thoughts about what could go wrong in a file? I am not sure if anything could be wrong with that. – Urvish Dec 27 '18 at 07:03
1

Hope my guess is correct and happy that the problem is solved. – Jim Todd Dec 28 '18 at 15:42

Not able to read csv while skipping first row and using second as header in pandas for raw tick data of symbols

2 Answers2