-1

My problem is, on my csv that Im currently working on is a restaurant review comments and their values 1 or zero based on the stars given. On some comments there is comma usage so when I do pd.read_csv() it sees the comments comma and gives an error. How can I fix this issue.

I used pd.read_csv(path, on_bad_lines='skip') or error_bad_lines = False too but these two didnt help my problem.

Basicly it gets the 1 or 0 value into the review side and leaves me a nan value so it creates problems in the code.

this is the csv

Review,Liked
Wow... Loved this place.,1
Crust is not good.,0
Not tasty and the texture was just nasty.,0
Stopped by during the late May bank holiday off Rick Steve recommendation and loved it.,1
The selection on the menu was great and so were the prices.,1
Now I am getting angry and I want my damn pho.,0
Honeslty it didn't taste THAT fresh.),0
The potatoes were like rubber and you could tell they had been made up ahead of time being kept under a warmer.,0
The fries were great too.,1
A great touch.,1
Service was very prompt.,1
Would not go back.,0
The cashier had no care what so ever on what I had to say it still ended up being wayyy overpriced.,0
I tried the Cape Cod ravoli, chicken, with cranberry...mmmm!,1
I was disgusted because I was pretty sure that was human hair.,0
I was shocked because no signs indicate cash only.,0
Highly recommended.,1
Waitress was a little slow in service.,0
This place is not worth your time, let alone Vegas.,0
did not like at all.,0
The Burrittos Blah!,0
The food, amazing.,1
Service is also cute.,1
I could care less... The interior is just beautiful.,1
So they performed.,1
That's right....the red velvet cake.....ohhh this stuff is so good.,1
#NAME?,0
This hole in the wall has great Mexican street tacos, and friendly staff.,1
Took an hour to get our food only 4 tables in restaurant my food was Luke warm, Our sever was running around like he was totally overwhelmed.,0
The worst was the salmon sashimi.,0
Also there are combos like a burger, fries, and beer for 23 which is a decent deal.,1
This was like the final blow!,0
I found this place by accident and I could not be happier.,1
seems like a good quick place to grab a bite of some familiar pub food, but do yourself a favor and look elsewhere.,0
Overall, I like this place a lot.,1
The only redeeming quality of the restaurant was that it was very inexpensive.,1
Ample portions and good prices.,1
Poor service, the waiter made me feel like I was stupid every time he came to the table.,0
My first visit to Hiro was a delight!,1

first line is the column names

molbdnilo
  • 64,751
  • 3
  • 43
  • 82
2brk
  • 1
  • 2
  • Can you share a small piece of your csv file *as text* and include it to your question ? – Timeless Apr 08 '23 at 21:48
  • Depending on how your file is set up this thread may be helpful https://stackoverflow.com/questions/769621/dealing-with-commas-in-a-csv-file – Raisin Apr 08 '23 at 21:49
  • 1
    @Timeless added the csv – 2brk Apr 08 '23 at 21:51
  • You can export the file using 3rd party program e.g excel with a custom seperator, such as ";" – GuyPago Apr 08 '23 at 21:52
  • Your csv is broken. If there is a comma in the field (and you use the comma as a separator), the field needs to be quoted. I'd try to fix whatever program generates the broken csv. – Robert Apr 08 '23 at 23:56

1 Answers1

0

One option is to use a regex separator :

df = pd.read_csv(path, sep=r"\b,\b|,(?=[01]$)", engine="python")

​ Output :

print(df.sample(6))

                                                          Review  Liked
26                                                        #NAME?      0
6                          Honeslty it didn't taste THAT fresh.)      0
13  I tried the Cape Cod ravoli, chicken, with cranberry...mmmm!      1
0                                       Wow... Loved this place.      1
10                                      Service was very prompt.      1
15            I was shocked because no signs indicate cash only.      0

Demo : [Regex]

Timeless
  • 22,580
  • 4
  • 12
  • 30