0

I collected data from social media platforms and stored it in a csv file, I have around 1.8 million comments in this file and I am trying to open it using pandas. I used engine as python while opening, the file size is 122 MB and it has code mixed comments with emojis, but I am getting an error.

ParserError: NULL byte detected. This byte cannot be processed in Python's native csv library at the moment, so please pass in engine='c' instead

So i tried with engine='c' but now got a different error

ParserError: Error tokenizing data. C error: Buffer overflow caught - possible malformed input file.
raviTeja
  • 338
  • 1
  • 7
  • 21
  • Does this answer your question? [How do I read a large csv file with pandas?](https://stackoverflow.com/questions/25962114/how-do-i-read-a-large-csv-file-with-pandas) – Dishin H Goyani Sep 24 '20 at 12:00
  • @ Dishin H Goyani yes it worked forreading the file but when i tried to convert it to list i am getting error AttributeError: 'Series' object has no attribute 'tolist' i tried df = dd.read_csv(), df["column"].tolist() – raviTeja Sep 24 '20 at 12:17
  • Did you get same error? – Dishin H Goyani Sep 24 '20 at 12:18
  • I am not able to access the content now, i am able to read the file to data frame – raviTeja Sep 24 '20 at 12:19
  • Is there anything in the dataframe or is it showing null? – Dishin H Goyani Sep 24 '20 at 12:21
  • I changed the engine to "python" from "c", i tried df.head(5), i got the first five comments, but how can i convert to a list i tried df["column"].tolist(), i am getting Sereis has no attribute tolist – raviTeja Sep 24 '20 at 12:22
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/222014/discussion-between-learner-and-dishin-h-goyani). – raviTeja Sep 24 '20 at 12:24

0 Answers0