Csv readlines issue

Question

I'm trying to readlines of a csv file file. The code to read the file in python with readlines and csv.reader works fine on windows but on Linux test server I am getting some issue maybe because of the \n at the end of each line.

Is there some difference in reading lines through readlines function in python on windows and linux?

This is my code:

with open(r"C:\Users\prate\Downloads\meal_count_avg_meal.csv","r") as filemy:
    #mycontent=csv.reader(filemy)
    out = filemy.readlines()

Possible duplicate of [How would I specify a new line in Python?](https://stackoverflow.com/questions/11497376/how-would-i-specify-a-new-line-in-python) — Thecave3, Apr 27 '19 at 13:11
What is this "_some issue_" that you are getting? Can you also share the code for opening and `readlines`? — Gino Mempin, Apr 27 '19 at 14:08
with open(r"C:\Users\prate\Downloads\meal_count_avg_meal.csv","r") as filemy: #mycontent=csv.reader(filemy) out=filemy.readlines() — Prateek Mishra, Apr 27 '19 at 14:23
But I'm getting /r/n escape characters on Linux server at the end of every line but only /n in case of Windows file. This is causing an issue to access the rows of csv which i need to compare for validation. How to solve it? — Prateek Mishra, Apr 27 '19 at 14:25
Always open the file “rb” as directed in the documentation for csv. Read the documentation. — DisappointedByUnaccountableMod, Apr 27 '19 at 17:30

Martin Evans · Answer 1 · 2019-04-28T16:25:23.103

0

Normally, you would read the CSV file in a row at a time, and carry out any required processing (e.g. converting to ints or floats). It is though possible to read all of your rows as a list of rows, with each row holding a list of the values:

import csv

filename = r"C:\Users\prate\Downloads\meal_count_avg_meal.csv"

with open(filename, "rb") as file_my:
    csv_my = csv.reader(file_my)
    # next(csv_my)     # Optionally skip the header
    rows = list(csv_my)

As you are still using Python 2.x, you will need to open your file with rb, which is required for the csv.reader() object. This approach should give you the same result on Windows or Linux.

rows would be something like:

[["header1", "header2", "header3"], ["value1", "value2", "value3"]]

This assumes your CSV file is a standard commas separated variable format. If your CSV file contains a header, and you don't want that included in rows, then simply add the following:

next(csv_my)

Using .readlines() will keep the line endings (which are different on Windows and Linux).

If one of your environments is Python 3.x, it would need to changed as follows:

import csv

filename = r"C:\Users\prate\Downloads\meal_count_avg_meal.csv"

with open(filename, "r", newline="") as file_my:
    csv_my = csv.reader(file_my)
    # next(csv_my)     # Optionally skip the header
    rows = list(csv_my)

If you plan on running the same code on 2.x and 3.x unchanged, your code would have to test which version you are running on and open the file differently depending on the version.

edited Apr 28 '19 at 16:25

answered Apr 28 '19 at 14:53

Martin Evans

45,791
17
81
97

Would using something like: with open(filename, "r") as file_my: csv_my=file_my.read().splitlines() be also a good alternative solution? – Prateek Mishra Apr 28 '19 at 15:44
Because read binary is giving me an issue in python 2.7 as iterator should return strings and not bytes(when I'm trying to covert csv_my as a list) – Prateek Mishra Apr 28 '19 at 15:47
Not really no. You would still have to process each line to split it on the delimiter and if needed also handle any quoted strings correctly. This is all handled for you using the CSV library. That library expects it to be opened in binary mode. It would help if you edited your question to show some example rows from your CSV file. – Martin Evans Apr 28 '19 at 15:52
If your CSV file is utf-8 encoded, you will need a different approach. Python 3.x handles these much better. – Martin Evans Apr 28 '19 at 15:53
with open(r"/mnt/c/Users/prate/Downloads/meal_count_avg_meal.csv","r") as filemy: stocks = filemy.read().splitlines() header=stocks[0] rest=stocks[1:] header_list=header.split(",") --------------------------------- The code above is fetching me the header of the csv file which looks like: Financial Year| Total Meals(in lacs)|Meals Per Day(Nos) -------------------|--------------------------|--------------------------- 2018 | 568 | 55 -------------------------------------------------------------------------- and so on – Prateek Mishra Apr 28 '19 at 15:55
That will work ok, but will fail if any of your fields contain commas. In this case it is normal to quote the field e.g. `123,456,"hello, world"`. Your code would give ['123', '456', '"hello', 'world"']` using the CSV would give the correct result. I happily use binary mode with Python 2.4.3 on one system with the CSV library so I would need more detail to determine why it is not working for you. – Martin Evans Apr 28 '19 at 16:05
Perhaps you could use something like pastebin to upload your CSV file to, you could then copy the link to it here. It would then be possible to run your code. – Martin Evans Apr 28 '19 at 16:10
Yes my business requirement is met as that would never be the case. Also, when I am running the code you provided its giving me the bytes error – Prateek Mishra Apr 28 '19 at 16:10
Thanks a lot for your efforts. Would keep you posted with same. – Prateek Mishra Apr 28 '19 at 16:21
The error you are getting actually implies you are running the code on Python 3.x. If this is the case, it would need to be changed to not be binary, and instead you should add `newline=""` as a parameter. – Martin Evans Apr 28 '19 at 16:23

Csv readlines issue

1 Answers1