Remove the newline character in a list read from a file

Question

I have a simple program that takes an ID number and prints information for the person matching the ID. The information is stored in a .dat file, with one ID number per line.

The problem is that my program is also reading the newline character \n from the file. I have tried the 'name'.split() method, but this doesn't seem to work for a list.

My program:

from time import localtime, strftime

files = open("grades.dat")
request = open("requests.dat", "w")
lists = files.readlines()
grades = []

for i in range(len(lists)):
    grades.append(lists[i].split(","))

cont = "y"

while cont == "y" or cont == "Y":
    answer = raw_input("Please enter the Student I.D. of whom you are looking: ")
    for i in range(len(grades)):
        if answer == grades[i][0]:
            print grades[i][1] + ", " + grades[i][2] + (" "*6) + grades[i][0] + (" "*6) + grades[i][3]
            time = strftime("%a, %b %d %Y %H:%M:%S", localtime())
            print time
            print "Exams - " + grades[i][11] + ", " + grades[i][12] + ", " + grades[i][13]
            print "Homework - " + grades[i][4] + ", " + grades[i][5] + ", " + grades[i][6] + ", " + grades[i][7] + ", " +grades[i][8] + ", " + grades[i][9] + ", " + grades[i][10]
            total = int(grades[i][4]) + int(grades[i][5]) + int(grades[i][6]) + int(grades[i][7]) + int(grades[i][8]) + int(grades[i][9]) + int(grades[i][10]) + int(grades[i][11]) + int(grades[i][12]) + int(grades[i][13])
            print "Total points earned - " + str(total)
            grade = float(total) / 550
            grade = grade * 100
            if grade >= 90:
                print "Grade: " + str(grade) + ", that is equal to an A."
            elif grade >= 80 and grade < 90:
                print "Grade: " + str('%.2f' %grade) + ", that is equal to a B."
            elif grade >= 70 and grade < 80:
                print "Grade: " + str('%.2f' %grade) + ", that is equal to a C."
            elif grade >= 60 and grade < 70:
                print "Grade: " + str('%.2f' %grade) + ", that is equal to a D."
            else:
                print "Grade: " + str('%.2f' %grade) + ", that is equal to an F."
            request.write(grades[i][0] + " " + grades[i][1] + ", " + grades [i][2] +
                          " " + time)
            request.write("\n")


    print
    cont = raw_input("Would you like to search again? ")

if cont != "y" or cont != "Y":
    print "Goodbye."

What's the format for the grade data? `ID, (first/last) name, (last/first) name, etc.` I want to know to provide a nice namedtuple solution. — Chris Morgan, Nov 30 '10 at 22:30
The format is ID, last, first, degree major, grade 1,2,3,4,5,6,7, 8, 9, 10. — Python Newbie, Nov 30 '10 at 23:39

score 112 · Accepted Answer · answered Nov 30 '10 at 22:18

112

str.strip() returns a string with leading+trailing whitespace removed, .lstrip and .rstrip for only leading and trailing respectively.

grades.append(lists[i].rstrip('\n').split(','))

answered Nov 30 '10 at 22:18

ephemient

198,619
38
280
391

1

It would definitely work on linux, but what if end of the line is ``CR+LF``(Windows) or just ``CR``(Mac)? – seler Apr 19 '12 at 21:54
9

`.rstrip('\r\n')`, or simply `.rstrip()`, would strip both. – ephemient Apr 20 '12 at 03:46
2

import os; endl = os.linesep; .strip(endl) or rstrip / lstrip... this way you don't have to worry about the OS :). – mthpvg Oct 29 '12 at 14:16
I only added ".rstrip('\n')", and now it works. Before only the last iteration in my for-loop worked, because it didnt have the "\n", it took me 1 h to figure out that it didnt have anything to do with the loop, thanks! – Michael Larsson Jul 14 '22 at 10:00

Michael Mrozek · Answer 2 · 2010-12-01T04:39:22.240

26

You can use the strip() function to remove trailing (and leading) whitespace; passing it an argument will let you specify which whitespace:

for i in range(len(lists)):
    grades.append(lists[i].strip('\n'))

It looks like you can just simplify the whole block though, since if your file stores one ID per line grades is just lists with newlines stripped:

Before

lists = files.readlines()
grades = []

for i in range(len(lists)):
    grades.append(lists[i].split(","))

After

grades = [x.strip() for x in files.readlines()]

(the above is a list comprehension)

Finally, you can loop over a list directly, instead of using an index:

Before

for i in range(len(grades)):
    # do something with grades[i]

After

for thisGrade in grades:
    # do something with thisGrade

edited Dec 01 '10 at 04:39

answered Nov 30 '10 at 22:17

Michael Mrozek

169,610
28
168
175

Thank you. To clarify, is this meant to be used with, or to replace my for i in range(len(lists)): grades.append(lists[i].split(",")) loop? – Python Newbie Nov 30 '10 at 22:20
@Python It replaces it; I edited the answer – Michael Mrozek Nov 30 '10 at 22:26
I replaced the first Before code with your first After code, but after entering an ID, it simply prints a blank line and then skips to "Would you like to search again?". Am I doing something wrong? – Python Newbie Nov 30 '10 at 23:46
-1 teaching newbies to use strip() when they want strip('\n') – John Machin Dec 01 '10 at 04:35
@John I don't think I've ever been downvoted for something so completely trivial, usually people just leave a comment...anyway, fixed – Michael Mrozek Dec 01 '10 at 04:40
@Michael Mrozek: Trivia like a dot instead of a comma blows up spacecraft. Teaching newbies to use a sledgehammer instead of a nutcracker without any analysis or comment is *NOT* trivial. – John Machin Dec 01 '10 at 04:50
4

@John The mistake itself was trivial, and is pretty much the reason comments exist; most people would just comment with "note that `strip()` will remove all whitespace, not just newlines; you might want to use `strip('\n')` instead". I said "You can use the strip() function to remove trailing (and leading) whitespace", and you're right that I should've passed `'\n'` to `strip()`, but downvotes are for completely unhelpful answers (see the downvote tooltip). I guess if you want to downvote answers that are fundamentally right and help the asker but have minor errors, that's your choice – Michael Mrozek Dec 01 '10 at 06:10

martineau · Answer 3 · 2021-11-19T17:54:35.780

7

You could actually put the newlines to good use by reading the entire file into memory as a single long string and then use them to split that into the list of grades by using the string splitlines() method which, by default, removes them in the process.

with open("grades.dat") as file:
    grades = [line.split(",") for line in file.read().splitlines()]
...

edited Nov 19 '21 at 17:54

answered Dec 01 '10 at 00:00

martineau

119,623
25
170
301

This was my thought too, which method is faster? – joemaller Jan 11 '13 at 23:34
@joemaller: Often the only way to tell for sure is by actually timing it with some test data -- which can usually be easily done with the `timeit` module -- and I think my recent revision would be very competitive. – martineau Jan 12 '13 at 01:40
nice update. I've been using `with open() as f: f.read().split('\n')` but `splitlines()` is cleaner and more obvious. `timeit` obviously, I was being lazy... – joemaller Jan 12 '13 at 06:04
@joemaller: FWIW `.splitlines()` was only a few msec faster than `.split('\n')` on a 4+ MB test file using `python -mtimeit "[line for line in open('AV1611Bible.txt').read().splitlines()]"`. The test file is a version of the Bible, downloaded and unzipped from [here](http://printkjv.ifbweb.com/AV%5Ftxt.zip). A few milliseconds on a file of nearly 34,000 lines hardly matters, so either one's fine. – martineau Jan 12 '13 at 14:31
2

@joemaller: In addition to being slightly faster, the [`str.splitlines()`](https://docs.python.org/2/library/stdtypes.html?highlight=splitlines#str.splitlines) method uses the [universal newlines](https://docs.python.org/2/glossary.html#term-universal-newlines) approach to splitting lines, whereas [`str.split('\n')`](https://docs.python.org/2/library/stdtypes.html?highlight=str.split#str.split) doesn't do that, so the former is also better because it's platform-independent. – martineau Apr 20 '15 at 15:48

score 2 · Answer 4 · answered Nov 30 '10 at 22:49

Here are various optimisations and applications of proper Python style to make your code a lot neater. I've put in some optional code using the csv module, which is more desirable than parsing it manually. I've also put in a bit of namedtuple goodness, but I don't use the attributes that then provides. Names of the parts of the namedtuple are inaccurate, you'll need to correct them.

import csv
from collections import namedtuple
from time import localtime, strftime

# Method one, reading the file into lists manually (less desirable)
with open('grades.dat') as files:
    grades = [[e.strip() for e in s.split(',')] for s in files]

# Method two, using csv and namedtuple
StudentRecord = namedtuple('StudentRecord', 'id, lastname, firstname, something, homework1, homework2, homework3, homework4, homework5, homework6, homework7, exam1, exam2, exam3')
grades = map(StudentRecord._make, csv.reader(open('grades.dat')))
# Now you could have student.id, student.lastname, etc.
# Skipping the namedtuple, you could do grades = map(tuple, csv.reader(open('grades.dat')))

request = open('requests.dat', 'w')
cont = 'y'

while cont.lower() == 'y':
    answer = raw_input('Please enter the Student I.D. of whom you are looking: ')
    for student in grades:
        if answer == student[0]:
            print '%s, %s      %s      %s' % (student[1], student[2], student[0], student[3])
            time = strftime('%a, %b %d %Y %H:%M:%S', localtime())
            print time
            print 'Exams - %s, %s, %s' % student[11:14]
            print 'Homework - %s, %s, %s, %s, %s, %s, %s' % student[4:11]
            total = sum(int(x) for x in student[4:14])
            print 'Total points earned - %d' % total
            grade = total / 5.5
            if grade >= 90:
                letter = 'an A'
            elif grade >= 80:
                letter = 'a B'
            elif grade >= 70:
                letter = 'a C'
            elif grade >= 60:
                letter = 'a D'
            else:
                letter = 'an F'

            if letter = 'an A':
                print 'Grade: %s, that is equal to %s.' % (grade, letter)
            else:
                print 'Grade: %.2f, that is equal to %s.' % (grade, letter)

            request.write('%s %s, %s %s\n' % (student[0], student[1], student[2], time))


    print
    cont = raw_input('Would you like to search again? ')

print 'Goodbye.'

@John Machin: that intentional as the format appears to be CSV and may easily have spaces in the fields. (For that matter, that's why I recommend `csv`.) (Also, -1 seems a bit drastic for such a point when I've improved the code so much!) — Chris Morgan, Dec 01 '10 at 05:11

score 0 · Answer 5 · answered Nov 30 '10 at 22:18

0

You want the String.strip(s[, chars]) function, which will strip out whitespace characters or whatever characters (such as '\n') you specify in the chars argument.

See http://docs.python.org/release/2.3/lib/module-string.html

answered Nov 30 '10 at 22:18

DGH

11,189
2
23
24

-1 for THREE reasons: (1) it's `string`, not `String` (2) string functions that have an equivalent `str` method are deprecated (3) The OP has not said that they are using an antique version of Python, so you should refer them to the docs of the current production version(s), 2.7 and 3.1, not 2.3. – John Machin Dec 01 '10 at 04:29
@John Machin: Good points, all. I probably rushed a bit when answering this question, and various languages I use tend to merge together in my head. Thank you, though, for explaining exactly why you down-voted me. I appreciate the chance to learn. – DGH Dec 01 '10 at 07:48

Remove the newline character in a list read from a file

5 Answers5

Before

After

Before

After

Linked

Related