121

This is my problem.

I'm trying to read a text file and then convert the lines into floats. The text file has \n and \t in it though I don't know how to get rid of it.

I tried using line.strip() but it didn't take it off and I got an error when I wanted to convert the stuff to floats. I then tried line.strip("\n") but that didn't work either. My program works fine when I take out the \t and \n from the text file, but it's part of the assignment to get it to work with them.

I really don't know why this isn't working. Thanks for any help.

John Machin
  • 81,303
  • 11
  • 141
  • 189
Mallard Duck
  • 1,219
  • 2
  • 8
  • 3
  • 6
    Can you provide an excerpt of your text file? – Josh Feb 19 '12 at 07:10
  • 2
    Strip only removes whitespace from the start and end of a line. If you have tabs in the middle of the line, it won't remove those. – Swiss Feb 19 '12 at 07:14
  • 2
    Worth noting that "\n" isn't the return character on all systems. You may need to strip "\n", "\r", or "\r\n". If you show some complete code you tried and input data, this may be easier to solve. – Aaron Dufour Feb 19 '12 at 07:34
  • 1
    Explain "convert the lines into floats". What text do you have on each line, and what exactly do you want to get out? – Karl Knechtel Feb 19 '12 at 08:53
  • 7
    `.rstrip()` is by far the easiest. – SDsolar Jul 04 '18 at 01:08

7 Answers7

214

You should be able to use line.strip('\n') and line.strip('\t'). But these don't modify the line variable...they just return the string with the \n and \t stripped. So you'll have to do something like

line = line.strip('\n')
line = line.strip('\t')

That should work for removing from the start and end. If you have \n and \t in the middle of the string, you need to do

line = line.replace('\n','')
line = line.replace('\t','')

to replace the \n and \t with nothingness.

austin1howard
  • 4,815
  • 3
  • 20
  • 23
  • 7
    I think you should escale the first backslash like this: `line = line.replace('\\n','')` – oho Dec 24 '18 at 14:53
  • 1
    This is bad answer because it works only if the stripped characters are in that exact order. The solution to clear tabs, newlines and spaces in any order is: string = string.strip("\n\t ") – Boy Feb 03 '19 at 12:00
  • 10
    Note that the advice from `oho` above is bad and that you should indeed use `\n` and not `\\n`, since you want to get the newline character and not the literal `\n`. – Superdooperhero Nov 03 '19 at 19:03
34

The strip() method removes whitespace by default, so there is no need to call it with parameters like '\t' or '\n'. However, strings in Python are immutable and can't be modified, i.e. the line.strip() call will not change the line object. The result is a new string which is returned by the call.

As already mentioned, it would help if you posted an example from your input file. If there are more than one number on each line, strip() is not the function to use. Instead you should use split(), which is also a string method.

To conclude, assuming that each line contains several floats separated by whitespace, and that you want to build a list of all the numbers, you can try the following:

floats = []
with open(filename) as f:
    for line in f:
        floats.extend([float(number) for number in line.split()])
Dan Gerhardsson
  • 1,869
  • 13
  • 12
8

You can use:

mylist = []
# Assuming that you have loaded data into a lines variable. 
for line in lines:
    mylist.append(line.strip().split('\t')

to get a python list with only the field values for all the lines of data.

Jobel
  • 633
  • 6
  • 13
6

How about using a python regex pattern?

import re
f = open('test.txt', 'r')
strings = re.findall(r"\S+", f.read())

And for your case of line.strip() will not work because Python removes only the leading and trailing characters

From Python Docs - Return a copy of the string with leading and trailing characters removed. If chars is omitted or None, whitespace characters are removed. If given and not None, chars must be a string; the characters in the string will be stripped from the both ends of the string this method is called on.

sransara
  • 3,454
  • 2
  • 19
  • 21
1

pythons csv library is good for this.

http://docs.python.org/library/csv.html

CSV = comma seperated values, but if you set the delimiter = \t, then it works for tab separated values too.

Rusty Rob
  • 16,489
  • 8
  • 100
  • 116
1

Often, depending on the way you read the lines, in order to get rid of \n from myline, you can take myline[:-1] since \n is the last character of myline.

For the '\t' you can use replace() or strip()

jimifiki
  • 5,377
  • 2
  • 34
  • 60
1

If you're trying to convert lines of floats separated by tab characters, then just float(line) will try to convert the whole line into one float, which will fail if there's more than one. Using strip to get rid of leading and trailing whitespace isn't going to help that fundamental problem.

Maybe you need to split each line into pieces and do something with each piece?

Ben
  • 68,572
  • 20
  • 126
  • 174