List of strings to integers while keeping a format in python

Question

So what I want to do seems relatively simple, but for the life of me, I just can't quite get it. I have a .txt file like

4 2
6 5 1
9 4 5

And I want its information to be available to me like so (i.e. I do not need to write a new .txt file unless it would be necessary.)...

3 1
5 4 0
8 3 4

or, 1 is subtracted from every number but the formatting remains the same. There will never be a number greater than 1 in the original, so negatives won't be possible. This whole headache is due to converting indexing to begin with 0 instead of 1. What may complicate things is that the original file prints like

['4 2 /n','6 5 1 /n', '9 4 5 /n']

What I've Done

Well its a mishmash of different things I've found on StackOverflow, but I think I'm going about it in the most cumbersome way possible. And this one didn't make sense when I implemented it.. although it may be on the same track with the issue with spaces..

origianl = open(file, 'r')
for line in original.readlines():
    newline = line.replace(" \n","")
    finalWithStrings.append(newline)

finalWithIntegers = [map(int,x) for x in finalWithStrings]
finalWithIntegers[:] = [x-1 for x in finalWithIntegers]

My thought process was, I need to remove the "/n" and to convert these strings into integers so I can subtract 1 from them. And somehow keep the formatting. It's important to have the formatting be the same since each line contains information on the similarly indexed line of another file. I don't want to see the "/n" in the end result (or print statement) but I still want the effect of a new line beginning. The above code however, wont work for two reasons (that I know of).

int(n[:]) throws an error since it doesn't like the spaces and when I put a value (say 0) in there, then the code prints the first number on each of the lines and subtracts one.. and puts it all on one line.

[3, 5, 8]

So, it seems redundant to take out a carriage return and have to throw another in, but I do need to keep the formatting, as well as have a way to get all the numbers!

This also didn't work:

for line in original.readlines():
    newline = line.replace(" \n","")
    finalWithStrings.append(newline)

finalWithIntegers = [map(int,x) for x in finalWithStrings]
finalWithIntegers[:] = [x-1 for x in finalWithIntegers]

but instead of just a wrong output it was an error:

ValueError:invalid literal for int() with base 10:''

Does anyone have any ideas on what I'm doing wrong here and how to fix this? I am working with Python 2.6 and am a beginner.

mgilson · Accepted Answer · 2012-08-03T17:44:29.607

9

with open("original_filename") as original:
    for line in original:
        #if you just want the line as integers:
        integers = [ int(i) - 1 for i in line.split() ]
        #do something with integers here ...

        #if you want to write a new file, use the code below:
        #new_line = " ".join([ str(int(i) - 1) for i in line.split() ])
        #newfile.write(new_line + '\n')

I've opened your file in a context manager in the above example because that is good practice (since version 2.5). The context manager makes sure that your file is properly closed when you exit that context.

EDIT

It looks like you might be trying to create a 2D list ... To do that, something like this would work:

data = []
with open("original_filename") as original:
    for line in original:
        integers = [ int(i) - 1 for i in line.split() ]
        data.append(integers)

Or, if you prefer the 1-liner (I don't):

with open("original_filename") as original:
    data = [ [int(i) for i in line.split()] for line in original ]

Now if you print it:

for lst in data:
    print (lst)    # [3, 1]
                   # [5, 4, 0]
                   # [8, 3, 4]

edited Aug 03 '12 at 17:44

answered Aug 03 '12 at 17:32

mgilson

300,191
65
633
696

Perfect, this is exactly what I was looking for! Thank you for the edit.. it was more specific to what I needed. – Ason Aug 03 '12 at 17:43
@Ason -- No problem. I re-read your post a little more carefully and came across the line that said you didn't need it in a new file unless that was the easiest way to accomplish this. So, I updated. – mgilson Aug 03 '12 at 17:45
@Ason -- I also condensed it down to a 1-liner (and added that as an alternative). I don't prefer it to the multi-line version, but it's not *too bad* so there might be some who like it better. – mgilson Aug 03 '12 at 17:50
@mgilson as a beginner, I like to see more of what I'm doing, so I'll stick with the multi-line, but thank you for adding more information for future users! – Ason Aug 03 '12 at 17:53
@mgilson just to clarify, is it necessary to convert the strings to integers to subtract `1`? I read that strings are unchangeable. – Ason Aug 03 '12 at 17:57
@Ason. Yes, it's necessary. I don't know where you read that strings and integers are interchangeable. That's not true. As proof, just try it in the interactive interpreter `>>> '8'-1` results in: `TypeError: unsupported operand type(s) for -: 'str' and 'int'`. – mgilson Aug 03 '12 at 17:59
@mgilson - I read that strings were _unchangeable_ so that's why I reasoned I needed a way to convert them to integers in order to operate on them. Thanks for the clarification though. – Ason Aug 03 '12 at 18:07
@Ason -- Ohh ... Yes, strings are unchangeable (or immutable). Here's a somewhat recent post on the subject ( http://stackoverflow.com/questions/11690220/python-byref-copy/11690298#11690298 ). I'm sure there are probably better ones. The specific posts is asking about integers, but the same reasoning can be applied to tuples, strings, and other immutable types. – mgilson Aug 03 '12 at 18:13

Andrew Clark · Answer 2 · 2012-08-03T18:43:37.937

4

Here is a pretty straight forward way to accomplish this using regular expressions. The benefit here is that the formatting is guaranteed to stay exactly the same because it will replace the numbers in place without touching any of the whitespace:

import re

def sub_one_repl(match):
    return str(int(match.group(0))-1)

for line in original.readlines():
    newline = re.sub(r'\d+', sub_one_repl, line).rstrip('\n')

edited Aug 03 '12 at 18:43

answered Aug 03 '12 at 17:39

Andrew Clark

202,379
35
273
306

Thank you so much for your answer! I'm not very familiar with regular expressions, so I'll have to select a different answer as it was easier to understand and implement.. but +1 for helping future visitors! – Ason Aug 03 '12 at 17:46
Great idea, though I think you mean `match.group` and not `m.group`. As well, you might want to make `sub_one_repl` either a little more safe (ie if the regex fails to match the .group will cause an exception) or just do a lambda. As well you could do it as a list comp or generator expression: `(re.sub(r'\d+', lambda m: str(int(m.group(0))-1), line) for line in original.readlines())` – Adam Parkin Aug 03 '12 at 18:17
1

@AdamParkin - Thanks, I originally had `m` as the argument and forgot to update the function. `sub_one_repl` will only be called on successful matches, which will always be all digits, so it should be safe as it is. One-line is an option but I would still move the `lambda` outside of it so you aren't recreating the function on each iteration. – Andrew Clark Aug 03 '12 at 18:48

score 2 · Answer 3 · answered Aug 03 '12 at 18:00

2

Another way is to use the csv module and list comprehension:

from csv import reader

data = [[int(j) - 1 for j in i] for i in reader(open("your_file"), delimiter=' ')]

It results, for example, using your data:

[[3, 1], [5, 4, 0], [8, 3, 4]]

answered Aug 03 '12 at 18:00

rcovre

69
4

score 0 · Answer 4 · answered Aug 03 '12 at 17:43

0

Try this:

with open(filepath) as f:
    for line in f:
        print " ".join([str(int(i)-1) for i in line.split()])

Hope that helps

answered Aug 03 '12 at 17:43

inspectorG4dget

110,290
27
149
241

List of strings to integers while keeping a format in python

4 Answers4