reading a file and storing values in a matrix using python

Question

I am trying to read some numbers from a file and store them into a matrix using Python. In the file, on the first line, I have 2 numbers, n and m, the number of lines and the number of columns and on the next lines, I have n*m values. The complicated part is that in the file, on the second line, for example, I do not have m values, I have only m-2 values. So I cannot read the file one line at a time and just store the values in a matrix. Editing the file is not option because I have files which have 200 rows and 1000 columns. This is how a file with less rows and columns looks:

4 5
1 2 3 
4 5 1 2 3 4 
5 1 2 
3 4 5 1 2 
3 4 5

I have managed to resolve this problem by storing all the values in an array and then deleting the first two values, which are n and m, and then creating a matrix from that array.

This is my code:

f = open('somefile2.txt')
numbers = []
for eachLine in f:
    line = eachLine.strip()
    for x in eachLine.split(' '):
        line2 = int(x)
        numbers.append(line2)
f.close()
print numbers
n = numbers[0]
del numbers[0]
m = numbers[0]
del numbers[0]
print n, m, numbers
vector = []
matrix = []
for i in range(n):
    for j in range(m):
        vector.append(numbers[j])
    matrix.append(vector)
    vector = []
print matrix

This gives me the expected result, but is this the right way to do it, by using the extra array numbers, or is there an easier way in which I store all the values directly into a matrix?

score 2 · Accepted Answer · edited May 23 '17 at 12:33

You can use a generator function:

def solve(f, n, m):
    lis = []
    for line in f:
        if len(lis) > m:
            yield lis[:m]
            lis = lis[m:]
        lis.extend(map(int, line.split()))
    for i in xrange(0, len(lis), m):
        yield lis[i:i+m]       

with open('abc1') as f:
    n, m = map(int, next(f).split())
    # Now you can either load the whole array at once using the list() call,
    # or use a simple iteration to get one row at a time.
    matrix = list(solve(f, n, m))
    print matrix

Output:

[[1, 2, 3, 4, 5], [1, 2, 3, 4, 5], [1, 2, 3, 4, 5], [1, 2, 3, 4, 5]]

Another approach get a flattened iterator of all the items in the file, and then split that iterator into equally sized chunks.

from itertools import chain, islice

with open('abc1') as f:
    n, m = map(int, next(f).split())
    data = chain.from_iterable(map(int, line.split()) for line in f)
    matrix = [list(islice(data, m)) for i in xrange(n)]
    print matrix
    #[[1, 2, 3, 4, 5], [1, 2, 3, 4, 5], [1, 2, 3, 4, 5], [1, 2, 3, 4, 5]]

Related:

Thank you! It works with the first version. I will also try the second approach. I have another question: at the notifications I saw that I received 2 answers for my question, but I can only see your answer. Do you know why? — user1956185, Jan 20 '14 at 12:40
@user1956190 Other answer is deleted now, only users with 10K+ rep can see deleted posts. — Ashwini Chaudhary, Jan 20 '14 at 12:41

score 1 · Answer 2 · 2014-05-13T18:03:18.150

My 2 cents:

with open('somefile.txt') as f:
    strings = f.read().split()

numbers = map(int, strings)
m = numbers.pop(0)
n = numbers.pop(0)

matrix = [numbers[i:i+n] for i in xrange(0, m*n, n)]

In Python 3 you would simply do:

m, n, *numbers = map(int, strings)

Depending on what you want to do with the data you might want to have a look at NumPy which has some nice methods for reading text files.

reading a file and storing values in a matrix using python

2 Answers2