71

I'd like to read numbers from file into two dimensional array.

File contents:

  • line containing w, h
  • h lines containing w integers separated with space

For example:

4 3
1 2 3 4
2 3 4 5
6 7 8 9
kravemir
  • 10,636
  • 17
  • 64
  • 111
  • are you stuck somewhere specific? have a look at http://docs.python.org/tutorial/inputoutput.html#methods-of-file-objects (I'm not the one downvoting here) – Jacob Jul 05 '11 at 13:44
  • but there is example how to read file line by line, not as numbers – kravemir Jul 05 '11 at 13:45
  • Your question misses both a clear description of the file content and of the desired output. – mac Jul 05 '11 at 13:46
  • @Miro getting the line is the step, than you need to manipulate the string with something like split() http://docs.python.org/library/stdtypes.html#string-methods You said you are new to python so I guess you want to learn something, but you are not going to learn alot if you just use an answer here. Also, this looks like homework. – Jacob Jul 05 '11 at 13:48
  • yes, the split function is that i'm looking for. I'm very good with c++ and i'm trying to do puzzle in other language. At codercharts.com – kravemir Jul 05 '11 at 13:52
  • 2
    but in c++ u read number by number and there u split string into numbers, so i've had no idea where to start – kravemir Jul 05 '11 at 14:12
  • @Miro: I agree, this problem is best tackled with a fundamentally different approach in python. To approach it iteratively just doesn't make sense, especially when you're buffering all the data at once anyway. Even those header lines, which might be critical to a standard C++ approach, seem unnecessary when doing it pythonically. See below. – machine yearning Jul 05 '11 at 14:17

6 Answers6

104

Assuming you don't have extraneous whitespace:

with open('file') as f:
    w, h = [int(x) for x in next(f).split()] # read first line
    array = []
    for line in f: # read rest of lines
        array.append([int(x) for x in line.split()])

You could condense the last for loop into a nested list comprehension:

with open('file') as f:
    w, h = [int(x) for x in next(f).split()]
    array = [[int(x) for x in line.split()] for line in f]
Zach Kelling
  • 52,505
  • 13
  • 109
  • 108
  • 3
    aren't values stored as strings then ? – ascobol Jul 05 '11 at 13:53
  • I think this is an okay solution, but I'm always hesitant to iterate and append... IMO it's usually more succinct and easier to read to work with list generators, where you can do both in a single operation and without a flow-control structure like a `for` loop. – machine yearning Jul 05 '11 at 14:29
  • I prefer list comprehensions as well, up until they become huge nested monstrosities. – Zach Kelling Jul 05 '11 at 14:36
  • 2
    +1 : This is an excellent answer and your skillful use of both `readline` and `line in f` earns my highest regards. Cheers. – machine yearning Jul 06 '11 at 02:51
  • Mixing `readline()` and iteration over the file is [explicitly disallowed by the documentation](https://docs.python.org/2/library/stdtypes.html#file.next) and does result in wrong behaviour in many cases. This specific use might happen to work in some cases, but it's wrong anyway. The correct solution is to use `next(f)` to get the first line. – Sven Marnach Mar 30 '16 at 21:14
  • @SvenMarnach It's not a problem to use `readline` **before** using `next`, as you are not yet using the read-ahead buffer. I have used this many times to read the first line of a CSV file, without a problem. It's probably less error-prone to use `next` though, in case the `readline` has to be moved later. –  Aug 02 '16 at 06:33
  • @Jean-ClaudeArbaut It might happen to work on current versions of CPython, but you are outside the API specification if you do this. Other Python implementations or future versions of Python are free to implement different behaviour. – Sven Marnach Aug 02 '16 at 12:17
  • @SvenMarnach Is there the same sentence in the Python3 documentation? I didn't find it. –  Aug 03 '16 at 12:49
  • 2
    @Jean-ClaudeArbaut No, you are right. In Python 3 you are allowed to freely mix `next(f)` and `f.readline()`, since `next()` is actually implemented using `readline()`, and buffering is moved to a separate class used by all mechanisms of reading from a file. Thanks for pointing this out. I now remember reading about it years ago, but it had slipped my mind when I wrote the previous comment. – Sven Marnach Aug 03 '16 at 13:53
  • for non-integer like 3.15? – Ka Wa Yip Jun 14 '17 at 00:30
  • For me, it seems that I have to add `array.append([x,y])` before reading the rest of the lines, otherwise my array starts with values from the second line and onwards, any clues? – user3374479 May 28 '21 at 08:32
19

To me this kind of seemingly simple problem is what Python is all about. Especially if you're coming from a language like C++, where simple text parsing can be a pain in the butt, you'll really appreciate the functionally unit-wise solution that python can give you. I'd keep it really simple with a couple of built-in functions and some generator expressions.

You'll need open(name, mode), myfile.readlines(), mystring.split(), int(myval), and then you'll probably want to use a couple of generators to put them all together in a pythonic way.

# This opens a handle to your file, in 'r' read mode
file_handle = open('mynumbers.txt', 'r')
# Read in all the lines of your file into a list of lines
lines_list = file_handle.readlines()
# Extract dimensions from first line. Cast values to integers from strings.
cols, rows = (int(val) for val in lines_list[0].split())
# Do a double-nested list comprehension to get the rest of the data into your matrix
my_data = [[int(val) for val in line.split()] for line in lines_list[1:]]

Look up generator expressions here. They can really simplify your code into discrete functional units! Imagine doing the same thing in 4 lines in C++... It would be a monster. Especially the list generators, when I was I C++ guy I always wished I had something like that, and I'd often end up building custom functions to construct each kind of array I wanted.

machine yearning
  • 9,889
  • 5
  • 38
  • 51
  • I don't think this works. `cols, rows = (int(val) for val in '4 3\n')` doesn't do what you want. Same for `[int(val) for val in line]` because `line` will be something like `'1 2 3 4\n'` – Jason R. Coombs Jul 05 '11 at 13:57
  • @Jason: Yeah sorry there were a couple of errors in my original code, but the gist was right. Corrected above. I guess that's what iterative development is for! :) – machine yearning Jul 05 '11 at 13:59
  • 3
    In the trivial case OP mentions, the C++ version, while slightly longer, would not be "a monster" as you say. You would use fscanf() or streams and vector> (or even int[][]). And C++ would provide much more control over memory management while reading and parsing the file. – dolphin Jun 18 '14 at 22:50
  • 2
    Actually, ifstreams are simpler to handle than fscanf, which is a C function, not a C++ one. If you're simply parsing text in C++ and have anything more complex than the suggested python solution you're clearly doing something wrong. – Andrej Jul 10 '15 at 01:43
6

Not sure why do you need w,h. If these values are actually required and mean that only specified number of rows and cols should be read than you can try the following:

output = []
with open(r'c:\file.txt', 'r') as f:
    w, h  = map(int, f.readline().split())
    tmp = []
    for i, line in enumerate(f):
        if i == h:
            break
        tmp.append(map(int, line.split()[:w]))
    output.append(tmp)
Artsiom Rudzenka
  • 27,895
  • 4
  • 34
  • 52
  • 1
    Interesting approach to include the header data as well, I didn't even think of that. +1 for completeness... but it's a bit lengthy / hard to read :) – machine yearning Jul 05 '11 at 14:22
  • 1
    Thanx) I have created extended solution that iterates line by line and creates list of list for all occurrences of w,h. However the best answer is already selected))) – Artsiom Rudzenka Jul 05 '11 at 14:47
2

is working with both python2(e.g. Python 2.7.10) and python3(e.g. Python 3.6.4)

with open('in.txt') as f:
  rows,cols=np.fromfile(f, dtype=int, count=2, sep=" ")
  data = np.fromfile(f, dtype=int, count=cols*rows, sep=" ").reshape((rows,cols))

another way: is working with both python2(e.g. Python 2.7.10) and python3(e.g. Python 3.6.4), as well for complex matrices see the example below (only change int to complex)

with open('in.txt') as f:
   data = []
   cols,rows=list(map(int, f.readline().split()))
   for i in range(0, rows):
      data.append(list(map(int, f.readline().split()[:cols])))
print (data)

I updated the code, this method is working for any number of matrices and any kind of matrices(int,complex,float) in the initial in.txt file.

This program yields matrix multiplication as an application. Is working with python2, in order to work with python3 make the following changes

print to print()

and

print "%7g" %a[i,j],    to     print ("%7g" %a[i,j],end="")

the script:

import numpy as np

def printMatrix(a):
   print ("Matrix["+("%d" %a.shape[0])+"]["+("%d" %a.shape[1])+"]")
   rows = a.shape[0]
   cols = a.shape[1]
   for i in range(0,rows):
      for j in range(0,cols):
         print "%7g" %a[i,j],
      print
   print      

def readMatrixFile(FileName):
   rows,cols=np.fromfile(FileName, dtype=int, count=2, sep=" ")
   a = np.fromfile(FileName, dtype=float, count=rows*cols, sep=" ").reshape((rows,cols))
   return a

def readMatrixFileComplex(FileName):
   data = []
   rows,cols=list(map(int, FileName.readline().split()))
   for i in range(0, rows):
      data.append(list(map(complex, FileName.readline().split()[:cols])))
   a = np.array(data)
   return a

f = open('in.txt')
a=readMatrixFile(f)
printMatrix(a)
b=readMatrixFile(f)
printMatrix(b)
a1=readMatrixFile(f)
printMatrix(a1)
b1=readMatrixFile(f)
printMatrix(b1)
f.close()

print ("matrix multiplication")
c = np.dot(a,b)
printMatrix(c)
c1 = np.dot(a1,b1)
printMatrix(c1)

with open('complex_in.txt') as fid:
  a2=readMatrixFileComplex(fid)
  print(a2)
  b2=readMatrixFileComplex(fid)
  print(b2)

print ("complex matrix multiplication")
c2 = np.dot(a2,b2)
print(c2)
print ("real part of complex matrix")
printMatrix(c2.real)
print ("imaginary part of complex matrix")
printMatrix(c2.imag)

as input file I take in.txt:

4 4
1 1 1 1
2 4 8 16
3 9 27 81
4 16 64 256
4 3
4.02 -3.0 4.0
-13.0 19.0 -7.0
3.0 -2.0 7.0
-1.0 1.0 -1.0
3 4
1 2 -2 0
-3 4 7 2
6 0 3 1
4 2
-1 3
0 9
1 -11
4 -5

and complex_in.txt

3 4
1+1j 2+2j -2-2j 0+0j
-3-3j 4+4j 7+7j 2+2j
6+6j 0+0j 3+3j 1+1j
4 2
-1-1j 3+3j
0+0j 9+9j
1+1j -11-11j
4+4j -5-5j

and the output look like:

Matrix[4][4]
     1      1      1      1
     2      4      8     16
     3      9     27     81
     4     16     64    256

Matrix[4][3]
  4.02     -3      4
   -13     19     -7
     3     -2      7
    -1      1     -1

Matrix[3][4]
     1      2     -2      0
    -3      4      7      2
     6      0      3      1

Matrix[4][2]
    -1      3
     0      9
     1    -11
     4     -5

matrix multiplication
Matrix[4][3]
  -6.98      15       3
 -35.96      70      20
-104.94     189      57
-255.92     420      96

Matrix[3][2]
    -3     43
    18    -60
     1    -20

[[ 1.+1.j  2.+2.j -2.-2.j  0.+0.j]
 [-3.-3.j  4.+4.j  7.+7.j  2.+2.j]
 [ 6.+6.j  0.+0.j  3.+3.j  1.+1.j]]
[[ -1. -1.j   3. +3.j]
 [  0. +0.j   9. +9.j]
 [  1. +1.j -11.-11.j]
 [  4. +4.j  -5. -5.j]]
complex matrix multiplication
[[ 0.  -6.j  0. +86.j]
 [ 0. +36.j  0.-120.j]
 [ 0.  +2.j  0. -40.j]]
real part of complex matrix
Matrix[3][2]
      0       0
      0       0
      0       0

imaginary part of complex matrix
Matrix[3][2]
     -6      86
     36    -120
      2     -40
Andrei
  • 579
  • 6
  • 12
2

To make the answer simple here is a program that reads integers from the file and sorting them

f = open("input.txt", 'r')

nums = f.readlines()
nums = [int(i) for i in nums]

After reading each line of the file converting each string to a digit

nums.sort()

Sorting the numbers

f.close()

f = open("input.txt", 'w')
for num in nums:
    f.write("%d\n" %num)

f.close()

Writing them back As easy as that, Hope this helps

Stale Noobs
  • 130
  • 1
  • 7
1

The shortest I can think of is:

with open("file") as f:
    (w, h), data = [int(x) for x in f.readline().split()], [int(x) for x in f.read().split()]

You can seperate (w, h) and data if it looks neater.

Yu Hao
  • 119,891
  • 44
  • 235
  • 294
The Potato
  • 11
  • 1