0

I'm trying to read lines of a file into a list so every N lines will be in the same tuple. Assuming the file is valid so there are xN lines, how can I achive it?

The way I read the lines into the list:

def readFileIntoAList(file,N):
    lines = list()
    with open(file) as f:
        lines = [line.rstrip('\n') for line in f]
    return lines

What change I have to do with N so it will be a list of tuples so each tuple is of length N? For example I have the following file content:

ABC
abc xyz
123
XYZ
xyz abc
321

The output will be:

[("ABC","abc xyz","123"),("XYZ,"xyz abc",321")]
vesii
  • 2,760
  • 4
  • 25
  • 71
  • 1
    Possible duplicate of [How do you split a list into evenly sized chunks?](https://stackoverflow.com/questions/312443/how-do-you-split-a-list-into-evenly-sized-chunks) – Mateen Ulhaq Sep 22 '19 at 12:19

4 Answers4

1

You could try using a chunking function:

def readFileIntoAList(file, n):
    with open(file) as f:
        lines = f.readlines()
        return [lines[i:i + n] for i in range(0, len(lines), n)]

This will split the list of lines in the file into evenly sized chunks.

miike3459
  • 1,431
  • 2
  • 16
  • 32
0

One way would be:

>>> data = []
>>> N = 3
>>> with open('/tmp/data') as f:
...     while True:
...         chunk = []
...         for i in range(N):
...             chunk.append(f.readline().strip('\n'))
...         if any(True for c in chunk if not c):
...             break
...         data.append(tuple(chunk))
...
>>> print(data)
[('ABC', 'abc xyz', '123'), ('XYZ', 'xyz abc', '321')]

Note that this assumes the file has the right number of lines. Having the wrong number of lines in the above code can lead to infinite loop. A solution without that risk is:

data = []
N = 3
with open('/tmp/data') as f:
    i = 0
    chunk = []
    for line in f:
        chunk.append(line.strip('\n'))
        i += 1
        if i % N == 0 and i != 0:
            data.append(tuple(chunk))
            chunk = []

Both of these ways will not read the whole file in memory which should be more efficient when you process large datasets

urban
  • 5,392
  • 3
  • 19
  • 45
  • 1
    Doesn't answer the question, since OP is looking to chunk by a variable number of lines. – miike3459 Sep 22 '19 at 12:21
  • True... missed that! Fixing - trying to find a way that does not require reading the whole file... – urban Sep 22 '19 at 12:23
  • 3
    More "pythonic" will be to use [`enumerate()`](https://docs.python.org/3/library/functions.html#enumerate) for indexing instead of manual increment. – Olvin Roght Sep 22 '19 at 13:03
0

You can use itertools.islice():

from itertools import islice

N = 3  # chunk size
with open("filename") as f:
    lines = []
    chunk = tuple(s.strip() for s in islice(f, N))
    while chunk:
        lines.append(chunk)
        chunk = tuple(s.strip() for s in islice(f, N))

Also you can use map() if you prefer functional style:

chunk = tuple(map(str.strip, islice(f, N)))
Olvin Roght
  • 7,677
  • 2
  • 16
  • 35
-1
import math
def readFileIntoAList(file,N):
    lines= list()
    lines1 = list()
    with open(file) as f:
        lines1 = [lineNew.rstrip("\n") for lineNew in f]
        for a in range(math.ceil(len(lines1)/N)):
            lines.append((*lines1[a*N:(a+1)*N],))
    return lines

I used loop, I tried to make it easily.

Gökhan
  • 22
  • 1
  • 4