1

I am looking for a code able to concatenate different .txt files but just up to certain number of lines in each one.

Suppose we have many text files as follows:

file1.txt:

AAAAA

BBBBB

CCCCC

DDDDD

EEEEE

file2.txt:

FFFFF

GGGGG

HHHHH

IIIII

JJJJJ

file3.txt:

KKKKK

LLLLL

MMMMM

NNNNN

OOOOO

file4.txt:

PPPPP

QQQQQ

RRRRR

SSSSS

TTTTT

How can we make one log file like below (assuming that all of them must be concatenated only up to the line number 3 -included)?

result:

AAAAA

BBBBB

CCCCC

FFFFF

GGGGG

HHHHH

KKKKK

LLLLL

MMMMM

PPPPP

QQQQQ

RRRRR

This is for Python 3.7.3. I was succesful to concatenate the files using the examples available in:

Python concatenate text files

but I was not able to modify the code for a specific maximum number of lines per file.

Related code developed until now (but not successful):

    a = open('newfile.log', 'wb')
    with a as wfd:
            for f in glob.glob(r'*.txt'):
                    with open(f,'rb') as fd:
                            for line in fd:
                                    for line in range (0, 3):
                                            a.write(line)  

Any help?

The obtained error message says:

TypeError: a bytes-like object is required, not 'int'

Andrés F
  • 73
  • 6

4 Answers4

0

Are you sure you want to be using

for **line** in fd:

  for **line** in range (0, 3):

The second line variable overwrites the first, so line is an integer from 0..3 If you do want to do that, just use

a.write(str(line))
Untitled123
  • 1,317
  • 7
  • 20
0

If I understand you correctly, try this:

import glob

limit = 3
with open('newfile.log', 'wb') as wfd:
    for f in glob.glob(r'*.txt'):
        with open(f, 'rb') as fd:
            line_count = 0
            for line in fd:
                if line_count >= limit:
                    break
                wfd.write(line)
                line_count += 1
Perplexabot
  • 1,852
  • 3
  • 19
  • 22
  • `for line_count, line in enumerate(fd):` to avoid manually tracking `line_count`. – Steven Rumbalski Jun 07 '19 at 18:19
  • 1
    @AndrésF Here at Stack Overflow, if an answer helped you, the common practice is to [upvote the answer or to mark it as accepted](https://stackoverflow.com/help/someone-answers). The upvotes and the green checkmark indicate to others which answer solved your problem. See [What should I do when someone answers my question?](https://stackoverflow.com/help/someone-answers) – Gino Mempin Aug 04 '19 at 12:01
0

the value of line in the first loop, which is bytes since f is opened in binary mode, is overwritten by the second loop and becomes an int which is not what write() expects. you can use writelines() to write a list of lines that you can get using readlines(), of course you can use slicing on readlines() to get only the first 3 lines:

import glob

with open("newfile.log", "wb") as log:
    for f in glob.glob("*.txt"):
        with open(f, "rb") as fd:
            log.writelines(fd.readlines()[:3])
AmjadHD
  • 163
  • 6
0

If you can describe the line numbers you want with a call to range() then you can use itertools.islice for a more direct method:

from itertools import islice

max_lines = 3

with open('newfile.log', 'wb') as wfd:
    for f in glob.glob(r'*.txt'):
        with open(f, 'rb') as fd:
            wfd.writelines(islice(fd, max_lines))
Steven Rumbalski
  • 44,786
  • 9
  • 89
  • 119