How to combine two text file as one?

Question

I have two text files. I want to combine some of their columns in a new text file.

I am trying this, but it is not working:

with open('1','r') as first:
    with open('2', 'r') as second:
        data1 = first.readlines()
        for line in data1:
            output = [(item.strip(), line.split(' ')[2]) for item in second]
            f = open("1+2","w")
            f.write("%s  %s\n" .format(output))
            f.close()

first text file that I have:

Second text file that I have:

I want a new file with the column in first file and second column in second file, which is like this:

score 0 · Answer 1 · answered Jul 19 '19 at 10:20

You can iterate over the respective line pairs, and concatenate the first column of first file with the second column of the second:

with open('file_1.txt') as f1, open('file_2.txt') as f2, open('new_file.txt', 'w') as fr:
    for line in ("{} {}".format(l1.rstrip('\n'), l2.split(maxsplit=1)[1]) for l1, l2 in zip(f1, f2)):
        fr.write(line)

If you're sure that the columns are separated by a single space, you can also use str.partition like:

l2.partition(' ')[-1]

Example:

In [28]: with open('file_1.txt') as f1, open('file_2.txt') as f2, open('new_file.txt', 'w') as fr:
    ...:     for line in ("{} {}".format(l1.rstrip('\n'), l2.split(maxsplit=1)[1]) for l1, l2 in zip(f1, f2)):
    ...:         fr.write(line)
    ...:     

In [29]: cat new_file.txt
1 3
2 5
3 7
4 3

As an aside, when you don't have same number of rows in both files, and you want to keep operating on the longest one, you can look at itertools.zip_longest instead of zip.

score 0 · Answer 2 · answered Jul 19 '19 at 10:26

Assuming your both file are data file, you can use the numpy module.

loadtxt loads text file in array.
savetxt saves an array in a text file. You can also specify the format of number saved with the fmt option.

Here the code:

import numpy as np

data1 = np.loadtxt("file1.txt")
data2 = np.loadtxt("file2.txt")
print(data1)
# [1. 2. 3. 4.]
print(data2)
# [[1. 3.]
#  [2. 5.]
#  [5. 7.]
#  [7. 3.]]

data2[:, 0] = data1
print(data2)
# [[1. 3.]
#  [2. 5.]
#  [3. 7.]
#  [4. 3.]]
np.savetxt('output.txt', data2, fmt="%d")

score 0 · Answer 3 · answered Jul 19 '19 at 10:26

from itertools import izip

with open("file1.txt") as textfile1, open("file2.txt") as textfile2, open('output.txt', 'w') as out: 
    for x, y in izip(textfile1, textfile2):
        x = x.strip()
        y = y.split(" ")[1].strip()
        print("{0} {1}".format(x, y))
        out.write("{0} {1}\n".format(x, y))

score 0 · Answer 4 · answered Jul 19 '19 at 10:38

There are many interesting answers as to how to do that, but none of them show how to fix your code. I find it better for learning when we understand our own mistakes, rather than get a solution ;)

Tuple in the same line has object names the other way around - you want line (from 1st file) stripped and item (from 2nd) split and took second element (that would be [1])

With those small changes (and others, described in comments), we get:

with open('1','r') as first:
    with open('2', 'r') as second:
        #data1 = first.readlines() #don't do that, iterate over the file
        for line in first: #changed
            output = [(line.strip(), item.split(' ')[1]) for item in second]
            f = open("1+2","a") #use mode "a" for appending - otherwise you're overwriting your data!
            f.write("{}  {}".format(output)) # don't mix python2 and python3 syntax, removed extra newline
            f.close()

But it's still wrong. Why? Because for item in second - you're parsing whole second file here. In the first ever line from 1st file.

We need to change it so that we only take one element. I'd recommend you read this question and explanations about iterators.

Now let's apply this knowledge: second is an iterator. We only need one element from it and we need to do it manually (because we're in another loop - looping over 2 things at once is a tricky thing), so we'll be using next(second):

with open('1','r') as first:
    with open('2', 'r') as second:
        for line in first: 
            item = next(second)
            output = (line.strip(), item.split(' ')[1]) #no list comprehension here
            f = open("1+2","a") 
            f.write("{}  {}".format(*output)) #you have to unpack the tuple
            f.close()

Explanation about unpacking - basically, when you pass just output, Python sees it as once element and doesn't know what to do with the other {}. You have to say "hey, treat this iterable (in this case: 2-element tuple) as single elements, not a whole" and that's how this * does. :)

How to combine two text file as one?

4 Answers4