1

Here's an example of my current output:

['5216', 'SMITH', 'VICTORIA', 'F', '2009-12-19']

This is my code:

users1 = open('users1.txt','w')

with open('users.txt', 'r') as f:
    data = f.readlines()

    for line in data:
        words = str(line.split())
        #print words
        f.seek(0)
        users1.write(words)

I would like to read in users.txt and separate the information to send it to users1 and another text file I'll call users2. (Keep in mind this is a hypothetical situation and that I acknowledge it would not make sense to separate this information like I'm suggesting below.)

Is it possible to identify specific columns I'd like to insert into each text file?

For example, if I wanted users1.txt to contain, using my sample output from above, ['5216','2009-12-19'] and users2.txt to contain ['SMITH','VICTORIA'], what should I do?

Christina
  • 97
  • 1
  • 10

3 Answers3

3

You could use slicing to select items from the list. For example,

In [219]: words = ['5216', 'SMITH', 'VICTORIA', 'F', '2009-12-19']

In [220]: [words[0], words[-1]]
Out[220]: ['5216', '2009-12-19']

In [221]: words[1:3]
Out[221]: ['SMITH', 'VICTORIA']

with open('users.txt', 'r') as f,\
     open('users1.txt','w') as users1,\
     open('users2.txt','w') as users2:
    for line in f:
        words = line.split()
        users1.write(str([words[0], words[-1]])
        users2.write(str(words[1:3])                     

Including the brackets [] in the output is non-standard. For portability, and proper handling of quoted strings and strings containing the comma delimiter, you would be better off using the csv module:

import csv
with open('users.txt', 'rb') as f,\
     open('users1.txt','wb') as users1,\
     open('users2.txt','wb') as users2:
    writer1 = csv.writer(users1, delimiter=',')
    writer2 = csv.writer(users2, delimiter=',')     
    for line in f:
        words = line.split()
        writer1.writerow([words[0], words[-1]])
        writer2.writerow(words[1:3])                
Community
  • 1
  • 1
unutbu
  • 842,883
  • 184
  • 1,785
  • 1,677
1

I (too) suggest you use thecsvmodule. However by using itsDictReaderandDictWriteryou can assign field names to the each column and use them to easily specify which ones you want to go into which output file. Here's an example of what I mean:

import csv

users_fieldnames = 'ID', 'LAST', 'FIRST', 'SEX', 'DATE'  # input file field names
users1_fieldnames = 'ID', 'DATE'     # fields to go into users1 output file
users2_fieldnames = 'LAST', 'FIRST'  # fields to go into users2 output file

with open('users.txt', 'rb') as inf:
    csvreader = csv.DictReader(inf, fieldnames=users_fieldnames, delimiter=' ')

    with open('users1.txt', 'wb') as outf1, open('users2.txt', 'wb') as outf2:
        csvwriter1 = csv.DictWriter(outf1, fieldnames=users1_fieldnames,
                                    extrasaction='ignore', delimiter=' ')
        csvwriter2 = csv.DictWriter(outf2, fieldnames=users2_fieldnames,
                                    extrasaction='ignore', delimiter=' ')
        for row in csvreader:
            csvwriter1.writerow(row)   # writes data for only user1_fieldnames
            csvwriter2.writerow(row)   # writes data for only user2_fieldnames

Only the columns specified in the constructor calls tocsv.DictWriter()will be written to the output file by the correspondingwriterow()method call.

martineau
  • 119,623
  • 25
  • 170
  • 301
0

If your data has the same structure for all entries you can make use of pandas and numpy packages. A lot of flexibility for selecting whatever columns you need.

flamenco
  • 2,702
  • 5
  • 30
  • 46