How do I read in specific characters from each line of text and write it out to another file?

Question

I have a txt file named "tclust.txt" and another named "ef_blue.txt." I'm trying to write a python script which will allow me to import certain characters from ef_blue.txt to tclust.txt. So far, I can only read in the values from ef_blue.txt and have everything from that txt file go to tclust.txt. My ef_blue.txt has multiple lines of text but I only want to take certain characters from each line (e.g.: "7.827382" from line 2 and "6.432342" from line 2.

blue = open("ef_blue.xpk", "rt")
contents = blue.read()

with open("tclust.txt","a") as f2: 
    f2.writelines(contents)

blue.close()
f2.close()

Edit: My tclust.txt file looks like this:

"type rbclust

Peak 0 8.5 0.05 4.0 0.05

Atom 0 125.H8 126.H1' label dataset sw sf"

My ef_blue.xpk file looks like this:

"label dataset sw sf

1H 1H_2

NOESY_F1eF2f.nv

4807.69238281 4803.07373047

600.402832031 600.402832031

1H.L 1H.P 1H.W 1H.B 1H.E 1H.J 1H.U 1H_2.L 1H_2.P 1H_2.W 1H_2.B 1H_2.E 1H_2.J 1H_2.U vol int stat comment flag0 flag8 flag9

0 {} 7.45766 0.01702 0.03286 ++ {0.0} {} {} 5.68094 0.07678 0.15049 ++ {0.0} {} 0.0 4.8459 0 {} 0 0 0

1 {} 8.11276 0.02278 0.03212 ++ {0.0} {} {} 5.52142 0.07827 0.11252 ++ {0.0} {} 0.0 2.0824 0 {} 0 0 0

2 {} 7.85285 0.02369 0.02232 ++ {0.0} {} {} 5.52444 0.07280 0.06773 ++ {0.0} {} 0.0 0.8844 0 {} 0 0 0

3 {} 7.45819 0.01630 0.02914 ++ {0.0} {} {} 5.42587 0.07081 0.11733 ++ {0.0} {} 0.0 2.8708 0 {} 0 0 0

4 {} 7.89775 0.01106 0.00074 ++ {0.0} {} {} 5.23989 0.07077 0.00226 ++ {0.0} {} 0.0 0.4846 0 {} 0 0 0

5 {} 7.85335 0.02665 0.03635 ++ {0.0} {} {} 5.23688 0.09117 0.12591 ++ {0.0} {} 0.0 1.5210 0 {} 0 0 0"

So what I want to do is take the characters from my ef_blue.xpk such as "7.45766" and "5.68094" from line 7 and write it out to line 3 of my tclust.txt file

So I would like my tclust.txt file to look like:

type rbclust
Peak 0 8.5 0.05 4.0 0.05
       7.45766   5.68094
       8.11276   5.52142
 .... etc
Atom 0 125.H8 126.H1'label dataset sw sf

Edit2: @open-source

This is the output I get

What is your expected output? What is a sample of your "tclust.txt" and how should the final "tclust.txt" look like after the manipulation? — idjaw, Jul 23 '17 at 16:42
hi, I just edited my question so hopefully that answers yours — user130306, Jul 23 '17 at 17:03
@user130306 Sam Chats means something like this: https://stackoverflow.com/questions/13423624/python-regular-expression-match — Anton vBR, Jul 23 '17 at 17:21

score 0 · Accepted Answer · answered Jul 23 '17 at 17:27

0

blue = open("ef_blue.txt", "rt")
contents = blue.readlines()

with open("tclust.txt","a") as f2: 
    for cont in range(len(contents)):
        if cont > 5:
            a = contents[cont].split(' ')
            print(a[2]+ '  ' + a[9])
            f2.writelines(a[2] + '  '+ a[9] + '  ')


blue.close()
f2.close()

Try with that, use readlines to convert every line in a list, then use a for to resort the list and check if is in the appropriate line, and finally make a list the actual line separate by a space, tell me in that work

answered Jul 23 '17 at 17:27

Mauricio Cortazar

4,049
2
17
27

Hi! Thank you for answering. I just tried it but it says: – user130306 Jul 23 '17 at 17:54
File "readIn.py", line 13, in print(a[2]+ ' ' + a[9]) IndexError: index out of range: 2 – user130306 Jul 23 '17 at 17:55
well that did work for me try deleting that line, is not important and delete tclust.txt or rename with another name and tell me. type print(a) instead of that print, just for check if the list exist – Mauricio Cortazar Jul 23 '17 at 23:57

niraj · Answer 2 · 2017-07-23T18:58:11.053

You can try the following:

import re

# read tclust.txt file line by line 
# remove last line and empty second last line
# save last line in variable

lines = open('tclust.txt').readlines()
last_line = lines[-1]

# update tclust.txt without last two lines 
open('tclust.txt', 'w').writelines(lines[:-2])

# Open both files
with open("ef_blue.xpk", "rt") as f1, open("tclust.txt","a") as f2:
    # Read ef_blue.xpk line by line 
    for line in f1.readlines():
        # check for 1.23232 format numbers
        float_num = re.findall("[\s][1-9]{1}\.[0-9]+", line)
        # if any digit found in line that matches format
        # assumming there must be 2 in line if found
        if len(float_num)>1:
            # write with 6 empty spaces in the beginning and separated by tab
            f2.writelines(' '*6 + float_num[0] + '\t' + float_num[1] + '\n')

    # finally write the last line earlier removed
    f2.writelines(last_line)

Output for tclust.txt:

"type rbclust

Peak 0 8.5 0.05 4.0 0.05
       7.45766   5.68094
       8.11276   5.52142
       7.85285   5.52444
       7.45819   5.42587
       7.89775   5.23989
       7.85335   5.23688
Atom 0 125.H8 126.H1' label dataset sw sf"

Input: ef_blue.xpk

"label dataset sw sf

1H 1H_2

NOESY_F1eF2f.nv

4807.69238281 4803.07373047

600.402832031 600.402832031

1H.L 1H.P 1H.W 1H.B 1H.E 1H.J 1H.U 1H_2.L 1H_2.P 1H_2.W 1H_2.B 1H_2.E 1H_2.J 1H_2.U vol int stat comment flag0 flag8 flag9

0 {} 7.45766 0.01702 0.03286 ++ {0.0} {} {} 5.68094 0.07678 0.15049 ++ {0.0} {} 0.0 4.8459 0 {} 0 0 0

1 {} 8.11276 0.02278 0.03212 ++ {0.0} {} {} 5.52142 0.07827 0.11252 ++ {0.0} {} 0.0 2.0824 0 {} 0 0 0

2 {} 7.85285 0.02369 0.02232 ++ {0.0} {} {} 5.52444 0.07280 0.06773 ++ {0.0} {} 0.0 0.8844 0 {} 0 0 0

3 {} 7.45819 0.01630 0.02914 ++ {0.0} {} {} 5.42587 0.07081 0.11733 ++ {0.0} {} 0.0 2.8708 0 {} 0 0 0

4 {} 7.89775 0.01106 0.00074 ++ {0.0} {} {} 5.23989 0.07077 0.00226 ++ {0.0} {} 0.0 0.4846 0 {} 0 0 0

5 {} 7.85335 0.02665 0.03635 ++ {0.0} {} {} 5.23688 0.09117 0.12591 ++ {0.0} {} 0.0 1.5210 0 {} 0 0 0"

Input: tclust.txt

"type rbclust

Peak 0 8.5 0.05 4.0 0.05

Atom 0 125.H8 126.H1' label dataset sw sf"

Hi! Thank you so much, this code definitely helped me. However I have some problem with the output. Instead of having the output that you gave, which is what I want, I have instead "6 5.68094 8.11276 5.52142 7.85285 5.52444 7.45819 5.42587 7.89775 5.23989 7.85335 5.23688 — user130306, Jul 23 '17 at 18:51
Sorry, that may be hard to read, I'll edit my question to show the output — user130306, Jul 23 '17 at 18:51
@user130306 I used exact format of input you provided. Let me update the answer with input used for two files: — niraj, Jul 23 '17 at 18:55
Hi!, I just realized, the problem was with my formatting. Thank you very much again for all your help. — user130306, Jul 23 '17 at 19:23
@user130306 so did it solve the problem? If it did, you can accept the answer if you want. `Happy Coding`. — niraj, Jul 23 '17 at 19:25

How do I read in specific characters from each line of text and write it out to another file?

2 Answers2