Parse flat-file (positional text-file) to read the wavelength

Question

I have the next txt with data:

FI R       83.0000m   34.960    1.1262      Fe 2      1.32055m   33.626    0.0522      N  
2      5754.61A   33.290    0.0241
TI R       1800.00m   33.092    0.0153      Fe 2      1.24854m   32.645    0.0054      N  
2      915.612A   31.997    0.0012
NI Ra      2.85000m   36.291   24.1132      Fe 2      7637.54A   33.077    0.0147

what I want is to obtain the third column, is the wavelength of the emergent line, but my problem is when I put the condition in the if.

Name1,ion1,wavelength1,da1,de1,name2,ion2,
wavelength2,da2,de2,name3,ion3,wavelength3,da3,de3=np.genfromtxt('Emergent_line.txt', 
skip_header=3, delimiter="", unpack=True)

if(Name1=="Fe" and ion1==2):
    print(wavelength1)
elif(name2=="Fe" and ion2==2):
    print(wavelength2)
elif(name3=="Fe" and ion3==2):
    print(wavelength3)

In the txt I want to find the wavelength for Fe 2, but I think the problem is that the wavelength have a letter in the end, I don't want to delete, because I have a large list like that. I tried another froms, but I haven't solved it.

What is the text (_wavelength_) you want to extract? Please [edit] your question and add an example. — hc_dev, May 22 '22 at 17:20

thamuppet · Accepted Answer · 2022-05-23T06:36:46.710

I think you are better off using regex

Example:

import re


text = '''FI R       83.0000m   34.960    1.1262      Fe 2      1.32055m   33.626    0.0522      N  
2      5754.61A   33.290    0.0241
TI R       1800.00m   33.092    0.0153      Fe 2      1.24854m   32.645    0.0054      N  
2      915.612A   31.997    0.0012
NI Ra      2.85000m   36.291   24.1132      Fe 2      7637.54A   33.077    0.0147'''

find_this = re.findall('(Fe 2.*?[0-9].*?)\s', text)
print(find_this)

Output:

['Fe 2      1.32055m', 'Fe 2      1.24854m', 'Fe 2      7637.54A']

[Program finished]

Or if you only want the values.

find_this = re.findall('Fe 2.*?([0-9].*?)\s', text)

Output:

['1.32055m', '1.24854m', '7637.54A']

[Program finished]

ANSWER TO NEW QUESTION

Here's an example of how you could achieve picking out values between 1.35 - 1.40 using a for loop and converting var into float. Now we can use conditions as in this line:

if (float_value >= 1.35) and (float_value <= 1.40):
    print(value)

if matched it prints the untouched string, keeping the ending letter.

Here's the full code: (I shortened the text for easier read)

import re


text = '''Fe 2      1.405A   33.077    0.0147
Fe 2      1.305A   33.077    0.0147
Fe 2      1.345A   33.077    0.0147
Fe 2      1.35A   33.077    0.0147
Fe 2      1.35623A   33.077    0.0147
Fe 2      1.40A   33.077    0.0147
Fe 2      1.37A   33.077    0.0147
Fe 2      1.41A   33.077    0.0147'''


find_this = re.findall('Fe 2.*?([0-9].*?)\s', text)

for value in find_this:
    del_letters = re.sub('[A-Za-z]', '', value)
    float_value = float(del_letters)
    if (float_value >= 1.35) and (float_value <= 1.40):
        print(value)

Output:

1.35A
1.35623A
1.40A
1.37A

How could I set conditions? If I want the Fe 2 line, but to show me the wavelength values between 1.35m and 1.40m I had not heard about "re" — Joan Lopez, May 22 '22 at 23:12
I have a question, in my txt is" name, ion,wavelength,intensity, abundance". How could i get the value from abundance, taking consideration the wavelength that the program find in the line "find_this = re.findall('(Fe 2.*?[0-9].*?)\s', text) "? — Joan Lopez, May 28 '22 at 02:22

score 2 · Answer 2 · answered May 22 '22 at 18:11

The text-file you presented seems a flat-file or fixed-with file where data (columns) are layed out

as positional text (each column starting at a predefined position)
in a fixed-width format (each column having a fixed-width)

Pandas has a method for reading fixed-width file

You could use pandas and their IO tools method read_fwf.

import io  # just for demonstration without needing a file
import pandas


text = '''FI R       83.0000m   34.960    1.1262      Fe 2      1.32055m   33.626    0.0522      N  
2      5754.61A   33.290    0.0241
TI R       1800.00m   33.092    0.0153      Fe 2      1.24854m   32.645    0.0054      N  
2      915.612A   31.997    0.0012
NI Ra      2.85000m   36.291   24.1132      Fe 2      7637.54A   33.077    0.0147'''

buffer = io.StringIO(text)  # just a helper to read from text as from file

filepath_or_buffer = buffer  # can also be the file-path directly
df =  pandas.read_fwf(filepath_or_buffer, colspecs='infer', widths=None, infer_nrows=100, header=None)
print(df)  # df represented as complete table read

wave_lengths = df.loc[(df[3] == 'Fe') & (df[4] == 2)][5]
print("== Wavelengths:")
print(wave_lengths)

buffer.close()

Prints:

    0    1                            2    3    4         5       6       7    8
0  FI    R  83.0000m   34.960    1.1262   Fe  2.0  1.32055m  33.626  0.0522    N
1   2  NaN  5754.61A   33.290    0.0241  NaN  NaN       NaN     NaN     NaN  NaN
2  TI    R  1800.00m   33.092    0.0153   Fe  2.0  1.24854m  32.645  0.0054    N
3   2  NaN  915.612A   31.997    0.0012  NaN  NaN       NaN     NaN     NaN  NaN
4  NI   Ra  2.85000m   36.291   24.1132   Fe  2.0  7637.54A  33.077  0.0147  NaN
== Wavelengths:
0    1.32055m
2    1.24854m
4    7637.54A

Note:

Python's io.StringIO was used as helper to simulate a buffer instead the file.
Panda's loc method to locate or filter the Fe 2 rows, where we printed the 5th column with wavelength

Parse flat-file (positional text-file) to read the wavelength

3 Answers3

Pandas has a method for reading fixed-width file

See also