How do I read data from a multidimensional txt file fast and efficient in Python? (real time monitoring application)

Question

I am not a programmer and I am trying to implement a real-time monitoring application for a sensor system. The file is too big to upload to GitHub, but I will try my best to explain the main problem.

The data
x: time in seconds
y1: comma-separated list of intensities
y2: comma-separated list of wavelengths.

So for each datapoint/time, the sensor saves a complete spectrum and closes the txt file. Then it opens it again and saves the next datapoint/time after a very short integration time of milliseconds.

My goal
I want to choose a wavelength (or multiple ones) from the spectrum and plot the time/intensity curve in real-time.

My problem
The efficiency of my code is very bad. It needs a lot of time to convert the raw data, which essentially has two different delimiters (tab and comma), but also to find the wavelengths close to my chosen value (I choose an integer, but the list only contains decimals).

My code

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

folder = '...'
file = '...txt'

wavelength1 = 940

wavelength2 = 930
wavelength3 = 950

df = pd.read_table(folder+file, skiprows=[1], index_col=0)
time = df['Elapsed Time (Seconds)']
intensity = df['Spectra (Spectra Intensity)'].str.split(',', expand=False)
spectrum = df['Spectra Wavelength (nanometers)'].str.split(',', expand=False)

x = []
y1 = []
y2 = []
y3 = []
for i in range(len(df)):
    A = np.array([float(j) for j in spectrum[i]])
    index1 = np.where(A == min(A, key=lambda k:abs(k-wavelength1)))
    index2 = np.where(A == min(A, key=lambda k:abs(k-wavelength2)))
    index3 = np.where(A == min(A, key=lambda k:abs(k-wavelength3)))
    x.append(float(time[i]))
    y1.append(float(intensity[i][int(index1[0])]))
    y2.append(float(intensity[i][int(index2[0])]))
    y3.append(float(intensity[i][int(index3[0])]))

plt.plot(x, y1, label='940 nm')
plt.plot(x, y2, label='930 nm')
plt.plot(x, y3, label='950 nm')
plt.show()

If you have experience with such implementations, I would also appreciate tips for efficient live plotting libraries, etc. :)

Thank you very much.

Update
I made a test file here A2422_Spectra so you can see for yourself.

Are you manually splitting the comma seperated data? Why not let pandas handle that with ```pd.read_csv()``` or use numpy with ```np.loadtxt()```? You can also specify there that you want floats instead of string, which saves you the list comprehension. Maybe this answers your question https://stackoverflow.com/questions/65567213/convert-text-file-containing-multiple-delimiters-to-csv — Stefan, Mar 30 '22 at 09:31
Could you provide a few lines from your data file for us to test? — Serge Ballesta, Mar 30 '22 at 09:35
I updated the question with a link to a test file :) @SergeBallesta — Nrmn, Mar 30 '22 at 11:21
In the sample file, the *Spectra Wavelength* column contains the very same values on each and every row. If this is true for your real data file, it would allow for much simpler and faster processing. Could you please confirm (or infirm...) that? — Serge Ballesta, Mar 31 '22 at 10:10
@SergeBallesta That is correct, it is the same in every row :) How would you process it then? — Nrmn, Apr 05 '22 at 12:51
Unfortunately, I have no access to my computer for the next week (hollydays...). The idea is to analyze only the first row to find the indexes for the researched frequencies and then to split the spectra column and only keep those indexes. — Serge Ballesta, Apr 05 '22 at 22:37

How do I read data from a multidimensional txt file fast and efficient in Python? (real time monitoring application)

0 Answers0