There are some similar questions to this, but nothing exact that I can find.
I have a very odd text-file with lines like the following:
field1=1; field2=2; field3=3;
field1=4; field2=5; field3=6;
Matlab's textscan()
function deals with this very neatly, as you can do this:
array = textscan(fid, 'field1=%d; field2=%d; field3=%d;'
and you will get back a cell-array where each column contains the respective field, and the text is simply ignored.
I'd like to rewrite the code that deals with this file in Python, but Numpy's loadtxt()
and genfromtxt()
don't seem to have this ability to ignore text interspersed with the desired numbers?
What are some Python ways to strip out the text and only get back the fields? I'm happy to use pandas
or another library if required. Thanks!
EDIT: This question was suggested as an answer, but it only gives equivalents to the basic usage of textscan
that does not deal with unwanted text in the input. The answer below with fromregex
is what I needed.