1

There are a ton of posts about various ways to read data into MATLAB, but none seem to handle this particular problem. How do I only read every other line without a loop?

I have data formatted like:

1 2 3 4 5 6 7 8 9 string
1 2 -1 5 3 -1 ...
1 2 3 4 5 6 7 8 9 string
1 2 -1 5 3 -1 4 9 -1 ...
...

In other words, the data alternates lines. I can't seem to figure out to only grab the numeric parts of the odd lines.

I know that I can use fscanf as fscanf(fid, '%f %f %f %f %f %f %f %f %f %*s') to actually read the relevant lines. However, this falls apart on the even lines which don't follow the same format.

I also tried fscanf(fid, '%f %f %f %f %f %f %f %f %f %*s\n%*[\n]') thinking this may match 2 lines (due to the included return character), while skipping the data on the even lines due to the asterisk and regex combo. However, this didn't work. It's important to note that the even lines are of different length, so I can't just pattern match them specifically.

How can I do this?

marcman
  • 3,233
  • 4
  • 36
  • 71
  • 2
    Is there any particular reason you don't want to use a loop? It's really easy to do that way: http://stackoverflow.com/questions/5531082/matlab-how-to-read-every-nth-line-of-a-text-file – gariepy Jun 23 '16 at 03:55
  • A second solution involving pre-processing the file, which is faster if you have a large file and you're using Linux, is here: http://stackoverflow.com/questions/9894986/how-can-i-delete-every-xth-line-in-a-text-file – gariepy Jun 23 '16 at 03:56
  • @gariepy: I don't want to use a loop because generally I'm dealing with massive files. Also, I don't want to then have to parse each line individual if I can avoid it. – marcman Jun 23 '16 at 03:59
  • Also, I'm on Windows and I don't want to edit the files. Just read the data – marcman Jun 23 '16 at 04:00
  • So, a suggestion...don't impose an arbitrary constraint like "no loops" if you really just want speed for large files. Loops are not evil. Sometimes the best answer is a loop. MATLAB doesn't really have a builtin utility for this sort of thing. – gariepy Jun 23 '16 at 04:24
  • And it's not because you're using a built-in function that it means that your script is free of loop. A built-in function is just a black-box. – obchardon Jun 23 '16 at 07:38
  • how about reading the entire file and then just processing every other line? i dont think it would be that slower than trying to skip every other line. how many lines do you have in a file? – Finn Jun 23 '16 at 07:58
  • I agree with the above comments. Additionally, if you are dealing with huge files, you might get memory problem when you read them as whole. You can read the files in chunks in a loop and allocate the memory for the chunk array if you have a speed issue. – Lati Jun 23 '16 at 08:44

1 Answers1

3

A working solution : (Although I didn't do any running time test)

I created a text file FileRead with the 4 rows you indicated.

% Open file
fid=fopen('FileRead.txt');

% Read the whole lines of your files in a cell array
A=textscan(fid,'%s','Delimiter','\n');

% Close file
fclose(fid);

% Extract the even lines
Tmp=A{1,1};
out1=Tmp(2:2:end);

% Use cellfun to apply str2num to every cell in out1
out=cellfun(@str2num,out1,'UniformOutput',false);

Output :

enter image description here

BillBokeey
  • 3,168
  • 14
  • 28
  • This is a great solution. I had been trying to recall the exact permutation of MATLAB file reader and format string that allow this. For files up to 100 million lines or so, this will work great. – gariepy Jun 23 '16 at 22:05