4

I am trying to read in a file with contains thousands of lines of the format:

AAAAAAAA    2013.99.2314.029    0    OFF    N

Which is a tab delimited file. The last column is a don't care. The two columns before that are variable, so I read them as strings. My main problem is the second column. It is a number that is divided into several parts

2013.99.2314.029

is year 2013, day 99, second 2314.029.

I want to use textscan to read in the whole file at once, but somehow split that complicated date string as I read it in.

Currently I have the scan string:

SCAN_STR = '%s\t%f.%f\t%s\t%s\t%*s'

Which reads the date string into two floats. What I'd really like is to read it into two ints and a float. But using

SCAN_STR = '%s\t%d.%d.%f\t%s\t%s\t%*s'

Truncates it to 2013 and 2314 and messes up the rest of the line. I tried escaping the '.' with '.' but that pops an error.

Any suggestions? I'd like to do this as it's scanned in due to the large size of the file. Memory runs low when you start trying to change the types of large data sets.

EDIT:

Really I need a scan string for 2013.99.2314.029 to return two integers and a float.

'%d.%d.%f'

Doesn't work. Nor does using delimiter as '.'. I tried %u as well. It rounds the decimal as it reads them in.

Le sigh.

polkid
  • 313
  • 4
  • 15
  • First thing that comes to mind: instead of `%d`, you can try `%[0-9]`. However, this would read the integers as strings, you'll have to convert them to numbers later (_e.g_ using `str2num`) if you need their numerical value. – Eitan T Jul 25 '13 at 17:42
  • 1
    Use textscan once for the whole line, then textscan again on the field you want to split up further? – Ansari Jul 25 '13 at 17:56
  • 1
    I hate textscan. Why don't they put some decent text parsing in Matlab? – Bitwise Jul 26 '13 at 00:18

1 Answers1

0

I just tried this with MATLAB 2012b and it seems to work on my end.

SCAN_STR = '%s\t%4d.%d.%f\t%d\t%s%*[^\n]'
radarhead
  • 668
  • 3
  • 13
  • It doesn't work on my end :( It reads 99 as 99.2314 and rounds it, leaving the float as 29. I tried adding %2d instead of %d for the second int, and it works. But it doesn't work if I use %3d. So it breaks after 100 days of the year. Or only works after 100 days of the year... – polkid Jul 26 '13 at 17:29