0

I want to export a numerical array as a .csv file. So, the simplified term looks like this:

fid = fopen('output.csv','wt')
toprint = [1.0, 1.1, 1.2, 1.3];
fprintf(fid, '%f, %f, %f, %f\n', toprint);
fclose(fid)

In this case there is no problem. I use %f in string format to maintain precision. However, sometimes, or rather usually, there are zeros in the array like this:

toprint = [1.0, 0, 0, 1.1];

In such situation, I want to adjust the string format to:

'%f, %d, %d, %f\n' % where "%f" were replaced by "%d" at the positions of the zeros

to reduce output file size since I do not need the precision of zero numbers. The original solution I applied was to detect data types through the array. If zero was detected, then concatenate '%d' onto string format. But it seems to be very inefficient.

What I am looking for is a efficient method to adjust string format depending on input data. Is there any way to achieve this?

Dev-iL
  • 23,742
  • 7
  • 57
  • 99
DerrickTSE
  • 35
  • 5
  • 1
    Any reason why you're doing the file-writing using low level functions and not `csvwrite`, `dlmwrite`, etc? I would suggest exporting the way you do, then postprocess using e.g. find-and-replace to turn all `0.00` into `0`. – Dev-iL Dec 19 '19 at 07:34
  • 1
    BTW, if you have some number that is very close to (but not exactly) `0`, would you prefer to display it as `0.000000` or as `0`? In case of the latter, what is the accuracy (_read: tolerance_) you're looking for? – Dev-iL Dec 19 '19 at 07:43
  • 1
    Lastly, if file size is a serious concern, perhaps you should avoid a text-based export altogether and export in some binary format like `xlsx`, `HDF`, or just convert your array to `single` precision and export this as a `float32` array (i.e. **binary** write mode instead of text). If you're taking this route, you might as well export it as `float16` using [this library](https://www.mathworks.com/matlabcentral/fileexchange/23173). – Dev-iL Dec 19 '19 at 07:58
  • @Dev-iL Thanks for replying. The reason I don't use csvwrite or other lib is that this will be implemented with C later. So I am trying to make it as library-independent as possible. Sorry for not clarifying at first. As for the text output issue you mentioned lastly, in this stage I am considering to operate and store the files in text-base format. But I will definitely consider you advises! Really appreciate your support! – DerrickTSE Dec 20 '19 at 03:39

1 Answers1

3

Two approaches:

  1. You can use "%g" to simplify floating-point output when possible. This also shortens other whole numbers like 1.0 or 2.0, which may or may not be what you want
  2. Dynamically construct the format string based on the the values
>> fprintf('%g %g %g %g\n', [1.0, 1.1, 1.2, 1.3])
1 1.1 1.2 1.3
>> fprintf('%g %g %g %g\n', [1.0, 1.1, 0, 1.3])
1 1.1 0 1.3
>> fprintf('%g %g %g %g\n', [1.0, 1, 0, 1.3])
1 1 0 1.3

Approach 2:

>> a = [1.1 1.2 0 1.3]

a =

    1.1000    1.2000         0    1.3000

>> tokens = {'%f', '%d'}

tokens = 

    '%f'    '%d'

>> strformat = strcat(strjoin(tokens((a==0)+1), ', '), '\n')

strformat =

%f, %f, %d, %f\n

>> fprintf(strformat, a)
1.100000, 1.200000, 0, 1.300000
9mat
  • 1,194
  • 9
  • 13
  • 1
    What if one of the zeros is actually `0 + eps(0)`? You should do comparison with a tolerance. See also: https://stackoverflow.com/questions/686439/ – Dev-iL Dec 19 '19 at 07:39
  • 2
    I interpret the question this way: preserve precision as much as possible, unless the number is really 0 (or whole number), then format them as integer – 9mat Dec 19 '19 at 07:42
  • It is possible that your answer is just what the OP needs. However, the OP mentioned that the example is simplified, so it's hard to tell without additional information if the "edge case" I mentioned is a real problem (and hence, deserving treatment) or not. – Dev-iL Dec 19 '19 at 07:47
  • And that case is manageable by adjusting the condition `a==0` to `abs(a) – 9mat Dec 19 '19 at 07:49