3

I have an .csv file made in Excel and that Excel uses ; as delimiter. I don't like that, but I cannot change it.

So I have to accept it. I want to read that .csv file in GNU Octave, but I can't really do it because the float numbers inside the .csv file is separated with , e.g 15,25 and not separated with . e.g 15.25 because in Excel, it assumes that 15.25 is a text string and 15,25 is a number. Yes, I know, it's weird.

If I change all , to . in my .csv file, then I can read the .csv file with dlmread or csvread. But in this case, I don't want to change , to . because that's a standard Excel configuration.

So my question is:

If you have a .csv file were the float numbers are displayed with ,. How can you read that .csv file in GNU Octave then?

euraad
  • 2,467
  • 5
  • 30
  • 51

2 Answers2

3

For a much more flexible csv reader than the default matlab-compatible csvread and dlmread, use the csv2cell function from the io pkg. If you haven't used packages before, this is how you would do that:

pkg install io -forge   % install io package if you haven't already
pkg load io             % load it

Then use csv2cell to read the values as strings into a cell (declaring ; as the desired delimiter); then use strrep to replace , to . in your strings; and then finally use str2double to convert those strings to numbers.

You can do this with the following one-liner:

str2double( strrep( csv2cell( 'testo.csv', ';' ), ',', '.' ) )

You could also wrap this into an anonymous function:

read_my_csv = @( myfile ) str2double( strrep( csv2cell( myfile, ';' ), ',', '.' ) );
Data = read_my_csv( 'data.csv' );

PS. Tux the penguin added for extra perfection, as requested.

         _nnnn_                      
        dGGGGMMb     ,"""""""""""""".
       @p~qp~~qMb    | Linux Rules! |
       M|@||@) M|   _;..............'
       @,----.JM| -'
      JS^\__/  qKL
     dZP        qKRb
    dZP          qKKb
   fZP            SMMb
   HZM            MMMM
   FqM            MMMM
 __| ".        |\dS"qML
 |    `.       | `' \Zq
_)      \.___.,|     .'
\____   )MMMMMM|   .'
     `-'       `--' 

(source: https://www.asciiart.eu/computers/linux)

Tasos Papastylianou
  • 21,371
  • 2
  • 28
  • 57
2

Sample file

i1;i2;i3
1,234;34,134;5,987
3,14;6,96;85,05

Script

nrows = 2;
ncolumns = 3;

data = textscan (fopen("data.csv", 'r'), "%s", "Delimiter", ";", "HeaderLines", 1);
data = cell2mat(data);
data = str2double(strrep(data,',','.'));
data = reshape(data, ncolumns, nrows); % elements are accessed in column-major order
data = transpose(data)

Logic

  1. We read the contents of the file as strings separated by ;.
  2. We replace the , present in the strings with ..
  3. Convert the strings to doubles.
  4. The data is reshaped to a matrix.

References

Adriaan
  • 17,741
  • 7
  • 42
  • 75
Kitswas
  • 1,134
  • 1
  • 13
  • 30
  • Is there any easier way to do this? – euraad Jun 24 '22 at 13:21
  • I turned the file into a function and created a [gist](https://gist.github.com/kitswas/34befa9d22c08a2ee79e2614eac92e3c). – Kitswas Jun 24 '22 at 14:01
  • Just download the script into your working directory and call the function. `read_improper_csv("data.csv", 2, 3, ";", 1)` – Kitswas Jun 24 '22 at 14:06
  • Click on [`Raw`](https://gist.github.com/kitswas/34befa9d22c08a2ee79e2614eac92e3c#file-read_improper_csv-m) and save the file. – Kitswas Jun 24 '22 at 14:12