0

I would like to save a big matrix as a .csv file. However, from numpy's documentation I tried the following:

training_matrix = dict_vect.fit_transform(training_data_2_dict)
csv_matrix = np.savetxt("foo.csv", training_matrix, delimiter=",")

This is the shape of the matrix: (878049, 413439) and this is the exception:

Traceback (most recent call last):
  File "/Users/user/PycharmProjects/kaggle/modeling_the_problem.py", line 55, in <module>
    training_matrix)
  File "/usr/local/lib/python2.7/site-packages/numpy/lib/npyio.py", line 1044, in savetxt
    ncol = X.shape[1]
IndexError: tuple index out of range

Any idea of how to save the matrix in a csv file?.

john doe
  • 2,233
  • 7
  • 37
  • 58
  • 1
    What exactly is `training_matrix`? Plain `numpy` array? Or something else? – hpaulj Oct 17 '15 at 15:32
  • Thanks for the feedback @hpaulj, it's a sparse scikit-learn matrix.... I guess that is a numpy array.... Anyhow, I tried to convert it to a numpy array and still can solve this issue. – john doe Oct 17 '15 at 18:33
  • 1
    A sparse matrix is not a numpy array. Use `todense`, `toarray` or `.D`,`.A` to convert it to a regular dense matrix or array. Just be ware that the saved text will have a lot of 0s - and 413439 'columns' (very long lines). – hpaulj Oct 17 '15 at 19:28
  • Thanks But I also tried` toarray()` but it chrashes my computer... – john doe Oct 17 '15 at 22:23
  • Probably the dense version is too big for your memory. What do you expect, or want to see, in the `csv` file? Slews of zeros? Have you tried converting a few rows of the matrix? Tried writing those? – hpaulj Oct 18 '15 at 00:44
  • `np.savetxt` just writes a header to the file, and then writes the array one 'row' at a time. Look at its code. Notice how it replicates the `fmt`, and then does a `fmt%tuple(row)`. So if know how to convert your array to strings, row by row, you can write the file yourself. – hpaulj Oct 18 '15 at 01:03

1 Answers1

1

If the matrix happens to be a scipy matrix, then numpy will encounter this in effort to save. If so, the post here should explain.

Community
  • 1
  • 1
KidMcC
  • 486
  • 2
  • 7
  • 17
  • Thanks for the help!. Do you think it is possible to convert this scipy matrix to a numpy array and then save it to a file?. – john doe Oct 17 '15 at 18:40