0

I have some problem with uploding excel data in python.

Excel:

image

Code used to upload:

import pandas as pd
from google.colab import files
#uploaded = files.upload()
import io
df2 = pd.read_csv(io.BytesIO(uploaded['nodes.csv']),index_col=0)
print (df2)

Result:

image

Can you kindly help me?

Timus
  • 10,974
  • 5
  • 14
  • 28
user82523
  • 11
  • 4
  • what should the display be? you can use [`df.to_string()`](https://stackoverflow.com/a/39923958/4541045) to produce a string output to paste here – ti7 Apr 01 '22 at 16:51
  • I want to convert it to a numpy array. – user82523 Apr 01 '22 at 16:55
  • 1
    What is your problem ? You say you have a problem, but don't describe what precisely this problem is. Anyway, I'm pretty sure it's an extension problem. Excel files have `.xls` or `.xlsx` extension, but not `.csv`, making the `read_csv()` method useless in your precise case – imperosol Apr 01 '22 at 16:55
  • The received result and excel table are not the same. – user82523 Apr 01 '22 at 17:07

2 Answers2

0

You said you wanted to convert the imported data to a numpy array, you could do so by doing the following without using pandas:

import numpy as np
arr = np.genfromtxt('nodes.csv', delimiter=',')
print(arr)

Check the documentation: https://numpy.org/doc/stable/reference/generated/numpy.genfromtxt.html

Omar
  • 204
  • 2
  • 3
  • Please do not answer with a comment/question. Understandably, your rep is too low to comment, but that still does not mean answers should be used to make comments as an alternative. It would be preferable if you deleted this. – sushanth Apr 01 '22 at 16:51
  • No error, but in the second column, I have 0.1. In the excel file instead of this value, we have 0. – user82523 Apr 01 '22 at 16:52
0

If you have a csv-file file.csv

0,0
2,0
4,0
1,1.732051
3,1.732051

then

df = pd.read_csv("file.csv", index_col=0)

does produce

df =
        0.1
0          
2  0.000000
4  0.000000
1  1.732051
3  1.732051

Why is that: There are two 0s in the first row and Pandas is mangling the dupes because the row is used for labels. The 0.1 isn't a number, it's a string (print(df.columns) will show Index(['0.1'], dtype='object')). If your file would look like

0,1
2,0
4,0
1,1.732051
3,1.732051

then this wouldn't happen, the output would look like

          1
0          
2  0.000000
4  0.000000
1  1.732051
3  1.732051

If your goal is NumPy array, then

arr = pd.read_csv("file.csv", header=None).values

leads to

array([[0.      , 0.      ],
       [2.      , 0.      ],
       [4.      , 0.      ],
       [1.      , 1.732051],
       [3.      , 1.732051]])
Timus
  • 10,974
  • 5
  • 14
  • 28