0

I have an excel file which contains my data. This is it: enter image description here After reading and storing it in 'Data' variable, I wanna divide it into 2 portions and assign each of them to different variables. I mean that I want to extract 2 matrices with inconsistent shapes out of my data input. For example if my data is the picture I've put here, I want these two out of it: enter image description here and enter image description here I used this indexing but it didn't work. This is the code:

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
FilePath='E:\\# Civil Engineering graduate\\Projects\\Python\\RoutePlanning'
FileName='\\Data.xlsx'
Data=pd.read_excel(FilePath+FileName)
print(Data)
Points=np.array(Data[1:,0:3])

And this is the error it throughs:

Exception has occurred: TypeError
'(slice(1, None, None), slice(0, 3, None))' is an invalid key
  File "E:\# Civil Engineering graduate\Projects\Python\RoutePlanning\RoutePlanning.py", line 9, in <module>
    Points=np.array(Data[1:,0:3])

I've seen few solutions that have used loops and function definitions for this purpose which I don't like to follow unless I have to... Definitely I have made a mistake in indexing here since it's not working. But I wanna know that can this be repaired and become operational or is there any indexing like solution for this or not. And if not, what could be the best performing solution.

Mohammad Amin
  • 56
  • 1
  • 7

3 Answers3

1

This is because Data is a pandas DataFrame, not a numpy.ndarray.

If you use Data.to_numpy()[1:, 0:3], it will work.

kuropan
  • 774
  • 7
  • 18
1

I suppose Data is a dataframe and your trying to do slicing using position of labels. Try iloc:

Points=np.array(Data.iloc[1:,0:3])

You can get your desired numpy array like below:

points_left = Data.iloc[1:, :3].to_numpy()
points_right = Data.iloc[1:, 3:].to_numpy()
ashkangh
  • 1,594
  • 1
  • 6
  • 9
0

Thanks you all. This one worked pretty well. But now NaNs!

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
FilePath='E:\\# Civil Engineering graduate\\Projects\\Python\\RoutePlanning'
FileName='\\Data.xlsx'
Data=pd.read_excel(FilePath+FileName)
Data=np.array(Data)
print(Data)
Points=np.array(Data[:,0:4])
print(Points)
Obstacles=np.array(Data[:,4:9])
print(Obstacles)

I've got my desired data out of it: Data=

[[ 1.  0.  0.  0.  1.  2.  2.  2.  1.]
 [ 2.  5.  5.  5. nan nan nan nan nan]]

Points=

[[1. 0. 0. 0.]
 [2. 5. 5. 5.]]

Obstacles=

[[ 1.  2.  2.  2.  1.]
 [nan nan nan nan nan]]

now all need to do is to remove the NaNs. Any recommendations?

Mohammad Amin
  • 56
  • 1
  • 7
  • I'd recommend you to open a new question for follow up questions. But quick hint: This is because some fields were empty in your table that you're loading. You could in principle use something like [described here](https://stackoverflow.com/questions/11620914/removing-nan-values-from-an-array) – kuropan Mar 06 '21 at 15:12