I have the following dataset:
import numpy as np
array_id = np.array([2,4,7])
I have the following array with ids:
df = pd.DataFrame({'Name': ['Station', 'Sensor', 'Station', 'Sensor',
'Sensor', 'Sensor', 'Sensor'],
'Type': ['analog', 'dig', 'analog', 'analog',
'analog', 'analog', 'dig'],
'id': [1, 2, 3, 4, 5, 6, 7]})
I would like to select the columns of the dataframe (df) where the id belongs to the array of ids (array_id). I would like the output to be:
Name Type id
Sensor dig 2
Sensor analog 4
Sensor dig 7
I managed to implement code to do this operation, but I needed to use two for():
d = {'Name', 'Type', 'id'}
df_aux = pd.DataFrame(d)
df_select = pd.DataFrame(d)
for i in range(0, len(df)):
for j in range(0, len(array_id)):
if(df['id'].iloc[i] == array_id[j]):
array_aux = [(df['Name'].iloc[i],
df['Type'].iloc[i],
df['id'].iloc[i])]
df_aux = pd.DataFrame(array_aux, columns = ['Name', 'Type', 'id'])
df_select = pd.concat([df_select, df_aux])
The output is:
print(df_select)
0 Name Type id
id NaN NaN NaN
Type NaN NaN NaN
Name NaN NaN NaN
NaN Sensor dig 2.0
NaN Sensor analog 4.0
NaN Sensor dig 7.0
I would like to learn a way that does not need to use the two for() and that the output of (df_select) does not appear with the NaN. Is there a way to solve this?