I've run into an issue trying to drop a nan
column from a table.
Here's the example that works as expected:
import pandas as pd
import numpy as np
df1 = pd.DataFrame([[1, 2, 3], [4, 5, 6]],
columns=['A', 'B', 'C'],
index=['Foo', 'Bar'])
mapping1 = pd.DataFrame([['a', 'x'], ['b', 'y']],
index=['A', 'B'],
columns=['Test', 'Control'])
# rename the columns using the mapping file
df1.columns = mapping1.loc[df1.columns, 'Test']
From here we see that the C
column in df1
doesn't have an entry in the mapping file, and so that header is replaced with a nan
.
# drop the nan column
df1.drop(np.nan, axis=1)
In this situation, calling np.nan
finds the final header and drops it.
However, in the situation below, the df.drop
does not work:
# set up table
sample1 = np.random.randint(0, 10, size=3)
sample2 = np.random.randint(0, 5, size=3)
df2 = pd.DataFrame([sample1, sample2],
index=['sample1', 'sample2'],
columns=range(3))
mapping2 = pd.DataFrame(['foo']*2, index=range(2),
columns=['test'])
# assign columns using mapping file
df2.columns = mapping2.loc[df2.columns, 'test']
# try and drop the nan column
df2.drop(np.nan, axis=1)
And the nan
column remains.