I am starting to dig deeper into Python and am having trouble converting some of my R scripts into Python. I have a function defined in R:
Shft_Rw <- function(x) { for (row in 1:nrow(x))
{
new_row = x[row , c(which(!is.na(x[row, ])), which(is.na( x[row, ])))]
colnames(new_row) = colnames(x)
x[row, ] = new_row
}
return(x)
}
Which essentially takes leading NA's of each row in a dataframe and puts them at the end of the row i.e.
import pandas as pd
import numpy as np
df =pd.DataFrame({'a':[np.nan,np.nan,3],'b':[3,np.nan,5],'c':[3, 4,5]})
df
Out[156]:
a b c
0 NaN 3.0 3
1 NaN NaN 4
2 3.0 5.0 5
turns into:
df2 =pd.DataFrame({'a':[3,4,3],'b':[3,np.nan,5],'c':[np.nan, np.nan,5]})
df2
Out[157]:
a b c
0 3 3.0 NaN
1 4 NaN NaN
2 3 5.0 5.0
So far I have:
def Shft_Rw(x):
for row in np.arange(0,x.shape[0]):
new_row = x.iloc[row,[np.where(pd.notnull(x.iloc[row])),np.where(pd.isnull(df.iloc[row]))]]
But throwing errors. Using sample df above I can get a row index using iloc and the column positions where it is null/not null (using where()) but can't put the two together (tried numerous variations with more brackets etc.).
df.iloc[1]
Out[170]:
a NaN
b NaN
c 4.0
np.where(pd.isnull(df.iloc[1]))
In[167] : np.where(pd.isnull(df.iloc[1]))
Out[167]: (array([0, 1], dtype=int64),)
df.iloc[1,np.where(pd.notnull(df.iloc[1]))]
Anyone able to help replicate the function AND/OR show a more efficient way to solve the problem?
Thanks!