Read only numerical values from one dataframe and create another dataframe from those values

Question

I have imported an excel into a dataframe and it looks like this:

rule_id  reqid1 reqid2  reqid3
50014     1.0    0.0     1.0
50238     0.0    1.0     0.0
50239     0.0    1.0     0.0
50356     0.0    0.0     1.0
50412     0.0    0.0     1.0
51181     0.0    1.0     0.0
53139     0.0    0.0     1.0

Then I wrote this code to compare corresponding reqids with each other and then drop the reqid columns:

    m = df1.eq(df1.shift(-1, axis=1))

    arr1 = np.select([df1 ==0, m], [np.nan, 1], 1*100)

    dft4 = pd.DataFrame(arr1, index=df1.index).rename(columns=lambda x: 'comp{}'.format(x+1))

    dft5 = df1.join(dft4)
    cols = [c for c in dft5.columns if 'reqid' in c]
    df8 = dft5.drop(cols, axis=1)

The result looked like this:

Then I transposed it and the data looks like this:

Now I want to write this data into a separate dataframe where only numerical values are present and empty or null values are removed. The dataframe should look like this:

If anybody could help me , I would greatly appreciate it.

I have already shared it. It is the second last snap in my question. — vesuvius, Mar 13 '19 at 12:57

jezrael · Accepted Answer · 2019-03-13T13:10:50.533

2

Use justify function and then remove only NaNs rows by DataFrame.dropna with parameter how='all':

df8 = dft5.drop(cols, axis=1).T

df8 = pd.DataFrame(justify(df8.values,
                   invalid_val=np.nan,
                   axis=0,side='up'), columns=df8.columns).dropna(how='all')
print (df8)
rule_id  50014  50238  50239  50356  50412  51181  53139
0        100.0  100.0  100.0  100.0  100.0  100.0  100.0
1        100.0    NaN    NaN    NaN    NaN    NaN    NaN

Another pandas solution:

df8 = df8.apply(lambda x: pd.Series(x.dropna().values))
print (df8)

rule_id  50014  50238  50239  50356  50412  51181  53139
0        100.0  100.0  100.0  100.0  100.0  100.0  100.0
1        100.0    NaN    NaN    NaN    NaN    NaN    NaN

edited Mar 13 '19 at 13:10

answered Mar 13 '19 at 12:54

jezrael

822,522
95
1,334
1,252

Hi @jezrael , since I have an older version of numpy so I used fliplr() instead of flip but it is showing the error - fliplr() got an unexpected keyword argument 'axis' – vesuvius Mar 13 '19 at 13:06
@sagarkhanna - Hard question, because no author of function. But added alternative - `df8 = df8.apply(lambda x: pd.Series(x.dropna().values))` – jezrael Mar 13 '19 at 13:11
1

Your updated pandas solution works. Thanks a lot @jezrael. Accepting:) – vesuvius Mar 13 '19 at 13:13

Read only numerical values from one dataframe and create another dataframe from those values

1 Answers1