1

I have a pandas dataframe (df) like below:

AccountName   AccountName2  DateTime
abc           guest         2016-06-10 20:46
              guest         2016-06-10 21:32
def                         2016-06-10 23:11
                            2016-06-10 23:31
ghi                         2016-06-10 24:41

I need to derive a new dataframe (df1) based on the above dataframe. df1 should have 2 fields, ResultAccount and DateTime.

if(df["AccountName"] != ' '):
 df1["ResultAccount"] = df["AccountName"]
elif(df["AccountName2] != ' '):
 df1["ResultAccount"] = df["AccountName2"]
else:
 df1["ResultAccount"] = "none"

This is the approach I followed but df1 is not getting populated as expected. Any help would be appreciated.

jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
user3447653
  • 3,968
  • 12
  • 58
  • 100

2 Answers2

0

I think you can first replace strings with spaces ' ' to NaN and then apply custom function f with last_valid_index. Output is Dataframe from Series ResultAccount and df.DateTime:

import pandas as pd
import numpy as np

df = pd.DataFrame({'AccountName2': {0: 'guest', 1: 'guest', 2: ' ', 3: ' ', 4: ' '}, 
                   'DateTime': {0: '2016-06-10 20:46', 1: '2016-06-10 21:32', 2: '2016-06-10 23:11', 3: '2016-06-10 23:31', 4: '2016-06-10 24:41'}, 
                   'AccountName': {0: 'abc', 1: ' ', 2: 'def', 3: ' ', 4: 'ghi'}})

print (df)
  AccountName AccountName2          DateTime
0         abc        guest  2016-06-10 20:46
1                    guest  2016-06-10 21:32
2         def               2016-06-10 23:11
3                           2016-06-10 23:31
4         ghi               2016-06-10 24:41
df[['AccountName','AccountName2']] = df[['AccountName','AccountName2']].replace(' ',np.nan)

def f(x):
    if x.first_valid_index() is None:
        return 'None'
    else:
        return x[x.first_valid_index()]

ResultAccount = (df[['AccountName','AccountName2']].apply(f, axis=1))

df1 = pd.DataFrame({'ResultAccount':ResultAccount ,'DateTime':df.DateTime}, 
                   columns=['ResultAccount','DateTime'])

print (df1)
  ResultAccount          DateTime
0           abc  2016-06-10 20:46
1         guest  2016-06-10 21:32
2           def  2016-06-10 23:11
3          None  2016-06-10 23:31
4           ghi  2016-06-10 24:41
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
  • Thank you. The ResultAccount gets populated when AccountName is present. It does not work for other 2 cases - that is if AccountName is null and AccountName2 has content, then ResultAccount is the content in AccountName2. Also, it does not populate "none" if both are empty. – user3447653 Jun 14 '16 at 18:19
  • Ok, maybe I something missing. What is desired output of column `ResultAccount`? – jezrael Jun 14 '16 at 18:22
  • Its not based on which has to be first. Sorry if i had not made my question clear. To populate "ResultAccount", first it should check if (AccountName != null), if yes then ResultAccount = AccountName. Else ResultAccount = AccountName2. – user3447653 Jun 14 '16 at 18:29
  • Sorry, i am little bit confused. What is desired output? Which values are in column `ResultAccount` ? – jezrael Jun 14 '16 at 18:31
  • Desired output is what you have given in your answer, but I do not get that when i apply the function. "abc", "def" and "ghi" are getting populated as expected but "guest" and "None" are not getting populated. – user3447653 Jun 14 '16 at 18:32
0

You could use np.select. It is a multi-conditional generalization of np.where:

import numpy as np
import pandas as pd
df = pd.DataFrame(
    {'AccountName': ['abc', ' ', 'def', ' ', 'ghi'],
     'AccountName2': ['guest', 'guest', ' ', ' ', ' '],
     'DateTime': ['2016-06-10 20:46', '2016-06-10 21:32', '2016-06-10 23:11', '2016-06-10 23:31', '2016-06-10 24:41']})

conditions = [df['AccountName'] != ' ', df['AccountName2'] != ' ']
choices = [df["AccountName"], df["AccountName2"]]
df['ResultAccount'] = np.select(conditions, choices, default='none')

yields

  AccountName AccountName2          DateTime ResultAccount
0         abc        guest  2016-06-10 20:46           abc
1                    guest  2016-06-10 21:32         guest
2         def               2016-06-10 23:11           def
3                           2016-06-10 23:31          none
4         ghi               2016-06-10 24:41           ghi
unutbu
  • 842,883
  • 184
  • 1,785
  • 1,677
  • Can you please reply to this http://stackoverflow.com/questions/3694918/how-to-extract-frequency-associated-with-fft-values-in-python#comment63088544_3695448 – optimus prime Jun 15 '16 at 05:49