-2

I have a dataframe as follows. What I would like is to generate another column (freq) where the rows will have values according to this logic:

  • If Mode column value starts with a digit m, then fill-in digit n in the freq column.

    - m: 1, n: 12
    - m: 6, n: 4
    - m: 7, n: 2
    - m: 8, n: 1
    

DataFrame

    Mode
0   602
1   603
2   700
3   100
4   100
5   100
6   802
7   100
8   100
9   100
10  100

Here is the logic that I tried implementing. But somehow it does not seem to work. Even if you could suggest some alternate solution, without using my code, that will work as well.

def check_mode(Mode):
    freq = ''
    if (Mode.str.startswith('8')).any(): 
        freq = 1
    elif (Mode.startswith("7")).all():  
        freq = 2
    elif (Mode.startswith("6")).any():  
        freq = 4
    elif (Mode.startswith("1")).any(): 
        freq = 12
    return freq

df['freq']=check_mode(df_ia['Mode'].values)

Some observations

if I use:

if (Mode.str.startswith('8')).any():

I receive error:

AttributeError: 'numpy.ndarray' object has no attribute 'str'

if I use:

if (Mode.startswith('8')).any():

I receive:

AttributeError: 'numpy.ndarray' object has no attribute 'startswith'

Any help will be much appreciated. Thank you.

CypherX
  • 7,019
  • 3
  • 25
  • 37
William
  • 3,724
  • 9
  • 43
  • 76

4 Answers4

1

Is this what you are after?

print(df1)

    Mode
0    602
1    603
2    700
3    100
4    100
5    100
6    802
7    100
8    100
9    100
10   100



 c=[df1['Mode'].astype(str).str.startswith('8'),df1['Mode'].astype(str).str.startswith('7'),df1['Mode'].astype(str).str.startswith('6'),df1['Mode'].astype(str).str.startswith('1')]
 ch=[1,2,4,12]
 df1['newcol']=np.select(c, ch,0)

outcome

   Mode  newcol
0    602       4
1    603       4
2    700       2
3    100      12
4    100      12
5    100      12
6    802       1
7    100      12
8    100      12
9    100      12
10   100      12
wwnde
  • 26,119
  • 6
  • 18
  • 32
  • Hi friend can you help me with this question?https://stackoverflow.com/questions/68476193/how-to-merge-2-pandas-daataframes-base-on-multiple-conditions-faster – William Jul 21 '21 at 20:38
1

Try with np.select

df=Mode
Mode = df.Mode.astype(str)
cond1 = Mode.str.startswith('8')
cond2 = Mode.str.startswith("7")
cond3 = Mode.str.startswith("6")
cond4 = Mode.str.startswith("1")
freq = [1,2,4,12]
df['new'] = np.select([cond1,cond2,cond3,cond4],freq)
df
   Mode  new
0   602    4
1   603    4
2   700    2
3   100   12
4   100   12
5   100   12
6   802    1
7   100   12
8   100   12
9   100   12
10  100   12
BENY
  • 317,841
  • 20
  • 164
  • 234
  • Hi friend can you help me with this question?https://stackoverflow.com/questions/68476193/how-to-merge-2-pandas-daataframes-base-on-multiple-conditions-faster – William Jul 21 '21 at 20:38
0

'startswith' is a pandas dataframe function/method. You are passing a numpy array to check_mode() method. This is the reason for getting below error

AttributeError: 'numpy.ndarray' object has no attribute 'str'

To avoid this issue send a pandas series as below

df['freq']=check_mode(df_ia['Mode'])

Note: Remember that Series object will not have 'startswith' due to which you would need to use str.startswith option and also need to have your data as strings for the same

Sushant Pachipulusu
  • 5,499
  • 1
  • 18
  • 30
  • Hi friend can you help me with this question?https://stackoverflow.com/questions/68476193/how-to-merge-2-pandas-daataframes-base-on-multiple-conditions-faster – William Jul 21 '21 at 20:38
0

Try this. One liner.

df['freq'] = df.Mode.astype(str).str.get(0).replace({'8': 1, '7': 2, '6': 4, '1': 12})

Now let us unpack what it does:

# You can run this cell and check the result as well

(df.Mode.astype(str) # convert the column "Mode" into str data type
   .str.get(0)       # get string based methods and access the get 
                     # method to get the 1st (`.get(0)`) digit
    # replace the digits with a dictionary that 
    # maps to their replacement values.
   .replace({'8': 1, '7': 2, '6': 4, '1': 12})) 

Code

df = pd.DataFrame([602, 603, 700, 100, 100, 100, 802, 100, 100, 100, 100,], columns=['Mode'])
df['freq'] = df.Mode.astype(str).str.get(0).replace({'8': 1, '7': 2, '6': 4, '1': 12})
df

## Output
#     Mode  freq
# 0    602     4
# 1    603     4
# 2    700     2
# 3    100    12
# 4    100    12
# 5    100    12
# 6    802     1
# 7    100    12
# 8    100    12
# 9    100    12
# 10   100    12
CypherX
  • 7,019
  • 3
  • 25
  • 37