-2

I have a dataframe of around 10000 rows and want to fill multiple columns based on certain conditions.

if Operating System Contains "Windows Server" , so Platform takes server or contains ('Windows 7|Windows 10') than Platform takes "Workstation"

Code I Have Tried:

conditions = [
    (dfADTM['Operating System'].str.contains('Windows Server')),
    (dfADTM['Operating System'].str.contains('Windows 10|Windows 7|Windows XP')),
    (dfADTM['Operating System'].str.contains('Cisco|SLES|OnTap|unknown'))]
choices = ['Server', 'Workstation', 'Network Appliance']
dfADTM['Platform AD'] = np.select(conditions, choices, default='Check')
print(dfADTM.head())

Error I am facing :

[Running] python -u "c:\Users\Abhinav Kumar\Desktop\weekly\code.py"
Traceback (most recent call last):
  File "c:\Users\Abhinav Kumar\Desktop\weekly\code.py", line 36, in <module>
    dfADTM['Platform AD'] = np.select(conditions, choices, default='Check')
  File "C:\ProgramData\Anaconda3\lib\site-packages\numpy\lib\function_base.py", line 715, in select
    'invalid entry {} in condlist: should be boolean ndarray'.format(i))
ValueError: invalid entry 0 in condlist: should be boolean ndarray

[Done] exited with code=1 in 7.725 seconds

The Resulting dataframe expected is: Dataframe

Abhinav Kumar
  • 177
  • 2
  • 5
  • 22
  • 1
    Does this answer your question? [Pandas conditional creation of a series/dataframe column](https://stackoverflow.com/questions/19913659/pandas-conditional-creation-of-a-series-dataframe-column) – Henry Yik Nov 11 '19 at 04:24
  • Checking the link – Abhinav Kumar Nov 11 '19 at 04:28
  • 1
    Please post the actual source code or printed output rather than links – Chris Nov 11 '19 at 04:32
  • @Chris , shared the actual code – Abhinav Kumar Nov 11 '19 at 05:01
  • @HenryYik - that is helpful but does not solves the problem as the conditions need to be boolean and I have a case where if string contains a word or not. – Abhinav Kumar Nov 11 '19 at 05:02
  • 1
    `str.contains` returns a boolean array. It is unclear why you get the error - perhaps you have something other than `string` in your column `Operating System`. Use `dfADTM['Operating System'].str.contains('Windows Server', na=False)` instead. – Henry Yik Nov 11 '19 at 05:20

2 Answers2

0

Not an efficient method but will get the job done

df.index

for i in range(0,len(df)):
    if df['OS'][i].split(" ")[1]=='Server':
      df.set_value(i, 'Platform', 'Server')
    if df['OS'][i].split(" ")[1]=='7' or df['OS'][i].split(" ")[1]=='10':
      df.set_value(i, 'Platform', 'Workstation')

you can drop the index or reset it if you want to

Equan Ur Rehman
  • 229
  • 1
  • 2
  • 11
0

You can try this : `

import numpy as np
import pandas as pd
df['Platform']=np.nan #create an empty column in the dataframe
for i in range(len(df)):
        a=df['Operating System'][i]
        if ('Windows 10' or 'Windows 7' or 'Windows XP')  in a:
            df['Platform'][i]='Workstation'
        elif ('Cisco' or 'SLES' or 'OnTap' or 'unknown') in a:
            df['Platform'][i]='Network Appliance'
        elif ('Windows Server') in a:
            df['Platform'][i]='Server'
        else:
            df['Platform'][i]='Not mentioned' #For the values which do no fall into any category
`