0

I would like to know if it could be possible to print one row (specific fields) on the screen, then ask for a boolean value, then add this value in the corresponding field in a new column.

For example, I have a dataframe

Name     Age   Job
Alexandra 24  Student
Michael   42  Lawyer
John      53  Data Analyst
...

I would like to print on the screen rows, checking them one by one. So I should have:

Alexandra Student

then a command that asks if Alexandra is female. Since Alexandra is a female, I should put True (input value) as value in a new column, called Sex. Then I move to the next row:

Michael Lawyer

Since Michael is not a female, I should put False in the Sex column. Same for John.

At the end, my expected output would be:

Name     Age   Job               Sex 
Alexandra 24  Student           True
Michael   42  Lawyer            False
John      53  Data Analyst      False
...

3 Answers3

0

You could try this with df.to_records that as you can see here is the fastest when iterating over rows:

sexs=[True if str(input(f'{row[1]} {row[3]}\nIs female?\n')).lower()=='true'  else False for  row in df.to_records()]

df['Sex']=sexs
print(df)

Or, to avoid that conditional True if str(input(f'{row[1]} {row[3]}\nIs female?\n')).lower()=='true' else False, in view of the fact that you only will input 'True' or 'False', you can try:

import ast
sexs=[ast.literal_eval(str(input(f'{row[1]} {row[3]}\nIs female?\n'))) for row in df.to_records()]

df['Sex']=sexs
print(df)

And if you want to keep the inputs as strings 'True' or 'False', you could try:

sexs=[input(f'{row[1]} {row[3]}\nIs female?\n') for row in df.to_records()]

df['Sex']=sexs

Output of all the above options:

>>>Alexandra Student
>>>Is female?
True

>>>Michael Lawyer
>>>Is female?
False

>>>John Data Analyst
>>>Is female?
False

df

        Name  Age           Job    Sex
0  Alexandra   24       Student   True
1    Michael   42        Lawyer  False
2       John   53  Data Analyst  False
MrNobody33
  • 6,413
  • 7
  • 19
0

This code should work. It iterates through the rows using a for loop.

df['Sex'] = np.NaN

for i in range(len(df)):
    sex = input('Is {} {} a female? '.format(df.iloc[i,0],df.iloc[i,2]))
    df.iloc[i,3] = sex
rhug123
  • 7,893
  • 1
  • 9
  • 24
0

Iteration in Pandas is an anti-pattern and is something you should only do when you have exhausted every other option. By iteration, I mean using functions such as iterrows and itertuples that run in native Python.

I'm not quite sure what your mechanism is to determine the gender. But the best way for this problem is using a multi column apply function.

df['Sex'] = df.apply(
    lambda row: find_gender(row['Name'], row['Job']),    
    axis=1
)

In your find_gender function, you can write your logic based on the name and job (as you have stated in your question). In this function, you will have to return a Boolean to apply it to the row.

justahuman
  • 607
  • 4
  • 13