0

I trained a decision tree model on an universal dataset that includes string observations (names of customers) - no text and no tokenization, and I wanted to test it on a new dataset by appending to it the predicted column. I have the following code; please can you guide me with the below funtion to append the predicted column to the existing data frame:

def genderpredictor(a):
    test_name1 = [a]
    transform_dv = dv.transform(features(test_name1))
    vector = transform_dv.toarray()
    if dclf.predict(vector) == 0:
        print("Female")
    else:
        print("Male") ```

# Test model on new dataset
```customers.head() ```

Output:
    |Cust_First_Name|
----|:-------------:|
0   |EBtissam       |
1   |Nawal          |
2   |Amer           |
3   |Joanna         |
4   |Stephany       |

# Test model on the above data frame
```customers_list = customers.Cust_First_Name.tolist()
for n in customers_list:
    print(genderpredictor(n)) ```


The above code will generate a list, how can I generate a column and append it to the data frame "customers"?
  • `customers['new col'] = customers['Cust_First_Name'].apply(genderpredictor)` – mozway Oct 10 '22 at 19:47
  • When I type your code I get predictions as: female, male, male, female etc... But when I ask to show " customers.head() " the "new col" shows "None" at each observation/row. What should I do? – Carla Assaf Oct 10 '22 at 21:03
  • I guess your function prints the output instead of returning it. Fix the function to `return` the value and not print. – mozway Oct 10 '22 at 21:11
  • when I adjusted the code to looke like this: '''for n in customers_list: return(genderpredictor(n))''' I got this message: "SyntaxError: 'return' outside function" – Carla Assaf Oct 10 '22 at 21:15

0 Answers0