Correct syntax for conditional statement in pandas

Question

I have a conditional statement I'm trying to workout in pandas in Anaconda. I've installed numpy as np.

I need to create a new "Text" field and, if the existing "truncated" field is "False", use the string in the existing "text" field. Otherwise, (or if the value of the "truncated" field is "True", use the string in the existing "extended_tweet.full_text" field.

Trying to follow instructions on this page, but it's not a direct parallel, as my 'choices' are the values of other fields, and not a given string. Pandas conditional creation of a series/dataframe column

Here's my code:

conditions = [
    (df['truncated'] == 'False'),
    (df['truncated'] == 'True')]
choices = ['text'], ['extended_tweet.full_text']
df['Text'] = np.select(conditions, choices, default='null')

After running that, all 'Text' values are 'null'

I've tried variations for the 'choices' options code, and am thinking the problem is the way I'm indicating the options in the choices line (the example code I'm following is using given 'string' values). But I can't sort out the right way to indicate I want the string values in the stated fields used in the new 'Text' field.

Any help greatly appreciated.

PART 2: RESPONSE TO INPUT BELOW:

Thank you. I wasn't familiar with minimal reproducible examples.

Here's what I've come up with:

df5 = pd.DataFrame([["True", "Hello", "fine"], ["False", 'Howdy', 'good'], ["False", "Hi", "bien"]], columns=['truncated', 'text', 'extended_tweet.full_text'])
print(df5)

  truncated   text extended_tweet.full_text
0      True  Hello                     fine
1     False  Howdy                     good
2     False     Hi                     bien

conditions = [
    (df5['truncated'] == 'False'),
    (df5['truncated'] == 'True')]
choices = ['text'], ['extended_tweet.full_text']
df5['Text'] = np.select(conditions, choices, default='null')

df5['Text']

0    extended_tweet.full_text
1                        text
2                        text
Name: Text, dtype: object

However, it's returning strings for the 'text' and 'extended_tweet.full_test' fields, and not the values in those columns.

I tried the two suggestions, which I can't see now that I'm in edit mode. But here are my results:

I changed the 'choices' line to:

choices = ['text', 'extended_tweet.full_text']

And it returned this error message, and every 'Text' value was 'null':

/anaconda3/lib/python3.7/site-packages/pandas/core/ops.py:1649: FutureWarning: elementwise comparison failed; returning scalar instead, but in the future will perform elementwise comparison
  result = method(y)

I also tried this version:

conditions = [~df['truncated'],df['truncated']]
choices = ['text'], ['extended_tweet.full_text']
df['Text'] = np.select(conditions, choices, default='null')

But like my minimal example, it produced the 'text' and 'extended.tweet_full.text' strings, and not the values in those fields.

Please see [How to make good pandas examples](https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples) — G. Anderson, Jun 27 '19 at 17:25
`choices` should be a single list `choices = ['text', 'extended_tweet.full_text']` — anky, Jun 27 '19 at 17:37
Also you need to check if `True` and `False` are strings or boolean values, if boolean `conditions = [~df['truncated'],df['truncated']]` should work — anky, Jun 27 '19 at 17:38
Thank you for the input. I edited my question above, adding a second part. I've tried the examples, and while I'm now getting something other than 'null' in my 'Text' field, it's still not the right values. — dsx, Jun 27 '19 at 22:51
Sorry, just realized my tables aren't formatted correctly. I'm researching how to do that. — dsx, Jun 27 '19 at 22:52

Correct syntax for conditional statement in pandas

0 Answers0