2

So I know I can add a new column trivially in Pandas like this:

df
=====
  A
1 5
2 6
3 7

df['new_col'] = "text"

df
====
  A    new_col
1 5    text
2 6    text
3 7    text

And I can also set a new column based on an operation on an existing column.

def times_two(x):
    return x * 2

df['newer_col'] = time_two(df.a)
df
====
  A    new_col   newer_col
1 5    text      10
2 6    text      12
3 7    text      14

however when I try to operate on a text column I get an unexpected AttributeError.

df['new_text'] = df['new_col'].upper()
AttributeError: 'Series' object has no attribute 'upper'

It is now treating the value as a series, not the value in that "cell".

Why does this happen with text and not with numbers and how can update my DF with a new column based on an existing text column?

EdChum
  • 376,765
  • 198
  • 813
  • 562

1 Answers1

1

It's because the * operator is implemented as a mul operator whilst upper isn't defined for a Series. You have to use str.upper which is implemented for a Series where the dtype is str:

In[53]:
df['new_text'] = df['new_col'].str.upper()
df

Out[53]: 
   A new_col new_text
1  5    text     TEXT
2  6    text     TEXT
3  7    text     TEXT

There is no magic here.

For df['new_col'] this is just assigning a scalar value and conforming to broadcasting rules, where the scalar is broadcast to the length of the df along the minor axis, see this for an explanation of that: What does the term "broadcasting" mean in Pandas documentation?

EdChum
  • 376,765
  • 198
  • 813
  • 562
  • Hey. So thank you. I figured something lower lev was going on here. My actual use case is trying to sha256 encode a text field. First it needs to be encoded to utf-8. When I try to do any operation to the text field — including encoding — I get the error. Is the solution to pass a .str version? – Matt O'Neill Apr 12 '19 at 18:23
  • I did. But for some reason SO has decided to remove my points from the last decade and reset me to zero. So while I've upvoted your comment it won't allow me to show it. – Matt O'Neill Apr 12 '19 at 20:51
  • Okay. So now when I pass this I see it is "" and it's still not handling it correctly. Any further guidance? – Matt O'Neill Apr 13 '19 at 09:01
  • Sorry you need to post code not the error description – EdChum Apr 13 '19 at 09:06
  • I will do. But it seems like even using .str attributes it’s not passing a proper string. It’s some kinda of pans string and it’s causing encode(‘utf-8’) to barf. – Matt O'Neill Apr 13 '19 at 10:19