6

I want to convert all numeric columns in a dataframe to their absolute values and am doing this:

df = df.abs()

However, it gives the error:

*** TypeError: bad operand type for abs(): 'unicode'

How to fix this? I would really prefer not having to manually specify the column names

user308827
  • 21,227
  • 87
  • 254
  • 417

4 Answers4

7

You could use np.issubdtype to check whether your dtype of the columns is np.number or not with apply. Using @Amy Tavory example:

df = pd.DataFrame({'a': ['-1', '2'], 'b': [-1, 2]})
res = df.apply(lambda x: x.abs() if np.issubdtype(x.dtype, np.number) else x)

In [14]: res
Out[14]:
    a  b
0  -1  1
1   2  2

Or you could use np.dtype.kind to check whether your dtype is numeric:

res1 = df.apply(lambda x: x.abs() if x.dtype.kind in 'iufc' else x)


In [20]: res1
Out[20]:
    a  b
0  -1  1
1   2  2

Note: You may be also interested in NumPy dtype hierarchy

Anton Protopopov
  • 30,354
  • 12
  • 88
  • 93
5

Faster than the existing answers and more to the point:

df.update(df.select_dtypes(include=[np.number]).abs())

(Careful: I noticed that the update sometimes doesn't do anything when df has a non-trivial multi-index. I'll update this answer once I figure out where the problem is. This definitely works fine for trivial range-indices)

Bananach
  • 2,016
  • 26
  • 51
  • @PikachuthePurpleWizard The other answers don't say anything about "how and why they solve the problem" or "what limitations and assumptions apply", they're just more more verbose. Go ahead and delete the answer if you want, but this site will not get better if you stop people from quickly sharing bits of "knowledge " while they're doing something else that stops them from being more verbose – Bananach Apr 12 '19 at 16:56
  • I never said that your answer will be deleted, is bad, or is extremely low quality. I'm simply saying that editing to include an explanation increases the quality of your answer and lets other users know *why* your answer is useful. – Pika Supports Ukraine Apr 12 '19 at 16:58
  • @PikachuthePurpleWizard Sorry for the misdirected anger then. But someone voted to delete it, and I felt like your comment was going in the same direction. There are a lot of people with a "good-citizen complex" here that do more harm than good IMHO. – Bananach Apr 12 '19 at 17:11
  • That's okay. I understand the frustration. Now, about this answer having a delete vote... usually [the answer has to be negatively-scored](https://stackoverflow.com/help/privileges/trusted-user) in order for users to vote to delete it, but unfortunately a trusted user voted to delete it in [review](https://stackoverflow.com/review/low-quality-posts/22736293). I do not think there is any reason for this to be deleted, and hopefully it will not be. – Pika Supports Ukraine Apr 12 '19 at 17:13
  • @PikachuthePurpleWizard Thanks for the link to the review. Solid data to confirm my suspicion: 3/5 want to delete an answer that is obviously superior to the accepted one. Self-destructive site... – Bananach Apr 12 '19 at 17:24
  • I don't think it's that they're trying to harm the site, per se, but it does seem that nobody reads [the guidance](https://meta.stackoverflow.com/questions/287563/youre-doing-it-wrong-a-plea-for-sanity-in-the-low-quality-posts-queue) before starting to review in low quality posts. Not learning how to review correctly causes a *ton* of misguided reviews like this one. – Pika Supports Ukraine Apr 12 '19 at 17:25
4

Borrowing from an answer to this question, how about selecting the columns that are numeric?

Say you start with

df = pd.DataFrame({'a': ['-1', '2'], 'b': [-1, 2]})
>>> df        
    a   b
0   -1  -1
1   2   2

Then just do

numerics = ['int16', 'int32', 'int64', 'float16', 'float32', 'float64']
for c in [c for c in df.columns if df[c].dtype in numerics]:
    df[c] = df[c].abs()
>>> df
    a   b
0   -1  1
1   2   2
Community
  • 1
  • 1
Ami Tavory
  • 74,578
  • 11
  • 141
  • 185
1

If you know the columns you want to change to absolute value use this:

df.iloc[:,2:7] = df.iloc[:,2:7].abs()

which means change all values from the third to sixth column (inclusive) to its absolute values.

If you don't, you can create a list of column names whos values are not objects

col_list = [col for col in df.columns if df[col].dtype != object]

Then use .loc instead

df.loc[:,col_list] = df.loc[:,col_list].abs()

I know it is wordy but I think it avoids the slow nature of apply or lambda