Python - problem with changing values to groups

Question

I have a dataset that has different attributes. One of these attributes is temperature. My temperature range is from about -30 to about 30 degrees. I want to do a machine learning study and I wanted to group the temperature into different groups. On a principle: below -30: 0, -30 to -10: 1 and so on. I wrote the code below, but it doesn't work the way I want it to. The data type is: int32, I converted it with float64.

dane = [treningowy_df]
for zbior in dane:
    zbior['temperatura'] = zbior['temperatura'].astype(int)
    zbior.loc[ zbior['temperatura'] <= -30, 'temperatura'] = 0
    zbior.loc[(zbior['temperatura'] > -30) & (zbior['temperatura'] <= -10), 'temperatura'] = 1
    zbior.loc[(zbior['temperatura'] > -10) & (zbior['temperatura'] <= 0), 'temperatura'] = 2
    zbior.loc[(zbior['temperatura'] > 0) & (zbior['temperatura'] <= 10), 'temperatura'] = 3
    zbior.loc[(zbior['temperatura'] > 10) & (zbior['temperatura'] <= 20), 'temperatura'] = 4
    zbior.loc[(zbior['temperatura'] > 20) & (zbior['temperatura'] <= 30), 'temperatura'] = 5
    zbior.loc[ zbior['temperatura'] > 30, 'temperatura'] = 6

For example: before the code is executed, record 1 has a temperature: -3, and after the code is applied, record 1 has a temperature: 3. why? A record with a temperature before a change: 22 after the change: 5, i.e. the assignment was executed correctly.

It seems you're on pandas... try using `apply` or `map` function in pandas? — Gabriel, May 27 '20 at 12:43
Generally i found the solutions because some guys responded me, i have to decide which one should i use. — Jakub Bidziński, May 27 '20 at 12:44
I gave an answer showing how to use the `apply` function. It should work neatly :) — Gabriel, May 27 '20 at 12:45

Gabriel · Accepted Answer · 2020-05-28T01:33:49.917

4

it looks like you're manipulating a dataframe. have you tried using the apply function?

Personally I would go about this as such (in fact, with a new column).

1. Write a function to process the value

def _check_temperature_range(x):
  if x <= -30:
    return 0
  elif x <= -10:
    return 1
  # so on and so forth...

2. Apply the function onto the column of the dataframe

df[new_column] = df[column].apply(lambda x: _check_temperature_range(x))

The results should then be reflected in the new_column or old column should you use back the same column

edited May 28 '20 at 01:33

answered May 27 '20 at 12:39

Gabriel

438
1
5
16

1

Why not just `elif x <= -10:`? – rvf May 27 '20 at 15:18

Jay · Answer 2 · 2020-05-27T12:37:56.393

I believe it has to do with the sequence of your code.

A record with temperature -3, gets assigned as 2 -

zbior.loc[(zbior['temperatura'] > -10) & (zbior['temperatura'] <= 0), 'temperatura'] = 2

Then in the next line, it is found again as being between 0 and 10, and so assigned again as 3 -

zbior.loc[(zbior['temperatura'] > 0) & (zbior['temperatura'] <= 10), 'temperatura'] = 3

One solution is to assign a number that doesn't make you "jump" a category.

So, for -3, I'd assign 0 so it sticks around.

After that you can do another pass, and change to the actual numbers you wanted, eg 0->3 etc.

score 2 · Answer 3 · answered May 27 '20 at 12:33

2

I think your code is applying multiple times on the same row. With you're exemple with the first line : temp = -3 gives 2 but then temp = 2 gives 3

So I recommend to create a new column in your dataframe

answered May 27 '20 at 12:33

Adrien Lebas

31
4

score 2 · Answer 4 · edited May 27 '20 at 12:50

2

If zbior is a pandas.DataFrame, you can use the map function

def my_func(x):
    if x <= -30:
        return 0
    elif x <= -10:
        return 1
    elif x <= 0:
        return 2
    elif x <= 10:
        return 3
    elif x <= 20:
        return 4
    elif x <= 30:
        return 5
    else:
        return 6
zbior.temperatura=zbior.temperatura.map(my_func)

edited May 27 '20 at 12:50

rvf

1,409
2
15
21

answered May 27 '20 at 12:38

Hussein Awala

4,285
2
9
23

What's the difference between using this and `apply()` as used in the accepted answer? – Jay May 27 '20 at 16:14
1

[Here](https://stackoverflow.com/q/19798153/9560594) you can find the answer to your question – Hussein Awala May 27 '20 at 16:35

Python - problem with changing values to groups

4 Answers4