-2

It's my first time using python and pandas (plz help this old man). I have a column with float and negative numbers and I want to replace them with conditions. I.e. if the number is between -2 and -1.6 all'replace it with -2 etc. How can I create the condition (using if else or other) to modify my column. Thanks a lot

mean=[]

for row in df.values["mean"]:
    if row <= -1.5:
        mean.append(-2)
    elif row <= -0.5 and =-1.4:
        mean.append(-1)
    elif row <= 0.5 and =-0.4:
        mean.append(0)
    else:
       mean.append(1)
df = df.assign(mean=mean)

Doesn't work

The column

HugoB
  • 121
  • 7
  • I suggest you provide an example dataframe so people can help you better - make sure you generate it in copy-pastable code. – kabanus Dec 26 '18 at 17:20
  • `np.where()` can do it. [refer](https://docs.scipy.org/doc/numpy-1.15.1/reference/generated/numpy.where.html) – samkart Dec 26 '18 at 17:20
  • Welcome to StackOverflow. Please read and follow the posting guidelines in the help documentation, as suggested when you created this account. [Minimal, complete, verifiable example](http://stackoverflow.com/help/mcve) applies here. We cannot effectively help you until you post your MCVE code and accurately describe the problem. We should be able to paste your posted code into a text file and reproduce the problem you described. StackOverflow is not a coding, review, or tutorial resource. – Prune Dec 26 '18 at 17:21
  • First of all, review your tutorials on constructing a compound conditional: you need something like `-1.4 < row <= 0.5`; your posted conditions aren't syntactically legal, and the closest legal expression will resolve to `True` all the time. – Prune Dec 26 '18 at 17:23
  • We expect you to research the answer on your own before posting. There are many examples of testing cutoff ranges and intervals, available on line. – Prune Dec 26 '18 at 17:24
  • @samkart no, don't use `where`, that would be horrendous for multiple conditions. You could use `pd.cut` or `np.histogram` or... The method I'm trying to think of to bin data. – roganjosh Dec 26 '18 at 17:24
  • Finally, we need a detailed specification of any computational problems: "Doesen't work [sic]" is not a problem spec. – Prune Dec 26 '18 at 17:25
  • Take a look at this solution using `pd.cut`. https://stackoverflow.com/a/49382340/3679377 – najeem Dec 26 '18 at 17:30
  • @roganjosh noted. Thanks! Reading through the docs, `pd.cut` is preferred way. – samkart Dec 26 '18 at 17:43
  • 1
    @samkart for some reason I have `factorize()` in my head for the best way with irregular bins but the docs don't support it and I can't test. – roganjosh Dec 26 '18 at 17:44

1 Answers1

3

create a function defining your conditions and then apply it to your column (I fixed some of your conditionals based on what I thought they should be):

df = pd.read_table('fun.txt')
# create function to apply for value ranges
def labels(x):
    if x <= -1.5:
        return(-2)
    elif -1.5 < x <= -0.5:
        return(-1)
    elif -0.5 < x < 0.5:
        return(0)
    else:
        return(1)

df['mean'] = df['mean'].apply(lambda x: labels(x)) # apply your function to your table
print(df)

another way to apply your function that returns the same result:

df['mean'] = df['mean'].map(labels)

fun.txt:

mean
0
-1.5
-1
-0.5
0.1
1.1

output from above:

   mean
0     0
1    -2
2    -1
3    -1
4     0
5     1
d_kennetz
  • 5,219
  • 5
  • 21
  • 44
  • well he was asking how to update his dataframe with these conditions. I simply answered his question. – d_kennetz Dec 26 '18 at 17:27
  • 1
    good, could you please accept it as answer if it answered your question? Do this by clicking the green check mark underneath the arrows – d_kennetz Dec 26 '18 at 17:51
  • Be aware that although this is a possible solution the conditions in this answer are not correct. For an input like `-0.45` you get `1` as result due to the ranges set in the conditions. Same issue with value like `-1.45` for example. – Cedric Zoppolo Dec 26 '18 at 19:01
  • That is a good point, I have updated the ranges to reflect a proper result for the conditions. – d_kennetz Dec 26 '18 at 19:04
  • 2
    @CedricZoppolo Who knows what's correct or not here? OP's conditions are neither proper python nor human readable math (at least without any guessing what they _might_ mean) and even if one starts interpreting at best will: the first step (i.e. only fixing inequality signs) would end up with what you call incorrect answer. Then of course you can begin adjusting the border values, too, but is this still what OP meant...? – SpghttCd Dec 26 '18 at 22:12
  • That's a good point @SpghttCd. I upvoted this answer before. But for anyone reading this question I wanted to clarify the ranges selected before correction may have possibly led to undesired outputs. – Cedric Zoppolo Dec 27 '18 at 00:28