1

I have following dataframe in pandas

   ID   Quantity    
   1    0.45
   2    1.2
   3    3.4
   4    3
   5    23.34
   6    122.34

I want to create bins of 1 for each observation

Below is my desired dataframe

  ID   Quantity      buckets  
   1    0.45         0-0.99
   2    1.2          1-1.99 
   3    3.4          3-3.99
   4    3            3-3.99
   5    23.34        23-23.99
   6    122.34       122-122.99

How can I do it in pandas?

Neil
  • 7,937
  • 22
  • 87
  • 145
  • What bin would you want for quantity = 1.995 ? – jpp Nov 06 '18 at 09:42
  • My data is upto 2 decimals only. – Neil Nov 06 '18 at 09:43
  • OK, then you should be **very** careful. For example, you shouldn't store your values as `float` values. As an example, try this in your interpreter: `2.99 == 2.9900000000000002131628207280300557613372802734375` returns `True`. – jpp Nov 06 '18 at 09:47

1 Answers1

1

Convert values to integers and then to strings, last join together:

s = df['Quantity'].astype(int).astype(str)
df['buckets'] = s + '-' + s + '.99'

Alternative with f-strings:

df['buckets'] = [f'{int(x)}-{int(x)}.99' for x in df['Quantity']]
#https://stackoverflow.com/a/42834054
df['buckets'] = [f'{x:.0f}-{x:.0f}.99' for x in df['Quantity']]

print (df)
   ID  Quantity     buckets
0   1      0.45      0-0.99
1   2      1.20      1-1.99
2   3      3.40      3-3.99
3   4      3.00      3-3.99
4   5     23.34    23-23.99
5   6    122.34  122-122.99

If want intervals:

s = df['Quantity'].astype(int)
df['buckets'] = pd.IntervalIndex.from_arrays(s, s + .99)
print (df)
   ID  Quantity          buckets
0   1      0.45      (0.0, 0.99]
1   2      1.20      (1.0, 1.99]
2   3      3.40      (3.0, 3.99]
3   4      3.00      (3.0, 3.99]
4   5     23.34    (23.0, 23.99]
5   6    122.34  (122.0, 122.99]

Detail:

print (df['Quantity'].astype(int))
0      0
1      1
2      3
3      3
4     23
5    122
Name: Quantity, dtype: int32

print (df['Quantity'].astype(int).astype(str))
0      0
1      1
2      3
3      3
4     23
5    122
Name: Quantity, dtype: object
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252