0

I have a data called data_v and one of the columns is salaries. The range of the data is between 0 and 140 000. I want to find different ranges(range1: 0-10000, range2: 10000-20000...) calculate the median of each range and replace the range with its median.

Using this I am able to get the desired output:

first = data_v$salaries[data_v$salaries>=0 & data_v$salaries<10000]
data_v$salaries[data_v$salaries>=0 & data_v$salaries<10000] = median(first)

second = data_v$salaries[data_v$salaries>=10000 & data_v$salaries<20000]
data_v$salaries[data_v$salaries>=10000 & data_v$salaries<20000] = median(second)

.............

ten=data_v$salaries[data_v$salaries>=90000 & data_v$salaries<=100000]
data_v$salaries[data_v$salaries >= 90000 & data_v$salaries <= 100000] = median(ten)

Output:

table(data_v$salaries)

median 7949    17523    25939    34302    42827    56840    65423  73292    81900      95479.75
#      130     2022     8481     9233     2661     1270     3864     2232      176        4 

I tried to implement the same thing with while loop without success:

 i <- 0;
while(i <=140000) {
  m = data_v$salaries[data_v$salaries >= i & data_v$salaries < (i + 10000)] 
  data_v$salaries[data_v$salaries >= i & data_v$salaries < (i + 10000)] =   median(m)
  i <- i + 10000; }

Any help/suggestions are more then welcomed.

Jane
  • 65
  • 7
  • Hi Biljana, I would use dplyr but could you maybe give us a reproducible example: http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example – biomiha Feb 19 '17 at 10:02
  • I agree, we could use dplyr, or data.table. Both packages are great for data manipulation. A reproductible exemple would be appreciated to give you a complete answer. – cderv Feb 19 '17 at 10:40

1 Answers1

2
data(mtcars) # data for test
step = 10 # interval length, 10000 for your data
n = ceiling(max(mtcars$mpg)/step)  # number of intervals
mtcars$mpg_interval = cut(mtcars$mpg, step*(0:n))
mtcars$mpg_median = ave(mtcars$mpg, mtcars$mpg_interval, FUN = median)
Gregory Demin
  • 4,596
  • 2
  • 20
  • 20