0

I have a dataframe with 2 columns namely p1 and p2. I need to split the p1 column into a range of values like 10-50, 50-100, 100-150, etc. After splitting the values of p1, the corresponding values of p2 should be printed. The sample input is given below.

df = data.frame(p1 = c(10,20,70,80,150,200),p2 = c(1000, 1111.7, 15522.1, 15729.3,18033.8,19358.2)).

The sample output is attached below.

enter image description here

When I am trying to do for large dataset p2 getting mixed with p1. enter image description here

Tnau
  • 25
  • 5

2 Answers2

1

One way of doing it:

library(dplyr)

df %>%
  mutate(
    p1 = cut(p1, breaks = 0:(max(p1) %/% 50 + 1) * 50, include.lowest = TRUE)
  ) %>%
  group_by(p1) %>%
  summarise(p2 = list(p2))
det
  • 5,013
  • 1
  • 8
  • 16
  • Thanks for the reply. But when i am trying to do for large dataset, p2 is getting mixed with p1 value. Can you please help me to solve. – Tnau Nov 11 '21 at 05:03
  • mixed how? Can you give an example? – det Nov 11 '21 at 06:26
0

Maybe this?

setNames(
  aggregate(
    p2 ~ cut(p1, c(10, 50, 100, 150, 200), include.lowest = TRUE),
    df,
    c
  ), names(df)
)

gives

         p1               p2
1   [10,50]   1000.0, 1111.7
2  (50,100] 15522.1, 15729.3
3 (100,150]          18033.8
4 (150,200]          19358.2
ThomasIsCoding
  • 96,636
  • 9
  • 24
  • 81