1

After cutting a dataframe, how can a binary field (see desired below) be added such that the value alternates with the cuts?

x   cut desired
1.1 (1, 2]  0
1.5 (1, 2]  0
1.6 (1, 2]  0
2.5 (2, 3]  1
3   (2, 3]  1
3.5 (3, 4]  0
3.5 (3, 4]  0
3.7 (3, 4]  0

The goal is to ggplot a histogram with colours similar to this except with only two colours: enter image description here

(for illustration only - doesn't correspond to above values)

Vlad
  • 3,058
  • 4
  • 25
  • 53
  • Possible duplicate of [How to use duplicate labels with cut function in R?](https://stackoverflow.com/questions/45711863/how-to-use-duplicate-labels-with-cut-function-in-r). The approach there is easily adaptable to your question: `c(0, 1, 0)[as.numeric(cut(x, breaks=c(-Inf, 2, 3, 4)))]` – duckmayr Nov 19 '17 at 12:33

1 Answers1

1

You can try this:

set.seed(1)
val <- sort(rnorm(100))
df <- data.frame(x = val, cut = cut(val,30))
df$desired <- as.numeric(df$cut) %% 2

head(df, 10)
#            x            cut cut_num desired
# 1  -2.214700  (-2.22,-2.06]       1       1
# 2  -1.989352  (-2.06,-1.91]       2       0
# 3  -1.804959  (-1.91,-1.75]       3       1
# 4  -1.523567   (-1.6,-1.45]       5       1
# 5  -1.470752   (-1.6,-1.45]       5       1
# 6  -1.377060  (-1.45,-1.29]       6       0
# 7  -1.276592  (-1.29,-1.14]       7       1
# 8  -1.253633  (-1.29,-1.14]       7       1
# 9  -1.224613  (-1.29,-1.14]       7       1
# 10 -1.129363 (-1.14,-0.984]       8       0

Edit: Note that a break of the cut can be empty (see for cut_num == 4 in the example), in this case seemingly consecutive breaks get the same desired label.

tobiasegli_te
  • 1,413
  • 1
  • 12
  • 18