2

I have a dataframe my_df and I would like to add an additional column, my_new_column, and populate it with random integer numbers that add up to a given sum. Here is some reproducible code:

library(dplyr)
library(magrittr)
my_df <- as.data.frame(matrix(nrow = 10, ncol = 2))
colnames(my_df) <- c("Cat", "MarksA")
my_df$Cat <- LETTERS[1:nrow(my_df)]
my_df$MarksA <- sample(1:100, size = nrow(my_df))

In Tidyverse style, I tried the following:

my_df %<>% mutate(my_new_column=sample(n()))

However, this gives me a column which sums up to an arbitrary number. How can I tweak my code to achieve this task?

M--
  • 25,431
  • 8
  • 61
  • 93

2 Answers2

3

Since you didn't specify a specific distribution, would this work? I pulled my answer mostly from this post which has more details and more options: Generate non-negative (or positive) random integers that sum to a fixed value

my_df %>%
  mutate(int_sample = rmultinom(n = 1, size = 1000, prob = rep.int(1 / 10, 10)))
Harrison Jones
  • 2,256
  • 5
  • 27
  • 34
0

Since the sum of all numbers between 1 and n is equal to n(n + 1)/2, you may try something like this :

nb <- nrow(my_df)
my_df %<>% mutate(my_new_column = sample(nb * (nb + 1)/2))
Julien
  • 1,613
  • 1
  • 10
  • 26