0

I want to multiply some counts in specific locations (0-100, 0-12 etc.., individual variable columns) by the number of days a count is present (days)

Here is an example of my data:

df <- structure(list(month = c("Apr", "Apr", "Aug", "Aug", "Aug", "Sep"
), Year = c(2018, 2018, 2018, 2018, 2018, 2018), First = 
 structure(c(17995, 
 17998, 17750, 17758, 17770, 17778), class = "Date"), Last = 
 structure(c(17999, 
 17998, 17750, 17761, 17771, 17778), class = "Date"), days = c(5, 
 1, 1, 4, 2, 1), `0-100` = c(1, 0, 1, 1, 1, 1), `0-12` = c(0, 
 0, 1, 1, 1, 1), `0-25` = c(1, 1, 1, 1, 1, 1), `0-50` = c(1, 0, 
1, 1, 1, 1)), row.names = c(NA, -6L), class = c("tbl_df", "tbl", 
 "data.frame"))

So i was thinking something along the lines of:

df2 <- df %>%
  mutate("0-100b" = days * "0-100", "0-12b" = days * "0-12", "0-25b" = days * "0-25", "0-50b" = days * "0-25")

Which one doesn't seem to work, but two there must be a more concise way than writing out each multiplication too ... if i had many more columns this seems a little tedious.

ok edit for col names:

colnames(df) <- c("month", "Year", "First", "Last" , "days", "V", "I", 
"II", "III")

df2 <- df %>%
mutate(Vb = days * V, Ib = days * I, IIb = days * 
       II, IIIb = days * III)
camille
  • 16,432
  • 18
  • 38
  • 60
Lmm
  • 403
  • 1
  • 6
  • 24
  • 1
    To refer to columns that aren't easily recognized by R (e.g. names start with numbers, contain "-"), wrap the name in backticks. Right now you're trying to multiple the numbers in `days` with the text "0-100", which doesn't make sense – camille May 28 '19 at 20:13
  • You can use `dplyr::mutate_at`, such as here: https://stackoverflow.com/q/45947787/5325862 – camille May 28 '19 at 20:16
  • 1
    It is better not to have column names start with non-standard names. – akrun May 28 '19 at 20:26

1 Answers1

4

Like I said above, you can select improperly named columns by wrapping them in backticks. One of the places that naming rules are laid out is in the docs of the base function make.names.

The easiest solution to having improper names is to just create data with valid names to begin with...but in practice, that isn't always possible. There are several ways to change the names into valid ones. The aforementioned make.names does this from a character vector.

If you're working in a larger piped workflow, you can use rename_all with a few string manipulation functions to 1) convert to lowercase, 2) replace - with _, and 3) prepend an x before any leading digits. You can also use janitor::clean_names, which cleans all the names in a data frame.

library(dplyr)

df %>%
  rename_all(~tolower(.) %>% 
               stringr::str_replace_all(., "\\-", "_") %>%
               stringr::str_replace("^\\b(?=\\d)", "x"))
# omitted: same names as below

With clean names, you can use mutate_at, select the columns, and pass it a function to multiply by days. If you use a named list, the name is appended to create new columns, instead of replacing them.

df %>%
  janitor::clean_names() %>%
  mutate_at(vars(x0_100:x0_50), list(b = ~. * days))
#> # A tibble: 6 x 13
#>   month  year first      last        days x0_100 x0_12 x0_25 x0_50 x0_100_b
#>   <chr> <dbl> <date>     <date>     <dbl>  <dbl> <dbl> <dbl> <dbl>    <dbl>
#> 1 Apr    2018 2019-04-09 2019-04-13     5      1     0     1     1        5
#> 2 Apr    2018 2019-04-12 2019-04-12     1      0     0     1     0        0
#> 3 Aug    2018 2018-08-07 2018-08-07     1      1     1     1     1        1
#> 4 Aug    2018 2018-08-15 2018-08-18     4      1     1     1     1        4
#> 5 Aug    2018 2018-08-27 2018-08-28     2      1     1     1     1        2
#> 6 Sep    2018 2018-09-04 2018-09-04     1      1     1     1     1        1
#> # … with 3 more variables: x0_12_b <dbl>, x0_25_b <dbl>, x0_50_b <dbl>

In this case, it might also make sense to select columns by regex:

df %>%
  janitor::clean_names() %>%
  mutate_at(vars(matches("^x\\d")), list(b = ~. * days))
# same output as above
camille
  • 16,432
  • 18
  • 38
  • 60