-1

I've been trying to make a graph using either barplot or ggplot but first I need to combine different observations from the same variable.

My variable has different observations depending on how relevant a subject is for each user. like this:

Count  Activity
10     Bikes for fitness reasons
22     Runs for fitness reasons
12     Bikes to commute to work
10     Walks to commute to work
5      Walks to stay healthy

My idea is to merge the observations from the "Activity" variable so it looks like this:

Count Activity
22    Bikes
22    Runs
15    Walks

So, I don't care the reason for them to do the activity, I just want to merge them so I can put that info into a bar graph.

user438383
  • 5,716
  • 8
  • 28
  • 43
Chris
  • 3
  • 2

3 Answers3

2

Here is a tidyverse solution:

library(tidyverse)

df %>% 
  mutate(Activity = word(Activity, 1)) %>% 
  group_by(Activity) %>% 
  summarize(Count = sum(Count))

This gives us:

# A tibble: 3 x 2
  Activity Count
  <chr>    <dbl>
1 Bikes       22
2 Runs        22
3 Walks       15

Data:

structure(list(Count = c(10, 22, 12, 10, 5), Activity = c("Bikes for fitness reasons", 
"Runs for fitness reasons", "Bikes to commute to work", "Walks to commute to work", 
"Walks to stay healthy")), row.names = c(NA, -5L), class = c("data.table", 
"data.frame"), .internal.selfref = <pointer: 0x0000019ba0e31ef0>)
Matt
  • 7,255
  • 2
  • 12
  • 34
  • 1
    Did not know `word()`. Looks VERY usefull. I would usually go through the `str_extract("^\\w+")` drill and alikes here. This is so much simpler. +1 – GuedesBF Sep 23 '21 at 18:56
1

You could use grep() to find each term you are looking for, like this:

df <- data.frame(
  Count = c(10,22,12,10,5),
  Activity = c("Bikes for fitness reasons",
               "Runs for fitness reasons",
               "Bikes to commute to work",
               "Walks to commute to work",
               "Walks to stay healthy"))

# Look for this string
var <- "Bikes"

# Get the row where "Bikes" appears
grep(pattern = var, x = df$Activity)
#> [1] 1 3

# Get Count values from each row where "Bikes" appears
df[grep(pattern = var, x = df$Activity), "Count"]
#> [1] 10 12
Skaqqs
  • 4,010
  • 1
  • 7
  • 21
1

Using trimws

library(dplyr)
df %>% 
   group_by(Activity = trimws(Activity, whitespace = "\\s+.*")) %>% 
   summarise(Count = sum(Count))

-output

# A tibble: 3 x 2
  Activity Count
  <chr>    <dbl>
1 Bikes       22
2 Runs        22
3 Walks       15
akrun
  • 874,273
  • 37
  • 540
  • 662