0

I'm trying the edX Harvard R Basics and Data Visualization courses, but I'm having quite a hard time trying to understand the functionality of the dot (.) operator.

I tried the code below:

gapminder %>%
  filter(year %in% c(1970, 2010) & !is.na(gdp)) %>%
  mutate(group = ifelse(region %in% west, "West", "Developing")) %>%
  ggplot(aes(dollars_per_day)) +
  geom_histogram(binwidth = 1, color = "black") +
  scale_x_continuous(trans = "log2") +
  facet_grid(year ~ group)

Here's where I get stuck, because I'm trying to intersect both lists, but if I put the "%>% .$country" in both lists, intersect them, then go to the histogram, everything runs well.

country_list_1 <- gapminder %>%
  filter(year == 1970 & !is.na(dollars_per_day)) %>% .$country
country_list_2 <- gapminder %>%
  filter(year == 2010 & !is.na(dollars_per_day)) %>% .$country
country_list <- intersect(country_list_1, country_list_2)

gapminder %>%
  filter(year %in% c(1970, 2010) & country %in% country_list) %>% 
  mutate(group = ifelse(region %in% west, "West", "Developing")) %>%
  ggplot(aes(dollars_per_day)) +
  geom_histogram(binwidth = 1, color = "black") +
  scale_x_continuous(trans = "log2") +
  facet_grid(year ~ group)

But if I do this (skip the %>% .$country) it returns the error "Faceting variables must have at least one value":

country_list_1 <- gapminder %>%
  filter(year == 1970 & !is.na(dollars_per_day))
country_list_2 <- gapminder %>%
  filter(year == 2010 & !is.na(dollars_per_day))
country_list <- intersect(country_list_1, country_list_2)

gapminder %>%
  filter(year %in% c(1970, 2010) & country %in% country_list) %>% 
  mutate(group = ifelse(region %in% west, "West", "Developing")) %>%
  ggplot(aes(dollars_per_day)) +
  geom_histogram(binwidth = 1, color = "black") +
  scale_x_continuous(trans = "log2") +
  facet_grid(year ~ group)

I don't quite get the logic of that, nor the function of the dot per se.

Section 3, 3.2 Using the Gapminder Dataset, 5th video "comparing distributions" of the Data Science: Visualization in R course HarvardX

mkrieger1
  • 19,194
  • 5
  • 54
  • 65
  • 1
    Hi @Carlo! Because your post focuses on `dplyr` only, I removed the references to `data.table` (and its dot has a different meaning). Cheers – Henrik Sep 09 '22 at 04:50
  • Perhaps [dot operator](https://stackoverflow.com/questions/54815607/r-combinations-with-dot-and-pipe-operator) as an overview. – Chris Sep 09 '22 at 04:53
  • 1
    Also, the `dot` tag refers to the language on stack overflow ([dot](https://graphviz.org/doc/info/lang.html)) and your operator tag is unrelated to your question (edited to remove tags). In brief, the 'dot' is a placeholder for the 'whole dataset'. Instead of `.$country` you could also use `select(country)` and get the same result. It's specific to the pipe operator, and can be used as the LHS of a command. Please see https://magrittr.tidyverse.org/reference/pipe.html and https://stackoverflow.com/questions/42385010/using-the-pipe-and-dot-notation for a more thorough explanation – jared_mamrot Sep 09 '22 at 04:55

1 Answers1

0

In addition to the comments above, you may need to learn about subsetting with $. When you run

my_new_df <- gapminder %>%
  filter(year == 1970 & !is.na(dollars_per_day))

you get back a data set (a tibble) with fewer rows then before. But you still have all of the columns. Now the $ lets you pick out a single column. So the list of countries is in the vector.

my_new_df$country

As mentioned in the comments to your question, the . operator just says to put everything coming in from the left side of the pipe %>% into that spot.

Michael Dewar
  • 2,553
  • 1
  • 6
  • 22