0

Could you please explain the meaning of .$VariableName in this piece of code? I also need to know what keywords to look for in study books that relate to the same.

data %>%

filter(Origin == origin, Dest == dest, UniqueCarrier == airline) %T>%

{totalFlights <<- totalFlights + length(.$Origin)} %>%

select(ifelse(is.na(Delay), 0, Delay)) %>%

filter(Delay > 0) ->

temp
NelsonGon
  • 13,015
  • 7
  • 27
  • 57
Irina Kärkkänen
  • 261
  • 1
  • 2
  • 10
  • I assume you're using the flights dataset. However, what does `%T>%` do?! For `.$VariableName`, it depends on the context but just alone it takes all the data and selects VariableName. – NelsonGon Feb 19 '19 at 07:41

1 Answers1

1

Here is a simple explanation with a simple example:

iris %>% 
 split(.$Species)

The dot(.) basically means take all the data passed into the pipe and split it into groups(for this example) based on Species. When you examine the output, you'll see three "splits" by Species. Related: Meaning of ~. (tilde dot) argument?

NelsonGon
  • 13,015
  • 7
  • 27
  • 57
  • Hi Nelson. So it basically splitting by a factor..Thanks! – Irina Kärkkänen Feb 19 '19 at 07:48
  • Not every time. In this example, it is splitting by Species. The point is `.` carries "data" or "columns". The easiest way to think of it is in a `lm` formula. – NelsonGon Feb 19 '19 at 07:49
  • 1
    To add to this: If you use ```dplyr``` functions like ```select``` or ```filter```, you can just use the column name (without the ```.$```) since ```dplyr``` will look for the column from the dataframe. But if you do not use ```dplyr``` functions (for example: ```{totalFlights <<- totalFlights + length(.$Origin)}```), you have to specify that ```Origin``` is a column in the input dataframe. Otherwise R will look for a variable named ```Origin``` in your global environment. – FloSchmo Feb 19 '19 at 07:52
  • Means that writing data$Origin or .$Origin means here the same? – Irina Kärkkänen Feb 19 '19 at 08:00
  • or simply Origin with dplyr functions... – Irina Kärkkänen Feb 19 '19 at 08:01
  • @IrinaKärkkänen data$Origin is the base R way. In `dplyr` you don't use "$" especially if you use the pipe. In base R, you cannot do `.$Origin`. I have a feeling this line(`totalFlights + length(.$Origin)`) doesn't work but not sure. I'm still curious about what `%T>%` means. – NelsonGon Feb 19 '19 at 09:14
  • To see the impact of the dot, try running `iris %>% split(Species)`. The dot is only used in special occassions with dplyr and/or formulae. – NelsonGon Feb 19 '19 at 09:17
  • This is should a correct code, it comes from Microsoft. I have found %T>% in a book, it is called tee pipe, it applies to the left side of the expression. I understand it that it shows that the code inside the {} brackets relates to the filter function... – Irina Kärkkänen Feb 19 '19 at 09:18
  • Found the tee pipe. It's from `magrittr`. It returns the filter part. – NelsonGon Feb 19 '19 at 09:19
  • I still do not understand, how this code filters all three arguments, origin airport, destination airport, plus the airline...The original task is here: https://github.com/MicrosoftLearning/20773_Analyzing-Big-Data-with-Microsoft-R/blob/master/Instructions/20773A_LAB_05.md. The Key is here: https://github.com/MicrosoftLearning/20773_Analyzing-Big-Data-with-Microsoft-R/blob/master/Instructions/20773A_LAB_AK_05.md – Irina Kärkkänen Feb 19 '19 at 09:26
  • @IrinaKärkkänen I can't understand the data right now. Try running this code and hopefully it helps you undersatnd what's going on: `nycflights13::flights%>% filter(origin =="EWR" , dest == "JFK", carrier == "UA") %T>% {totalFlights <<-length(.$origin)} `. It returns 0 because there are no flights that match that criteria. You'll see the result of totalFlights in your global envir. – NelsonGon Feb 19 '19 at 09:33
  • Compare it to this: `nycflights13::flights%>% filter(origin =="JFK" , dest == "MCO") %T>% {totalFlights1 <<-length(.$origin)}` – NelsonGon Feb 19 '19 at 09:37
  • It looks for a match and then we take the `length` of the origin column and save that to totalFlights. The dot here is saying take whatever data meets the criteria and now from that data find the length of the column origin. – NelsonGon Feb 19 '19 at 09:43