0

Update: I finally figured out how to do the plotting. The code below works for me:

mydf %>%
  dplyr::filter(NAME =="" & GENDER =="") %>%
  ggplot(aes(YEAR, RANK)) +
  geom_point()

Now I am working on converting it into a function that will take name and gender as arguments. The function is case sensitive and should still display a plot if the gender argument is missing. Here is my progress so far with the function. It is displaying a plot with two arguments, but if I remove the gender argument, it displays a blank plot. Do I need to apply the grep or grepl inside my function? Thanks everyone!

 name.plot <- function(name="", gender="", ignore.case=TRUE){ 
  mydf %>% 
  dplyr::filter(NAME == name & GENDER == gender) %>% 
  ggplot(aes(YEAR, RANK)) + 
  geom_point()  
}

I am working on my homework and need some help. We were given a dataset of babynames and supposed to write a function that will take a name and a gender and returns a plot of rank against year.

Currently, I'm figuring out how to display the plot first. I figured out how to display a name, but when I tried to add gender, it's just giving me a blank plot. Can someone please help me what I am doing wrong? I tried both group by and which functions, but no luck.

p1 <- mydf %>%
  filter(NAME =="Madison", GENDER =="girl") %>%
  ggplot(aes(YEAR, RANK)) +
  geom_point()
p1

Here is my sample dataset:

Babynames 1880-2008

oguz ismail
  • 1
  • 16
  • 47
  • 69
Pinaypy
  • 37
  • 1
  • 8
  • I guess you would fare better to separate your data in a more programmatic way. Meaning use facets and aesthetics for grouping. E.g, use `+ geom_point(data = mydf, aes(YEAR, RANK, color = NAME)) + facet_grid(˜GENDER)` (apologies for the weird tilda sign - using a weird keyboard atm) – tjebo Mar 23 '20 at 17:17
  • Not sure how it works. Right now, I'm just figuring out to display a plot and later on I will have to create a function that will take name and gender as arguments and display a plot. – Pinaypy Mar 23 '20 at 17:28
  • Have you tried my code? Replace the weird tilda with a normal tilda sign, and add it to your p – tjebo Mar 23 '20 at 17:33
  • `@Tjebo`, I did and still not working :( – Pinaypy Mar 23 '20 at 17:48
  • There seems to be something peculiar going on with your data .... John, WIlliam and James first ranked for girl names in 1880 ? – tjebo Mar 23 '20 at 17:52
  • `@Tjebo` thanks for pointing it out, will have to figure out why it does it when I read the CSV file. I added a sample dataset from the actual CSV file. – Pinaypy Mar 23 '20 at 17:58
  • Please read this thread here how to make data and better questions https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example – tjebo Mar 23 '20 at 18:02
  • I think I finally figured out how to plot, but how do I put it into a function that will take the name and gender as arguments? – Pinaypy Mar 23 '20 at 18:20
  • You are using the `filter` function improperly. You need to include a logical test (& or |) between your two filter statements like: `df %>% filter( NAME =="John" & GENDER =="boy")` – Dave2e Mar 23 '20 at 18:40
  • `@Dave2e`, Thank you, but I finally figured out that part. Now I'm stuck on how to put into a function that will take name and gender as arguments. If there are no matches, the function should still display an empty plot. – Pinaypy Mar 23 '20 at 19:19

1 Answers1

3

There are a few ways to do this. I should point out that often the filter() function you want to call, dplyr::filter(), is often conflicting with the stats::filter() function. I usually explicitly call using dplyr::filter() for that reason (rather than using filter() alone).

Secondly, you can also pull out data to filter using subset(df, ...) within the data argument of any ggplot function. So the code below should work to show you what you need:

ggplot(df, aes(YEAR, RANK)) +
    geom_point(data=subset(df, NAME=='Madison' & GENDER=='girl'))
chemdork123
  • 12,369
  • 2
  • 16
  • 32
  • I tried both of these and still not working :( It would display a plot for the first input, but if I tried a different name or gender, it was still giving me a blank plot. – Pinaypy Mar 23 '20 at 17:25
  • 1
    Hard to give better direction without checking the actual data - can you post the dataset? (use R syntax like `data.frame(NAME=c(...), GENDER=c(...),...)` to make it easier to help you). – chemdork123 Mar 23 '20 at 17:27
  • Oh - and note that you would have to change the code I have so that `df` is your actual dataframe name. If it's `mydf`, then use that in both places. – chemdork123 Mar 23 '20 at 17:28
  • I did that, I changed it to my actual data frame which is mydf. The same issues I have with my current code. It will display the first input name, but if I tried something else, it would give me a blank plot. – Pinaypy Mar 23 '20 at 17:31
  • I think I finally figured out how to plot, but how do I put it into a function that will take the name and gender as arguments? – Pinaypy Mar 23 '20 at 18:19
  • Sure - I would update your post to indicate what worked for you to plot it (give the code and the output would be helpful for us and others who find it in the future). Also - is your function taking strings as input or variables? I ask because `aes()` takes variable names as input, but for functions you often pass character strings, so you would use `aes_string()` in the calls to `ggplot`. Check the documentation on `aes_string()` to see if that's what you're looking for. – chemdork123 Mar 23 '20 at 19:37
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/210186/discussion-between-pinaypy-and-chemdork123). – Pinaypy Mar 23 '20 at 19:59