0

The data is the divvy bike trip data from Chicago in order to complete the case study in Coursera Data Analytics Professional Certificate. I wanted to sort stations in a descending order to find out which are the top 10 most popular stations. At first, I tried to arrange it, but since there are missing values, it returns an error.

I fixed my code into the one below:

top_ten <- all_trips %>% na.omit() %>% arrange(desc(start_station_name)) %>% head(n = 10)

Then, this error was returned: Error in exists(cacheKey, where = .rs.WorkingDataEnv, inherits = FALSE) : invalid first argument Error in assign(cacheKey, frame, .rs.CachedDataEnv) : attempt to use zero-length variable name

I am a beginner in R, can someone help me? Let me know if any additional information is needed. Thanks in advance!

user438383
  • 5,716
  • 8
  • 28
  • 43
  • Please provide [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example). What's the data looks like. – isaid-hi Jul 06 '23 at 10:05
  • That error is probably not (directly) related to your problem, it's a known RStudio bug that should now be fixed in Daily builds: https://github.com/rstudio/rstudio/issues/13188 . `dplyr::arrange()` should have no issues with missing values, arranging your dataset by station name on the other hand would probably not provide much insight about station popularity. `na.omit()` removes all incomplete cases (**all** rows with **any** NA-s), are you sure this is what you want? – margusl Jul 06 '23 at 10:22
  • ride_id | rideable_type | started_at | ended_at | start_station_name | start_station_id 1 6842AA60…| electric_bike | 2023-03-16 08:20:34 | 2023-03-16 08:22:52 | Clark St & Armita… | 13146 2 F984267A… | electric_bike | 2023-03-04 14:07:06 | 2023-03-04 14:15:31 | Public Rack - Ked… | 491 3 FF7CF57C… | classic_bike | 2023-03-31 12:28:09 | 2023-03-31 12:38:47 | Orleans St & Ches… | 620 Is this enough? @isaid-hi ? – Gustivan Pangestu Jul 06 '23 at 10:51
  • It turns out you are right @margusl, `dplyr::arrange()` does not have issues with missing values. However, can you further explain the part where you said this does not provide much insight about station popularity? Thanks! – Gustivan Pangestu Jul 06 '23 at 11:03
  • 1
    You appear to have arranged by radio station name alphabetically - so your 'top 10' may well just be the same station ten times. If you want to assess the number of times each station is chosen there would need to be additional data manipulation (such as `table(all_trips$start_station_name)` – Paul Stafford Allen Jul 06 '23 at 11:36

0 Answers0