1

EDITEDI have a dataset of healthcare providers, if they provide a certain service, the city they are in, and the year. Some city's have multiple providers. I've constructed it as a time series so there is a data point for each provider and its year (either 2021 or 2023), and then a binary variable for if it provided the service in that year. I want to make a table showing the total number of providers by city in 2021 compared to 2023. The name of the dataset is HCProvidersbyCity

Here is what the data looks like:

structure(list(City = c("Alamosa", "Alamosa", "Albany", "Albany", 
"Albany", "Albany", "Albuquerque", "Albuquerque", "Albuquerque", 
"Albuquerque", "Albuquerque", "Albuquerque", "Albuquerque", "Albuquerque", 
"Alexandria", "Alexandria", "Alexandria", "Alexandria", "Alhambra", 
"Alhambra", "Allentown", "Allentown", "Altamonte Springs", "Altamonte Springs", 
"Ames", "Ames", "Amherst", "Amherst", "Amsterdam", "Amsterdam", 
"Anaheim", "Anaheim", "Anchorage", "Anchorage", "Ann Arbor", 
"Ann Arbor", "Ann Arbor", "Ann Arbor", "Ann Arbor", "Ann Arbor", 
"Annapolis", "Annapolis", "Antioch", "Antioch", "Arcadia"), Provides_Service = c(1, 
0, 1, 1, 0, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 
1, 1), Year = c(21, 23, 21, 23, 21, 23, 21, 23, 21, 23, 21, 23, 
21, 23, 21, 23, 21, 23, 21, 23, 21, 23, 21, 23, 23, 21, 21, 23, 
21, 23, 21, 23, 21, 23, 21, 23, 21, 23, 21, 23, 21, 23, 21, 21, 
21)), row.names = c(NA, -45L), class = c("tbl_df", "tbl", "data.frame"
))

I tried using the code: table(HCProvidersbyCity$City,HCProvidersbyCity$Year,HCProvidersbyCity$Provides_Service)

And I get this: table results

However, the table shows results that don't match the actual data. For example it will say a city like Chicago had 0 providers in both the years 2021 and 2023. I'll post an image of the results I get, but essentially most of the data is turning up as zero. For example, Ann arbor had 2 providers in 2021 and in 2 2023 as well, but is showing a 0 or 2021. Annapolis had 1 providers in 2021 and 1 provider in 2023, but has 0 for both.

I want a table where the City is the right column label, year is the row label, and each year has the number of providers the city had in that year. I did the same thing with Healthcare providers and states, and it worked. I'm looking for something like this, where each city only appears once. state results

I want my data to look like this: snippet of what i want my table to look like

I tried: t(table(HCProvidersbyCity[HCProvidersbyCity$Provides_Service == 1,])[,1,])

and it gave me this: !results from tt(table(HCProvidersbyCity[HCProvidersbyCity$Provides_Service == 1,])[,1,])]4

Abby
  • 11
  • 2
  • Please provide the output of `dput(AllPovidersforCity)` as part of your question to make your issue reproducible. – Phil Jul 13 '23 at 14:54
  • It's easier to help you if you include a simple [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) with sample input and desired output that can be used to test and verify possible solutions. – MrFlick Jul 13 '23 at 14:55
  • The results you are showing don't seem to match up to the code you are showing. That's not what the output of `table` should look like. What does the actual `AllPovidersforCity` data.frame look like? Use `dput(AllPovidersforCity)` to share data in the question. – MrFlick Jul 13 '23 at 15:09
  • the output I showed is what appears in the console, do I use dput here on overstack? Sorry I'm new to this. – Abby Jul 13 '23 at 15:13
  • Yes. Assuming the data is not huge, then run `dput(AllPovidersforCity)`, copy the output on the console, [edit] your question here, and paste the output into a [code block]. If the data is too big, you can subset it (`dput(AllPovidersforCity[1:10,1:10])`, for instance), though please make sure you get enough rows/columns and adjust your desired output to reflect that sampled data. Thanks! – r2evans Jul 13 '23 at 15:21
  • 1
    @r2evans will do, am working on that right now, thank you!! – Abby Jul 13 '23 at 15:34
  • Could you also include a snippet of how your expected result shall look like? – Jan Jul 13 '23 at 16:22
  • @Jan thanks for the advice, I've done that! – Abby Jul 13 '23 at 16:28

1 Answers1

0

Could this be what you're looking for?

t(table(HCProvidersbyCity[HCProvidersbyCity$Provides_Service == 1,])[,1,])

where HCProvidersbyCity is the structure you provided.

pjheink
  • 306
  • 2
  • 5
  • That gives me a table where instead of there being a column for the year 2021 and 2023, with values showing the number of providers in each, there is table with a column labeled year and filled with 2021 or 2023. Essentially I want to collapse the time series data. but it does give me the information I need? Its helpful but not quite what i'm looking for. – Abby Jul 13 '23 at 16:13
  • It sounds like you want the original table described in this answer (and not the transpose) so just remove the transpose function ("t()") from my answer above. This will give you a table with two columns (one for each year) and 16 rows (one for each city). – pjheink Jul 13 '23 at 17:28
  • THANK YOU SO MUCH !!!! – Abby Jul 13 '23 at 19:25