0

I have some data from which I can create a table in the RStudio console using

#Open latest file
#get a vector of all filenames 
files <- list.files(path="MYFILEPATH",pattern="files.xls",full.names = TRUE,recursive = TRUE)

#get the directory names of these (for grouping)
dirs <- dirname(files)

#find the last file in each directory (i.e. latest modified time)
lastfiles <- tapply(files,dirs,function(v) v[which.max(file.mtime(v))])

File_List <- file.info(list.files("//MYFILEPATH", full.names = T))

Path_Of_Interest <- rownames(File_List)[which.max(File_List$mtime)]

library(readxl)
MyData <- read_excel(Path_Of_Interest,
    col_types = c("numeric", "skip", "skip", 
        "skip", "skip", "numeric", "skip", 
        "skip", "skip", "skip", "skip", "skip", 
        "skip", "skip", "text", "text", "skip", 
        "skip", "skip", "skip", "skip", "skip", 
        "skip", "skip", "skip", "skip", "date", 
        "date", "skip", "skip", "skip", "skip", 
        "skip", "skip", "skip", "skip", "skip", 
        "skip", "skip", "skip", "skip", "skip", 
        "skip", "skip", "skip", "skip"))

#Rename columns
MyData <- MyData %>%
rename(Age = "Age (Years)", Area = "Preferred Area", SeniorArea = "Preferred Senior Area", SaleDate = "Sale Date", ReturnDate = "Return Date")


#Group Age Column

MyData["Age"] = cut(MyData$Age, c(0, 17, Inf), c("0-17", "18+"), include.lowest=TRUE)

#Filter to most recent data only
MyData %>%
select(Age, Area, SaleDate) %>%
filter(SaleDate >= as.Date("2021-10-25") & SaleDate <= as.Date("2021-10-31"))

#Create table
table(MyData$Area, MyData$Age)

Raw data example here:

ID Area Age
1 AreaA 0-17
2 AreaA 0-17
3 AreaB 18+
4 AreaB 18+
5 AreaB 0-17

The output in the console looks like how I would like it to look, similar to below:

Area 0-17 18+
AreaA 310 405
AreaB 210 401

However when I try to create an object out of this table (in order to merge it with another later) and then view the object I get a table like the below (not all rows included but you can see the column names)

Var1 Var2 Freq
AreaA 0-17 310
AreaB 0-17 210

Can anyone please advise how to create the object to match format similar to the first table above for use in other things?

Basically I just need to be able to create an object which is the equivalent of an Excel pivot table with 'Area' for columns and count of 'Age' for rows.

Thank you!

  • Can you provide the code that you used to create the table object? A full minimum reproducible example would be best. – Harrison Jones Nov 02 '21 at 12:44
  • group and summarise the original data. Provide data-sample for exact answer... meanwhile: https://stackoverflow.com/questions/9723208/aggregate-summarize-multiple-variables-per-group-e-g-sum-mean – Wimpel Nov 02 '21 at 12:47
  • I cannot reproduce your issue. With `tb <- table(Data$Area, Data$Age)`, I then `rbind(tb, tb)` and get four rows (`Area*`) and two columns (`"0-17"` and `"18+"`). Please show the code that is giving you problems, as it is not obvious. – r2evans Nov 02 '21 at 13:44
  • My guess is that your mention of *"merge"* means that your `tb` is being converted to a frame, which means that the dim-names are placed in two columns and the third column is the number. The reason? the results of `table` and `xtabs` can include 1, 2, or more variables, which will be displayed on your console potentially in 3 or more dimensions (e.g., `xtabs(~ cyl + gear + vs, data = mtcars)`); when you want to do something frame-like with that, there needs to be a consistent way to convert `n`-dim to a 2d frame, so using columns for each dim-axis is a (only) sane way to go. – r2evans Nov 02 '21 at 13:47
  • Thank you, I have added full code. Basically the code pulls in the most recent data file from a folder, then filters to my choice of date and provides row data which has 'Area' and 'Age' (age cut by 0-17 and 18+. What I'm looking to do is a pivot table, Areas as column and age as rows, a la an Excel pivot table? Thanks. – JeffWithpetersen Nov 02 '21 at 14:24

0 Answers0