I am trying to create a vertical data frame from a really large crime database arranged horizontally. Its first 5 columns describe the monthly data which is presented in the following 5:16 columns. It is a yearly database, so underneath the data for a certain year is the data for the next year. What I want to do is arrange all data for a specific crime for a specific state for all years in vertical columns.
Since some rows of data are not complete I have to add an if statement that checks, since otherwise I would have to run the code and solve each problem manually until it worked.
Edit: I am trying to find a way to post the csv file containing the data, yet my stack overflow skills are pretty new. I am adding a link that contains the csv file. https://drive.google.com/file/d/1JU-mYOh68bzSHUF6weT0io1xVuNCgXFd/view?usp=share_link
I created a code like this:
## I added these base modifications to the df
df_15<-read.csv("Incidencia_municipal_2015_2022.csv", fileEncoding = "latin1")
df_15<-df_15%>%select(-colnames(df_15)[c(4,6)])%>%filter(Año<2022)%>%unite(col="Entidad", colnames(df_15)[c(3,5)])%>%unite(col="Tipo",colnames(df_15)[7:9])
df_15$Tipo<-sub(" ", "-", df_15$Tipo)
df_15$Entidad<-sub(" ", "-", df_15$Entidad)
##
tn<-unique(df_15$Tipo)
glos2<-data.frame(seq(1:98),tn)
ts<-rep(seq(1:98), times=16219)
df_15$Tipo<-ts
sn<-unique(df_15$Entidad)
cnt<-0
na<-c()
for(m in 1:length(sn)){
if(m==1){
abs<-matrix(,nrow = 84,ncol=242844)
nmsx<-seq(1:242844)
}
for (i in 1:length(tn)) {
agsx<-filter(df_15, Entidad==sn[m], Tipo==i)%>%select(colnames(df_15)[5:16])
agsx<-c(as.matrix(agsx))
t<-(m-1)*length(tn)+i
if(length(agsx)==nrow(abs)){
abs[,t]<-agsx } else{
cnt<-cnt+1
na[cnt]<-m
}
nmsx[t]<-paste(m,i, sep="_")
}
if(m==2466){
data_2015m<-as_tibble(abs)
colnames(data_2015m)<-nmsx
}
}
The code works. Yet, it takes forever to do its work. Is there an easier way to do this?