0

I'm basically taking a non normal dataset and converting it into a dataset that I can load into a SQL Server table. Using the example code below, is there a more efficient way to do this without having to explicity list the row indices of "ASRN1" or the dataframes I want spread, merge and bind? I have hundred of datasets i have to loop through, and some might have 3 sets of asrn1, service, and OCR, while others may have 30 sets of asrn1, service and ocr.

   Columns<-c("SERIVCE ORDER", "SERVICE ORDER DATE", "ASRN1", "SERVICE","OCR","ASRN1","SERVICE","OCR", "ASRN1", "SERVICE", "OCR", "COMMENTS")
Values<-c("peanuts", "06/09/2020","1111", "abcd","xxxx", "2222", "efgh", "yyyy", "3333", "ijkl", "zzzz", "zippitydoda" )

 df <- data.frame(Columns, Values)
  a = which(df$Columns == "ASRN1",arr.ind=FALSE, useNames = TRUE)[1]
  b = which(df$Columns == "ASRN1",arr.ind=FALSE, useNames = TRUE)[2]
  c = which(df$Columns == "ASRN1",arr.ind=FALSE, useNames = TRUE)[3]



 dfa<-spread(unique(df[0:(a-1),]),Columns,Values)
 dfb<-spread(df[a:(b-1),],Columns, Values)
 dfc<-spread(df[b:(c-1),],Columns,Values)
 dfe<-spread(tail(df,-c+1),Columns,Values)

 dff<-merge(dfa,dfb)
 dfg<-merge(dfa,dfc)
 dfh<-merge(dfa,dfe)


 dfj<-dplyr::bind_rows(dff, dfg,dfh)
Sean E
  • 3
  • 2
  • 1
    Can you show a small reproducible example – akrun Jun 09 '20 at 20:50
  • 1
    Hi Sean, welcome to Stack Overflow. It will be much easier to help if you provide at least a sample of your data with `dput(df3)` or if your data is very large `dput(df3[1:20,])`. You can [edit] your question and paste the output. You can surround it with three backticks (```) for better formatting. See [How to make a reproducible example](https://stackoverflow.com/questions/5963269/) for more info. – Ian Campbell Jun 09 '20 at 21:35
  • Please read Ian's comment again on how to share data. How do we use data from an image? – Ronak Shah Jun 10 '20 at 01:25

1 Answers1

0

Consider by to subset data frame by Columns subsets and then build a list of vectors to call cbind at end. This assumes repetition is the same for multiple values and all others values appear once.

# BUILD LIST OF VECTORS
vec_list <- by(df, df$Columns, function(sub) {
    # RENAME COLUMNS
    tmp <- setNames(sub, c("Columns", as.character(sub$Columns[1])))
    # REMOVE FIRST COLUMN
    tmp <- transform(tmp, Columns = NULL)
})

# CBIND ALL DF ELEMENTS
final_df <- do.call(cbind.data.frame, vec_list)
final_df
#   ASRN1    COMMENTS  OCR SERIVCE ORDER SERVICE SERVICE ORDER DATE
# 1  1111 zippitydoda xxxx       peanuts    abcd         06/09/2020
# 2  2222 zippitydoda yyyy       peanuts    efgh         06/09/2020
# 3  3333 zippitydoda zzzz       peanuts    ijkl         06/09/2020
Parfait
  • 104,375
  • 17
  • 94
  • 125