0

The column naming when adding a new character apparently does not show the name as set. Instead it makes a combination separated by a $ sign.

Thats my code with one example calculation I run:

for(i in ScenarioRuns){
  if(i == ScenarioRuns[1]){
    print(paste0("Scenario ", i, " so nothing done!"))
  }
  else{ 
    #define all values to be used for column selection
    toMatch <- c("col1","CropType","col5","Base__HCON","Base__PROD",names(CAPRI_PROD_agg[,grep(paste0(i,"__"),names(CAPRI_PROD_agg))]))
    #filter all necessary columns to temporary df
    a <- CAPRI_PROD_agg[,toMatch]
    head(a)
    #now we can calculate the difference from scenario i to the baseline scenario
    #1st we select the Consumer Prices for scenario i and substract the respective values for the base scenario
    a$HCON_chg = (a[,(paste0(i,"__HCON"))] - a[,"Base__HCON"])/1
       
    #now we need to select all new diff columns:
    newCols <- c(names(a[,grep("chg",names(a))]))
    #we can now rename all selected columns and add the scenario information
    for (n in newCols) {
      # for each n (=old column name), we set the new column name to the scenario name + the old coluumn name
      colnames(a)[colnames(a) == n] = paste0(i,"__",n)
    }
    # we have created the new difference calculations which can now be appended to the original data frame:
    CAPRI_PROD_agg <-  merge(CAPRI_PROD_agg,a)
    
    print(paste0("For Scenario ", i, " absolute and percentage difference to base was calculated and stored"))
  }  
}

To give you an simple example, this is the part where the error occurs:

Base__HCON <- c(23, 41, 32,23, 41, 32,23, 41, 32)
UBA_1__HCON <- c(23, 41, 32,23, 41, 32,23, 41, 32)
df <- data.frame(Base__HCON, UBA_1__HCON)
i <- "UBA_1"

df$HCON_chg <-  df[,(paste0(i,"__HCON"))] - df[,"Base__HCON"]

The Problem ist that for instance the column "HCON_chg" is not as defined but instead, it is named as "HCON_chg$<name of what is behind 'a[,(paste0(i,"__HCON"))]'>"

In the simplified example this is not happening anymore, but I have no clue why it happens in my data frame.

Does anyone know why this is happening? My script was perfectly running until today...

Already many thanks,

Carlo

Carlo237
  • 11
  • 2
  • 1
    It's easier to help you if you include a simple [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) with sample input and desired output that can be used to test and verify possible solutions. – MrFlick May 25 '23 at 14:03

1 Answers1

0

I still wonder why this pattern occurs (some columns are assigned correctly, other not...)

I have changed the assignment from:

a$HCON_chg = a[,paste0(i,"__HCON")] - a[,"Base__HCON"]

to the following approach and it seems to work:

a$HCON_chg <- ""
a[,"HCON_chg"] <-  a[,paste0(i,"__HCON")] - a[,"Base__HCON"]

I still wonder why the first approach sometimes triggers a wrong column naming.

Carlo237
  • 11
  • 2