0

I have a data set with each column having an attribute storing data. Meaning, columns has row wise values and then attributes to columns also have a value.

I can read the data attached to the column as attribute using attr(). However, my goal is to capture these attribute values and replicate as a columns.

Reading Attribute

> attr(data$`column1`, "metadata")$DP.SomeNumber1
"6200"
> attr(data$`column2`, "metadata")$DP.SomeNumber2
"7200"

Input Data

column1 column2
 -0.01   0.05
 -0.01   0.05
 -0.01   0.05
 -0.01   0.05
 -0.01   0.05
 -0.01   0.05
 -0.01   0.05
 -0.01   0.05

Then using above code, I want to append the values as shown below.

Output Data

column1 SomeNumber1 column2 SomeNumber2
 -0.01    6200        0.05     7200
 -0.01    6200        0.05     7200
 -0.01    6200        0.05     7200
 -0.01    6200        0.05     7200
 -0.01    6200        0.05     7200
 -0.01    6200        0.05     7200
 -0.01    6200        0.05     7200
 -0.01    6200        0.05     7200

How can I implement this recursively for data with more than 1000 columns? Each read will require call to attr() with unique column name to capture the attribute data and then replicate it as another adjust column.

I am getting confused on how I can recursively do this and that too in an optimized way.

Please share suggestions, thanks.

Chetan Arvind Patil
  • 854
  • 1
  • 11
  • 31

1 Answers1

1

Unfortunately you didn't provide an reproducible example. so I created one and hope it fits your problem:

column1 = rep(-0.01, 8)
attr(column1, "metadata")$DP.SomeNumber1 = "6200"
column2 = rep(0.05, 8)
attr(column2, "metadata")$DP.SomeNumber2 = "7200"

data = data.frame(column1, column2)

Using lapply you can iterate over the columns of your dataframe. For each column the attributes were added as a new column to the original dataframe. Here is the code of my solution:

# create function to extract attributes of a given column(name) an create new column in original dataframe
attr2col <- function(col) {
  myAttr = attr(data[,col], "metadata")
  data[,sub("^DP\\.", "", names(myAttr))] <<- myAttr[[names(myAttr)]]
}

# iterate over colums of original dataframe
lapply(names(data), attr2col)
MarkusN
  • 3,051
  • 1
  • 18
  • 26
  • .@MarkusN - Thanks. What's `col` in `function(col)`. It has to be passed from `attr2col()`, if I am correct? – Chetan Arvind Patil Sep 15 '17 at 00:35
  • @Chetan Arvind Patil - It's actually the columnname, that is passed in lapply. Long form would be lapply(names(data), function(x) attr2col(x)). – MarkusN Sep 15 '17 at 06:52
  • .@MarkusN - I get `Error: subscript out of bounds` for `myAttr[[names(myData)]]`. In `myAttr`, I do get all the required attributes. Any suggestions please? – Chetan Arvind Patil Sep 15 '17 at 22:35
  • @Chetan Arvind Patil - I assumed a data structure that meets your problem. apparently your data seems to be different, maybe you have to provide an example of your data. – MarkusN Sep 16 '17 at 21:45