0

I'm a novice R user, who's learning to use this coding language to deal with data problems in research. I am trying to understand how knowledge evolves within an industry by looking at patenting in subclasses. So far I managed to get the following:

# kn.matrices<-with(patents, table(Class,year,firm))
# kn.ind <- with(patents, table(Class, year))

patents is my datafile, with Subclass, app.yr, and short.name as three of the 14 columns

# for (k in 1:37)  
# kn.firms = assign(paste("firm", k ,sep=''),kn.matrices[,,k]) 

There are 37 different firms (in the real dataset, here only 5)

This has given 37 firm-specific and 1 industry-specific 2635 by 29 matrices (in the real dataset). All firm-specific matrices are called firmk with k going from 1 until 37.

I would like to perform many operations in each of the firm-specific matrices (e.g. compare the numbers in app.yr 't' with the average of the 3 previous years across all rows) so I am looking for a way that allows me to loop the operations for every matrix named firm1,firm2,firm3...,firm37 and that generates new matrices with consistent naming, e.g. firm1.3yearcomparison

Hopefully I framed this question in an appropriate way. Any help would be greatly appreciated.

Following comments I'm trying to add a minimal reproducible example

year<-c(1990,1991,1989,1992,1993,1991,1990,1990,1989,1993,1991,1992,1991,1991,1991,1990,1989,1991,1992,1992,1991,1993)

firm<-(c("a","a","a","b","b","c","d","d","e","a","b","c","c","e","a","b","b","e","e","e","d","e"))

class<-c(1900,2000,3000,7710,18000,19000,36000,115000,212000,215000,253600,383000,471000,594000)

These three vectors thus represent columns in a spreadsheet that forms the "patents" matrix mentioned before.

Community
  • 1
  • 1
SJDS
  • 1,239
  • 1
  • 16
  • 31
  • 1
    You should not assign this to 37 matrices, but rather use `apply` on `kn.matrices`. – Roland Mar 04 '14 at 12:23
  • Hi Roland, Could you please be a bit more specific about how I should use apply, how this would solve my problem and if I store all 37 firm-specific matrices in a single very long matrix (97125 x 29), how I could still know where firm 1 ends and firm 2 begins? – SJDS Mar 04 '14 at 12:58
  • 1
    I cannot be more specific until you produce a [reproducible example](http://stackoverflow.com/a/5963610/1412059). However, as I said either use `apply` directly on the `table` data structure or use `aggregate`, `ddply` or other split-apply-combine functions on `as.data.frame(kn.matrices)`. It is generally not necessary and bad practice to pollute the global workspace with many objects. – Roland Mar 04 '14 at 13:56
  • Firm Year Class Fw.Cit a 1989 7900 18 a 1990 18000 9 b 1991 212000 18 b 1991 253600 7 c 1991 7710 4 c 1991 212000 35 d 1993 215000 10 d 1989 286000 7 e 1989 19000 6 e 1990 653000 26 d 1990 119000 7 c 1990 210190 2 v 1992 217030 20 b 1992 566000 10 a 1991 249000 19 a 1992 232800 5 a 1993 425000 107 a 1990 594000 3 a 1990 36000 6 b 1990 36000 32 b 1992 36000 61 b 1992 7900 63 c 1991 18000 5 c 1993 212000 0 d 1989 253600 1 e 1989 7710 53 e 1990 7900 44 d 1992 18000 10 b 1991 212000 26 c 1990 253600 8 – SJDS Mar 04 '14 at 14:21
  • If you want to provide additional information, please edit your question. – Roland Mar 04 '14 at 14:22
  • @Roland, I can't make a reproducible example, (see my attempt above) and I can't seem to upload an excel file. In essence, there are 3 columns, one with names, one with years, and one with numbers (subclasses) . Each name, year, and number occurs multiple times. When I use apply as you suggested I am forced to select a function - this alters the data and changes the original structure. I can't work with that because I (think I) need to keep the (subclass x year) structure and original dimensions of the matrices. I can only think of this in spreadsheet form which might be the problem – SJDS Mar 04 '14 at 14:56
  • Everyone can make a reproducible example. See countless other questions here. At least the good ones include a reproducible example. – Roland Mar 04 '14 at 15:15

1 Answers1

0

it looks like you already have a 3 dimensional array with all your data. You can basically view this as your 38 matrices all piled one on top of the other. You don't want to split this into 38 matrices and use loops. Instead, you can use R's apply function and extraction functions. Just view the help topic on the apply() family and it should show you how to do what you want. Here are a few basic examples

examples:

# returns the sums of all columns for all matrices
apply(kn.matrices, 3, colSums)

# extract the 5th row of all matrices
kn.matrices[5, , ]

# extract the 5th column of all matrices
kn.matrices[, 5, ]

# extract the 5th matrix
kn.matrices[, , 5]

# mean of 5th column for all matrices
colMeans(kn.matrices[, 5, ])