1

I have a matrix Q_hyda with 2 columns and n rows:

       [1]     [2]
[1]   1950    0.265
[2]   1950    0.176
[3]   1950    0.873
 .     ...     ...
[60]  1951   0.534
[61]  1951   0.142
 .     .        .
 .     .        .
 .     .        .
 .     .        .
[n]   2014    0.152

What I want to get is a matrix mat_HQa of this type:

      [1950]    [1951]    [1952] ... [2014]
[1]   0.265     0.534      ...       0.152
[2]   0.176     0.142      ...         ...
[3]   0.873      ...       ...         ...
 .     ...       ...       ...         ...
 .     ...       ...       ...         ...
 .     ...       ...       ...         ...
[n]    ...       ...       ...         ...

I tried it with some loops:

## Create a matrix mat_HQa with a_n columns (where a_n is the number of different years) and 366 rows

mat_HQa = matrix(0, 366, a_n)
colnames(mat_HQa)=as.vector(R_a) # the vector R_a is a timeline from 1950 to 2014

# fill matrix

for (i in 1:a_n)
  {for (j in 1:n) 
      {if (R_a[i] == Q_hyda[j,1]){mat_HQa[j,i] = Q_hyda[j,2]}}}

It works for the first column but when it moves to the second column it continues to fill the matrix mat_HQa at the j's position, and I can't figure out how to start at each column at the top.

I'm very new to programming, since it's not my subject. How can I achieve this? I sure think there's a much easier way to do this. I'm deeply grateful for any advice.

Suever
  • 64,497
  • 14
  • 82
  • 101
Keulizzle
  • 21
  • 3
  • What does this have to do with MATLAB? I'm going to remove the tag since it seems irrelevant – Suever May 22 '16 at 16:47
  • I'm sorry, if it's wrong. It was a suggested tag and since R and MATLAB are related I thought 'Why not?'. – Keulizzle May 22 '16 at 16:50
  • [This post](http://stackoverflow.com/questions/9617348/reshape-three-column-data-frame-to-matrix) should be helpful; here, something like `xtabs(Q_hyda[, 2] ~ seq_len(nrow(Q_hyda)) + Q_hyda[, 1])` – alexis_laz May 22 '16 at 16:54
  • If I do so I get this error message: Error in matrix(Q_hyda[, 2], nrows = 65) : unused argument (nrows = 65) – Keulizzle May 22 '16 at 16:57
  • Do `?matrix` and you will notice the argument is `nrow` and not `nrows`. :) – Gopala May 22 '16 at 16:59
  • 1
    I would convert the matrix into a data frame and use spread from the tidyr package or use the reshape2 package. – Dave2e May 22 '16 at 17:27

3 Answers3

2

Simple way using reshape2, which involves first putting your matrix into a data.frame:

Q_hyda <- matrix(c(1950, 1950, 1950, 1951, 1951, 2014,
                .265, .176, .873, .534, .142, .152),
              ncol = 2)
df <- as.data.frame(Q_hyda)
names(df) <- c("year", "val")
# give them an ID within year
df$obs <- unlist(sapply(table(df$year), function(n) 1:n), use.names = FALSE)
df
#   year   val obs
# 1 1950 0.265   1
# 2 1950 0.176   2
# 3 1950 0.873   3
# 4 1951 0.534   1
# 5 1951 0.142   2
# 6 2014 0.152   1

Now we apply reshape2:

require(reshape2)
dfm <- melt(df, id.vars = c("obs", "year"), value.name = "val")
dfc <- dcast(dfm, obs ~ year, mean, value.var = "val")
dfc
#   obs  1950  1951  2014
# 1   1 0.265 0.534 0.152
# 2   2 0.176 0.142   NaN
# 3   3 0.873   NaN   NaN

This is a better object class than a matrix, for subsequent manipulation, but if you really want a matrix, you can coerce it to one using:

mat_HQa <- as.matrix(dfc[, -1])
mat_HQa
#       1950  1951  2014
# [1,] 0.265 0.534 0.152
# [2,] 0.176 0.142   NaN
# [3,] 0.873   NaN   NaN
Ken Benoit
  • 14,454
  • 27
  • 50
1

Here is a solution using the 'tidyr' package:

> col1 <- rep(1950:2014, each = 59)
> col2 <- runif(length(col1))
> # add 'sample' as first column for the new row name
> Q_hyda <- data.frame(sample = 1:59, year = col1, value = col2)
> library(tidyr)  # does it all for you
> 
> new_data <- spread(Q_hyda, year, value)
> 
> # small sample of data
> new_data[1:6, 1:4]
  sample       1950       1951       1952
1      1 0.59867896 0.68813505 0.06603773
2      2 0.94072166 0.04474356 0.04468876
3      3 0.78878882 0.55344089 0.40102737
4      4 0.01339499 0.54489195 0.11938488
5      5 0.49914844 0.18922653 0.52316301
6      6 0.49786329 0.79751386 0.95561927
> 
> View(new_data)
0

Make up the data.

col1 <- rep(1950:2014, each = 59)
col2 <- runif(length(col1))
Q_hyda <- cbind(col1, col2)

That has colnames but it is a matrix. Let's try the suggested solutions in order. First up, @ZheyuanLi

mat_HQa <-  matrix(Q_hyda[, 2], ncol = 65); colnames(mat_HQa) <- 1950:2014
dim(mat_HQa)

## [1] 59 65

mat_HQa[1:5,1:3]

##           1950      1951       1952
## [1,] 0.5227552 0.3105570 0.33501591
## [2,] 0.4236526 0.7158999 0.04454956
## [3,] 0.8187411 0.1406177 0.02497711
## [4,] 0.5537462 0.6366948 0.92567469
## [5,] 0.2602161 0.7634615 0.85745645

This works, although it assumes you have identical numbers of observations per year. It's nice and direct, and doesn't have to convert to a data.frame.

Next is the suggestion of @alexis_laz to use xtabs()

mat_HQa <- xtabs(Q_hyda[, 2] ~ seq_len(nrow(Q_hyda)) + Q_hyda[, 1])
dim(mat_HQa)

## [1] 3835   65

mat_HQa[1:5,1:3]

##                      Q_hyda[, 1]
## seq_len(nrow(Q_hyda))      1950      1951      1952
##                     1 0.5227552 0.0000000 0.0000000
##                     2 0.4236526 0.0000000 0.0000000
##                     3 0.8187411 0.0000000 0.0000000
##                     4 0.5537462 0.0000000 0.0000000
##                     5 0.2602161 0.0000000 0.0000000

This is not the right answer. To make this work we need a 3rd variable that identifies which row the result should go in.

Q_hyda <- cbind(Q_hyda, rep(1:59, times = 65))
mat_HQa <- xtabs(Q_hyda[, 2] ~ Q_hyda[,3] + Q_hyda[, 1])
dim(mat_HQa)

## [1] 59 65

mat_HQa[1:5,1:3]

##            Q_hyda[, 1]
## Q_hyda[, 3]       1950       1951       1952
##           1 0.52275520 0.31055703 0.33501591
##           2 0.42365262 0.71589995 0.04454956
##           3 0.81874106 0.14061770 0.02497711
##           4 0.55374618 0.63669482 0.92567469
##           5 0.26021608 0.76346147 0.85745645

This is also what we want, but now it is class xtabs which inherits from table not matrix. We can coerce it back to a matrix but have to remember to do that!

mat_HQa <- as.matrix(mat_HQa)
mat_HQa[1:5, 1:3] # looks fine

##            Q_hyda[, 1]
## Q_hyda[, 3]       1950       1951       1952
##           1 0.52275520 0.31055703 0.33501591
##           2 0.42365262 0.71589995 0.04454956
##           3 0.81874106 0.14061770 0.02497711
##           4 0.55374618 0.63669482 0.92567469
##           5 0.26021608 0.76346147 0.85745645

class(mat_HQa) # still not a matrix!

## [1] "xtabs" "table"

Or maybe we can't. So not that impressed with that solution. Probably not having a matrix is no worry, but you never know.

Once we've added that extra column the problem is now in the form from this question, and all the solutions there would apply once you convert to a data.frame. That includes the answers using reshape2 or tidyr suggested by @Dave2e.

Community
  • 1
  • 1
atiretoo
  • 1,812
  • 19
  • 33