5

I've a data.frame given below. I am trying to move it from long format to wide format. Using the spreading column being dates. using spread function from the tidyr package presents two fold problem:

  • The data is filled with NA
  • The months get ordered by alphabetic order

So how do I go from

30-Apr-2015 632.95
28-May-2015 532.95
25-Jun-2015 232.95

to

30-Apr-2015 28-May-2015 25-Jun-2015
632.95      532.95      232.95

instead I end up at

30-Apr-2015 25-Jun-2015 28-May-2015 
632.95      NA      232.95
NA          232.95  NA
NA          NA      532.95

Actual dates don't matter, but their relative ordering matter, i.e. the nearest month data should go to first column, followed by the other two month data, in successive order. This is necessary because I'm using rbind on the result

The code I've tried

data = tidyr::spread(data, key = EXPIRY_DT, value = CHG_IN_OI)
colnames(data)[3:5] = c('Month1', 'Month2', 'Month3')

The data.frame is as given below:

data = structure(list(SYMBOL = c("A", "A", "A", "B", "B", "B", "C", 
"C", "C", "D", "D", "D"), EXPIRY_DT = c("30-Apr-2015", "28-May-2015", 
"25-Jun-2015", "30-Apr-2015", "28-May-2015", "25-Jun-2015", "30-Apr-2015", 
"28-May-2015", "25-Jun-2015", "30-Apr-2015", "28-May-2015", "25-Jun-2015"
), OPEN = c(1750, 1789, 0, 1627.5, 1653.3, 0, 632.95, 644.1, 
0, 317.8, 319.5, 0), HIGH = c(1788.05, 1795, 0, 1656.5, 1653.3, 
0, 646.4, 650.5, 0, 324.6, 326.65, 0), LOW = c(1746, 1760, 0, 
1627.5, 1645.45, 0, 629.65, 635, 0, 315.85, 318.4, 0), CLOSE = c(1782.3, 
1791.85, 1695.1, 1642.95, 1646.75, 1613.9, 640.85, 644.35, 614.6, 
320.55, 322.35, 310.85), SETTLE_PR = c(1782.3, 1791.85, 1804.8, 
1642.95, 1653.85, 1664.35, 640.85, 644.35, 649.1, 320.55, 322.35, 
325.35), CONTRACTS = c(1469L, 78L, 0L, 2638L, 14L, 0L, 4964L, 
181L, 0L, 3416L, 82L, 0L), VALUE = c(6496.96, 347.91, 0, 10830.05, 
57.68, 0, 15869.41, 583.38, 0, 10969.31, 264.93, 0), OPEN_INT = c(1353750L, 
8500L, 0L, 1377250L, 17000L, 0L, 6264000L, 98000L, 0L, 8228000L, 
216000L, 0L), CHG_IN_OI = c(15250L, 1250L, 0L, -21000L, 1500L, 
0L, 73500L, 6000L, 0L, -192000L, 13000L, 0L), TIMESTAMP = c("10-APR-2015", 
"10-APR-2015", "10-APR-2015", "10-APR-2015", "10-APR-2015", "10-APR-2015", 
"10-APR-2015", "10-APR-2015", "10-APR-2015", "10-APR-2015", "10-APR-2015", 
"10-APR-2015")), .Names = c("SYMBOL", "EXPIRY_DT", "OPEN", "HIGH", 
"LOW", "CLOSE", "SETTLE_PR", "CONTRACTS", "VALUE", "OPEN_INT", 
"CHG_IN_OI", "TIMESTAMP"), row.names = 40:51, class = "data.frame")

Thanks for reading.

Edit:

After comments from @akrun adding the expected output. Because the values for each dates are different, i.e. would need the data for each month placed one after another, with the column names are being appended with the string 'Month1/2/3' instead of the actual date. Hope that helps.

output = structure(list(SYMBOL = c("A", "B", "C", "D"), TIMESTAMP = c("10-Apr-15", 
"10-Apr-15", "10-Apr-15", "10-Apr-15"), OPEN.Month1 = c(1750, 
1627.5, 632.95, 317.8), HIGH.Month1 = c(1788.05, 1656.5, 646.4, 
324.6), LOW.Month1 = c(1746, 1627.5, 629.65, 315.85), CLOSE.Month1 = c(1782.3, 
1642.95, 640.85, 320.55), SETTLE_PR.Month1 = c(1782.3, 1642.95, 
640.85, 320.55), CONTRACTS.Month1 = c(1469L, 2638L, 4964L, 3416L
), VALUE.Month1 = c(6496.96, 10830.05, 15869.41, 10969.31), OPEN_INT.Month1 = c(1353750L, 
1377250L, 6264000L, 8228000L), CHG_IN_OI.Month1 = c(15250L, -21000L, 
73500L, -192000L), OPEN.Month2 = c(1789, 1653.3, 644.1, 319.5
), HIGH.Month2 = c(1795, 1653.3, 650.5, 326.65), LOW.Month2 = c(1760, 
1645.45, 635, 318.4), CLOSE.Month2 = c(1791.85, 1646.75, 644.35, 
322.35), SETTLE_PR.Month2 = c(1791.85, 1653.85, 644.35, 322.35
), CONTRACTS.Month2 = c(78L, 14L, 181L, 82L), VALUE.Month2 = c(347.91, 
57.68, 583.38, 264.93), OPEN_INT.Month2 = c(8500L, 17000L, 98000L, 
216000L), CHG_IN_OI.Month2 = c(1250L, 1500L, 6000L, 13000L), 
    OPEN.Month3 = c(0L, 0L, 0L, 0L), HIGH.Month3 = c(0L, 0L, 
    0L, 0L), LOW.Month3 = c(0L, 0L, 0L, 0L), CLOSE.Month3 = c(1695.1, 
    1613.9, 614.6, 310.85), SETTLE_PR.Month3 = c(1804.8, 1664.35, 
    649.1, 325.35), CONTRACTS.Month3 = c(0L, 0L, 0L, 0L), VALUE.Month3 = c(0L, 
    0L, 0L, 0L), OPEN_INT.Month3 = c(0L, 0L, 0L, 0L), CHG_IN_OI.Month3 = c(0L, 
    0L, 0L, 0L)), .Names = c("SYMBOL", "TIMESTAMP", "OPEN.Month1", 
"HIGH.Month1", "LOW.Month1", "CLOSE.Month1", "SETTLE_PR.Month1", 
"CONTRACTS.Month1", "VALUE.Month1", "OPEN_INT.Month1", "CHG_IN_OI.Month1", 
"OPEN.Month2", "HIGH.Month2", "LOW.Month2", "CLOSE.Month2", "SETTLE_PR.Month2", 
"CONTRACTS.Month2", "VALUE.Month2", "OPEN_INT.Month2", "CHG_IN_OI.Month2", 
"OPEN.Month3", "HIGH.Month3", "LOW.Month3", "CLOSE.Month3", "SETTLE_PR.Month3", 
"CONTRACTS.Month3", "VALUE.Month3", "OPEN_INT.Month3", "CHG_IN_OI.Month3"
), class = "data.frame", row.names = c(NA, -4L))
Frash
  • 724
  • 1
  • 10
  • 19
  • I think the date column needs to be converted to date class. – akrun Apr 12 '15 at 04:57
  • Regarding the NA part, if you need the date columns and SYMBOL column in the expected output, you can subset those columns, otherwise, the values in other columns are different for other rows to create 'NA` values in the date columns. – akrun Apr 12 '15 at 05:08
  • Yes, you are right, The NA values are because of the differing OHLC values for different dates. Didn't expect the OHLC to be different, guess I need to subset and cbind them back to get row numbers same. – Frash Apr 12 '15 at 05:33
  • I updated with the post with `data.table` method. – akrun Apr 12 '15 at 10:59
  • Accepted your answer. Sorry for the delay. – Frash Apr 12 '15 at 11:20
  • Thanks, no problem. Glad to help you. I understand that it takes some time to create the `output` dataset. – akrun Apr 12 '15 at 11:25

3 Answers3

4

We could use the devel version of data.table ie. 'v1.9.5' which can take multiple "value.vars". Instructions to install the devel version are here.

Change the 'data.frame' to 'data.table' (setDT(data)). Create a "Month" column by pasting the 'Month' with the row number for each "SYMBOL". Then, we can use dcast, specifying the value.var as the columns '3:11'.

library(data.table)
res <- dcast(setDT(data)[, Month:=paste0('Month', 1:.N), by=SYMBOL],
                 SYMBOL+TIMESTAMP~Month, value.var=names(data)[3:11])

If we need to change the column names to the particular format in the 'output', use setnames. I rearranged the order of the columns as in the expected result ('output') and changed the data.table to data.frame (setDF)

setnames(res, sub('([^_]+)_(.*)', '\\2.\\1', colnames(res)))
res1 <- setDF(res[,names(output), with=FALSE])
res1
#  SYMBOL   TIMESTAMP OPEN.Month1 HIGH.Month1 LOW.Month1 CLOSE.Month1
#1      A 10-APR-2015     1750.00     1788.05    1746.00      1782.30
#2      B 10-APR-2015     1627.50     1656.50    1627.50      1642.95
#3      C 10-APR-2015      632.95      646.40     629.65       640.85
#4      D 10-APR-2015      317.80      324.60     315.85       320.55
#  SETTLE_PR.Month1 CONTRACTS.Month1 VALUE.Month1 OPEN_INT.Month1
#1          1782.30             1469      6496.96         1353750
#2          1642.95             2638     10830.05         1377250
#3           640.85             4964     15869.41         6264000
#4           320.55             3416     10969.31         8228000
#  CHG_IN_OI.Month1 OPEN.Month2 HIGH.Month2 LOW.Month2 CLOSE.Month2
#1            15250      1789.0     1795.00    1760.00      1791.85
#2           -21000      1653.3     1653.30    1645.45      1646.75
#3            73500       644.1      650.50     635.00       644.35
#4          -192000       319.5      326.65     318.40       322.35
#  SETTLE_PR.Month2 CONTRACTS.Month2 VALUE.Month2 OPEN_INT.Month2
#1          1791.85               78       347.91            8500
#2          1653.85               14        57.68           17000
#3           644.35              181       583.38           98000
#4           322.35               82       264.93          216000
#  CHG_IN_OI.Month2 OPEN.Month3 HIGH.Month3 LOW.Month3 CLOSE.Month3  
#1             1250           0           0          0      1695.10
#2             1500           0           0          0      1613.90
#3             6000           0           0          0       614.60
#4            13000           0           0          0       310.85
#  SETTLE_PR.Month3 CONTRACTS.Month3 VALUE.Month3 OPEN_INT.Month3
#1          1804.80                0            0               0
#2          1664.35                0            0               0
#3           649.10                0            0               0
#4           325.35                0            0               0
#  CHG_IN_OI.Month3
#1                0
#2                0
#3                0
#4                0

The TIMESTAMP column in 'output' was in different format. Changed the format in the 'res1' and it is the same as the expected output.

res1$TIMESTAMP <- format(as.Date(res1$TIMESTAMP, '%d-%b-%Y'), '%d-%b-%y')
all.equal(output, res1)
#[1] TRUE

Or we can use reshape from base R, which does take multiple value columns. Just like we created a sequence earlier, here we can use ave to create 'MONTH' column and use that as timevar within the reshape.

 data$MONTH <- with(data, paste0('MONTH', ave(seq_along(SYMBOL), 
                    SYMBOL, FUN=seq_along)))
 res2 <- reshape(data[-2], idvar=c('SYMBOL', 'TIMESTAMP'), 
                          timevar='MONTH', direction='wide')
akrun
  • 874,273
  • 37
  • 540
  • 662
  • Why use select? I need all the columns to be there, except that EXPIRY_DT be spread into columns. When I removed the select line, didn't work. Any idea why? – Frash Apr 12 '15 at 05:35
  • Thanks for the months being formatted as dates. I had tried strptime with as.POSIXct earlier, didn't work correctly. – Frash Apr 12 '15 at 05:37
  • @Frash Just updated now. As I mentioned in the comments earlier, if you didn't subset, some rows will have NAs due to the combination. One option would be to subset, then spread, and then `left_join` so that you won't have NA, here the values in the spread dates will fill up for each SYMBOL – akrun Apr 12 '15 at 05:37
  • @Frash Can you check the output of `left_join`? Also, it will be great if you can show the expected output based on the example provided. – akrun Apr 12 '15 at 06:04
  • Thanks, works nicely. Both the answers are great. Any way to make the column names in the format .Month ? Currently it is Month_. – Frash Apr 12 '15 at 11:03
  • @Frash If you check the `res1`, it is in the correct format. – akrun Apr 12 '15 at 11:04
  • Saw your updated reply. The dates doesn't need reformatting. Though didn't understand the use of `res1 <- setDF(res[,names(output), with=FALSE])`. Is it to reorder the columns? – Frash Apr 12 '15 at 11:11
  • @Frash It is just to order the dataset and to check whether the result I got is the same as the one you showed in 'output'. When we have a lot of columns, it is difficult to check column by column. I also updated with `reshape` which gets the column names in the format you wanted without making any extra effort. – akrun Apr 12 '15 at 11:14
2

Extremely tough problem. I've devised a solution that comes very close to your sample output; you should be able to clean up the little discrepancies afterward (see the end of my answer for a summary of discrepancies).

Assumptions

First, let me start with my assumptions:

  • The input data.frame data is already properly ordered with respect to the EXPIRY_DT (independently for each SYMBOL). Your sample input satisfies this assumption. Now, as a general recommendation, you should try to always use ISO 8601 for date formats, which naturally sort lexicographically, and would naturally allow you to coerce to Date format in R. Given your input date formats, if you wanted to guarantee proper order, you would have to call as.Date() and pass the input format, and then make a call to order(). Instead of including this in my code, I've just made the assumption that the data is already ordered.
  • Because your sample output seems to have unified all values of TIMESTAMP for each SYMBOL, I've made the assumption that those two columns comprise a multicolumn primary key to the data. If this is incorrect, you can simply change the keys variable I define in my code to not include TIMESTAMP. But if that is the case, then you will get additional TIMESTAMP.Month{mnum} columns in the output (which you could remove afterward, if desired).

Code

keys <- c('SYMBOL','TIMESTAMP');
mnum <- ave(1:nrow(data), data[,keys], FUN=seq_along );
mnum;
##  [1] 1 2 3 1 2 3 1 2 3 1 2 3
mdata <- lapply(1:max(mnum), function(x) setNames(data[mnum==x,],ifelse(names(data)%in%keys,names(data),paste0(names(data),'.Month',x))) );
mdata;
## [[1]]
##    SYMBOL EXPIRY_DT.Month1 OPEN.Month1 HIGH.Month1 LOW.Month1 CLOSE.Month1 SETTLE_PR.Month1 CONTRACTS.Month1 VALUE.Month1 OPEN_INT.Month1 CHG_IN_OI.Month1   TIMESTAMP
## 40      A      30-Apr-2015     1750.00     1788.05    1746.00      1782.30          1782.30             1469      6496.96         1353750            15250 10-APR-2015
## 43      B      30-Apr-2015     1627.50     1656.50    1627.50      1642.95          1642.95             2638     10830.05         1377250           -21000 10-APR-2015
## 46      C      30-Apr-2015      632.95      646.40     629.65       640.85           640.85             4964     15869.41         6264000            73500 10-APR-2015
## 49      D      30-Apr-2015      317.80      324.60     315.85       320.55           320.55             3416     10969.31         8228000          -192000 10-APR-2015
## 
## [[2]]
##    SYMBOL EXPIRY_DT.Month2 OPEN.Month2 HIGH.Month2 LOW.Month2 CLOSE.Month2 SETTLE_PR.Month2 CONTRACTS.Month2 VALUE.Month2 OPEN_INT.Month2 CHG_IN_OI.Month2   TIMESTAMP
## 41      A      28-May-2015      1789.0     1795.00    1760.00      1791.85          1791.85               78       347.91            8500             1250 10-APR-2015
## 44      B      28-May-2015      1653.3     1653.30    1645.45      1646.75          1653.85               14        57.68           17000             1500 10-APR-2015
## 47      C      28-May-2015       644.1      650.50     635.00       644.35           644.35              181       583.38           98000             6000 10-APR-2015
## 50      D      28-May-2015       319.5      326.65     318.40       322.35           322.35               82       264.93          216000            13000 10-APR-2015
## 
## [[3]]
##    SYMBOL EXPIRY_DT.Month3 OPEN.Month3 HIGH.Month3 LOW.Month3 CLOSE.Month3 SETTLE_PR.Month3 CONTRACTS.Month3 VALUE.Month3 OPEN_INT.Month3 CHG_IN_OI.Month3   TIMESTAMP
## 42      A      25-Jun-2015           0           0          0      1695.10          1804.80                0            0               0                0 10-APR-2015
## 45      B      25-Jun-2015           0           0          0      1613.90          1664.35                0            0               0                0 10-APR-2015
## 48      C      25-Jun-2015           0           0          0       614.60           649.10                0            0               0                0 10-APR-2015
## 51      D      25-Jun-2015           0           0          0       310.85           325.35                0            0               0                0 10-APR-2015
## 
res <- Reduce(function(x,y) merge(x,y,by=keys,all=T), mdata );
res;
##   SYMBOL   TIMESTAMP EXPIRY_DT.Month1 OPEN.Month1 HIGH.Month1 LOW.Month1 CLOSE.Month1 SETTLE_PR.Month1 CONTRACTS.Month1 VALUE.Month1 OPEN_INT.Month1 CHG_IN_OI.Month1 EXPIRY_DT.Month2 OPEN.Month2 HIGH.Month2 LOW.Month2 CLOSE.Month2 SETTLE_PR.Month2 CONTRACTS.Month2 VALUE.Month2 OPEN_INT.Month2 CHG_IN_OI.Month2 EXPIRY_DT.Month3 OPEN.Month3 HIGH.Month3 LOW.Month3 CLOSE.Month3 SETTLE_PR.Month3 CONTRACTS.Month3 VALUE.Month3 OPEN_INT.Month3 CHG_IN_OI.Month3
## 1      A 10-APR-2015      30-Apr-2015     1750.00     1788.05    1746.00      1782.30          1782.30             1469      6496.96         1353750            15250      28-May-2015      1789.0     1795.00    1760.00      1791.85          1791.85               78       347.91            8500             1250      25-Jun-2015           0           0          0      1695.10          1804.80                0            0               0                0
## 2      B 10-APR-2015      30-Apr-2015     1627.50     1656.50    1627.50      1642.95          1642.95             2638     10830.05         1377250           -21000      28-May-2015      1653.3     1653.30    1645.45      1646.75          1653.85               14        57.68           17000             1500      25-Jun-2015           0           0          0      1613.90          1664.35                0            0               0                0
## 3      C 10-APR-2015      30-Apr-2015      632.95      646.40     629.65       640.85           640.85             4964     15869.41         6264000            73500      28-May-2015       644.1      650.50     635.00       644.35           644.35              181       583.38           98000             6000      25-Jun-2015           0           0          0       614.60           649.10                0            0               0                0
## 4      D 10-APR-2015      30-Apr-2015      317.80      324.60     315.85       320.55           320.55             3416     10969.31         8228000          -192000      28-May-2015       319.5      326.65     318.40       322.35           322.35               82       264.93          216000            13000      25-Jun-2015           0           0          0       310.85           325.35                0            0               0                0

Explanation

As you can see, the core of my solution involves splitting the input data into separate data.frames by month number, which makes possible adding suffixes to all non-key columns independently for each split, and then repeatedly calling merge() to merge them all together.

The mnum vector stands for "month number". You could consider it to be a kind of "detached" column of the input data object; it represents the month number within the primary key group to which each row in data belongs. I use ave() to call seq_along() once for each group, which generates a sequential integer vector of length equal to the group size (i.e. number of rows in the group), which ave() maps back to the positions of the group rows in the original data object.

The mdata object is a list of data.frames, where each component represents one month number. The actual extraction of the rows with a particular month number is done with a simple logical index operation:

data[mnum==x,]

where x is the mnum element, iterated over 1:max(mnum) by lapply(). The suffixing of non-key column names is done using setNames(), deriving the replacement column names as follows:

ifelse(names(data)%in%keys,names(data),paste0(names(data),'.Month',x))

The above leaves the names of key-columns untouched, but appends '.Month{mnum}' to the names of all non-key-columns.

Finally, all month-number splits must be merged back into one data.frame. I thought I'd be able to use a single call to merge() (possibly with a little help from do.call()) to do this, but was disappointed to discover that it only takes two arguments to merge, x and y (also see Simultaneously merge multiple data.frames in a list). Thus, I needed to call Reduce() to achieve the repeated calls. The all=T argument would be important if your different symbols had different numbers of expiry dates; then "short" symbols would not be represented on the RHS of the final merge(s), and thus would be dropped, if all=T was not passed.

Discrepancies

My output doesn't exactly match your sample output. Here are the discrepancies:

  • Your sample output seems to have changed the format of the TIMESTAMP column from what it was in the input, for example, 10-APR-2015 changed to 10-Apr-15. My code does not touch the format of TIMESTAMP.
  • Your sample output is lacking the EXPIRY_DT columns, which my solution retains under their suffixed EXPIRY_DT.Month1, EXPIRY_DT.Month2, etc. names. You can remove those columns afterward with grep() on names() and negative indexing, if so desired.
Community
  • 1
  • 1
bgoldst
  • 34,190
  • 6
  • 38
  • 64
  • Works great. And detailed explanation is inspiring. Learned a lot. About the discrepancies, the date format change is because of hand editing in excel. The extra columns I've indexed out, so no problem there. Great answer and great explanation. – Frash Apr 12 '15 at 10:59
1

Just remembered that aggregate() has an overload for data.frames which can be used to achieve this requirement. The column names and order won't be exactly as you wanted, but they're certainly logical and usable (and could be adjusted afterward):

keys <- c('SYMBOL','TIMESTAMP');
aggregate(data[,!(names(data)%in%keys)],data[,names(data)%in%keys],identity);
##   SYMBOL   TIMESTAMP EXPIRY_DT.1 EXPIRY_DT.2 EXPIRY_DT.3  OPEN.1  OPEN.2  OPEN.3  HIGH.1  HIGH.2  HIGH.3   LOW.1   LOW.2   LOW.3 CLOSE.1 CLOSE.2 CLOSE.3 SETTLE_PR.1 SETTLE_PR.2 SETTLE_PR.3 CONTRACTS.1 CONTRACTS.2 CONTRACTS.3  VALUE.1  VALUE.2  VALUE.3 OPEN_INT.1 OPEN_INT.2 OPEN_INT.3 CHG_IN_OI.1 CHG_IN_OI.2 CHG_IN_OI.3
## 1      A 10-APR-2015 30-Apr-2015 28-May-2015 25-Jun-2015 1750.00 1789.00    0.00 1788.05 1795.00    0.00 1746.00 1760.00    0.00 1782.30 1791.85 1695.10     1782.30     1791.85     1804.80        1469          78           0  6496.96   347.91     0.00    1353750       8500          0       15250        1250           0
## 2      B 10-APR-2015 30-Apr-2015 28-May-2015 25-Jun-2015 1627.50 1653.30    0.00 1656.50 1653.30    0.00 1627.50 1645.45    0.00 1642.95 1646.75 1613.90     1642.95     1653.85     1664.35        2638          14           0 10830.05    57.68     0.00    1377250      17000          0      -21000        1500           0
## 3      C 10-APR-2015 30-Apr-2015 28-May-2015 25-Jun-2015  632.95  644.10    0.00  646.40  650.50    0.00  629.65  635.00    0.00  640.85  644.35  614.60      640.85      644.35      649.10        4964         181           0 15869.41   583.38     0.00    6264000      98000          0       73500        6000           0
## 4      D 10-APR-2015 30-Apr-2015 28-May-2015 25-Jun-2015  317.80  319.50    0.00  324.60  326.65    0.00  315.85  318.40    0.00  320.55  322.35  310.85      320.55      322.35      325.35        3416          82           0 10969.31   264.93     0.00    8228000     216000          0     -192000       13000           0

A clean, simple solution in base R!

Edit: Thanks to @Frash for pointing out the quirk in the above "solution". The situation can be rectified by wrapping the aggregate() as follows:

do.call(data.frame,...);

This is because data.frame() automatically expands matrices to independent columns in the resulting data.frame (except for matrices of class "model.matrix" and those protected by I()).

bgoldst
  • 34,190
  • 6
  • 38
  • 64
  • Something fishy going on, the output is showing as data.frame of 4 obs of 12 variables. Putting the output through `str` gives that apart from 'SYMBOL' and 'TIMESTAMP' all the other variables have the dimension of 4 x 3 instead of the expected 4 x 1 dimensions! Each of the items is a matrix and the whole is embedded in a data.frame. – Frash Apr 13 '15 at 20:41
  • Whoa... you're right! Never seen that before. I guess a matrix can be embedded in a data.frame as a fancy kind of column, provided that it has the same number of rows as the data.frame. And when printing, each column within the matrix is displayed as an apparently independent column within the data.frame, with a `.{cnum}` suffix on it to distinguish it. Weird. – bgoldst Apr 13 '15 at 20:55