0

I have a dataset in .mat file. Because most of my project is going to be R, I want to analyze the dataset in R rather than Matlab. I have used "R.matlab" library to convert into R but I am struggling to convert the data to dataframe to do further processing with it.

library(R.matlab)
>data <- readMat(paste(dataDirectory, 'Data.mat', sep=""))
> str(data)
List of 1
 $ Data: num [1:32, 1:5, 1:895] 0.999 0.999 1 1 1 ...
 - attr(*, "header")=List of 3
  ..$ description: chr "MATLAB 5.0 MAT-file, Platform: PCWIN, Created on: Fri Oct 18 11:36:04 2013                                          "
  ..$ version    : chr "5"
  ..$ endian     : chr "little"'''

I have tried the following codes from what I found from other questions but they do not do exactly what I wanted to do.

data = lapply(data, unlist, use.names=FALSE)
df <- as.data.frame(data)
> str(df)
'data.frame':   32 obs. of  4475 variables:

I want to convert into a data frame to 5 observations (Y,X1,X2,X3,X4) but right now there is 32 observation.

I do not know how to go further from here as I never worked with such a large dataset and couldn't find a relevant post. I am also new to R and coding so please excuse me if I will have some trouble with some of the answers. Any help would be greatly appreciated.

Thanks

  • after `data = lapply(data, unlist`, what does `str(data[[1]]$Data)` show? – Chris Jul 18 '20 at 15:47
  • Oh it just shows the same without the names. *Edit*: Nvm, I get ```Error in data[[1]]$Data : $ operator is invalid for atomic vectors``` – Tdta Mg Jul 18 '20 at 16:01
  • Did you perhaps mean this? ```> str(data[1]$Data) num [1:32, 1:5, 1:895] 0.999 0.999 1 1 1 ...``` – Tdta Mg Jul 18 '20 at 16:07
  • I guess I was just trying to get more of a sense of the data in `df`. The result for `str(df)` seems almost mute, like it's not saying anythings. You get to `df`, using the same approach as [SOF](https://stackoverflow.com/questions/28080579/how-to-load-a-matlab-struct-into-a-r-data-frame), and at this point, I guess 32obs of 4475 var is correct, and you want to subset to `Y,X1,X2,X3,X4`? – Chris Jul 18 '20 at 16:08
  • I see, I kind of removed the data in ```df``` as I was not certain if I should show the full output. I thought that was enough. I am guessing that original matrix is [1:32, 1:5, 1:895] and when converting to dataframe it becomes [1:32,1:4475] and I want it to be [1:5, 1:(32*895)] I guess. I actually do not know if that the best way its just the data from .mat isn't ready to be processed by R. – Tdta Mg Jul 18 '20 at 16:20
  • So, perhaps `head(df, n = 20)` or so will give a sense of it. and put the out put above. It looks, (they look?) an awful lot like arrays the `num [1:32, 1:5, 1:895]`, which I imagine is how matlab holds its data. – Chris Jul 18 '20 at 16:22
  • The output is really long so I will cut it bit not to overflow it. ```> str(df) 'data.frame': 32 obs. of 4475 variables: $ Data.1 : num 0.999 0.999 1 1 1 ... $ Data.2 : num -0.109 -0.0862 -0.0284 -0.0116 0.0748 ... $ Data.3 : num -0.0732 -0.0472 -0.0513 -0.0421 -0.0618 ... $ Data.4 : num 0.007985 0.004074 0.00146 0.000489 -0.004618 ... $ Data.5 : num 0.00536 0.00223 0.00264 0.00177 0.00382 ... $ Data.6 : num 0.999 0.999 1 1 1 ... ``` This goes on until ```[list output truncated]``` – Tdta Mg Jul 18 '20 at 16:29
  • And ```> head(df, n=20) Data.1 Data.2 Data.3 Data.4 Data.5 Data.6 Data.7 Data.8 Data.9 Data.10 Data.11 Data.12 Data.13 Data.14 Data.15 Data.16 Data.17 Data.18 Data.19 Data.20 Data.21 Data.22 Data.23 Data.24 Data.25 Data.26 Data.27 Data.28 Data.29 Data.30 Data.31 Data.32 Data.33 Data.34 Data.35 Data.36 Data.37 Data.38 Data.39 Data.40 Data.41 Data.42 Data.43 Data.44 Data.45 ... Data.4475 [ reached 'max' / getOption("max.print") -- omitted 20 rows ]``` I would have moved this to a chat but it seems I am unable to as I do not have enough reputation. – Tdta Mg Jul 18 '20 at 16:31
  • Maybe `apply(data$Data, 2, rbind)`? This will produce a matrix, you can then coerce to data.frame. – Rui Barradas Jul 18 '20 at 19:18
  • Yes!!! Thank you, that works. I can't believe I did not see this. – Tdta Mg Jul 18 '20 at 19:35

0 Answers0