-1

I know that there are many threads called this but either the advice within hasn't worked or I haven't understood it.

I have read what was an SPSS file into R. I cleaned some variables and added new ones. By this point the file size is 1,000 MB. I wanted to write it into a CSV to look at it more easily but it just stops responding - file too big I guess.

So instead I want to create a subset of only the variables I need. I tried a couple of things

(besb <- bes[, c(1, 7, 8)])
data1 <- bes[,1:8]

I also tried referring to variables by name:

nf <- c(bes$approveGov, bes$politmoney)

All these attempts return errors with number of dimensions.

Therefore could somebody please explain to me how to create a reduced subset of variables preferably using variable names?

Joe
  • 8,073
  • 1
  • 52
  • 58
Henry Cann
  • 101
  • 1
  • 10
  • 1
    Welcome to StackOverflow. Please take a look at these tips on how to produce a [minimum, complete, and verifiable example](http://stackoverflow.com/help/mcve), as well as this post on [creating a great example in R](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example). – lmo Jan 26 '17 at 12:50
  • `c()` just concatenates variables, so you indeed want either `bes[, 1:8]` or `bes[, c(1,3,5)]`. Which output gives `str(bes)`? And which function did you use to write to csv? You may look at the `readr`-package, – Daniel Jan 26 '17 at 12:53
  • Daniel, I used `new <- write.csv(bes, "BesAdd.csv") ` Also, both code lines you suggest give me the error **Error in bes[, 1:8] : incorrect number of dimensions** Thanks – Henry Cann Jan 26 '17 at 12:59
  • If there are any commas in the data, that will corrupt the creation of a CSV file by adding a value that makes it un-rectangular. Is there any chance that you have a , for a decimal? – sconfluentus Jan 26 '17 at 13:12
  • also...is your data in a dataframe or table? That dimension warning comes up sometimes when you are working with non- flat file types which cannot be written to a flat file. How did you get that spss file in and does it have a nested structure? – sconfluentus Jan 26 '17 at 13:25
  • @BethanyP you could be onto something there...I just noticed that the dataset, bes, is listed in the environment as a Value, a Large List with x elements. I only noticed because I just read in another file which is corrected listed in the environment as Data. The difference is that the second file I read in from Stata as I would normally do. The only reason I used the SPSS version for the first file on this occasion was because of a frustrating "Binary read error". To answer your question I read the SPSS in like this: `bes <- read.spss("BES15W9.sav")` . Is this the problem maybe? – Henry Cann Jan 26 '17 at 13:34
  • It likely is the problem. Use `str(bes)` to see if it is a series of nested lists. You may get an idea from the structure of how to unpack it with loops to create the data frame you need. – sconfluentus Jan 26 '17 at 17:04

1 Answers1

0

An easy way to subset variables from a data.frame is with the dplyr package. You can select variables with their bare names. For example:

library(dplyr)
nf <- select(bes, approveGov, politmoney)

It's fast for large data frames too.

Joe
  • 8,073
  • 1
  • 52
  • 58