-1

I have csv with a set of cars in R. How do I set up the data to where one group contains three specific cars, and the other group is the rest of the cars? I have tried

carA=someintervalvariable[car=="carA"]
carB=someintervalvariable[car=="carB"]
carC=someintervalvariable[car=="carC"]
ABC=which(c("A","B","C"))
others=someintervalvariable[-ABC]

AND

group1 <- data$car["carA","carB","carC"]

I am stuck. I don't know how to collect those three cars while keeping the other cars in a different set. I would like to run tests on those cars vs. the other cars with ratio and interval data. How do i save them separately?

Here is an example of my data:

car         mpg         satisfaction   
 carA:1   Min.   :12.00   Min.   :0.2000  
 carB:1   1st Qu.:21.00   1st Qu.:0.3850  
 carC:1   Median :23.00   Median :0.5600  
 carD:1   Mean   :22.43   Mean   :0.5386  
 carE:1   3rd Qu.:24.50   3rd Qu.:0.7150  
 carF:1   Max.   :31.00   Max.   :0.8100  
 carG:1                                  
vagabond
  • 3,526
  • 5
  • 43
  • 76
  • Please show few lines of your data and expected result. Try `%in%` instead of `==` – akrun May 04 '15 at 17:11
  • You should define a categorical variable `data$cargroup = 0; data$cargroup[car%in%carlist[[1]]] <- 1; ...` and then use `split`. – Frank May 04 '15 at 17:20
  • 1
    Horace, neither your question nor your expected result are understandable from the post ! – vagabond May 04 '15 at 17:32
  • @vagabond : i have added an example of my data if it helps. My question is "do car A,B,C have higher satisfaction than the others?" I just need to know how to set car A,B,C as one variable so that i can compare it to the D,E,F,G saved into a different variable. – Horace Bixby May 04 '15 at 17:58
  • @akrun : look at my previous comment – Horace Bixby May 04 '15 at 17:59
  • 1
    Looks like your dataset is the output of `summary(yourdata)` – akrun May 04 '15 at 18:09
  • @akrun yes, you are correct. I copied the summary(cars) output. Does this help? I am still unsure what to do to partition this data. would it have anything to do with which(c("carA","carB","carC")) and then maybe dropping that from the set D,E,F,G? How would I create a new set of just the A,B,C? – Horace Bixby May 04 '15 at 18:14
  • If something is unclear, just let me know and be specific, i will try to clear it up. I'm not that good at R yet, I'm very new to it. – Horace Bixby May 04 '15 at 18:24
  • Have you tried the option by @Frank – akrun May 04 '15 at 19:20
  • i'm sorry I still can't understand what you want to do. but for next time, pasting the output of `head(data, 10)` is a good option to quickly provide some sample data. Also refer to : http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example – vagabond May 04 '15 at 21:06

1 Answers1

2

I think that you have to provide a reproducible example because my thought are that your problem "might" (I bet it does) include regular expressions and so there is a huge amount of possibilities.

To start, Give a look at this code just ho have an idea and let's know if it useful for you. It allows you to select all car_X where X is all letters (capital and non capital) except the ones to d from z.

cars <- c("car_A", "car_B", "car_C", "car_D", "car_E")
car1 <- grep("car_[^d-zD-Z]", cars, value = TRUE )
car1
[1] "car_A" "car_B" "car_C"

Subset a Data Frame

With a data frame, then, you can subset based on the output of your grep; consider the following example, which continues the previous one.

values <- rnorm(5)
data <- data.frame(cars, values)
data1 <- data[grep( "car_[a-cA-C]", data[ ,1] ) , ]
> data1
   cars     values
1 car_A -1.8553913
2 car_B -0.3562586
3 car_C -0.3208530
Frank
  • 66,179
  • 8
  • 96
  • 180
SabDeM
  • 7,050
  • 2
  • 25
  • 38
  • 1
    Good answer. To get the complement (car_D and car_E), they can put a `-` before the selection: `data[-grep( "car_[a-cA-C]", data[ ,1] ) , ]` – Frank May 04 '15 at 19:27