0

I want to use a loop function to create a sub-data frame of an existing data frame. I have a data frame, which consists of 10 columns and multiple rows. One of the columns is labelled 'answers' and there are three possible answers - 'yes', 'no' or 'i don't know'. Now, I want to use a loop, followed by an if function to create a data frame that consists of all the rows where the answer is 'i don't know'. Out of the 10 columns that I have mentioned, 3 of them are 'name', 'subject' and 'contact number' My new data frame, which consists of the rows containing 'i don't know' needs to also contain the 3 aforementioned columns. How could I use a loop and an if function to create this new data frame?

  • 2
    Welcome to StackOverflow! To get the best answers, please follow the instructions [here](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example). Be sure to include sample data (fake is fine) and code showing what you've tried so far. – A. S. K. Jan 27 '21 at 20:56

2 Answers2

1

I dont think you need a for loop for achieving that. Just use dataframe notation to subset to the desire values:

dataframename[dataframename$answers=="i don't know",c("name","subject","contact")]

dataframename$answers=="i don't know" Remember this need to use the exact column name and string that is in the column. This will return True in rows containing the value "i don't know" in the column "answers" and eliminate the others. Then using, c("name","subject","contact") will bring only the columns with than name.

  • Thank you! it worked. How would I be able to get the same output by using a loop? –  Jan 28 '21 at 09:30
  • @user15052672 What do you want to loop through? The answers columns? Looping through each row to find the same exact value is not convenient and the code will underperform. – Mr. Caribbean Jan 28 '21 at 20:34
0

Sample data is always a great help. Check out the dput function to see how to easily do it. You can certainly create data frames with loops and conditional statements, but I think this should do what you're looking for in a more straightforward way just by filtering. Here is the base R approach, but I like using the tidyverse the most.

df <- data.frame(x = 1:10, y = sample(c("y","n","o"),10, replace = T))
sub <- df[which(df$y == "o"),]

library(tidyverse)
sub <- df %>% filter(y == "o")
bischrob
  • 544
  • 3
  • 10