1

I have a data set with 18 variables, and I am working with r. My data looks like this:(Decision_date is the first variable)

1 2003-07-09 00:00:00.0      Austria  Agriculture and Rural Development
2 2002-03-20 00:00:00.0      Austria  Agriculture and Rural Development
3 2004-07-07 00:00:00.0      Austria  Agriculture and Rural Development
4 2003-10-06 00:00:00.0      Austria  Agriculture and Rural Development
5 2004-07-07 00:00:00.0      Austria  Agriculture and Rural Development
6 2003-10-06 00:00:00.0      Austria  Agriculture and Rural Development
# ... with 15 more variables: Title <chr>, Decision_type <chr>, Active_infringement_cases <chr>,
#   Not_communicated <chr>, dir_number <chr>, delegating_dir <dbl>, implementing_dir <dbl>, closure <dbl>,
#   let <int>, ro <int>, referral <int>, let2 <int>, ro2 <int>, sanction <dbl>, withdrawal <dbl>

Data for reproduction

structure(list(Decision_date = c("2003-07-09 00:00:00.0", "2002-03-20 00:00:00.0", 
"2004-07-07 00:00:00.0", "2003-10-06 00:00:00.0", "2004-07-07 00:00:00.0", 
"2003-10-06 00:00:00.0", "2003-12-16 00:00:00.0", "2003-10-06 00:00:00.0", 
"2004-07-07 00:00:00.0", "2003-10-06 00:00:00.0"), Member_state = c("Austria", 
"Austria", "Austria", "Austria", "Austria", "Austria", "Austria", 
"Austria", "Austria", "Austria"), Policy_area___Department_in_charge = c("Agriculture and Rural Development", 
"Agriculture and Rural Development", "Agriculture and Rural Development", 
"Agriculture and Rural Development", "Agriculture and Rural Development", 
"Agriculture and Rural Development", "Agriculture and Rural Development", 
"Agriculture and Rural Development", "Agriculture and Rural Development", 
"Agriculture and Rural Development"), Title = c("CODE RELATIF A L'EXERCICE DES PROFESSIONS ARTISANALES, COMMERCIALES ET INDUSTRIELLES", 
"CODE RELATIF A L'EXERCICE DES PROFESSIONS ARTISANALES, COMMERCIALES ET INDUSTRIELLES", 
"PRODUITS DE CACAO ET DE CHOCOLAT DESTIN<U+00C9>S <U+00C0> L'ALIMENTATION", 
"PRODUITS DE CACAO ET DE CHOCOLAT DESTIN<U+00C9>S <U+00C0> L'ALIMENTATION", 
"DIRECTIVE 2001/110/CE DU CONSEIL DU 20 D<U+00C9>CEMBRE 2001 RELATIVE AU MIEL", 
"DIRECTIVE 2001/110/CE DU CONSEIL DU 20 D<U+00C9>CEMBRE 2001 RELATIVE AU MIEL", 
"DIR 2001/111/CE DU CONSEIL DU 20/12/01 RELATIVE <U+00C0> CERTAINS SUCRES DESTIN<U+00C9>S <U+00C0> L'ALIMENTATION HUMAINE", 
"DIR 2001/111/CE DU CONSEIL DU 20/12/01 RELATIVE <U+00C0> CERTAINS SUCRES DESTIN<U+00C9>S <U+00C0> L'ALIMENTATION HUMAINE", 
"JUS DE FRUITS ET <U+00C0> CERTAINS PRODUITS SIMILAIRES DESTIN<U+00C9>S <U+00C0> L'ALIMENTATION HUMAINE", 
"JUS DE FRUITS ET <U+00C0> CERTAINS PRODUITS SIMILAIRES DESTIN<U+00C9>S <U+00C0> L'ALIMENTATION HUMAINE"
), Decision_type = c("Closing of the case", "Formal notice Art. 258 TFEU", 
"Closing of the case", "Formal notice Art. 258 TFEU", "Closing of the case", 
"Formal notice Art. 258 TFEU", "Closing of the case", "Formal notice Art. 258 TFEU", 
"Closing of the case", "Formal notice Art. 258 TFEU"), Active_infringement_cases = c("No", 
"No", "No", "No", "No", "No", "No", "No", "No", "No"), Not_communicated = c("No", 
"No", "Yes", "Yes", "Yes", "Yes", "Yes", "Yes", "Yes", "Yes"), 
    dir_number = c("", "", "", "", "2001/110", "2001/110", "2001/111", 
    "2001/111", "", ""), delegating_dir = c(0, 0, 0, 0, 0, 0, 
    0, 0, 0, 0), implementing_dir = c(0, 0, 0, 0, 0, 0, 0, 0, 
    0, 0), closure = c(1, 0, 1, 0, 1, 0, 1, 0, 1, 0), let = c(0L, 
    1L, 0L, 1L, 0L, 1L, 0L, 1L, 0L, 1L)), .Names = c("Decision_date", 
"Member_state", "Policy_area___Department_in_charge", "Title", 
"Decision_type", "Active_infringement_cases", "Not_communicated", 
"dir_number", "delegating_dir", "implementing_dir", "closure", 
"let"), row.names = c(NA, -10L), class = c("tbl_df", "tbl", "data.frame"
))

I have created many new dummy variables. For example, one of my dummy variables is named "let".

let1 <- ifelse(infringements$Decision_type == "Formal notice Art. 258            
TFEU", yes = TRUE, no = FALSE)
let2 <- ifelse(infringements$Decision_type == "Formal notice Art. 106   
TFEU", yes = TRUE, no = FALSE)
let3 <- ifelse(infringements$Decision_type == "Formal notice Art. 258  
TFEU + Press release", yes = TRUE, no = FALSE)
let4 <- ifelse(infringements$Decision_type == "Formal notice Art. 260      
TFEU", yes = TRUE, no = FALSE)
 let5 <- ifelse(infringements$Decision_type ==  "Formal notice Art. 260     
 TFEU + Press release", yes = TRUE, no = FALSE)
 table(let2)

 infringements$let <- let1 + let2 + let3 + let4 + let5

If "let" is = to 1, I want it to extract information from another variable called "Decision_date" which is in this format:

2003-07-09 00:00:00.0

I have something like this:

 subSet <- infringements[infringements$let == 1,] 
infringements$let_date <- infringements$Decision_date[infringements$let           
 == 1]

But I get the following error term:

subSet <- infringements[infringements$let == 1,] Error: Variables must be length 1 or 288825. Problem variables: 'Decision_date', 'Member_state', 'Policy_area___Department_in_charge', 'Title', 'Decision_type', 'Active_infringement_cases', 'Not_communicated', 'delegating_dir', 'implementing_dir', 'closure', 'let', 'ro', 'referral', 'let2', 'ro2', 'sanction', 'withdrawal' infringements$let_date <- infringements$Decision_date[infringements$let == 1] Error in $<-.data.frame(*tmp*, "let_date", value = c("2002-03-20 00:00:00.0", : replacement has 19255 rows, data has 45165

To say it in other words: I want a finished data set where it would look like this:

let            let_date
1              2003-07-09 00:00:00.0
1              2004-07-09 00:00:00.0
1              2005-07-09 00:00:00.0

Any help would be appreciated. Thanks very much.

Marc Brinkmann
  • 142
  • 2
  • 16
reveraert
  • 39
  • 1
  • 7

1 Answers1

0

I'm also kinda new to R. I used to do it the way I know from other languages. Maybe this will help you:

if(infringements$let == 1) {
  infringements$let_date <- infringements$Decision_date
}  


Edit: digEmAll and Ronak Shah offered good solution approaches. If I got you right, this should do it for you:

infringements$let_date <- infringements$Decision_date[infringements$let == 1]

Edit2: With the above reproducible code I managed to get it work with the following line of code:

df <- data.frame(infringements$let[infringements$let == 1], infringements$Decision_date[infringements$let == 1])
Community
  • 1
  • 1
Marc Brinkmann
  • 142
  • 2
  • 16
  • Thank you for the answers, but I get this error when I run the second code: Error in `$<-.data.frame`(`*tmp*`, "let_date", value = c("2002-03-20 00:00:00.0", : replacement has 19255 rows, data has 45165 and when I ran the first code, it did not work. – reveraert Apr 04 '17 at 11:02
  • You are welcome! Could you provide some more information? http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example – Marc Brinkmann Apr 04 '17 at 11:04
  • Sure. I use this code without a problem: subSet <- infringements[infringements$let == 1,] infringements$Decision_date[infringements$let == 1] , but when I try to attach it to my data with this code: infringements$let_date <- infringements$Decisio.....i get the following error message: > Error in `$<-.data.frame`(`*tmp*`, "let_date", value = c("2002-03-20 00:00:00.0", : replacement has 19255 rows, data has 45165 > Error: Variables must be length 1 or 288825. Problem variables: 'Decision_date', 'Member_state', .....'' – reveraert Apr 04 '17 at 11:09
  • Could you give me some sample data and the full code? Maybe you wanna have a look at the link I posted. – Marc Brinkmann Apr 04 '17 at 11:12
  • 1
    Thanks for your help so far. And I hope what I added helps it make more sense. Thanks for suggesting to look at the link. – reveraert Apr 04 '17 at 11:25
  • Could you also create some sample data with dput(data)? – Marc Brinkmann Apr 04 '17 at 11:44
  • When I did that, I can only see many 0's and 1's. Sorry, but I don't know how to show that to you. – reveraert Apr 04 '17 at 12:05
  • Just run this code: dput(infringements) or run this if your dataset has more than 100 entries: dput(infringements[1:100,]) – Marc Brinkmann Apr 04 '17 at 12:16
  • I've added a solution – Marc Brinkmann Apr 04 '17 at 13:23