Deleting just the date in a cell

Question

I'd like to thank everyone who has given me helpful coding advice. I have a row of about 700 cells. Each cell has an "ID number, Month, Year, and Status". I'd like to code the program to delete the month and year in every cell, but keep the ID or Status.

One nice thing is that there is a white space between each value. I am thinking of getting the code to recognize white space? So maybe like "Hey R can you delete everthing between the 2nd and 4th white space?"

" 4475 10 2013 infected " turns into " 4475 infected "

Partial Code

Thanks, any tips or suggestions (even packages) help. I'd like to learn this too - I'll update my code as I figure out some more steps

Do not post your data or code as an image, please learn how to give a [reproducible example](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example/5963610) — Jaap, Apr 08 '17 at 21:08

IRTFM · Answer 1 · 2017-04-08T22:29:43.360

R has lists and vectors that allow indexing. You should drop the term "cell" from your vocabulary when working in R. The scan function can be used to split character values at whitespace:

scan(text=" 4475 10 2013 infected ", what="")[c(1,4)] # Pick first and fourth.
#Read 4 items
#[1] "4475"     "infected"

If you want them rejoined, the paste function is available. The scan function is at the heart of the read.table-function and that would have been what I would have used for the data pictured in your link. If you were to edit your question to include dput(head(dataset)) you might get an answer that addresses your actual problems, but at the moment the only problem you offered in actual code has been addressed. (Pictures of datasets are not warmlywelcomed in SO. Learn to post with actual characters in the question-text. E.g. post the output of: dput( head( dataset))

This shows how to extract the 1st and 4th items from a multiline data input using scan:

scan(text=txt, what=list(1, NULL, NULL, "")) # list of type-"examples"
#-----------
Read 3 records
[[1]]
[1] 4475 6685 3547

[[2]]
NULL

[[3]]
NULL

[[4]]
[1] "infected"    "infected"    "susceptible"

score 0 · Answer 2 · answered Apr 08 '17 at 21:48

Another option using sapply and strsplit. We split based on a space and throw out the 3rd/4th positions (which are those that come between the 2nd and 4th space). Then we recombine:

txt <-  c(" 4475 10 2013 infected ", 
          " 6685 10 2013 infected ", 
          " 3547 10 2013 susceptible")

sapply(strsplit(txt," "), function(x) paste0(unlist(x)[-3:-4], collapse=" "))
##[1] " 4475 infected"    " 6685 infected"    " 3547 susceptible"

Deleting just the date in a cell

2 Answers2