1

I have a text file. It contains lots of text in following format:

  • text
  • text
  • Date in format of 12 December 2016
  • text
  • text

How do I extract only the date in such a case given that there is no other date in the text section of the file? Need a R program for it.

Mysterious
  • 843
  • 1
  • 10
  • 24

4 Answers4

1

This would do the trick. You would get the the dates parsed while the rest would become NA objects which you can filter out.

text=c('a','b','12 December 2016','10 December 2015')

strptime(text,format='%d %B %Y')
karthikbharadwaj
  • 368
  • 1
  • 7
  • 17
0

I've called your data set demo_set for practical purposes. You start by reading in your data set: demo_set=readLines(con <- file("yourFile.txt") #read in file.

You can use other ways of reading in your data set. Then you use regex to find lines with month names.

demo_set[grep(pattern = paste(month.name,collapse = "|"),demo_set)]
biomiha
  • 1,358
  • 2
  • 12
  • 25
0

If your text doesn't starts with number you can use the below code

abc<- subset(abc, grepl("^[0-9]",name))

where abc is your dataframe and name is your column in your dataframe

I.G. Pascual
  • 5,818
  • 5
  • 42
  • 58
Arun kumar mahesh
  • 2,289
  • 2
  • 14
  • 22
0

You can also use an if statement to check if there are any values within a column such as Date, and print them to screen like so;

if(!is.na(data$date)) {
  print(data$date)
}

This will print all the records where there is a value in Date but if you would rather just a sample, use;

print(data$date[1:10]) 
SM_Downes
  • 1
  • 1