-2

I have a string data as follows:

a<-  "\n    Update Your Profile to Dissolve This Message\nSocial Media Learning and behaviour\n        Uploaded on May 3, 2020 at 10:56 in Research\n            View Forum\n        \n"

I have to extract the string "Social Media Learning and behaviour" for this I used the below code:

gsub("        Uploaded on .* ", "", gsub("\n    Update Your Profile to Dissolve This Message\n", "",a)) 

This gives me output as below

"Social Media Learning and behaviour\n\n"

I am not able to match the exact pattern. What can be the exact pattern to extract "Social Media Learning and behaviour" without "\n\n"

djMohit
  • 151
  • 1
  • 10
  • You could also match the line before in a capturing group, and match the line after it that contains Uploaded `^(.*)\r?\n Uploaded on` https://regex101.com/r/bF5GKT/1 – The fourth bird May 31 '20 at 08:51

2 Answers2

1

You could capture the previous line in a group and match the next line that contains Uploaded:

(.*)\r?\n[^\S\r\n]+Uploaded on

Regex demo

a<-  "\n    Update Your Profile to Dissolve This Message\nSocial Media Learning and behaviour\n        Uploaded on May 3, 2020 at 10:56 in Research\n            View Forum\n        \n"
stringr::str_match(a, "(.*)\\r?\\n[^\\S\\r\\n]+Uploaded on")
The fourth bird
  • 154,723
  • 16
  • 55
  • 70
0

You can extract part between "Update Your Profile to Dissolve This Message" and "Uploaded on"

sub(".*Update Your Profile to Dissolve This Message\n(.*)\n\\s+Uploaded on.*", "\\1", a)
#[1] "Social Media Learning and behaviour"

You can also use str_match from stringr

stringr::str_match(a, "Update Your Profile to Dissolve This Message\n(.*)\n\\s+Uploaded on")[, 2]
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213