-2

I need to extract the first sentence from every paragraph in a written text. I also need to preserve the paragraph structure so that the first sentence is its own paragraph.

I need to use R for this one.

I know I have to add a loop function, but I don't know how to.

Thanks a lot, guys.

  • 1
    Welcome to SO, Imran Luqman! Please make this question *reproducible*. This includes sample code you've attempted (including listing non-base R packages, and any errors/warnings received), sample *unambiguous* data (e.g., `data.frame(x=...,y=...)` or the output from `dput(head(x))`), and intended output given that input. Refs: https://stackoverflow.com/q/5963269, [mcve], and https://stackoverflow.com/tags/r/info. – r2evans Nov 01 '21 at 01:19
  • Thank you, @r2evans. But I was thinking of only maintaining the first sentence of different paragraphs in a text. For example: 1st paragraph: My name is xxx. I am 100 years old. I have a house. 2nd paragraph: I have a wife. She is 90 years old. She has a mansion. My desired output would be; 1st para: My name is xxx. 2nd para: I have a wife. – Imran Luqman Nov 01 '21 at 01:29

1 Answers1

0

Suppose that every sentences are split with . and every paragraphs are split with \n. For example,

dummy <- c("first sentence. blablabla.
       first sentence2. blablablabblah.")

Then by using stringr::str_split,

sapply(str_split(dummy, "\n", simplify = TRUE), function(x) str_split(x, "\\.", simplify = T)[1])

You can get

          first sentence. blablabla.            first sentence2. blablablabblah. 
                    "first sentence"                "           first sentence2" 

If your input is vector of paragraphs,

dummy <- c("first sentence. blablabla.","first sentence2. blablablabblah.")
sapply(dummy, function(x)str_split(x, "\\.", simplify = T)[1])

   first sentence. blablabla. first sentence2. blablablabblah. 
             "first sentence"                "first sentence2" 

Code for your text.

dummy <- c("Now, I truly understand that because it's an election season expectations for what we will achieve this year is really low. But, Mister Speaker, I appreciate the very constructive approach that you and other leaders took at the end of last year to pass a budget and make tax cuts permanent for working families." , "So I hope we can work together this year on some priorities like criminal justice reform.So, who knows, we might surprise the cynics again.")

lapply(dummy, function(x)str_split(x, "\\.", simplify = T)[1])

[[1]]
[1] "Now, I truly understand that because it's an election season expectations for what we will achieve this year is really low"

[[2]]
[1] "So I hope we can work together this year on some priorities like criminal justice reform"

unlist(lapply(dummy, function(x)str_split(x, "\\.", simplify = T)[1]))
[1] "Now, I truly understand that because it's an election season expectations for what we will achieve this year is really low"
[2] "So I hope we can work together this year on some priorities like criminal justice reform" 
Park
  • 14,771
  • 6
  • 10
  • 29
  • Thank you very much, @Park. But I had an error saying simplify=T is an unused argument. If I deleted that one, the output will be: $`first sentence. blablabla.` [1] "first sentence" " blablabla" $`first sentence2. blablablabblah.` [1] "first sentence2" " blablablabblah" – Imran Luqman Nov 01 '21 at 03:59
  • @ImranLuqman Did you use `str_split` or `strsplit`? I'm sorry that this code may confusing. May you try `sapply(stringr::str_split(dummy, "\n", simplify = TRUE), function(x) stringr::str_split(x, "\\.", simplify = T)[1]) `? – Park Nov 01 '21 at 04:01
  • In both code, if error still occurs when using `stringr::str_split`, please let me know. – Park Nov 01 '21 at 04:03
  • Thanks for getting back to me so soon. First, I used str_split, but they couldn't recognise the function. So, I used strsplit. It worked, but then the error message for the simplify=T came up. I'll give it a go with the new code and will update you. – Imran Luqman Nov 01 '21 at 04:05
  • @ImranLuqman I'm really sorry for unkind explanation. You need `stringr` package. Try `install.packages("stringr")` and then load package with `library(stringr)`. – Park Nov 01 '21 at 04:07
  • Hey @Park. It's somewhat successful. But, I found out that the written text has to be in 1 straight line. My text example was on multiple lines. So, they gave me a mad output. Is there any way to rectify this? – Imran Luqman Nov 01 '21 at 04:15
  • @ImranLuqman If your example is not too long, can you paste that text to your question? Then I can edit a code for your text. – Park Nov 01 '21 at 04:17
  • text <- "Now, I truly understand that because it's an election season expectations for what we will achieve this year is really low. But, Mister Speaker, I appreciate the very constructive approach that you and other leaders took at the end of last year to pass a budget and make tax cuts permanent for working families." , "So I hope we can work together this year on some priorities like criminal justice reform.So, who knows, we might surprise the cynics again." – Imran Luqman Nov 01 '21 at 04:38
  • Thanks so much for your help @Park. I really appreciate it. So, for the first paragraph, I would like to have "Now, I truly understand that because it's an election season expectations for what we will achieve this year is really low." For the second paragraph: "So I hope we can work together this year on some priorities like criminal justice reform." – Imran Luqman Nov 01 '21 at 04:39
  • @ImranLuqman I didn't notice that your paragraph is paragraph(it's long). Instead of `sapply`, `lapply` will give you a better result. I add code for your text. – Park Nov 01 '21 at 04:45
  • Yeah, and that's just a small part of the text. There are more paragraphs. Thanks again, @Park – Imran Luqman Nov 01 '21 at 04:59
  • Hey @Park. Sorry, I didn't notice that you have added the code. It words perfectly! Thank you for your help. Appreciate it. – Imran Luqman Nov 01 '21 at 11:46