I have a text file with the following pattern:
Prof. Imperdiet montes, metus elementum eleifend eget eget adipiscing augue.
Abstract title: Lorem ipsum dolor sit amet, consectetuer adipiscing
A, nec, quam eleifend quis, magnis sit pretium. leo augue. amet, elit. vel
Vel, dis eget nascetur justo. imperdiet consequat et sit Nam Aenean a, Quisque
Enim. a, dui. Aenean lorem Phasellus commodo quis, pretium ultricies nascetur
tincidunt. sem. vitae,
montes, tellus. amet, venenatis natoque enim. fringilla
quis, vitae, Aenean Etiam viverra ipsum dapibus ut elementum Aenean Lorem eget,
nisi mollis Curabitur Quisque Aenean rhoncus sociis justo, sem. justo, vel
Aenean ultricies nec, eu laoreet.
Dr. Enim. vitae, feugiat in, Aenean
Abstract title: Massa. sociis dis dapibus dolor semper ipsum
jalor
Semper tincidunt. ullamcorper commodo magnis viverra pede elit. eget aliquet
eleifend vel, eleifend feugiat pede Vivamus ridiculus vitae, a, ligula, et Nulla
ligula vulputate ac, nisi. enim dapibus. Donec metus In sit dolor Nam ultricies
imperdiet. pellentesque Cras eu, massa quis porttitor parturient varius ut,
Phasellus arcu. pretium. quam augue. eu, adipiscing felis, enim. ante,
vulputate Integer dui. ultricies a, dictum rutrum. Nullam nec, quis,
consequat Cum tellus. dis felis dolor. nulla Aliquam Donec massa. justo. in,
nascetur
Semper tincidunt. ullamcorper commodo magnis viverra pede elit. eget aliquet
eleifend vel, eleifend feugiat pede Vivamus ridiculus vitae, a, ligula, et Nulla
Dr. Justo. nisi elementum ante, Donec Aenean Nulla
Abstract title:
Aenean consectetuer leo penatibus eget imperdiet nisi. consequat
lorem pretium mus.
Prof. Dr. Aliquam metus semper
Abstract title: Aliquet augue. amet, enim ut justo, nec, eleifend lorem enim. nisi. ipsum
eleifend
More information will be available soon.
I want to extract these parts:
Abstract title: Lorem ipsum dolor sit amet, consectetuer adipiscing
Abstract title: Massa. sociis dis dapibus dolor semper ipsum jalor
Abstract title:
and
Abstract title: Aliquet augue. amet, enim ut justo, nec, eleifend lorem enim. nisi. ipsum eleifend More information will be available soon.
Now, I found these are helpful:
- Regex JS: Matching string between two strings including newlines
- Regular Expression to find a string included between two characters while EXCLUDING the delimiters
but (?<=(Abstract title:))(.*)(?=\n{2})
returns only
Abstract title: Lorem ipsum dolor sit amet, consectetuer adipiscing
and
Abstract title:
Also I am not sure what software tool would be most efficient – awk, shell , r? Please forgive if it's noob question but I am open to suggestions.