1

Im very new to coding and only know the very basics. I am using python and trying to print everything between two sentences in a text. I only want the content between, not before or after. It`s probably very easy, but i couldnt figure it out.

Ev 39 Fursetfjellet (Oppdøl - Batnfjordsøra) No reports. Ev 134 Haukelifjell (Liamyrane bom - Fjellstad bom) Ev 134 Haukelifjell Hordaland / Telemark — Icy. 10 o'clock 1 degree. Valid from: 05.01.2020 13:53 Rv 3 Kvikne (Tynset (Motrøa) - Ulsberg)

I want to collect the bold text to use in website later. Everything except the italic text(the sentence before and after) is dynamic if that has anything to say.

  • Does this answer your question? [How to extract the substring between two markers?](https://stackoverflow.com/questions/4666973/how-to-extract-the-substring-between-two-markers) – wjandrea Jan 05 '20 at 17:35
  • Also related: [Get a string after a specific substring](https://stackoverflow.com/q/12572362/4518341) – wjandrea Jan 05 '20 at 17:37
  • Welcome to Stack Overflow! Check out the [tour] and [ask]. – wjandrea Jan 05 '20 at 17:37

2 Answers2

0

It looks like a job for regular expressions, there is the re module in Python.

You should:

  • Open the file
  • Read its content in a variable
  • Use search or match function in the re module

In particular, in the last step you should use your "surrounding" strings as "delimiters" and capture everything between them. You can achieve this using a regex pattern like str1 + "(.*)" + str2.

You can give a look at regex documentation, but just to give you an idea:

  • ".*" captures everything
  • "()" allows you actually capture the content inside them and access it later with an index (e.g. re.search(pattern, original_string).group(1))
wjandrea
  • 28,235
  • 9
  • 60
  • 81
albestro
  • 121
  • 1
  • 4
0

You can use split to cut the string and access the parts that you are interested in.

If you know how to get the full text already, it's easy to get the bold sentence by removing the two constant sentences before and after.

full_text = "Ev 39 Fursetfjellet (Oppdøl - Batnfjordsøra) No reports. Ev 134 Haukelifjell (Liamyrane bom - Fjellstad bom) Ev 134 Haukelifjell Hordaland / Telemark — Icy. 10 o'clock 1 degree. Valid from: 05.01.2020 13:53 Rv 3 Kvikne (Tynset (Motrøa) - Ulsberg)"
s1 = "Ev 39 Fursetfjellet (Oppdøl - Batnfjordsøra) No reports. Ev 134 Haukelifjell (Liamyrane bom - Fjellstad bom)"
s2 = "Rv 3 Kvikne (Tynset (Motrøa) - Ulsberg)"

bold_text = full_text.split(s1)[1] # Remove the left part.
bold_text = bold_text.split(s2)[0] # Remove the right part.
bold_text = bold_text.strip()      # Clean up spaces on each side if needed.
print(bold_text) 
Guimoute
  • 4,407
  • 3
  • 12
  • 28