2

I have string with tags "Key", I need get text inside tags.

string = "<Key>big_img/1/V071-e.jpg</Key>"

Need "big_img/1/V071-e.jpg"?

cmashinho
  • 605
  • 5
  • 21

3 Answers3

2

Using regular expressions:

import re

s = "<Key>big_img/1/V071-e.jpg</Key>"

re.findall(r"<Key>(.*)</Key>",s)
['big_img/1/V071-e.jpg']
ODiogoSilva
  • 2,394
  • 1
  • 19
  • 20
0

The most simple solution:

string.trim()[5:-6]

This will work for any length string provided it starts with <Key> and ends with </Key>.

It works because:

  • trim() removes any extraneous whitespace characters
  • <Key> will always be in the first 5 chars of the string, so start 1 char after (remember sequence/string indexes are 0-based, so starting at 5 is really starting at the 6th char)
  • the beginning of </Key> will always be 6 chars from the end of the string, so stop before that point
Zach Young
  • 10,137
  • 4
  • 32
  • 53
Klaus D.
  • 13,874
  • 5
  • 41
  • 48
0

Use Python's xml.etree.ElementTree module to parse your XML string. If your file looks something like:

<root>
    <Key>big_img/1/V071-e.jpg</Key>
    <Key>big_img/1/V072-e.jpg</Key>
    <Key>big_img/1/V073-e.jpg</Key>
    <Key>...</Key>
</root>

First, parse your data:

from xml.etree import ElementTree

# To parse the data from a string.
doc = ElementTree.fromstring(data_string)

# Or, to parse the data from a file.
doc = ElementTree.parse('data.xml')

Then, read and print out the text from each <Key>:

for key_element in doc.findall('Key'):
    print(key_element.text)

Should output:

big_img/1/V071-e.jpg
big_img/1/V072-e.jpg
big_img/1/V073-e.jpg
Uyghur Lives Matter
  • 18,820
  • 42
  • 108
  • 144