2

I have a dictionary :

{'...': '...',
'unite_fonctionnelle': 'text to keeep',
'...': '...'}

In this case the dictionary is good but sometimes the value of unite_fonctionnelle need to be cleaned, the syntaxe is always the same:

{'...': '...',
'unite_fonctionnelle' : 'asjbzjqnfknqfsnf<Run>text to keeep</Run>qsdqsdqsdqsfqsfegdhfkdnh',
'...': '...'}

I want to transform the value of unite_fonctionnelle when tags <Run> and </Run> are present in the string to keep only the text between the two tags. How can I do this with split() ?

Howins
  • 487
  • 1
  • 6
  • 18

3 Answers3

2

You can simply use regex here.

Try this :

import re

for key, value in dic.items():
    if '<Run>' in str(value):
        output = re.search('<Run>(.*)</Run>', str(value))
        print(output.group(1))
1

Given d your dictionary:

for k, v in d.items():
    if '<Run>' in v:
        d[k] = v.split('<Run>')[1].split('</Run>')[0]
Synthase
  • 5,849
  • 2
  • 12
  • 34
0

This is my approach.

import re
dic = {
'unite_fonctionnelle' : 'asjbzjqnfknqfsnf<Run>text to keeep</Run>qsdqsdqsdqsfqsfegdhfkdnh'
}
for key,val in dic.items():
    if(all(x in val for x in ['<Run>','</Run>'])):dic[key] = re.split('<Run>|</Run>',val)[1]
Buddy Bob
  • 5,829
  • 1
  • 13
  • 44