2

How in below example get NUM or CODE

lines = filename.split('\n')
    for line in lines:
        print("CODE is:" + ???);
        #CODE is: 1111
        #CODE is: 2222
        #CODE is: 3333

file

<FROM ID="1610350010160016">
      <PR NUM="1" CODE="1111" />
      <PR NUM="2" CODE="2222" />
      <PR NUM="3" CODE="3333" />
    </FROM>

How get NUM and CODE from abowe example?

Pointer
  • 2,123
  • 3
  • 32
  • 59

3 Answers3

4

This is one way using regex:

import re
lines = filename.split('\n')
    for line in lines:
        print('NUM is : {}'.format(''.join(re.findall(r'NUM="(\w+)', line))), end=" ")
        print('CODE is : {}'.format(''.join(re.findall(r'CODE="(\w+)', line))))

'''
NUM is : 1  CODE is : 1111
NUM is : 2  CODE is : 2222
NUM is : 3  CODE is : 3333
'''
Austin
  • 25,759
  • 4
  • 25
  • 48
  • Tnx for help, any solution when I try get DATE="10.10.2017" only get number 10 how get whole text --> 10.10.2017 ? – Pointer Mar 12 '18 at 13:12
  • Try `re.findall(r'DATE="(\d+\.\d+\.\d+)', line)` and let me know if it works. – Austin Mar 12 '18 at 13:17
  • Work fine, any web page where I can all example. GL – Pointer Mar 12 '18 at 13:28
  • If you are looking to go deep into `regex`, I recommend a start from here: https://docs.python.org/2/library/re.html#regular-expression-syntax Happy coding! – Austin Mar 12 '18 at 13:36
4

You can use the standard library (Python 2.5+):

import xml.etree.ElementTree as ET

for element in ET.parse('data.xml').getroot():
    print(element.attrib['NUM'], element.attrib['CODE'])

where data.xml contains the XML data.

If you have the data in a string, you can use the string parser instead:

ET.fromstring('Your XML data')

This solution doesn't use regex and is very compact and readable, and it works with any data type (for example, dates in the format 'gg.mm.aaaa').

Daniele Murer
  • 247
  • 3
  • 9
3

You seem to have XML data, not just text data. Please use the right tool for the job. XML should be read and interpreted using XML tools, not text tools like Regex or plain text parsing. Even if the regex answer works in this example, it is not right!

XML stands for EXTENSIBLE Markup Language. This means that the XML file content should be EXTENSIBLE without breaking the processing of the XML file. If you hard code the reading with regex or plain text, your integration will break when the XML changes. With xml tools used correctly this is not a problem. Of course you could count on that nothing will change in the future, but this has many times proven to be wrong assumption.

Odoo uses XML heavily. Odoo uses XML the correct way. When coding in Odoo, please also use XML correctly.

In Daniele Murer's answer you can find the right way to do it. Please use it.

For more information on processing xml in python, take a look at https://www.tutorialspoint.com/python3/python_xml_processing.htm, or search for "python xml" in Google.

Veikko
  • 3,372
  • 2
  • 21
  • 31
  • For a good dark humor answer why not use regex for xml/xhtml, please read this stackoverflow answer: https://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#1732454. ps. Don't take this comment too seriously. The answer itself is serious. ;-) – Veikko Mar 12 '18 at 21:10
  • Tnx for clear answer. Do you have any good tools or example for my request? If you have please put here. GL – Pointer Mar 13 '18 at 14:51
  • The example https://stackoverflow.com/a/49235513/1818279 works for me perfectly. Place your xml document in file "data.xml". No other tools than Python and import xml needed. Happy coding :-) – Veikko Mar 13 '18 at 17:33