0

So I have the following .txt file of data, where the data highlighted with yellow needs to be saved to a new txt file: Data in text file

I managed to print certain sections in Python, but that's about it:

with open('Podatki-zima-MEDVES.txt', mode='r+t') as file:
for line in file:
      print(line[18:39])

Resulting in:

 EntryDate="20101126" 
 EntryDate="20101126"
 EntryDate="20101126"
 EntryDate="20101126"
EntryDate="20101127" 
EntryDate="20101128" 
 EntryDate="20101128"
 EntryDate="20101128"
 EntryDate="20101128"

I know it's a very basic question, but for someone experienced this wouldn't take a minute. Thanks

J Arun Mani
  • 620
  • 3
  • 20
Wyvern
  • 11
  • 2

2 Answers2

2

It looks like you're trying to parse xml data.

There is a standard library package that can do this. The documentation is pretty good and it includes a tutorial. Take a look at The ElementTree XML API.

In you case the code would look something like:

data = """
<data>
  <ROW EntryData="20101126" SnowDepth="4"/>
  <ROW EntryData="20101127" SnowDepth="8"/>
</data>"""

import xml.etree.ElementTree as ET
root = ET.fromstring(data)

for child in root:
    entries = child.attrib
    print(entries["EntryData"], entries["SnowDepth"])

This gives the output you're looking for:

20101126 4
20101127 8
Sam Broster
  • 542
  • 5
  • 15
  • 1
    After fiddling around with the module I managed to import the required data, and also edit it in Excel. Thanks for the help! – Wyvern Jan 20 '20 at 12:32
0

As an alternative to using Element Tree you could use an Expat parser for your Structured Markup data.

You first need to specify document type and wrap a top level element around your data as follows:

<?xml version="1.0"?>
<podatki>
<ROW RowState="5" EntryDate="20101126" Entry="" SnowDepth="4" />
<ROW RowState="13" EntryDate="20101126" Entry="Prvi sneg to zimo" SnowDepth="10" />
</podatki>

Then you could use an expat parser.

import xml.parsers.expat

def podatki(name, attrs):
    if name == "ROW":
        print(f'EntryDate={attrs["EntryDate"]},', 
              f'SnowDepth={attrs["SnowDepth"]}')

parser = xml.parsers.expat.ParserCreate()
parser.StartElementHandler = podatki

with open('podatki.xml', 'rb') as input_file:
    parser.ParseFile(input_file)

The result should be

EntryDate=20101126, SnowDepth=4
EntryDate=20101126, SnowDepth=10
Dima Chubarov
  • 16,199
  • 6
  • 40
  • 76