In Scrapy, with lxml, I created a custom pipeline to generate xml according to my needs. The xml is being generated, but there is a bug: the next group (items) in the list overwrites the previous one. That is, regardless of the len()
of the list, only a group (items) is saved. The code is below. Someone help me?
Even referring to the same theme, it is not a duplicate of this question: P.S.: How do you append to a file in Python? For there are quirks like preserving xml headers and footers.
# -*- coding: utf-8 -*-
from yt.items import Lista
import lxml.etree
import lxml.builder
class ytXmlPipeline(object):
def process_item(self, item, spider):
E = lxml.builder.ElementMaker()
ITEMS = E.items
CHANNEL = E.channel
TITLE = E.title
LOGO= E.logo_30x30
SINOPSE = E.description
STREAM = E.stream_url
lista = ITEMS(
CHANNEL(
TITLE('<![CDATA['+item["title"]+']]>'),
LOGO('<![CDATA['+item["logo_30x30"]+']]>'),
SINOPSE('<![CDATA[<center><img height="254" width="200" src="'+item["logo_30x30"]+'"/><p>'+item["description"]+'</p></center>]]>'),
STREAM('<![CDATA['+item["stream_url"]+']]>'),
)
)
# create a new XML file with the results
mydata = lxml.etree.tostring(lista, encoding='utf-8', pretty_print=True, xml_declaration = True, method="xml")
mydata = mydata.replace('<','<').replace('>','>')
myfile = open("ytLista.xml", "w")
myfile.write(mydata)