0

I am having a XML file, I want to load it in pandas, I have tried other XML files, but for this the shape is unclear and every time this throws the error and I am not coming up with the pandas dataframe I need. Any suggestions -

-<tweet id="591154373696323584">
Diga cuanto nos van a costar las 
<sentiment polarity="N" entity="Partido_Popular" aspect="Economia">autovías</sentiment>
de sus amiguetes ¿4500 millones o más ? @EsperanzAguirre @PPopular
</tweet>


-<tweet id="591154532362670080">
@lhermoso_ @sanchezcastejon 
<sentiment polarity="N" entity="Partido_Socialista_Obrero_Espanol" aspect="Propio_partido">#DobleMoral</sentiment>
 Castilla antes que Aragón...
</tweet>

I am using the code below.

import xml.etree.cElementTree as et
import pandas as pd

def getvalueofnode(node):
    """ return node text or None """
    return node.text if node is not None else None

def main():
    parsed_xml = et.parse("stompol-train-tagged.xml")
    dfcols = ['tweet id', 'tweet', 'sentiment_polarity', 'entity', 'aspect', 'sentiment']
    df_xml = pd.DataFrame(columns=dfcols)

    for node in parsed_xml.getroot():
        tweetid = node.attrib.get('tweetid')
        tweet = node.find('tweet')
        sentiment_polarity = node.find('polarity')
        entity = node.find('entity')
        aspect = node.find('aspect')
        sentiment = node.find('sentiment')


        df_xml = df_xml.append(
            pd.Series([tweetid, tweet, sentiment_polarity, entity, aspect, sentiment], index=dfcols),
            ignore_index=True)

    print(df_xml)

main()

I get None and all.

Wisdom258
  • 173
  • 2
  • 11

0 Answers0