0

I have a XML like this,I need to get the information of all element values using a pYTHON script.I tried it via beautiful soup, but it is too slow and it is printing the values with the TAGS

<item>
<g:id>1212</g:id>
<g:title>tile 1</g:title>
<g:description>description </g:description>
<g:gtin>426567836816</g:gtin>
<g:brand>Amazon</g:brand>
<g:mpn>6789</g:mpn>
<g:link>link </g:link>
<g:image_link>link.jpg</g:image_link>
<g:availability>in stock</g:availability>
<g:condition>new</g:condition>
<g:unit_pricing_base_measure>10 g</g:unit_pricing_base_measure>
<g:unit_pricing_measure>50 g</g:unit_pricing_measure>
<g:product_type>type 001</g:product_type>
<g:produkteinheit>50 g</g:produkteinheit>
<g:google_product_category/>
<g:price>250.1</g:price>
<g:shipping>
<g:country>xxx</g:country>
<g:service>DHL</g:service>
<g:price>50</g:price>
</g:shipping>
</item>

Below is the Python code using beautifulsoup which i have tried, Is there any better option for this?

from bs4 import BeautifulSoup

with open('google-shopping-1.xml') as fp:
    soup = BeautifulSoup(fp, 'lxml')

    for item in soup.findAll('item'):
        id = item.find('g:id')
        title = item.find('g:title')
        description = item.find('g:description')
        gtin = item.find('g:gtin')
        brand = item.find('g:brand')
        mpn = item.find('g:mpn')
        link = item.find('g:link')
        image_link = item.find('g:image_link')
        availability = item.find('g:availability')
        condition = item.find('g:condition')
        unit_pricing_base_measure = item.find('g:unit_pricing_base_measure')
        unit_pricing_measure = item.find('g:g:unit_pricing_measure')
        product_type = item.find('g:product_type')
        produkteinheit = item.find('g:produkteinheit')
        price = item.find('g:price')
        condition = item.find('g:condition')
        print(id)
        #print(title)
        #print(description)
        #print(gtin)
        #print(brand)
        #print(mpn)
        #print(link)
        #print(image_link)
        #print(availability)
        #print(availability)
        #print(condition)
        #print(unit_pricing_base_measure)
        #print(unit_pricing_measure)
        #print(product_type)
        #print(produkteinheit)
        #print(price)
        #print(condition)
        #print("/n")

Is there a better solution than this ,ultimately i want it to be inserted into my database(postgres).

Current output: 12 20

Linu
  • 589
  • 1
  • 10
  • 23

0 Answers0