come back again with another issue. using BeautifulSoup really new in parsing XML , and have this problem since 2 weeks now. will appreciate your help have this structure :
<detail>
<page number="01">
<Bloc code="AF" A="000000000002550" B="000000000002550"/>
<Bloc code="AH" A="000000000035826" C="000000000035826" D="000000000035826"/>
<Bloc code="AR" A="000000000026935" B="000000000024503" C="000000000002431" D="000000000001669"/>
</page>
<page number="02">
<Bloc code="DA" A="000000000038486" B="000000000038486"/>
<Bloc code="DD" A="000000000003849" B="000000000003849"/>
<Bloc code="EA" A="000000000001029"/>
<Bloc code="EC" A="000000000063797" B="000000000082427"/>
</page>
<page number="03">
<Bloc code="FD" C="000000000574042" D="000000000610740"/>
<Bloc code="GW" C="000000000052677" D="000000000075362"/>
</page>
</detail>
this is my code:(i know that its so poor and have to improve it :'( )
if soup.find_all('bloc') != None:
for element in soup.find_all('bloc'):
code_element = element['code']
if element.find('m1'):
m1_element = element['m1']
else:
None
if element.find('m2'):
m2_element = element['m2']
else:
None
print(code_element,m1_element, m2_element)
I ve got the error because the 'm2' element does not exist in all the pages. i dont know how can handle this issue.
i would like to put the result in DataFrame like this.
DatFrame = CODE A/ B/ C/ D Page--- Columns
AF 0000002550 00002550 NULL NULL 01
AH 000035826 NULL 000035826 0000035826 01
AR 000026935 000000024503 0000002431 0000001669 01
....etc.
Thank you so much for your help