0

I got following information from EDGAR:

<SERIES-AND-CLASSES-CONTRACTS-DATA>
<EXISTING-SERIES-AND-CLASSES-CONTRACTS>
<SERIES>
<OWNER-CIK>0000074663
<SERIES-ID>S000004984
<SERIES-NAME>Eaton Vance Income Fund of Boston
<CLASS-CONTRACT>
<CLASS-CONTRACT-ID>C000013484
<CLASS-CONTRACT-NAME>Eaton Vance Income Fund of Boston Class A
<CLASS-CONTRACT-TICKER-SYMBOL>EVIBX
</CLASS-CONTRACT>
<CLASS-CONTRACT>
<CLASS-CONTRACT-ID>C000013485
<CLASS-CONTRACT-NAME>Eaton Vance Income Fund of Boston Class B
<CLASS-CONTRACT-TICKER-SYMBOL>EBIBX
</CLASS-CONTRACT>
<CLASS-CONTRACT>
<CLASS-CONTRACT-ID>C000013486
<CLASS-CONTRACT-NAME>Eaton Vance Income Fund of Boston Class C
<CLASS-CONTRACT-TICKER-SYMBOL>ECIBX
</CLASS-CONTRACT>
<CLASS-CONTRACT>
<CLASS-CONTRACT-ID>C000013487
<CLASS-CONTRACT-NAME>Eaton Vance Income Fund of Boston Class R
<CLASS-CONTRACT-TICKER-SYMBOL>ERIBX
</CLASS-CONTRACT>
<CLASS-CONTRACT>
<CLASS-CONTRACT-ID>C000013488
<CLASS-CONTRACT-NAME>Eaton Vance Income Fund of Boston Class I
<CLASS-CONTRACT-TICKER-SYMBOL>EIBIX
</CLASS-CONTRACT>
</SERIES>
</EXISTING-SERIES-AND-CLASSES-CONTRACTS>
</SERIES-AND-CLASSES-CONTRACTS-DATA>

I would ideally like to scrape all information for each tag and its subtags. It seems that for tags within class contract (e.g., class-contract-id) does not have closing tag.

Possibly for this reason, I get the following result when I try this out:

from bs4 import BeautifulSoup

with open("temp.txt",'r') as html_file:
    content = html_file.read()
    soup = BeautifulSoup(content, 'lxml')
        
    series = soup.find('series') 
    
    for item in series:
        cik = item.find('owner-cik')
        print(cik)

Result:

-1
None

Is there any possible way to sort this out?

1 Answers1

0

The issue is that in this case, item itself is the OWNER-CIK tag. series.find('owner-cik') will probably do what you want, as page 33 of the specification seems to say there's only one OWNER CIK per SERIES.

It looks like there are also a number of existing python libraries for downloading/parsing EDGAR data. You may be able to use or modify one of those instead.

yut23
  • 2,624
  • 10
  • 18