Questions tagged [edgar]

EDGAR is an information system of the U.S. Securities and Exchange Commission holding company data. Questions related to parsing and querying the data and public APIs should be tagged.

EDGAR stays for Electronic Data Gathering, Analysis, and Retrieval. This information system uses several data formats: classic SGML based, XML-based XBRL format for business reporting and many more.

120 questions

votes

1 answer

Web scraping SEC Edgar 10-K and 10-Q filings

Are there anyone experienced with scraping SEC 10-K and 10-Q filings? I got stuck while trying to scrape monthly realised share repurchases from these filings. In specific, I would like to get the following information: 1. Period; 2. Total Number of…

web-scraping beautifulsoup edgar

asked Jul 20 '15 at 22:53

Jiayuan Chen

votes

1 answer

SEC company filings: Is the tag valid SGML? If so, how to parse it?

I tried to parse SEC company filings from sec.gov. Starting from fb 10-Q index.htm let's look at a complete text submission filing like complete submission text filing. It has a structure like:

"some…

parsing sgml edgar

asked Nov 02 '19 at 12:11

Michael S

votes

1 answer

From 10-K -- extract SIC, CIK, create metadata table

I am working with 10-Ks from Edgar. To assist in file management and data analysis, I would like to create a table containing the path to each file, the CIK number for the company filed (this is a unique ID issued by SEC), and the SIC industry code…

python regex metadata finance edgar

asked Apr 17 '17 at 11:31

user7317101

votes

1 answer

HTML Rendering of EDGAR .txt Filings

Currently, I'm working on a project where one PHP script grabs an index file from ftp://ftp.sec.gov and places all the company information into the database. The second PHP script then grabs the raw text file from the SEC and saves it locally for…

php html xml edgar

asked Nov 29 '15 at 03:59

Benjamin Schulz

votes

2 answers

JSONDecodeError: Expecting value: line 1 column 1 (char 0) when scaping SEC EDGAR

My codes are as follows: import requests import urllib from bs4 import BeautifulSoup year_url = r"https://www.sec.gov/Archives/edgar/daily-index/2020/index.json" year_content = requests.get(year_url) decoded_year_url = year_content.json() I could…

python edgar sec

asked Dec 29 '21 at 02:05

Julie

votes

1 answer

Parse XML with Python lxml

I am trying to parse a XML using the python library lxml, and would like the resulting output to be in a dataframe. I am relatively new to python and parsing so please bear with me as I outline the problem. The original xml that I am trying to parse…

python xml parsing lxml edgar

asked Mar 09 '21 at 01:02

stump

votes

2 answers

How to Use Beautiful Soup to Scrape SEC's Edgar Database and Receive Desire Data

Apologies in advance for long question- I am new to Python and I'm trying to be as explicit as I can with a fairly specific situation. I am trying to identify specific data points from SEC Filings on a routine basis however I want to automate this…

python beautifulsoup edgar

asked Apr 11 '19 at 00:04

bvd

votes

2 answers

Arelle Webserver - How to extract the income statement from an XBRL filing?

I am trying to extract financial statement information based on type of the statement. Let me explain to you in a little more details. I want to extract the income statement, balance sheet and cash flow statement from an XBRL instance – especially…

webserver finance xbrl edgar arelle

asked Apr 21 '17 at 12:49

rbr

votes

0 answers

How would I approach a lot of structured-but-inconsistent data?

I'm attempting to parse EDGAR documents - they're SEC filings. Specifically, I'm attempting to parse both SEC Schedule 13D and Schedule 13G filings. There appears to be lots of failed attempts at parsing these filings, and I assume that's because…

regex parsing scrape sgml edgar

asked Apr 21 '15 at 20:47

Mr_Spock

3,815
6
25
33

votes

1 answer

Parse SEC EDGAR XML Form Data with child nodes using BeautifulSoup

I am attempting to scrape individual fund holdings from the SEC's N-PORT-P/A form using beautiful soup and xml. A typical submission, outlined below and [linked here][1], looks like:

python xml beautifulsoup portfolio edgar

asked Feb 06 '23 at 19:47

therdawg

votes

1 answer

Downloading file from the website - HTTPError: HTTP Error 403: Forbidden

I am trying to download 10Ks (annual report of public companies) from EDGAR. I am running the code below (used it from the textbook, don't understand much of it), but keep getting the following error: (I downloaded 'master.idx' files that are…

python edgar

asked Oct 07 '22 at 22:48

Alberto Alvarez

votes

3 answers

How to get data from SEC Edgar python and a json

on the following page below there is as Data source a json link: https://www.sec.gov/edgar/browse/?CIK=1067983&owner=exclude Data source: CIK0001067983.json -> https://data.sec.gov/submissions/CIK0001067983.json This is my code (it works…

python json edgar sec

asked Sep 27 '22 at 05:47

JKR

votes

2 answers

How to Web scraping SEC Edgar 10-K Dynamic data

we are trying to parse SEC Edgar filing using Python . I'm trying to get this table "Sales By Segment Of Business" at line 21 . This is the link to the…

beautifulsoup edgar sec

asked Sep 02 '21 at 06:37

Tarun teja

votes

1 answer

Extracting table of holdings from (Edgar 13-F filings) TXT (pre-2013) with python

I am working on extracting a table of holdings from 13-F form on EDGAR. Before 2013 holdings were given in a txt file (see example). The output I am aiming for is a pd.DataFrame with same shape as the "Form 13F Information Table" in txt file (10…

parsing beautifulsoup python-requests edgar

asked Nov 22 '20 at 10:12

NoobFin

votes

2 answers

Extracting xml from a txt file

I'm trying to extract the xml portion of code from a txt file in python. The current txt file I'm using is from the edgar database and has multiple representations of a 10-k report in one txt file, having html then xml, and then some other…

xml beautifulsoup elementtree xbrl edgar

asked Apr 28 '20 at 02:05

segfault

2 3 4 5 6 7 8 Next