Ok im using this git from Git Bash. After i run it i have the txt
files of the Securities and Exchange Commission DB which is EDGAR in this format on my hard drive. I am using Win 7. The txt
files have HTML
tags inside.
I was wondering since the files in text are in this strict format by the SEC agency since the early nineties if there is a way to extract a certain item let's say
<us-gaap:IncomeTaxExpenseBenefit contextRef="eol_PE9523----1310-K0013_STD_365_20131231_0"
decimals="-3" id="id_3914012_7F3BEF88-8CD1-49E7-8A78-91A091178D1B_1_13"
unitRef="iso4217_USD">40315000</us-gaap:IncomeTaxExpenseBenefit>
Whether by using a Script or a git repository with accuracy since the format is strict? How for instance can someone extract a hole table from the txt file? Libraries, gits, scripts anything that with a little work and modification can be picked up will be fine for me to have a start.
Can any of these gits get in and do such a job? I read the instructions (whenever there are) but i dont understand many stuff.