0

I need to parse sgml files in Java . Below is the content in sgml file I need the FILING-DATA, CIK and ASSIGNED-SIC. Please help me in this regard.

<ACCEPTANCE-DATETIME>20130226172602
<ACCESSION-NUMBER>0001193125-13-077271
<TYPE>10-K
<PUBLIC-DOCUMENT-COUNT>15
<PERIOD>20121231
<FILING-DATE>20130226
<DATE-OF-FILING-DATE-CHANGE>20130226
<FILER>
<COMPANY-DATA>
<CONFORMED-NAME>COGNIZANT TECHNOLOGY SOLUTIONS CORP
<CIK>0001058290
<ASSIGNED-SIC>7371
<IRS-NUMBER>133728359
<FISCAL-YEAR-END>1231
</COMPANY-DATA>
<FILING-VALUES>
<FORM-TYPE>10-K
<ACT>34
<FILE-NUMBER>000-24429
<FILM-NUMBER>13643872
</FILING-VALUES>
<BUSINESS-ADDRESS>
<STREET1>500 FRANK W. BURR BLVD.
<CITY>TEANECK
<STATE>NJ
<ZIP>07666
<PHONE>2018010233
</BUSINESS-ADDRESS>
<MAIL-ADDRESS>
<STREET1>500 FRANK W. BURR BLVD.
<CITY>TEANECK
<STATE>NJ
<ZIP>07666
</MAIL-ADDRESS>
</FILER>
</SEC-HEADER>
Zong
  • 6,160
  • 5
  • 32
  • 46
  • Questions asking for code must demonstrate a minimal understanding of the problem being solved. Include attempted solutions, why they didn't work, and the expected results. See also: [Stack Overflow question checklist](http://meta.stackexchange.com/questions/156810/stack-overflow-question-checklist) – reto Dec 11 '13 at 10:10
  • Are you working on text categorization? – Ashish Dec 11 '13 at 10:10

2 Answers2

1

Take a look at these stuffs
Simple SGML parser
SGML parser in Java
SAX-like API for SGML (SGML parser for Java)

Community
  • 1
  • 1
Nidhish Krishnan
  • 20,593
  • 6
  • 63
  • 76
0

Though it's a very old post and OP might get the solution but there is no useful reference. I am not claiming that answer provided by me is perfect or best solution but it served the purpose and I was able to successfully get the data from very large SGML files also. So I hope it may help someone in need to parse SGML file. Kindly refer to my previous answer here Kindly let me know in case any clarification required.

Shailesh Saxena
  • 3,472
  • 2
  • 18
  • 28