I have to read data from PDF and then pulled back to Excel, using iTextSharp. I am able to read all the text from PDF, but the problem is I have to pull those Text data to excel in same format as it is in PDF (in PDF it is in table structure) data am getting as a series of string.
Please suggest me by which I could be separate values and table columns, I am not able to distinguish between which is a value and which is a table columns.
Below are the table structure in PDF:
---------------------------------
| |
|Name|jaydeep|Age|25|Place|India|
--------------------------------
|Sex |Male |Pin|000|Job |Yes |
---------------------------------
So after extracting am getting all the text, now I have to populate excel with these data in the same Table structure:
----------------------------------
|Table1
-------------------------------------
|#|ActionPlan|Description|Failure Mode|
---------------------------------------
|1|Test |Sample test| No |
---------------------------------------
|2|Change R |Sample 1 | No |
---------------------------------------
|3|xxxxx |Sample 2 | Yes |
---------------------------------------
I have user some logic and able to get data in an string[] array in below format :
BT /F3 9 Tf 1 1 1 rg 407.446 TL 297.648 364.176 Td (CCR Metrics) Tj T* ET
BT /F3 9 Tf 0.161 0.365 0.537 rg 407.446 TL 306.576 349.776 Td (#) Tj T* ET
BT /F3 9 Tf 0.161 0.365 0.537 rg 407.446 TL 375.912 349.776 Td (CCR) Tj T* ET
BT /F3 9 Tf 0.161 0.365 0.537 rg 407.446 TL 454.68 349.776 Td (Value) Tj T* ET
BT /F3 9 Tf 0.161 0.365 0.537 rg 407.446 TL 489.888 349.776 Td (Threshold) Tj T* ET
BT /F3 9 Tf 0.161 0.365 0.537 rg 407.446 TL 542.88 349.776 Td (Status) Tj T* ET
BT /F4 8.5 Tf 0 0 0 rg 384.81 TL 306.72 332.208 Td (1) Tj T* ET
BT /F4 8.5 Tf 0 0 0 rg 384.81 TL 324.648 332.208 Td (Program: ) Tj T* ET
BT /F4 8.5 Tf 0 0 0 rg 384.81 TL 324.648 322.704 Td (xxcxcxcx) Tj T* ET
BT /F4 8.5 Tf 0 0 0 rg 384.81 TL 324.648 313.2 Td (fdwdf44) Tj T* ET
BT /F4 8.5 Tf 0 0 0 rg 384.81 TL 324.648 303.696 Td (44dd) Tj T* ET
BT /F4 8.5 Tf 0 0 0 rg 384.81 TL 456.624 332.208 Td (981.80) Tj T* ET
BT /F4 8.5 Tf 0 0 0 rg 384.81 TL 505.872 332.208 Td (1152.00) Tj T* ET
BT /F4 8.5 Tf 0 0 0 rg 384.81 TL 306.72 290.16 Td (2) Tj T* ET
BT /F4 8.5 Tf 0 0 0 rg 384.81 TL 324.648 290.16 Td (Dataset: ) Tj T* ET
BT /F4 8.5 Tf 0 0 0 rg 384.81 TL 324.648 280.656 Td (P1924_w_V20) Tj T* ET
BT /F4 8.5 Tf 0 0 0 rg 384.81 TL 324.648 271.152 Td (ww55)-) Tj T* ET
BT /F4 8.5 Tf 0 0 0 rg 384.81 TL 324.648 261.648 Td (P978555520_JMC) Tj T* ET
BT /F4 8.5 Tf 0 0 0 rg 384.81 TL 456.624 290.16 Td (186.40) Tj T* ET
BT /F4 8.5 Tf 0 0 0 rg 384.81 TL 510.624 290.16 Td (512.00) Tj T* ET
BT /F4 8.5 Tf 0 0 0 rg 384.81 TL 306.72 248.112 Td (3) Tj T* ET
BT /F4 8.5 Tf 0 0 0 rg 384.81 TL 324.648 248.112 Td (RAM: ) Tj T* ET
BT /F4 8.5 Tf 0 0 0 rg 384.81 TL 324.648 238.608 Td (PddUPF_V20) Tj T* ET
BT /F4 8.5 Tf 0 0 0 rg 384.81 TL 324.648 229.104 Td (yurfcew345) Tj T* ET
BT /F4 8.5 Tf 0 0 0 rg 384.81 TL 324.648 219.6 Td (Pqsq0_JMC) Tj T* ET
BT /F4 8.5 Tf 0 0 0 rg 384.81 TL 461.376 248.112 Td (46.50) Tj T* ET
BT /F4 8.5 Tf 0 0 0 rg 384.81 TL 515.376 248.112 Td (72.00) Tj T* ET
So here in the brackets I have those PDF data which am getting from table structure of PDF.
Now the job is I have to put these data to Excel in same table structure.