Pdftables is a Python package to extract tables from PDF files.
Questions tagged [pdftables]
23 questions
4
votes
4 answers
AttributeError: module 'collections' has no attribute 'Iterable'
I am using the "pdftables" library to extract tables from a pdf.
This is my code:
import pdftables
pg = pdftables.get_pdf_page(open("filename.pdf","rb"),253)
print(pg)
table = pdftables.page_to_tables(pg)
print(table)
I am getting this error…

Prassana K
- 41
- 1
- 3
3
votes
2 answers
Extract all tables from PDF in python
I have an PDF and want to extract all tables from that PDF. When I run the code below, I get empty list.
import pdftables
filepath = 'File_Set_-2_feasibility_Study/140u-td005_-en-p.pdf'
with open(filepath, 'rb') as fh:
table =…

Neeraj Sharma
- 174
- 1
- 3
- 14
2
votes
1 answer
Camelot Cannot extract entire table
Im using Camelot to extract table information from a PDF that i have converted from scanned to searchable using ocrmypdf(500dpi).
Camelot seems to be able to identify the table and extract most of the data within the table but it seems to be unable…

Douglas Griffin
- 21
- 1
2
votes
0 answers
Trouble with tabulizer library in r recognizing non-alphanumeric (symbol) characters on a table in a PDF
I am using the tabulizer library in r to capture data from a table located inside a PDF on a public website
(https://www.waterboards.ca.gov/sandiego/water_issues/programs/basin_plan/docs/update082812/Chpt_2_2012.pdf).
The example table that I am…

user11036517
- 65
- 5
2
votes
1 answer
iText 7 prevent cell to split on page break
I'm trying to generate a PDF with table that contains cells with shapes.
I override CellRenderer class and inside the new class I draw shapes in DrawableCellRenderer#draw.
Sometimes when the table needs to split and the cell has row span I want to…

Michael Azarzar
- 55
- 8
1
vote
1 answer
How to align multiple tables added to a single table using itextsharp in c#?
I have created a table with 3 columns and another table with 6 columns which is then added to another table to make it into a single table. I want to align the second column of the 3 column table and second columns of 6 column table like this:
Can…

ANK
- 71
- 1
- 8
1
vote
2 answers
How do I format/tag an accessible PDF table that spans multiple pages horizontally?
I'm responsible for remediating a PDF that has been generated by a third-party, proprietary system for which I have no access to the layout or design. The goal is to pass the adobe acrobat DC accessibility checker before publication.
Some of the…

Glamador
- 11
- 2
1
vote
1 answer
Get absolute width from PdfPTable column (iText)
How to get the absolute width of a column from iText when table columns are specified with their relative size ?
What I tried
I specified 3 columns with their relative width as float like this:
PdfPCell cell2;
PdfPTable table2 = new PdfPTable(new…

Fakhryan Albar
- 113
- 14
0
votes
0 answers
Facing issue in extracting Tables from PDF with tabula
I am trying to extract multiple tables from the PDF which is throwing me Command '['java', '-Dfile.encoding=UTF8', ERROR
link to the pdf
https://www.paypalobjects.com/marketing/web/US/en/merchant_fees/US-merchant-fees-24-July-2023.pdf
PDF has 42…

user21766269
- 19
- 2
0
votes
1 answer
Better Layout Output for PDF Tables Extracted using Camelot
I'm building a python program using Camelot that extracts tables from a PDF (see code below). I am able to successfully execute the code, but I am hitting a road block on how to get a better output result. Specifically, I'm trying to get the code to…

CyberCoder
- 7
- 3
0
votes
1 answer
Flutter Multi Image Pick and inserting in PDF table only last list inserted
I am new to flutter and I am in process of making an app to select multiple images using image_picker package and inserting it into the pdf table. I am able to get the images and make a list based on the number of rows required, however when the pdf…

Bimal
- 23
- 2
0
votes
0 answers
How to wrap contents in a table ? .docx to PDF using apache POI
On converting the .docx to PDF using apache POI, the contents in the table are not getting wrapped.
enter image description here
Following is the code , I am using to convert
XWPFDocument document = new XWPFDocument(is);
…

abhi
- 1
- 1
0
votes
0 answers
How To Circumvent 504 Errors
I am working in ReactJs and one of the main aspects of our project is the ability to upload a scorecard and have all of its results parsed and placed into objects. However, due to the nature of these pdfs that get uploaded, there's a LOT of…
user18899735
0
votes
2 answers
I want to insert the photo taken with the camera into a cell of the table in the pdf. But I am getting the following error code
// I want to insert the photo taken with the camera into a cell of the table in the pdf. But I am getting the following error code.
Reloaded 1 of 1223 libraries in 6.711ms.
E/flutter (21256): [ERROR:flutter/lib/ui/ui_dart_state.cc(209)] Unhandled…

Murat Bayram
- 11
0
votes
1 answer
Auto-Breakline PdfTable Cells PdfFileWriter c#
Im writing a Programm that retrieves Cutomer Data from a SQLite-File and stores them in a PDF-File in a PdfTable like:
PdfContents contentsTable = new PdfContents(page);
PdfTable table = new PdfTable(page, contentsTable, ArialNormal,…

Samuel Skorsetz
- 1
- 4