Questions tagged [python-camelot]

Camelot is a Python library that makes it easy for anyone to extract tabular data from PDF files.

Official web site

Camelot is a Python library that makes it easy for anyone to extract tabular data from PDF files.

Why Camelot?

You are in control. Unlike other libraries and tools which either give a nice output or fail miserably (with no in-between), Camelot gives you the power to tweak table extraction. (This is important since everything in the real world, including PDF table extraction, is fuzzy.)
Bad tables can be discarded based on metrics like accuracy and whitespace, without ever having to manually look at each table.
Each table is a pandas DataFrame, which seamlessly integrates into ETL and data analysis workflows.
Export to multiple formats, including JSON, Excel and HTML.

See comparison with other PDF table extraction libraries and tools.

197 questions

votes

7 answers

AttributeError: module 'camelot' has no attribute 'read_pdf'

I am trying to extract tables from pdf using camelot and I get this attribute error. Could you please help? import camelot import pandas as pd pdf = camelot.read_pdf("Gordian.pdf") AttributeError Traceback (most recent…

python python-camelot

asked Oct 14 '19 at 12:15

Yousra

votes

5 answers

Camelot: DeprecationError: PdfFileReader is deprecated

I have been using camelot for our project, but since 2 days I got following errorMessage. When trying to run following code snippet: import camelot tables = camelot.read_pdf('C:\\Users\\user\\Downloads\\foo.pdf', pages='1') I get this…

python pypdf python-camelot

asked Dec 28 '22 at 11:35

Said Akyuz

votes

2 answers

Camelot is reading only the first page of the pdf

tables = camelot.read_pdf(r"C:\Users\Ayush ShaZz\Desktop\Code_Python\FoodCaloriesList.pdf") for table in tables: print(table.df) Its reading only the first page. Someone please help me out

python python-camelot

asked Jun 26 '19 at 16:21

Ayush ShaZz

votes

0 answers

Same table is extracted twice from a pdf by Camelot-py

I am trying to extract tables from a multiple page PDF file using camelot-py v0.7.3. So far it has been the best pdf reader tool for me. I just needed to read pdf line by line and detect table manually. I tried many other tools such as tabula,…

python pdf-reader pdf-parsing python-camelot

asked Feb 21 '20 at 18:12

mk09

votes

17 answers

Python-camelot (Error: GhostscriptNotFound while it is installed)

I am trying to extract tabular data from pdf using camelot and I am getting the following error. Code: tables = camelot.read_pdf(file_name) Error: GhostscriptNotFound: Please make sure that Ghostscript is installed and available on the PATH…

python python-camelot

asked Nov 15 '18 at 12:03

Venkatesan R

votes

3 answers

Python Camelot borderless table extraction issue

I'm trying hard to extract some borderless table as show in the below image which are from pdf files. I have installed python-camelot as shown here and is working fine for bordered tables only. Please find below details: platform -…

python-3.x python-camelot

asked Nov 08 '18 at 14:03

Richie

votes

3 answers

No module named 'camelot.ext'

I have been trying to run Excalibur after install't from pip, it's asked me to install camelot, after that this error pop up, Traceback (most recent call last): File "/usr/lib/python3.9/runpy.py", line 197, in _run_module_as_main return…

python python-3.x python-camelot excalibur-py

asked Oct 26 '21 at 06:38

Virus

votes

2 answers

Find PDF Dimensions with Camelot

I am using Camelot to read complete PDFs and extract about 112 attributes from each one. I use table areas to extract the attributes test_variable = camelot.read_pdf(filename, flavor='stream', table_areas=['38, 340 ,50, 328'])…

python pdf-extraction python-camelot

asked Jan 14 '19 at 06:32

A.A. F

votes

1 answer

Camelot PDF dimensions

I have searched stackoverflow extensively before posting this and have not been able to find anything on camelot page dimensions. There is this question, which suggests using table_region but that does not solve OP's problem or mine. I unfortunately…

python python-camelot pymupdf

asked Dec 03 '19 at 19:19

Jinx

votes

2 answers

Python PDF Parsing with Camelot and Extract the Table Title

Camelot is a fantastic Python library to extract the tables from a pdf file as a data frame. However, I'm looking for a solution that also returns the table description text written right above the table. The code I'm using for extracting tables…

python pdfminer tabula python-camelot

asked Oct 01 '19 at 13:04

Ali Asad

1,235
1
18
33

votes

3 answers

Problems to extract table data using camelot without error message

I am trying to extract tables from this pdf link using camelot, however, when a try this follow code: import camelot file = 'relacao_medicamentos_rename_2020.pdf' tables =…

python ghostscript python-camelot pdf-extraction

asked Dec 30 '21 at 15:12

Gabriel Souto

votes

2 answers

tabula vs camelot for table extraction from PDF

I need to extract tables from pdf, these tables can be of any type, multiple headers, vertical headers, horizontal header etc. I have implemented the basic use cases for both and found tabula doing a bit better than camelot still not able to detect…

python pdf tabula python-camelot

asked Apr 23 '20 at 12:32

Niranjan Kumar

1,438
1
12
29

votes

3 answers

How to find table region for camelot

As mentioned in camelot, we can extract table from particular region like: tables = camelot.read_pdf('table_regions.pdf', table_regions=['170,370,560,270']) But how can I find these regions for my pdf.

python-camelot

asked Sep 20 '19 at 09:00

Shubham Mishra

votes

1 answer

How to get table coordinates using python-camelot?

I am trying to parse some pdf files in order to extract some key information.There is number of tables in each pdf that contains a part of these information. So I tried to use camelot to extract tables and I got good results but I want to extract…

python-3.x pdf python-camelot

asked Sep 19 '19 at 11:59

jessy

votes

0 answers

How to switch table area coordinates in Python Camelot and Tabula-Py

I have obtained the coordinates of a table bounding box using Camelot, but I need to use tabula-py to extract the table data, as camelot is only extracting the first line in each table cell, even in lattice mode. I have noticed that when defining…

python python-3.x tabula python-camelot

asked May 08 '19 at 16:17

John

2 3

…

13 14 Next