Questions tagged [paddleocr]
57 questions
5
votes
2 answers
How can I fix the 'Error in PyMuPDF' when installing paddleocr with pip?
When doing pip install paddleocr, I am facing an error in building wheel for PyMuPDF.
Building wheels for collected packages: PyMuPDF
Building wheel for PyMuPDF (setup.py) ... error
error: subprocess-exited-with-error
× python setup.py…

Jinen Rathore
- 65
- 1
- 6
5
votes
2 answers
ImportError: cannot import name 'inference' from 'paddle'
I am trying to implement paddleocr. I have installed it using:
#Github repo installation for paddle
! python3 -m pip install paddlepaddle -i https://mirror.baidu.com/pypi/simple
#install paddle ocr
!pip install paddleocr
!git clone…

Vikas Kumar
- 85
- 1
- 11
3
votes
1 answer
`use_angle_cls` and `cls` arguments in PaddleOCR
I want to use PaddleOCR for my text detection and recognition task. But I couldn't find enough documentation about why they have used the arguments use_angle_cls and cls. The following code illustrates the text image inference in PaddleOCR.
from…

npn
- 304
- 1
- 14
3
votes
2 answers
PaddleOCR freezes on MacOs Ventura M1
This is my First ever post here
I have been working on a python script to get text from photos using PaddleOCR. Obviously everything works as expected on Windows x64.
I managed to install paddleocr successfully on my MacBook Pro M1 by manually…

MarcoT
- 31
- 2
2
votes
2 answers
PaddleOCR Error flag 'flagfile' was defined more than once
Am encountering an issue running PaddleOCR on M1 Macbook
ERROR: flag 'flagfile' was defined more than once (in files '/Users/paddle/xly/workspace/f2bafd01-b80e-4ac8-972c-1652775b2e51/Paddle/build/third_party/gflags/src/extern_gflags/src/gflags.cc'…

user2392965
- 435
- 2
- 4
- 13
1
vote
1 answer
How to improve PaddleOCR performance, it is sometimes not able to detect space in between words
I am working on data extraction from daily use items using paddleOCR, it is working fine in most of the cases but somethime it mixes two or more words as a single word it does not take space into action
is there a better way to solve this
thanks in…

warriorwizard
- 11
- 2
1
vote
3 answers
Error: Can not import paddle core while this file exists
I am currently using intel i3 with no gpu,
I created a virtual environment in python with python version 3.10.11, while my current python version is 3.11.3
In the virtual env
I tried installing paddleocr using pip install paddleocv and pip install…

Jinen Rathore
- 65
- 1
- 6
1
vote
1 answer
Paddle OCR gives "device id must be less than GPU count" error
I am trying to use Paddle OCR for reading numbers from images, but it gives me this error:
"(InvalidArgument) Device id must be less than GPU count, but received id is: 0. GPU count is: 0.
[Hint: Expected id < GetGPUDeviceCount(), but received id:0…

sgl
- 95
- 8
1
vote
0 answers
Getting [CRITICAL] WORKER TIMEOUT when using gunicorn in the docker container
I get this weird error on some random request when accessing the server. I am using gunicorn, and exposing some ocr reading using paddleocr.
[2022-11-28 09:46:30 +0000] [7] [CRITICAL] WORKER TIMEOUT…

yardstick17
- 4,322
- 1
- 26
- 33
1
vote
1 answer
How to install paddlepaddle with no-avx core
I am trying to use paddleocr in a docker container and keep getting the error below:
Dockerfile
FROM paddlecloud/paddleocr:2.5-gpu-cuda10.2-cudnn7-85d7d5
In docker container after building it:
In [1]: from paddleocr import PaddleOCR
/bin/grep:…

yardstick17
- 4,322
- 1
- 26
- 33
1
vote
1 answer
Not able to install OCRPaddle for WIndows
I'm trying to install the PaddleOCR package following the Quick Start Guide. When I run this command to install PaddleOCR Whl Package:
pip install "paddleocr>=2.0.1" # Recommend to use version 2.0.1+
I'm getting this error when building the…

Jaime Acevedo
- 11
- 2
1
vote
0 answers
Aws Textract Tables and Rawtext in a single document
I am using Amazon Textract to extract data from pdf files to an s3 location. My documents consist of Tables and Paragraphs. when I parse through the extract i get the raw text of all data however the table data is distorted.
Below is the actual…

Barry
- 13
- 3
1
vote
1 answer
How do I find, download and install a trained PaddleOCR model?
Tell me how to find and install a trained OCR model for PaddleOCR? Because I got confused in the official documentation on the GitHub. Looking for a "smart and complete" OCR model for PaddleOCR for Python.

Igor Mikhailov
- 11
- 2
0
votes
0 answers
paddleocr erro:invalid ppem value
plesae help me
font_path = 'D:/download/PaddleOCR-release-2.7/doc/fonts/simfang.ttf' # PaddleOCR下提供字体包
image = Image.open(img_path).convert('RGB')
im_show = draw_structure_result(image, result,font_path=font_path)
how to solve this bug

liumin
- 1
0
votes
0 answers
How to extract reconstructed table data's corresponding coordinates on the page?
Using PaddleOCR, I'm able to extract the tables from the page into an excel file. It also generates a res file which is of the following format:
{"type": "text", "bbox": [46, 292, 1469, 319], "res": [], "img_idx": 0}
{"type": "text", "bbox": [44,…

Dipanshu Juneja
- 1,204
- 14
- 29