Questions tagged [paddleocr]

57 questions
5
votes
2 answers

How can I fix the 'Error in PyMuPDF' when installing paddleocr with pip?

When doing pip install paddleocr, I am facing an error in building wheel for PyMuPDF. Building wheels for collected packages: PyMuPDF Building wheel for PyMuPDF (setup.py) ... error error: subprocess-exited-with-error × python setup.py…
Jinen Rathore
  • 65
  • 1
  • 6
5
votes
2 answers

ImportError: cannot import name 'inference' from 'paddle'

I am trying to implement paddleocr. I have installed it using: #Github repo installation for paddle ! python3 -m pip install paddlepaddle -i https://mirror.baidu.com/pypi/simple #install paddle ocr !pip install paddleocr !git clone…
Vikas Kumar
  • 85
  • 1
  • 11
3
votes
1 answer

`use_angle_cls` and `cls` arguments in PaddleOCR

I want to use PaddleOCR for my text detection and recognition task. But I couldn't find enough documentation about why they have used the arguments use_angle_cls and cls. The following code illustrates the text image inference in PaddleOCR. from…
npn
  • 304
  • 1
  • 14
3
votes
2 answers

PaddleOCR freezes on MacOs Ventura M1

This is my First ever post here I have been working on a python script to get text from photos using PaddleOCR. Obviously everything works as expected on Windows x64. I managed to install paddleocr successfully on my MacBook Pro M1 by manually…
MarcoT
  • 31
  • 2
2
votes
2 answers

PaddleOCR Error flag 'flagfile' was defined more than once

Am encountering an issue running PaddleOCR on M1 Macbook ERROR: flag 'flagfile' was defined more than once (in files '/Users/paddle/xly/workspace/f2bafd01-b80e-4ac8-972c-1652775b2e51/Paddle/build/third_party/gflags/src/extern_gflags/src/gflags.cc'…
user2392965
  • 435
  • 2
  • 4
  • 13
1
vote
1 answer

How to improve PaddleOCR performance, it is sometimes not able to detect space in between words

I am working on data extraction from daily use items using paddleOCR, it is working fine in most of the cases but somethime it mixes two or more words as a single word it does not take space into action is there a better way to solve this thanks in…
1
vote
3 answers

Error: Can not import paddle core while this file exists

I am currently using intel i3 with no gpu, I created a virtual environment in python with python version 3.10.11, while my current python version is 3.11.3 In the virtual env I tried installing paddleocr using pip install paddleocv and pip install…
Jinen Rathore
  • 65
  • 1
  • 6
1
vote
1 answer

Paddle OCR gives "device id must be less than GPU count" error

I am trying to use Paddle OCR for reading numbers from images, but it gives me this error: "(InvalidArgument) Device id must be less than GPU count, but received id is: 0. GPU count is: 0. [Hint: Expected id < GetGPUDeviceCount(), but received id:0…
sgl
  • 95
  • 8
1
vote
0 answers

Getting [CRITICAL] WORKER TIMEOUT when using gunicorn in the docker container

I get this weird error on some random request when accessing the server. I am using gunicorn, and exposing some ocr reading using paddleocr. [2022-11-28 09:46:30 +0000] [7] [CRITICAL] WORKER TIMEOUT…
yardstick17
  • 4,322
  • 1
  • 26
  • 33
1
vote
1 answer

How to install paddlepaddle with no-avx core

I am trying to use paddleocr in a docker container and keep getting the error below: Dockerfile FROM paddlecloud/paddleocr:2.5-gpu-cuda10.2-cudnn7-85d7d5 In docker container after building it: In [1]: from paddleocr import PaddleOCR /bin/grep:…
yardstick17
  • 4,322
  • 1
  • 26
  • 33
1
vote
1 answer

Not able to install OCRPaddle for WIndows

I'm trying to install the PaddleOCR package following the Quick Start Guide. When I run this command to install PaddleOCR Whl Package: pip install "paddleocr>=2.0.1" # Recommend to use version 2.0.1+ I'm getting this error when building the…
1
vote
0 answers

Aws Textract Tables and Rawtext in a single document

I am using Amazon Textract to extract data from pdf files to an s3 location. My documents consist of Tables and Paragraphs. when I parse through the extract i get the raw text of all data however the table data is distorted. Below is the actual…
1
vote
1 answer

How do I find, download and install a trained PaddleOCR model?

Tell me how to find and install a trained OCR model for PaddleOCR? Because I got confused in the official documentation on the GitHub. Looking for a "smart and complete" OCR model for PaddleOCR for Python.
0
votes
0 answers

paddleocr erro:invalid ppem value

plesae help me font_path = 'D:/download/PaddleOCR-release-2.7/doc/fonts/simfang.ttf' # PaddleOCR下提供字体包 image = Image.open(img_path).convert('RGB') im_show = draw_structure_result(image, result,font_path=font_path) how to solve this bug
liumin
  • 1
0
votes
0 answers

How to extract reconstructed table data's corresponding coordinates on the page?

Using PaddleOCR, I'm able to extract the tables from the page into an excel file. It also generates a res file which is of the following format: {"type": "text", "bbox": [46, 292, 1469, 319], "res": [], "img_idx": 0} {"type": "text", "bbox": [44,…
Dipanshu Juneja
  • 1,204
  • 14
  • 29
1
2 3 4