Replies: 1 comment
-
@kurekj - any luck? |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
When parsing CVs using Docling on Ubuntu with Python 3.11, some portions of the PDF (e.g., containing text) are incorrectly treated as images instead of being recognized as text. This occurs despite enabling OCR and trying different OCR engines and settings.
Environment:
data:image/s3,"s3://crabby-images/36f00/36f00b789b538f0dfd9d655d2ed5c53504a84164" alt="image_000001_0a70fe332b988b47c6e4b59e8f4c6edbcba45055cc60c5293ff72f86bf82544c"
Docling version: 2.10.0
Docling Core version: 2.9.0
Docling IBM Models version: 2.0.7
Docling Parse version: 3.0.0
Operating System: Ubuntu
Python version: 3.11
Relevant Code:
IMAGE_RESOLUTION_SCALE = 10.0
Beta Was this translation helpful? Give feedback.
All reactions