Rendered at 07:03:36 GMT+0000 (Coordinated Universal Time) with Cloudflare Workers.
lavaman131 2 hours ago [-]
This is awesome! Been needing something like this for some research paper diagrams I've been indexing.
KetoManx64 9 hours ago [-]
What's the performance like compared to tesseract?
I don't see tesseract mentioned anywhere in the readme, which is surprising considering that's the number one tool most go to for Image > text OCR.
mrkn1 9 hours ago [-]
No rigorous eval, and I love Tesseract. Here's the example that motivated me to build textsnap (which is in the github's README), parsed with Tesseract:
Very noticable difference and the exact issue I run repeatedly with tesseract! Definitely going to try dropping textsnap into my scripts now. Thanks!!
abstract257 18 hours ago [-]
Curious how it does on multi-page scanned PDFs vs. single screenshots? The ORT vision/decoder split is the part that usually makes or breaks CPU VLM OCR...
krunck 17 hours ago [-]
I had to extract the image from a PDF for it to work. Then run it on each page image extracted.
abstract257 32 minutes ago [-]
Thanks
vivzkestrel 16 hours ago [-]
- how well do you think this ll work with code? i mean take code screenshots and convert it into actual code for vscode
https://imgur.com/a/i2eQra8