Make the recognition accuracy better

Lets face it - the accuracy isn't great. Make it better.

89 votes

AdminDuncan McGregor (Admin, VelOCRaptor) shared this idea · March 04, 2009 · Report… · Admin →

planned

An error occurred while saving the comment

Lloyd commented · November 06, 2012 01:21 · Report

I like the idea and it works quickly, but the accuracy was a problem. Lots on fiddly corrections. I will come back again later and see if it is more accurate.

Submitting...
Leigh Grossman commented · September 15, 2010 15:43 · Report

I'd love to be able to use this, but the recognition on clean a test file was unacceptable; it would have taken forever to clean up compared to the older programs I'm still using. I'll keep an eye out, and if the accuracy improves I'd be happy to use (and recommend) it.

Submitting...
Katie commented · July 28, 2010 14:02 · Report

Yes please! I'd love to be able to not only search, but do the occasional copy-paste of a sentence or a paragraph.

Submitting...
fcchambers commented · February 18, 2010 08:50 · Report

Kind of a marketing/positioning comment here: I was drawn to this product primarily to make my PDFs searchable via spotlight... so in my case if 90% of the text is accurate, it would probably be enough for me! Elsewhere in the comments was a suggestion to pass the text to searchlight... I'm wondering if "stripped down" version where *all* it did was index a doc for seachlight, might have merit?

Submitting...
HandyMac commented · February 07, 2010 10:18 · Report

I picked up VelOCRaptor on a whim when I saw it mentioned at MacInTouch last summer, mostly because I liked the spirit evident in the name and icon.

I have occasional need for OCR, say to copy a paragraph or a few pages out of a book I'm reading. In the classic Mac OS days I used OmniPage, which came with my scanner; it worked, but was somewhat confusingly complex for my needs. In the OS X era I've used OmniPage (awkwardly "ported" from its old version) occasionally, then tried ReadIRIS, which was really daunting in its complexity. Lately I've just been saving the scans for when I might have time and energy to master the software.

Until today, when I finally got around to trying VelOCRaptor, and found it, as the little bear said, "just right". I scanned about 15 pages out of a book (300dpi TIFFs), dropped the scans on the icon, then copied the text out of the PDFs VelOCRaptor created, pasted it into TextEdit and went over it. Added an extra return for each paragraph, then used Devon's excellent WordService service to delete all the superfluous returns (cmd-shift-7).

Yeah, it could be a little more accurate, which is why I'm posting in this thread -- but after all, "more accuracy" is what any serious developer of OCR software would be working toward all the time anyway, no? Does any OCR engine understand hyphenated words? And it was a little funny that it consistently read "McLean" as "McAllen", and read lower case "o" as "0" (zero) in some scans. So it's true that the OCR'd material requires a fairly close reading and a fair number of corrections, but it's sure a lot faster than typing it all in (and I'm a pretty fast typist when I get going).

Anyway, I'm sure VelOCRaptor will become more accurate as time goes on, but otherwise I like it fine. And I'm sure you have some good ideas to make it better, but don't go adding a whole lot of "kitchen sink" features trying to please everyone. There are already industrial-strength OCR programs available (with byzantine interfaces) for those who need them.

Submitting...
AdminDuncan McGregor (Admin, VelOCRaptor) commented · July 21, 2009 12:20 · Report

Please see http://blog.velocraptor.com/2009/07/is-velocraptor-good-enough.html

Submitting...
godffreypratt commented · July 20, 2009 12:39 · Report

I agree with Dyno wholeheartedly. Not to sound discouraging but it's really of no use in its current state. Even using the crispest font, it doesn't recognise half the number characters. And the PDF output is blurry. It's actually rather cheeky getting users to en masse as beta testers in this way. (More dubious practices to follow, no doubt).

Submitting...
dyno commented · June 14, 2009 03:32 · Report

The accuracy is so poor at present (even with excellent quality source files) that I'm afraid the app isn't really in a useable state. However the interface is delightfully simple and straightforward, which is why I would like to strongly encourage further development.

Submitting...
AdminDuncan McGregor (Admin, VelOCRaptor) commented · May 08, 2009 13:36 · Report

Our current engine is OCRopus, which does layout analysis and then hands over to Tesseract as its character recognizer. So yes!

I'm hoping to be able to announce big news on the accuracy front soon.

Submitting...
pornel commented · May 08, 2009 02:55 · Report

Have you considered tesseract engine?
http://code.google.com/p/tesseract-ocr/

Submitting...
pornel commented · May 06, 2009 15:03 · Report

I've converted blurry image and got the following result:

-¬ _
[¬ª¬
1
11*1111111111.11111.¬ J111‚Äò1-1111:1tF ,I::¬
.111 1111`1t‚Äò11.11‚Äô11111i11.11,111 :11%

Submitting...

I suggest you ...

Feedback

General

Searching…

VelOCRaptor

Make the recognition accuracy better

General

Categories

Searching…

VelOCRaptor

Make the recognition accuracy better

We're glad you're here

We're glad you're here

We're glad you're here