provide a progress bar to indicate how long the ocr will take
This should give an indication of progress, not just the spinny wheel
-
Anonymous commented
In addition to a progress bar, an optional log window would be useful. Surely the underlying engine has some sort of debug mode. It would help to know for example when it is getting lost or uncertain.
-
To be honest, I was initially naive and thought that people would throw their own scans at VelOCRaptor, rather than complete works!
Consequently I give all the pages to the OCRopus OCR engine at once, and as it does not provide feedback, I have no idea where in the list of pages it is.
I'm planning to revise this so that I hand individual pages to the engine - in this way I can provide progress feedback, and also the ability to cancel but write work so far. I expect this to be the major feature in a 1.1 release, but due to other commitments I can't start work on it for a few days.
-
Scott Bayes commented
I just threw Principia Mathematica in as a test, and see 2 needs for more than the spinny wheel:
- PM is HUGE and I have no idea how long it might take to complete recognition. I aborted after looking at Activity Monitor, see next item
- Activity Monitor shows a Ruby process "not responding" at 97% CPU and VOCRr loafing at 0.7% CPU (MacBook orig at 2GHz). Obviously VOCRr is monitoring the human interface and the ruby proc—which is where the action is—but it's not clear whether the ruby proc is hung or is just so deeply immersed that it is not taking time to respond (the ruby proc is 1 thread, so probably is mostly ignoring the outside world except for whatever ipc you use). If I could see a progress bar clomping/trudging along (perhaps with a textual "page n/m" incrementing too), I'd know things were copasetic. Or not.
The bar doesn't need to try to be very accurate for my need: on a single page job, some rough idea of where it is if the page takes more than 2-3 sec; on a multi=pager, just the page n of m converted to a percentage of the bar would do fine; I think the additional textual "page n/m" would be useful though, because big jobs like Principia might push the bar so slowly as to make progress almost imperceptible.
OCR is prone to long execution times, so I think you need to consider your user's anxiety level and provide stronger indications that things are proceeding as expected.