←

Settings and activity

multi language support

32 votes

Vote

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close

You have left! (?) (thinking…)

4 comments · General · Delete… · Admin →

How important is this to you?

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close

under review · AdminDuncan McGregor (Admin, VelOCRaptor) responded

Too be frank, I’d like to get English working better first, but I’ll look at this then.

An error occurred while saving the comment

Anonymous commented · Sep 5, 2012

I don't know if this has to do with the GUI, or with the documents being parsed? Presumably the google folks are working on foreign language font parsing?

Submitting...
Let me save as raw text

22 votes

Vote

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close

You have left! (?) (thinking…)

1 comment · General · Delete… · Admin →

How important is this to you?

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close

An error occurred while saving the comment

Anonymous commented · Sep 5, 2012

I know of only two free tools that do a decent job of text extraction from pdf (considering layout complexities). They are pdftotext (there are several by that name i mean the one coming with xpdf) and multivalent. There needs to be two modes (which at least pdftotext supports): one that attempts to preserve layout even in plain text, and the other that just gets out raw strings (this is easier).

Submitting...
When dealing with scanned documents in PDF, the output file should not be much bigger than the input

40 votes

Vote

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close

You have left! (?) (thinking…)

3 comments · General · Delete… · Admin →

How important is this to you?

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close

planned · AdminDuncan McGregor (Admin, VelOCRaptor) responded

Thanks for the suggestion. We currently re-encode the image into the new PDF, which may change its size. Was the original black and white or greyscale rather than colour?

An error occurred while saving the comment

Anonymous commented · Sep 5, 2012

I also have a black-and-white. Started as 8mb, became 66mb afterwards.

Submitting...
provide a progress bar to indicate how long the ocr will take

69 votes

Vote

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close

You have left! (?) (thinking…)

planned · 3 comments · General · Delete… · Admin →

How important is this to you?

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close

An error occurred while saving the comment

Anonymous commented · Sep 5, 2012

In addition to a progress bar, an optional log window would be useful. Surely the underlying engine has some sort of debug mode. It would help to know for example when it is getting lost or uncertain.

Submitting...

multi language support

We're glad you're here

We're glad you're here

Let me save as raw text

We're glad you're here

We're glad you're here

When dealing with scanned documents in PDF, the output file should not be much bigger than the input

We're glad you're here

We're glad you're here

provide a progress bar to indicate how long the ocr will take

We're glad you're here

We're glad you're here