Japanese OCR (Optical Character Recognition) Software

In college, my Japanese wasn't quite up to par, and I had to read several legal articles for my thesis. Since there were so many kanji I didn't know, I used OCR (Optical Character Recognition) software to digitize the articles, and then read them using a combination of rikaichan and other computer-based Japanese dictionaries.

OCR software converts printed text you scan into digital text that you can read in Microsoft Word, Firefox, etc. For Japanese, it works decently. It is certainly not perfect, and you will have to look up more complicated, rare kanji on your own, but if you have some short articles and little Japanese ability it can save you a lot of time.

That said, the act of scanning the text is quite time consuming. One must make sure to line up the pages properly, and even a little mistake like forgetting to scan one page can cost you even more time later to go back and set up everything again.

For this reason, I wouldn't recommend scanning a whole book under most circumstances. In the long run, if you want to read Japanese books, simply biting the bullet and studying the kanji will help you more.

But, if you're in a situation like I was, OCR software can help. I tried many different programs, but the only two that gave me any results were OmniPage and ReadIris. Both of these are standard OCR programs, and they compete in features in a variety of ways. However, from my own experience, as far as Japanese is concerned, Omnipage did a significantly better job correctly recognizing the kanji. I often had to rescan the pages with Readiris, and even then the output from Omnipage was more accurate.

In theory, ruby text printed on the page is added in parentheses after the word; in my experience, this was very hit and miss. But, most adult books don't have a lot of ruby, so it's not likely to be too much of a problem; I was much more aggravated by incorrect kanji recognition. Another annoyance is that sometimes it mis-recognizes the size of the kanji; for example, あ is transcribed as ぁ. I found it curious that although the software did decent kanji recognition, it couldn't consistently get the size of the kanji correct.

Of course, OCR software is quite expensive. For your own personal use, it's probably not worth the money. In my case, I was able to get my college to pay for it, so it wasn't such a problem; if you can, I recommend you do the same.

quality

Both of the ones you link have pretty bad reviews on Amazon, and plus that is probably for English so I don't want to imagine for Japanese?? What one did you use in college?

Omnipage

As noted in the review, I used Omnipage. At the time, I think it was Omnipage 14.

The reviews may not be good on Amazon, and, as I noted in my review, I wouldn't recommend spending money on this yourself. However, if you are studying or working in a place where one or the other program is available, it might be worth a shot, depending on what your needs are.

One curious note though- in my limited experience, I found Japanese recognition more or less on par with English recognition. English recognition wasn't great, but Japanese recognition didn't seem significantly poorer.

The problem with OCR software is that it's likely to get all of the easy kanji right, but not the difficult ones- and of course, the difficult ones are the ones you need.

Cheers,
Kaeru

NihongoPeraPera Store

* Buy iTunes Japan Gift Codes.

* Buy Sony PSN Japan Gift Codes.

* Buy Japanese Microsoft Points.

* Buy Japanese Nintendo Points.

* Shop at Amazon Japan.
Clicking the Amazon link gives NihongoPeraPera a small referral on your purchases at no additional cost to you. Support NihongoPera and enjoy great Japanese merchandise!

E-mail Newsletter

Want to receive an e-mail when this site is updated? Subscribe to the updates blog! (I will not use your address for any other purpose, nor share it.)

Enter your Email


Preview