Ask HN: What is the best method for turning a scanned book as a PDF into text?
204 points by resource_waste 7 days ago | 119 comments
I like reading philosophy, particularly from the authors rather than a secondhand account.
However I often run into that these come as scanned documents, Discourses on Livy and Politics Among Nations for example.
I would greatly benefit from turning these into text. I can snipping tool pages and put them in ChatGPT and it turns out perfect. If I used classic methods, it often screws up words. My final goal is to turn these into audiobooks, (or even just make it easier to copypaste for my personal notes)
Given the state of AI, I'm wondering what my options are. I don't mind paying.
aragonite 5 days ago | next |
I did this very recently for a 19th century book in German with occasionally some Greek. The method that produces the highest level of accuracy I've found is to use ImageMagick to extract each page as a image, then send each image file to Claude Sonnet (encoded as base64) with a simple user prompt like "Transcribe the complete text from this image verbatim with no additional commentary or explanations". The whole thing is completed in under an hour & the result is near perfect and certainly much better than from standard OCR softwares.