Site Map - skip to main content

Hacker Public Radio

Your ideas, projects, opinions - podcasted.

New episodes Monday through Friday.


hpr2771 :: Embedding hidden text in Djvu files

Part 2 of Klaatu's Djvu mini series

<< First, < Previous, Latest >>

Hosted by Klaatu on Monday 2019-03-18 is flagged as Clean and is released under a CC-BY-SA license.
Tags: pdf, ebook, bloat, djvu.

Listen in ogg, spx, or mp3 format. | Comments (0)

To embed text into a Djvu file, you must create a djvused script detailing the page and bitmap location of one of: character, word, line, paragraph, or region.

For good measure, you should first list the contents of your Djvu bundle:

$ djvused -e 'select; ls' test.djvu
   1 P   177062  p0001.djvu
   2 P   199144  p0002.djvu
   3 P    12323  p0003.djvu
   4 P    57059  p0004.djvu
   5 P    96725  p0005.djvu
   6 P    53868  p0006.djvu

Then define the location of text in a file called, for instance, content.dsed. Assume that my page is 1000 px by 1000 px:

select; remove-ant; remove-txt

select "p0004.djvu" # page 4
set-txt
(page 0 0 1000 1000
 (word 100 600 450 800 "Hello" )
 (word 100 600 450 800 "world" ))

.

select "p0005.djvu"
set-txt
(page 0 0 1000 1000
 (line 100 400 900 600 "Hacker Puppy Radio"))

Apply this script to your Djvu file with dvjused:

djvused -f ./content.dsed -s test.djvu

Converting from PDF to Djvu

You can convert PDF files to Djvu with the djvudigital command. Due to license incompatibility, it does require you to compile a Ghostscript plugin, but it's an easy build. Get the gsdjvu code, and then follow its README instructions.

Once you've built the Ghostscript driver, you can convert PDF to Djvu:

djvudigital --words foo.pdf foo.djvu

Show Transcript

Automatically generated using whisper

whisper --model tiny --language en hpr2771.wav

<< First, < Previous, Latest >>


Comments

Subscribe to the comments RSS feed.

<< First, < Previous, Latest >>

Leave Comment

Note to Verbose Commenters
If you can't fit everything you want to say in the comment below then you really should record a response show instead.

Note to Spammers
All comments are moderated. All links are checked by humans. We strip out all html. Feel free to record a show about yourself, or your industry, or any other topic we may find interesting. We also check shows for spam :).

Provide feedback
Your Name/Handle:
Title:
Comment:
Anti Spam Question: What does the P in HPR stand for ?
Are you a spammer →
Who hosted this show →
What does HPR mean to you ?