Does this sound familiar: You open a huge PDF and need to quickly find the page containing the topic you're interested in. (We'll assume this PDF has no bookmarks or a linked TOC that will suffice.)
Our Google-ized instincts immediately reach for the Find (Command/Control-F) field to enter the word or phrase we're looking for. Acrobat (or Reader, doesn't matter) finds the first couple of instances in a reasonable amount of time, but soon it slows to a crawl as we click Find Next one too many times and it hits a dry patch.
The little read-out says, "Searching 342 of 575 ... 343 of 575 ... 344 ....345 ...346 ... " Two minutes later and we're still staring at the page progression, hypnotized, waiting for a hit: "517 of 575 ... 518 ...519...520..."
Agh! Snap out of it, man!
By choosing one little command in Acrobat Pro v8, you can put an end to this misery for yourself and for anyone else who wants instant finding or searching, even in the most massive of PDFs.
-----
Embed an Index
-----
Using Acrobat Pro, you can create a full-text index
of the contents of a single PDF, similar to how Google indexes all the
text in the pages of a web site, and (new to v8) embed it into the PDF.
Then when you Find or Search, Acrobat or Reader searches the *index,*
not the PDF. Since the index file is much smaller, operations are
lightning-quick. And, since the index knows which page numbers its words
appear on, the end result is the same.
We've been able to create indexes in Acrobat Pro for many versions now, always using the Catalog command. PDF content providers typically index a folder full of PDFs so that a single Search (Command/Control-Shift-F) can hunt down the search text in a whole collection of PDFs. And I suppose you could use Catalog to create an index of a single PDF too, though I never bothered.
All that is still possible in Acrobat Pro 8, and the old ways of associating an index with a particular PDF still work.
But as I mentioned, Acrobat Pro 8 added a new twist: Indexes are embeddable in a PDF. Once they're embedded, you no longer have to keep track of the separate .pdx and .idx files generated for each PDF's index, making sure they always travel with the file. End users don't have to figure out how to tell Reader to use the index during Finds and Searches, since Reader 8 and Acrobat 8 automatically use it if it's embedded. (Earlier versions of Reader and Acrobat ignore the embedded index.)
Cool, huh? Best of all, it's dead-simple to do.
1. Open the PDF in Acrobat Pro 8 and choose Advanced > Document Processing > Manage Embedded Index.
2. The resulting dialog box will tell you that the "the document does not contain an embedded index." Ignore that and click the Embed Index button.
3. An alert pops up, saying that Acrobat is about to 1) Save and close the document; 2) Build a search index for it; 3) Embed the index; and 4) Reopen the document. Click the OK button if you want to proceed ... yes indeedy, you do, so click!
The PDF closes, and after a few seconds of watching a progress bar create the index, it opens right back up again.
-----
Before and After
-----
For my guinea pig test file, I downloaded the InDesign CS3 "full documentation" PDF from Adobe's web site:
http://www.adobe.com/support/documentation/en/indesign_incopy/
This puppy tips the scales at 46.35 MB and 762 pages. Whoa, mama!
Before I indexed it, I ran a search (Edit > Search) for the term "blend" and timed it. On my late-model Compaq, Acrobat Pro 8 took 24 seconds to display the 153 matches in its Results window.
After embedding the index (which added 2.8 MB to the filesize), and purging the Search cache (see below) to keep things fair; I ran the same search. This time, Acrobat took about, oh, a nanosecond to display the same 153 matches. I had the same blink-of-an-eye results in Reader 8, on both platforms.
You can bet that from now on, I'll be routinely embedding indexes in all of the larger PDFs on my hard drive, especially all those software documentation ones I keep needing to find things in.
If you post large PDFs for your customers to download, like catalogues or periodicals, you might want to do the same.
-----
About that Search Cache
-----
Both Acrobat and Reader already do something
similar when you're repeatedly hunting for terms in the same PDF. They
cache the text and save it in a file so that subsequent Finds and
Searches are fast. You can adjust the size of the cache, or purge it, in
Preferences > Search.
But embedding an index in a PDF ensures that Finds and Searches are always fast in Reader 8 or Acrobat 8, regardless of the state of the user's cache, even if it's the first time they need to find something quickly.
Recent Comments