One day I hope to have a standard interface for searching Mozilla bookmarks, Eudora mailboxes, and OpenOffice files. Today, searching for data on my own computer is still much harder than it should be.
Rick Klau’s recent post about his 50,000th visitor to his website (http://www.rklau.com/tins/002228.html) got me thinking. It’s great when people visit my site, it’s better when they link to my site, and it’s even better when they add value by commenting on the substance of a particular article. But for me, one of the best measures of success is how many people send me e-mail about a particular article. Since 1992, I’ve written four books and nearly 100 articles, and it is the positive feedback that I get via e-mail that keeps me going.
That said, searching for information in e-mail is not as easy as it should be. I started reading e-mail in Emacs and later switched to Eudora, which uses the same UNIX-style mailboxes, which include multiple e-mail messages in a single logical file. The problem with this storage method is that the files (i.e. mailboxes) get large rather quickly. So every year, I create a new hierarchy of Eudora mailboxes. But every year, it gets more difficult to search for information. I’ve been saving much of my e-mail since 1992, and I currently have about 100,000 e-mail messages in about 1400 mailboxes.
Eudora includes a “Find” feature, but it is not as capable as full-text search engines such as SWISH-E (http://www.swish-e.org/). But all of the full-text search engines that I’ve considered operate at the file level. In other words, they index and let you find information in individual files. The problem is that individual e-mail messages are small chunks of separate files (i.e. mailboxes). I also have valuable information stored in my Mozilla bookmarks file, which is one (very large) file. So I have to use Mozilla to search for information in my bookmarks (by bookmarking my bookmarks.html file and using the “Find” command from the “Edit” menu), Eudora to search for e-mail messages, and a third program to search for everything else.
It’s worth noting that I have about 33,000 files in my data directory, and if each e-mail message were stored as a logical file, it would quadruple the number of logical files on my computer.
What is needed, therefore, is a new type of search engine. One that can tell whether it is searching a bookmark file, mailbox, or word processing file. And produce different output accordingly. Simson Garfinkel’s article in the 12/02 issue of Technology Review hit the nail on the head: we need a new paradigm for e-mail (http://www.simson.net/clips/2002.TR.12.EmailClient.htm).