[Up: DrawBack] [Robot Wisdom home page]

InfoRaptor web-bot project

Jorn Barger March 2003

See alt.hypertext for current comments.

The theory is that as one does research on the Web, current tools continually throw away information, so that when you want to look up something you ran across some time ago, your odds of finding it quickly are pretty slim.

So the first strategy of InfoRaptor will be to archive and word-index all visited pages, and especially all search- results.

But a local word-index is only going to be a slight improvement over Google, so the second level of attack is topical master-pages, that bring together on a single local (unpublished) webpage all the resources you've come across for a given topic (even likely links that you haven't followed yet).

Pages for publication will be represented as a subset of the master page, and InfoRaptor will allow the author to click on a section of a published page and jump to the corresponding part of the master page, where all the related-but-rejected links, and all the related search- queries, will be cached.

So the design challenge is to automate the maintenance of these master-pages as much as possible, so that they're useful, despite their gigantic size...

I've posted before to alt.hypertext about what I called 'action hypertext' [posts]-- this is the latest evolution of that. The (new) general idea is that you have a master- hierarchy of topics, and as you do research InfoRaptor will display your current place in that hierarchy, plus a range of nearby topics.

When you want to 'dismiss' a page, you select the best topic-match, and type a few words describing the page. (Action-hypertext theory suggests that there will be various standard categories that may be offered in a menu as well, like 'image' or 'etext' or 'analysis', or 'read later' or 'too deep' or 'too shallow' etc.)

The masterpage will remember the date, the annotations, the original link, and the local archived copy, sorted into a convenient spot under the proper subtopic.

You'll have the ability to flag search-strings and websites for the bot to monitor for new content or other changes. (One of the main functions is to semi-automate finding substitute links when one goes 404.)


Suggestions

You can submit a new URL or any other suggestion for this page by typing it into the box below. It will instantly become visible to anyone at this comments page. I should get around to checking it out and updating it above within a week or three, at which point I'll delete it from the comments page.

If you want credit, include your name and email (otherwise it's anonymous). You can use HTML but you don't have to.




[Up: DrawBack]
[Robot Wisdom home page] (Feedback to jorn@robotwisdom.com)


Search this site Search full Web

Before you leave this site: Be sure you've checked out Jorn's weblog which offers daily updates on the best of the Web-- news etc, plus new pages on this site. See also the overview of the hundreds of pages of original content offered here, and the offer for a printed version of the site.

Hosting provided by instinct.org. Content may be copied under Open Web Content License.