[Up: tech] [Robot Wisdom home page]

What will your harddrive hold in ten years?

Jorn Barger December 2000

Software and hardware are always co-evolving, most dramatically when an incremental hardware advance makes a whole new genre of software possible.

While most of the media attention is currently on the Internet as a driving force of software innovation, quiet advances in harddrive storage [Dvorak passim] have opened up some possibilities that are hardly being discussed yet.

So long as you're working with PhotoShop or MP3s or TiVo video, etc, a 40-gig harddrive is just bigger. But when you talk about ascii text, 40-gigs is effectively infinite-- no human could read 40 gigs of text in their lifetime! [calculation]

So the design question becomes-- how do you fill those 40-gigs with etexts in the most useful way, for computer-users-in-general (eg preinstalled when you buy a new PC)... and how do you customize it to be most useful to you personally?

You certainly, immediately, want it to start archiving everything you read on the Net, for future reference. You want this all to be word-indexed, like a generic search-engine but entirely 'local'. You want all the good stuff to be sorted by topic, into your own personal Yahoo/DMoz, reflecting your priorities... and you want it to watch and alert you whenever a good webpage you've archived/mirrored is updated.

When you think a whole website looks good, you probably want to mirror the whole site, so that your future local searches will find whatever it offers on that topic. When you start doing this, you can quickly fill your 40gigs... but you can plan to double that capacity every year via inexpensive upgrades. So downloading the equivalent of the Encyclopedia Britannica-- about 200 megs-- every day, forever, is already a perfectly reasonable strategy.

While you sleep, your Internet connection will be busy checking and updating and capturing whole sites (which EB.com will probably resent!).

This is moving very much in the opposite direction from Microsoft's .NET-- distrusting the reliability of any random website's being there when you need it, with the material you expected it to have.

And it improves the customizability, because you can (eg) add links and comments to your copy of the page, making it your own...

topic-indexing

Building your personal Yahoo should center around creating custom topic-pages, which you may or may not publish to the Web. These will be like notebook pages that you continually revise as you study the topic-- accumulating links to web originals, to your local mirrors as well, and annotating these for your own reference. [more, wrt academia]

I find that most of these topic-pages require a timeline-- so the generic preinstalled 40-gig etextbase should include a basic topic-page-with-timeline for every popular subject! These will be halfway between Yahoo index-pages and Britannica articles-- they should include up-to-date links to the Web for more info, but also an encyclopedia-like summary of the topic. (I've been building lots of these, experimentally: old info)

There will be lots of competing efforts to choose between, with their own spins and specialisations. It should be easy to pick a starter set based on reviews, but you'll eventually need a way to merge parts of many competing ones into your own synthesis...

Webloggers have a headstart on the challenge of building their own, because they've started archiving the best links, with annotations. They can go back thru these archives and sort the links by topic-- my netlit portal was a first try at this, using the categories: fun, art, media, issues, net, tech, science, history, search, shop. [qv]

But I feel like that experiment was a near-complete failure-- I hardly ever use them myself. Esthetically, they're just too noisy. (I trimmed almost all the pullquotes when I sorted them, which may have been a mistake, and I started rewriting my blurbs, but it didn't help that much, imho.)

I don't know if everybody shares this experience, but I've started lots of notebooks (lists, etc) on paper that were meant to accumulate ideas on a topic over an extended period, where I never managed to keep motivated by them-- given the choice between digging them out and adding a new thought, I always prefer just to start fresh. And this paradox can apply to webpages, too, if you're not careful.

I'm trying now to recycle the same portal categories as subdirectories, and I've started targeting my new topic pages for those directories. For example, this essay is for my 'tech' subdirectory, and it has the generic filename 'futures.html' so I expect to evolve it gradually into a more comprehensive topic-page on tech-futures-in-general. (I've also started an 'issues-futures' page: qv.)

As I think of more futures-topics I want to explore, the (old, original) content of this page will surely get moved to a subpage, but will still be linked and annotated from this page.

So in the long run I might offer a small Yahoo of topic-pages I've created, that others can download and merge into their own Yahoo, with their webpage-editor being smart enough to differentiate 'Jorn's old annotation for this link' from 'my new comments that override Jorn's' (hiding Jorn's in a link, probably, and using a different textcolor or other formatting for each, if they're both still on the same page?).

This should work better than the 'portal' approach because each topic-page is created as an esthetic unity, not a heap of semi-random links. So I challenge other webloggers to try this approach-- start with a topic you care about and write a page that both gives an overview, and links the best stuff you've found about it.

Here's one approach

the Joyce experiment

For the subject you care most about, you may find you want to customize every topic-index-page about it. My James Joyce site has provided me with a lot of surprising insights on how this happens-- lately I've been working on a biographical timeline [150k] but I'm finding I have to spin off subpages for: Joyce's wife Nora, his daughter Lucia, his best friend Byrne, his nemesis Gogarty, his schoolpals, his siblings... and there's no end in sight, nor should there be.

As you study, you learn new things, and if you care to bother, you'll want to update all your relevant pages to reflect this new learning. When a topic page grows to the 100k region-- as the most central ones certainly will-- you'll naturally look for ways to split it into appropriate subtopics.

But you have to be very careful to minimize the number of different, overlapping places you'll have to update for a given topic-- geekthink imagines you can solve this problem technologically using an XML-type solution, but I don't believe it. (The categories XML would need, to do this, can only be discovered by the slow, thoughtful, un-automate-able process I'm describing.)


Suggestions

You can submit a new URL or any other suggestion for this page by typing it into the box below. It will instantly become visible to anyone at this comments page. I should get around to checking it out and updating it above within a week or three, at which point I'll delete it from the comments page.

If you want credit, include your name and email (otherwise it's anonymous). You can use HTML but you don't have to.




[Up: tech] [Site map] [Robot Wisdom home page] (Feedback)


Search this site Search full Web

Before you leave this site: Be sure you've checked out Jorn's weblog which offers daily updates on the best of the Web-- news etc, plus new pages on this site. See also the overview of the hundreds of pages of original content offered here, and the offer for a printed version of the site.