Backups on the Home Front

In computing, you have exactly two options: 1) Have current, working, tested backups or 2) don’t care if your data is there tomorrow. There is no third option. Pretending that there is leads only to substantial cussing.

Unfortunately, people mostly either know this already, or won’t be convinced of it until they learn from the school of bitter experience. So, this isn’t a post to try to convince you to take backups; it’s a post about how I do it, presented in the hopes that it’ll make doing so easier and safer. (As with many of my computer-related posts, the actual implementation is somewhat specific to UNIX-like systems, though many of the general principles apply universally.)

There are moon-letters here.

Return of the King, 1st US Edition Cover
Return of the King, 1st US Edition Cover

Did you hear the one about the Aggie who had a truly first-class library of science fiction and fantasy?

The denizens of Texas A&M University take a lot of stick, some fraction of which they may perhaps deserve. As I’m a Rice alumnus, you may believe me when I say I’ve heard (and repeated) my share of the dreadful jokes.

But this post is about one of the places where not only have the Aggies excelled, but have done so within the realm of unqualified, unabashed flat-out geekishness — one of my personal favorite sorts of excellence, and one I deeply admire.

Finding files with identical contents

Need to search a filesystem for all the files which have identical contents? Read on for a Perl script that does that.

Note that this isn’t just a solution to the (much simpler) problem of finding files in different directories which have the same name; I’m talking about the actual data inside the file being duplicated. This script also works reasonably efficiently, so it’s still useful in cases where you have an extremely large number of files and/or the files in question are very large.

Book: Heavy Words, Lightly Thrown

This is a thing I have recently read. You might read it, too, if you like:

Heavy Words, Lightly Thrown: The Reason Behind the Rhyme by Chris Roberts.

Each chapter starts with a bit of nursery rhyme, then describes — in a very conversational way — possible meanings, origins and interpretations. Though the subject matter may seem of interest only to those who believe literature to have user-serviceable parts inside, this book strives to entertain, even when it means stepping away from academic rigor.

The subject matter leaps from political intrigue to sexual innuendo to the dense web of literary reference, but the narrative remains interesting and informative throughout. It taught this jaded bibliophile a few new things, and dispelled as myths a few things I’d previously assumed to be true.

I found the English-to-American glossary in the back to be unnecessary and perhaps a little condescending, though I have my suspicions this was the idea of the publisher rather than the author.

[Edited 2010-02-10 by dhenke to remove Amazon links. See explanatory post.]

Barbarians at the gate: Excluding Bing via robots.txt

This isn’t about the variety of cherry. If you haven’t heard of Bing.com, it’s Microsoft’s recent attempt at a search engine. (If you’re curious, Google it.)

If you are not particularly pleased with the idea of a company like Microsoft making money from your creative work, then my strong suggestion is to create a robots.txt file in the root of your web-space, with contents not unlike:

User-agent: msnbot
Disallow: /

User-agent: *
Disllaow:

The robots.txt is a voluntary standard which allows web page authors to exclude search engines from part or all of their sites. There’s a helpful website that has details about why you might want to do this, and how to go about it.

Of course, there have been some allegations that Bing isn’t honoring the robots.txt standard. But announcing that they’re unwelcome is a fine symbolic act, even should they fail to honor your wishes.

Wherein the author both tests WordPress and inaugurates the site…

Is this thing on?

I’ve been henke@insync.net for a long damn time.

To the best of my ability to recall, my first email address was something or other @lanl.gov. This would be around or about 1987. After a four-year (somewhat overlapping) stint as something boring @rice.edu (Go Owls!), I wallowed in low-grade snark and minor infamy as henke@netcom.com and henke@scaly.ssc.gov. This would bring our timeline to approximately 1993.

After that, it was a move to Houston and henke@phoenix.net. I have fond memories of the phoenix account, notably the improvement in the signal-to-noise ratio of my email stream stemming from the inability of stupid people to spell “phoenix”. This was about the time spam was becoming a real problem, so although I was a frequent poster to UseNet, I munged addresses.

That brings us to 1999 (party as though it were same) and henke@insync.net. As of this writing, that account it still active. As of the dawn of 2010, it will not be. Alas, the fantastic, local, hacker-run insync.net was too good for this troubled world, and was gobbled up by texas.net lo, these many years ago. While the original domain still stands, I’ve grown tired of shelling out the Croessian sum of twenty bucks a month for a simple email forward and POP mailbox.

Thus, I’ve hired hostmonster and registered the domain ‘mythopoeic.org’, at which I’m dhenke (as there’s an ahenke also, with whom I desire email parity — while I’m still Henke of Clan Henke, that’s only for formal occasions).

While hand-crafting HTML (and/or XHTML) with a plain old text editor has long been my habit, I have come around to the view that time spent on that sort of attention to detail is time that would serve both my readers and myself better were I to spend it on content. And, Hostmonster had an easy way to install WordPress, so, well… here we are.