The Register reports in a recent article “Excel ate my DNA” that Excel's “smart” formatting function has been causing consternation to medical researchers, and corruption to gene research databases…
…The errors are introduced because some genetic identifiers look very like dates to Excel. If the spreadsheet is not properly set up, it will convert an identifier, such as SEPT2 to a date: 2-Sep. The conversion, the researchers say, is irreversible: once the error has been introduced, the original data is gone.
…”A little detective work traced the problem to default date format conversions and floating-point format conversions in the very useful Excel program package,” they write. “The date conversions affect at least 30 gene names; the floating-point conversions affect at least 2,000 if Riken identifiers are included.”…
These poor scientists are being bitten by a pet peeve of mine: software that's too fucking smart. I have this apparently odd belief that software should do what you tell it to do. No less, and certainly NO MORE. I assure both the software and its developers that they in fact do NOT know better than I do what I want to do with my computer.
I've been tripped up by Excel's oversized brain on a number of occasions. For example, I once was working with a database that contained a field which contained two statistics about an entity, call them length and width. Now a proper database should probably split these into two separate fields (I certainly would, but I didn't build this database and I wasn't about to rebuild it). All I wanted to do was take the data out, massage it in certain ways, and put it back in. I chose Excel to do this. Unfortunately Excel bit me, hard, because the length/width column formatted length and width just like that. For example, 8/7 meant “8 inches long, 7 inches wide”. Import the data into Excel and suddenly 8/7 became 7-Aug-2002. I was working with a lot of columns at the time and didn't notice the change. Since the raw format for Excel dates is in fact a floating point number, when I pasted the data back into the database, the string “8/7″ had changed into the number “38206″.
Gee. Thanks Excel.
I've worked with a number of word processors that will notice when one has selected only part of a word and which assume one must have meant to select the WHOLE word. Is there anything more annoying than using a tool that argues with you?
When I pick up a hammer and hammer in a nail, I don't have to worry about the hammer deciding for me how the nail should be sunk. If I want to leave the nail sticking out of the wall because I intend to hang something from it, the hammer doesn't care. If I want to use the hammer to work metal, crack rocks, or scramble eggs, the hammer does exactly what I tell it to do.
If I'm writing a report in Microsoft Word and I copy an excerpt from a website to paste into my word document, should I really have to TELL Word, “by the way, paste this text I copied as TEXT okay?” Yep. Otherwise Word says “Hey this was copied from a website! I'm going to try and decipher the underlying HTML and format it appropriately here!” No you fucking retarded word processor, this is MY document, and I have already set up the styles I want to use. Stop trying to *guess* what I want, and just do as I ask.
Curly Quotes in Word is another example of software being too smart. When I type an apostrophe or a double quote, I expect my word processor to insert an apostrophe (ASCII 39) or a double quote (ASCII 34). I would assume that if I turned on “curly quotes” the quotes would appear curly, and they would print curly, but the actual character data would be unchanged. After all, when I select a letter and make it bold, it doesn't change into a different letter. A bold capital A (ASCII 65) is still a capital A (ASCII 65).
At least the last time I had to fight with curly quotes (some years ago) this was not the case. When I typed an apostrophe, Word would actually insert some high order ASCII character which was presented as a single quote with the appropriate upward or rightward or whateverthefuckward curl. The problem is that in ASCII (ascii table) only the low-order characters (codes 0 to 127) are standard. The high order characters (128 to 255) are NOT, and thus in different pieces of software they are represented differently. Thus when you pasted your Wordified text into an e-mail message, or a database, or spreadsheet, or in fact pretty much anywhere other than inside a Word document, all of the curly quotes would be replaced with garbage characters, and suddenly your text became annoying and difficult to read.
This problem may since have been resolved through the new UNICODE character standard, but it never should have arisen in the first place. Microsoft still hasn't learned from the experience and if you open a Word document and type three periods in a row (which most people call an ellipsis) Word will by default autoreplace them with a single special three-dotted character (which Microsoft calls an ellipsis). Type in a pair of hyphens between two words and Word will replace them with an em-dash, a single elongated dash character. Neither of these characters fall in the low-order ASCII table and there's no guarantee that they will look right when pasted into other applications, and in fact, it's a good bet they won't.
Here's another one. Open MS Word and type this: Word is a *pain* sometimes. Chances are what you actually got was: Word is a pain sometimes. Word, in its wisdom, has concluded that when you put asterisks around a word for emphasis, what you really meant to do was put it in bold. Thanks Word, what I really meant to do was exactly what I did, and by the way, would you please fuck off and do what I tell you to do?
For the curious, I took a quick peek at Word's TEN PAGES of settings and didn't see an obvious way to turn this particular annoyance feature off. I'm sure its buried in there somewhere. Many of these features can be turned off by unchecking some checkbox buried in a preferences dialog somewhere. The features are turned on by default because if they weren't they would never be used because nobody in their right mind wants to spend any time perusing a preferences dialog. Face it, we largely only visit such dialogs when the software is pissing us off.
Nearly every “smart” feature in the software I use on a regular basis in the end gets shut off by me. Which leads me to conclude that adding the smart feature in the first place wasn't all that smart. Software engineers and interface designers everywhere should think long and hard before they decide to make my work “easier”.
So if you find yourself annoyed by “smart” software, I recommend taking a hammer to it. The hammer will get the job done, at least until Microsoft starts making hammers.
PS: My thanks to my friend Paul for calling my attention to the article in The Register.