New Beginnings and Improved Focus
By: Gil Price
Original: 11/9/2005
Modified: 12/30/2005
In the recent few months I have been asking myself some tough questions. Many of the answers have been less than stellar and I have been making changes as a result. The re-targeting of this site is the first of these many changes.
The Issues:
If you have been reading my site for the past few weeks, you know I've become concerned recently about "data permanence". What this refers to is how permanent is the data I put online or keep on my hard drive? To tell the truth, while it seems like it will always be there, I must be realistic. All my data is subject to the failure of the hardware, obsolescence of the applications which created it, or my inability to find it at sometime in the future. So let's look at these 3 issues separately and then try to determine a safe course of action to resolve the issues in as painless and expeditious manner as possible.
Hardware Failure:
While modern hardware is very resilient, it nonetheless sometimes fails. These failures can often be a power supply to the server where I host/store my files to the workstation(s) where I create my files. It can be a hard drive failure, network failure to my broadband connection, or internal component failure of my router, wiring or power to my house. These failures are often impossible to anticipate, and timely or costly to resolve.
In the past 1999, I signed up for the Time Warner Roadrunner service. This was the first broadband service available to me in Lexington, South Carolina. Since then I have been without service for about 5 weekends. While this seems really reliable, I do host my own web and e-mail server. In the past 4 years I have agreed to host 2 commercial web sites for which I do all the development and deployment. While I'm not really concerned if my wonderful prose is missing from the 'net for a few days, I am concerned about 2 businesses being offline.
I have never suffered a power-supply or hard drive failure, but my hardware isn't being replaced on any life cycle and is getting older every day. I have had numerous power outages due to weather (ice storms), sometimes the power company cuts it when they are making changes to the electrical grid or replacing my meter with a "new technology" one. The outages are usually short duration, but have lasted from 3 to 9 hours, 3 or 4 times in the past 10 years.
My part of the United States is prone to tornadoes, hurricanes, and lightening strikes. All of which can cause total loss to my data as well as my home.
Obsolescence of Applications:
While this doesn't seem to be that big a deal, I mean an .html file is an .html file, and these can be edited with any old text editor, I do create spreadsheets, databases, and formatted word documents on a fairly consistent basis. In my garage, in boxes in a corner somewhere are a large collection of 5.25 inch disks filled with volumes of information created 12 to 15 years ago in MultiMate, Lotus 123 and Dbase III. These are part of my military history and all totally useless to me today. I don't have a 5.25 inch disk drive, and I certainly don't have any of the applications used to create the files. So what about my current files? I have been using MS Word since version 4.2, anybody remember Office 4.2? Okay, we know template files created in Office 95 are not compatible with Office 2000, but everything is now guaranteed to be backward compatible to Office 2000 right? Think again, it would be bad planning to think a Word document from 2006 will be readable by the computers we use in 2018 or later.
What about the web? Are PHP/Ruby/Perl/Python/TCL still going to be current scripting languages in 2018? I don't know and that's the rub. With all of this uncertainty, can I really count on technology to free my creativity and expect my children or their grandchildren to enjoy the fruits of my labor on their computers of the future?
Finding Files and Published Data:
Not only can I find the files I store on my workstation. But what about the files I store in the server? How many copies of the same file do I have scattered between my workstation and backup systems. How many copies are really necessary? Currently, I'm pretty good at making backups of my entire hard-drive, individual work folders etc.. to backup media. The backup media I use is pretty standard fare. I have an external 250GB hard-drive for backup of my server, a 120GB external drive for my workstation and an 80GB external hard-drive for my laptop. I backup current working files to one of 3 or 4 USB "thumb" drives and weekly copy the files on them to my server. Sounds good? Well, it has been up till now. Recently I noticed I had the same file, different versions in 14 different locations on the external server backup drive. This file was also in 4 different locations on my laptop and in 7 locations on the workstation. Do I really need 20 copies of an excel spreadsheet, saved at various times with different data? All but the current one on the laptop and it's copy on the "thumb" drive are obsolete.
Publishing my data on the Internet back in 1999 seemed like a really good idea. In fact I was quite the happy writer during the 2000 Presidential election. Putting my thoughts and opinions about the events leading up to the election and then joyfully offering my opinions on the aftermath. Interspersed in all of this I published bits and pieces of information gleaned and collated to bring a historical perspective on the events from a constitutional view along with interesting bits about past elections going back to the very early ones in our nation's history. Sounds like fun? It was, but as those events slowly passed into history, and my interests changed, I changed my publishing system. With the change in system came the realization I would have to convert nearly 500 entries by copying and pasting the text into the new system, add a title, publish date and category. Guess what? I kept only the 80 to 100 best entries and let the rest disappear. Over the next few months I noticed those old, no longer to be found entries were getting a lot of traffic to my site. The visitors were getting the ole 404 (file not found) error from my server. In fact for close to 8 months, the 404 page was the most request page on my site. Search engines are truly the work of the devil! It's great to see them driving visitors to your site for current pages, but painful when visitors are being served the ole 404!
Resolution:
So now that I have laid out my most pressing issues of the day, how am I going to resolve them? Glad you asked, while I don't have answers to all of them, I do have a strategy for getting to where I want to be in the near term.
Hardware Issues: I have subscribed to an online provider for running my web content! So my files will reside locally and be "published" when changed to an off-site service which has a much better history of staying online and accessible than my home server. So online content is now safe. My online provider has redundant network connections, daily backup of all servers, and emergency power. None of which exist in my home.
I have also subscribed to an online storage service. Currently I have 4GB of online space to store the really important files. This also facilitates my accessing certain files from any web browser anywhere in the world. So if there are any files I might need while on vacation, I bundle them up into a .zip file and put them in online storage. I also copy the .zip to a "thumb" drive and make sure my laptop is one of the first things I pack. This solution also allows me a method of working on files from locations outside of my home, saving them to online storage and then retrieving them for further work at home from my workstation.
Obsolete Applications: No immediate resolution really, but I am considering using XML more along with more reliance on purely text based solutions whenever possible. Less reliance on proprietary applications and systems and more on open-source software. While I must admit, the majority of my work is with proprietary applications, the resultant file is easily translated and/or imported into open-source solutions. I might not be able to work with the source document (propriety database-backed application), but I can import and work with the final draft of a document (published open-source format).
Finding Files and Published Data: Here's the rub, once the files are created or applications saved, where do I go to find the latest version of a file or application. In the past I've devised many schemes to try and organize my files and folders. For some reason, laziness or lack of attention, these plans or schemes always fail. The same thing with finding information on the Internet, I find a page with just the answer I'm looking for, save it to my favorites and in 6 or so months, when I go back the data is no longer available or moved. My "favorite" links become outdated or the list gets so long with so many sub-folders for organization I spend too much time looking for the applicable link or information. I'm sure you know what I am talking about. So once again I am devising a system for keeping track of my important data. Only this time I am trying to be disciplined enough to stick to the plan. A future article here will look back at the changes I am making and measure their success.
Web based information is another story altogether, I'm still struggling with deciding on my final publishing system. I've used and enjoyed so many in the past 4 to 5 years. Of particular note are MovableType, OpenACS, CityDesk, TextPattern and Wordpress. While all are or do the things they are designed to do very well, I'm torn between the flexibility of some, and the ease of use of others. I've tried and become familiar with many more, and the five mentioned here are the ones I find the most useful with the largest online communities. So currently, I am using CityDesk to publish this site. But have a desire to return to MovableType and as always I keep my eye on OpenACS.
You might question my choice of CityDesk, but the choice is really rather academic at the moment. With my move of my primary publishing to an established provider, I really don't need to have a "system" installed any longer. On my personal server running from my home, I used MovableType and OpenACS the most with an occasional foray into the world of ExpressionEngine, Drupal and others. But having an off-site provider has led me to publishing predominately static pages with CityDesk. While the application is a client-side based one, CityDesk publishes my files to the off-site provider in static html. This gives my pages a smaller size, and corresponding faster load times.
Conclusion:
What am I really talking about here? I'm talking about a fundamental new view of computers and the role they play in my life. Rather than being the object to master, they are now the tool to facilitate my memory, the sharing of my thoughts and ideas and at the most basic level nothing more than a tool to get things done. I'm starting to use them to enhance my other life systems, not only as something to use for e-mail, word processing or balancing my checkbook.
Over the next few months I will be writing a short series of articles explaining how I am using computers, networks, and the Internet to keep track of those things they are best suited to keep track of, and how I go about protecting my data and systems from loss and failure.
(2,050 words)
Return Home
|