Previous Table of Contents Next


This is actually the hardest part of this method: figuring out what to compare and getting your arms around the items that are proving the problem. For example, if many (but not all) users are reporting a problem, particularly an intermittent problem, it can seem overwhelming at first, because it’s hard to say where to start and which pieces of the network might be causing the problem. In cases like this, you’ll save your sanity if you start logging the calls and writing down in a tabular format what the problems are, when they happened, under what circumstances, and the configuration of the workstations and users involved. Such a chart might look like the one shown in Table 5.1.

Table 5.1 Sample log sheet.

Date Time Department User Problem Type of PC

8/11 2:00 Finance Jack Illegal operation error—WordPerfect Clone/486
8/11 2:20 Finance Leona Fatal exception 06 (during system boot). Seemed OK afterwards. Dell/P6
8/11 3:00 Finance Tracy Illegal operation error—WordPerfect Clone/486
8/12 9:00 Finance Tracy Illegal operation error—WordPerfect Clone/486
8/12 10:45 Finance Jack Illegal operation error—WordPerfect Clone/486
8/12 1:00 Finance Jill Locked up. Had to reboot when reading email. Clone/486
8/13 11:00 Finance Jack Illegal operation error—WordPerfect Clone/486
8/13 12:45 Finance Bill Illegal operation error—WordPerfect Clone/486

By looking at this detailed log, it becomes apparent that illegal operation errors are what you’re getting the most of. (If you had more incidents listed in the log, it would become even more apparent. The other items are not repetitive and amount to “bumps in the road,” not hard errors.) So now you know that “some of these things are not like the others.” The ones that are alike are your persistent errors that you’re trying to get rid of. Why isn’t Bill calling as much as Tracy or Jack? A quick phone call reveals that he was out of town on the 11th and 12th, so he’s not as much of a wildcard as the log implies.

Let’s look at a concrete example. I was recently involved in upgrading a department to Windows 95, which included upgrading to a new version of WordPerfect. The word from management was that this was a strategic change, making the possibility of rollback very small; it was up to us to make it work. Although the sample machines I had pilot-tested worked just fine, we started to have problems a day or so after the upgrade. Certain (not all) users began to report illegal operation errors. I was absolutely sure that all users were configured the same—identical login scripts, file permissions, and home directory configuration—but, unfortunately, because we were using clone hardware, I wasn’t as sure about the workstations.

I logged the incidents over several days and discovered that some users never reported the error. This indicated that the problem was not a moving target and that it was staying in the same places. This is important to establish; some errors do not pop up in the same places all the time—that is, they move from workstation to workstation. This typically indicates a systemic problem rather than a problem with the individual workstations.

Next, I saw that only the PCs that were clones were having a problem with the illegal operation error—none of the name-brand PC users had reported it. Finally, I saw that not all of the users who had clones were reporting errors—only certain of the clones. This led us to believe that there were component problems with certain clones.

Obviously, one group of these things was not like the other! To rule out a user problem, I switched the PC of a user who didn’t have the problem with the PC of a user who did have the problem. (This didn’t endear me to either user, but it did tell me that the problem was definitely workstation related.)

I took inventory of the workstations that were acting up because I wanted to see what those workstations had in common. I made a new chart of the lower-level components (see Table 5.2). I left out Windows 95 and its components, because I had taken pains during the rollout to make sure that all the workstations were identical in this regard. I also didn’t scan for viruses, because I rolled out a scanner along with the Windows installation.

Table 5.2 A Sample Chart of Lower-Level Components

Workstation Mfr/Bios Video Card Hard drive RAM Hub/port

Jack AMI Brand-X Quantum 16MB 8/2
Tracy Phoenix Brand-X Connor 16MB 9/3
Bill AMI Brand-X Seagate 24MB 8/1


Previous Table of Contents Next