[c-nsp] Prove it's not the network!

Whisper whisper555 at gmail.com
Thu May 15 03:56:29 EDT 2008


Justin, I have alwasy been under the impression that Network Engineers
primary role was going around constantly proving that the Network is not the
problem. :)

Your rant, I suspect, is more or less repeated on  daily basis by Network
Engineers all around the world.

On Thu, May 15, 2008 at 3:41 PM, Justin Shore <justin at justinshore.com>
wrote:

> Nathan wrote:
>
> > Proceed by elimination. If there is someone else in the office (I
> > suppose the T1 is not just for one person) whose Outlook is *not*
> > slow, and especially if "someone else" can be extended to "everybody
> > else" then the problem is not the network.
> >
> > Outlook can have severe speed/response problems when not kept healthy;
> > most notably there's something called PST files that have to be kept
> > at a reasonable size, or re-indexed or something, and people who like
> > to keep all their mail tend to run into that.
>
> Here's a long account of a similar battle over PSTs that I fought.
>
> I fought a 'blame-the-network' battle at a customer's site a couple
> years ago.  We built a brand-new GigE greenfield network in a new
> building and help the customer move into their new digs.  Shortly
> thereafter a certain group of users started complaining that their
> computers were horribly slow, most especially Outlook.  This reached
> upper management before it came back down to us contractors so it was a
> huge deal when it landed at our feet.
>
> First thing we did was narrow down exactly who had the problem and who
> didn't.  95% of the complaints were "me too!" complaints and weren't
> legitimate.  The remaining 5% were isolated to one group of users in one
> specific area of the new building.  Their IT staff that was working on
> this problem with us immediately blamed us again because "it had to be
> the network's fault because all the users are in the same physical
> vicinity".  I showed them graph after graph of the network I/O from the
> Exchange servers through the core and down through the uplinks to
> distribution.  In the end we ended up graphing every affected users'
> port.  The graphs did not help; we were still to blame.
>
> Finally one day I sat down with the squeakiest user and had her show me
> exactly what was slow and the steps she took to make that happen from
> minute 1 of her walking into her office.  I had her shut down and start
> from a cold boot.  She commented that the login process was faster than
> normal and asked what I'd done to fix it (grrr).  She fired up Outlook
> and I noticed that it was very slow.  She said that it was faster than
> normal.  Finally Outlook came up and she started scrolling through her
> email.  She selected a message and waited 10 seconds or so for the
> message to come up.  Then she'd try to save the attachment to the
> desktop and it would take 4-5 minutes (for a 20MB attachment).  She
> continued on with her daily routine and started scrolling down there her
> Outlook folders.  I stopped her when I saw "Inbox, Sent, Drafts, etc"
> scroll by more than once.  This was the sign I was looking for.  I took
> the wheel at this point and started counting.  She had 8 (count them,
> EIGHT) sets of default Outlook folders because she had 8 PSTs mounted in
> Outlook.  She explained that she hits the Exchange PST hard limit of 2GB
> every 8-10 months.  The company's IT folks would export everything to a
> new PST to give her a fresh inbox.  Then they'd mount it in Outlook so
> she could have access to it (it was tax stuff so Legal wouldn't let her
> delete anything, literally).  I started hunting for the PSTs and found
> them on an old file server, one that we had no idea was related to the
> mail system.  She was mounting 8 roughly 2GB PSTs across the network to
> Outlook on a PC running XP w/ 128MB of RAM.  Wonderful.
>
> But it gets better.  I noticed that her inbox wasn't on the server but
> was instead in a PST on the same file server and her email was set to
> deliver to PST, not Exchange directly.  In this situation the way
> Exchange works, email is held on the server for PST users until they
> bring their Outlook online.  OL then downloads the queued up email and
> stuffs it into the PST.  Well, the PST was stored on the server so the
> client would have to manipulate the PST on the server.
>
> Oh, but it gets better still.  A few days later one of sys admins was
> looking the newly discovered file server that was apparently critical to
> the function of the mail server.  From across the room we here loud
> profanity and run over to see what happened.  He discovered that the
> idiot IT staff set up Windows to compress the non-RAIDed drive that
> contains all the user PSTs and home directories because they ran low on
> drive space about a year earlier.  Before a user's OL client can modify
> the PST the server has to decompress the entire PST, then write the
> changes for the client, and recompress the PST and then write it back to
> disk.  The server was a low-end MS box with 256MB of RAM with no RAID
> and a backup that usually failed.  Oh, and that sys admin also
> discovered shortly thereafter that all of the users created in the past
> year and a half were set to deliver to PST because of, you guessed it,
> another drive space issue.  Isn't that nice.
>
> All the users that reported this problem turned out to be users that
> handled tax data and couldn't delete any email.  That's why that group
> of users all experienced the problem.  Every single one of these users
> were mounting 2-8 2GB PSTs across the network.  Those that shutdown at
> night would come in at 8am and fire up their computers.  A couple dozen
> different users would all try to pull down their PSTs from the
> compressed file system of the poor server.  So it wasn't the network's
> fault.  The network was running like a champ.  The POS server put into
> mission critical service by incompetent IT staff was to blame.  We spent
> weeks troubleshooting the problem and trying to convince management that
> the network was fine.  In the end I had to sit down with a user, watch
> everything that they did and then analyze their steps to figure out what
> was causing the problem.  Oh, and the reason it was faster the day I
> worked with her was because we did this mid-morning, not at 8am.  Did
> anyone ever apologize (even figuratively) to the network folks?  Nope.
> Of course not.
>
>
> As a network engineer I've found that the vast majority of my job is
> helping other people find their problems.  The network seldom breaks and
> when it does it's not subtle; it's catastrophic.  Even highly skilled
> technical people still blame the network when their stuff doesn't work
> right (after all my network is just a bunch of tubes, right?).
> Networking is like mysterious dark magic that no one seems to
> understand.  It's the gremlins on the wire that causes Windows to crash,
> not poor programming and a lack of QA.  Networking is simply not
> understood by most people and it's human nature to fear and loathe what
> they don't understand.  To be able to do my job effectively I have to
> know my shit and everyone elses' well enough to know how something works
> when it inevitably breaks.  Had I not come into networking with a
> systems background and were I not a quick study under fire I would not
> be good at what I do.  Did something "suddenly" break that must have
> been caused by the network maintenance I did last week?  No, it's the
> fact that it never worked to begin with and you never actually tested it
> when you deployed it a year ago.  It wasn't until a user tested it for
> you that you became aware of the fact that it wasn't working.  It just
> happened to come a week after I did maintenance on an unrelated device
> on an unrelated network.  But I'm going to spend all morning sniffing
> and decoding traffic to help you realize that this device off to the
> side over here couldn't possibly be involved.  *sigh*  Story of my life.
>
> </OT RANT>
>
> Justin
> _______________________________________________
> cisco-nsp mailing list  cisco-nsp at puck.nether.net
> https://puck.nether.net/mailman/listinfo/cisco-nsp
> archive at http://puck.nether.net/pipermail/cisco-nsp/
>


More information about the cisco-nsp mailing list