E-Mail:

Network Nightmares!

Boy do I have one for you today! All right, check this out. Take a small office network, mixed Windows boxes, add in the need to access the Internet at specific times of the day, and what do you have? Stalled connectivity!

So here is the deal. My sister-in-law works for a small company that has been experiencing the strangest network behavior lately. Twice a day, without fail - at roughly the same time (give or take ten minutes), all Internet connectivity has gone out the Window. According to the local fill-in admin, things have been so bad that they have even switched out ISPs - still no success! The only remaining constants are the machines on the network and the router itself (SoHo type model).

Because all machines are fully updated and secured with some kind of anti-virus protection, I am thinking it must be something on one of the lesser maintained boxes hosting some sort of nasty malware? Personally, I am suspicious as to how safe any of these boxes really are using old versions of Windows and IE6. Seriously, this is always a recipe for disaster!

My two recommendations were to obviously run malware removal products again on all the boxes (disconnected from the network), but also to see what happens if all the boxes but one are disconnected during an event. Check the netstat for hits and see what the bandwidth and system resources are doing, one box at a time. Something like hunting for mice. What do you think might be going on? Has someone been feeding the gremlins again?

[tags]malware,gremlins,windows,network,isp,SoHo[/tags]

27 Comments

Could one or more of the boxes have preset automated scheduled tasks that initiate at those times of the day which would then slow down the whole network due to inadequate memory to support the network processes? Just a thought.

Larry

Put in IPCop with BOT addon. Something should be logged.

I think that we techies often know “too much” and tend to look at solutions within their area of expertise rather than globally.

A friend had a very similar problem some time ago. The system would reset at roughly 6pm every evening. It turned out to be a cleaner disconnecting a box in order to plug in the vacuum.

Worth checking to see if someone is unplugging the router to plug i a kettle or a cleaner !

Since you ruled out the ISP, I’m really wondering. Verizon locally quits twice a day here between 1:00 and 1:08 PM for about 2-3 minutes, and between 7:50 and 8:12 for about 4-6 minutes. It can vary between those times, but it happens every day. Verizon denies it, and says it’s a faulty router. Strangely the 2wire I bought myself, as backup does it at the same times.

Probably not your problem. I would ask one thing. Are any [or all] of these connected wirelessly? If so there could be a 2.4GHz event happening close by. Just a thought.

Go back to basics. I ha this and it turned out to be a bad cble up stream.

Matt,
Hard to guess with the limited info. A possible cause could be everything is going through a server box to connect and that server is starting some sort of scan or update that is consuming most if not all of the reources on that machine. I know I have had some machines brought to a stop by some of the popular security suites out there. It almost has to be a single point of blockage. Let us know.
Dan

I had a similar issue a while back that had me really scratching my head until I found down behind a desk buried in a mass of wires a small 4 port switch that someone had plugged in to provide a few extra network jacks. Instead of running it just as a hub it was busy arguing with the firewall (which provided IP’s etc) and this was causing regular networks drops. Replaced the switch with a regular hub and problem gone.

This reminds me of the old story about how a company would, at the same time every day, lose access to the company server. Turns out the cleaning crew came by the computer closet that time, unplugged the server, plugged in the vacuum and proceeded to work. When they were done, they would unplug the vacuum, and replug in the server. The crew had no idea what they were unplugging.

Perhaps the AV is updating twice a day??

So you’re going to get a million responses saying check this, do that. Why not really understand what’s on the wire and sniff it? Use wireshark, or your tool of choice. Just make sure your spanning the largest portion of the network you can. If it’s a switch the pc’s plug into then place a hub in between it and router and sniff there. The answer will be there. Good luck.

Robert Kubichek

June 5th, 2007
at 9:06am

I would also check the update scheduling for all the programs on the computers. It could be that whomever set them up up, used the default
settings for the software, and if it is installed on all the computers, think of the sudden traffic outbound it would cause…
I have run into this several times, and with a larger lan, it is not good…

“all machines are fully updated and secured with some kind of anti-virus protection”

Maybe all machines are trying to do the daily anti-virus updates at the same time?

JimP

If you don’t have a UPS on the router, put a small one on it (450 kVa would be sufficient and cheap) to eliminate building power problems.

One possibility - mismatch between full and half duplex on the router? Just a thought…

B

Some routers can be programmed to reboot themselves on a regular schedule. Has this possibility been examined?

Learn all you can about their network, what device is doing what and how, and get feedback from each user, don’t relay on third party feedback.

Be there 1 hour before the usual time it happens and be ready when it happens. When it does happen look at the network devices starting from the modem and router and then work your way back to the desktops.

You will never know what is happen unless you are there when it does happen, can be anything from the janitor unplugging a power cord, to all desktop apps updating at the same time, to a misconfigured network, to a bot on your network.

Also, setup a packet sniffer between the network and the router, on a hub, for at least two days and then go in to do the “hands on inspection” when the event happens.

Go in early so you can go over the packet sniffer log before the “Internet slow down event” starts.

NEW INFO!

No one has been uplugging anything. Cleaner cleans the office before anyone arrives in the morning. The problem is not with our T1 line, as we had the problem before switching from DSL. The problem is not with the router, as our out of state office experiences no drops and is hooked up through the server in the local office. I’m plum out of ideas.

Claire is the person in question, folks. So I would say that packet sniffing is one way to track this as is simply breaking down and calling for a tech to come out - not me, of course… :)

Hello,

Has anything interesting shown up in the packet captures?

Regards,

Aryeh Goretsky

Dave Williams

June 6th, 2007
at 9:34pm

Had a client once who experienced regular ‘issues’ on their lan. Appears to have been interference caused by the bakery next door turning on the ovens :-o

Is the router set to ‘Keep Alive’? How about enabling ‘Clone MAC’?

Our virusscanner does a HD-scan during lunch-break. When we had older machine (W98 etc.) it was a real hog. Made them completely useless for 15 minutes ………….

Since this is a small network and you know the time this hang-up happens you can do as I call a plug and play apporach. Start at the Soho router, does it have any diagnostics? Can you ping out from the router to the internet? Remove the soho and connect one pc direct to the internet. Can this one pc connect to internet?
I’ve always tried to narrow down the problem, since you know the time this hang-up happens, you can use the one pc-to-internet mode to see if the ISP is the problem. You can also use the one-pc-to-internet mode by physically allowing only one pc to access the internet via the soho during this time period. Disconnect all other pc’s from soho and see if the one pc can access the internet.

Good Luck

There’s a basic question here that hasn’t been answered/clarified, and there’s really no point in speculating until it is:

When the internet connection goes “out the window” does the connectivity actually get dropped? Or is there so much attempted connectivity that individual connections slow down so much as to be unusable.

Those are 2 very different problems, and until we know which is actually the case, troubleshooting is pretty pointless.

Claire’s comment that the offsite office is still able to connect “in’ through the connection makes me assume that the connection is overloading, rather than dropping out, but that’s just a guess.

And if the offsite office is using the same connection, then you’re going to have to factor it’s usage characteristics into the equation.

Are the local office and remote office running peer to peer, or is this a domain-based system?

If the connection is in fact saturating, my guess is that something at the out-of-state office is replicating to the local server on a schedule - database, Active Directory, etc.

Can you get us any more info?

Paul

Matt,

Don’t know if this will help you or not:

A small manufacturing firm specializing in small jet engines & parts (I’d love to have one of their 450 hp turbines in my car!) called to say that their network was “going up and down.” Every XP machine would either get “Network Cable Unplugged” or “This connection has limited connectivity” messages. Both of these messages pointed to network equipment as the culprit. So that’s where we looked.

We figured it had to be the switch, so we put in a spare. Didn’t work. It didn’t make sense that anything else could be responsible, except maybe for new machines being installed in the shop that were causing power surges. But that hadn’t happened, so that wasn’t it. It was off to the races.

I went into the system event logs on the servers and found hundreds of warnings and information entries that went “link down”/”link up,” many of them in the overnight hours. This being an industrial area, I began to consider dirty power and brownouts on the power grid as the source of the problem.

But they had a battery backup unit in place, so that should handle brownouts. I went up and pulled the plug on it just to make sure it was doing its job and the network went down. Aha! Problem solved.

Cheers!
The Geek

What Do You Think?

 

Want to Start a Blog Here for Free?

Are you an expert in one subject or another? If your goal is to help others and dispense your hard-earned information back to the community, get involved in our community site today! You can write about anything - no matter the topic. Exceptional candidates will be offered the chance to contribute to (and generate revenue from) the main Lockergnome site. Join us today!

GnomeREPORT - Nov 21, 2008

Router Report

Tips - Nov 17, 2008

Blogging - Finding The Time

Business, Resources - Nov 14, 2008

FierceCIO

69 queries / 0.308 seconds.