[Users] Need help with hanging servers

Brian Moon brian at moonspot.net
Tue Jul 6 14:03:49 EDT 2010


> Do the production servers hang, or is the hang restricted to all containers?

The host OS and containers are all non-responsive.

>> And it is a weird hang. They still respond to ping. And TCP
>> connnections answer (connect) but don't respond. There is nothing in
>> syslog on the host server or any containers. There is nothing on the
>> console.
>
> I know that problem from hanging harddisks, and from various security
> incidents where I was called in to investigate.
>
> Things I'd do to investigate further:
> Set up a host which runs tcpdump and some web server. I'll call that
> host diaghost.
> Open a screen session (GNU screen, a textmode utility) on one of the
> production servers (not in a VE), and run the following commands, each
> in its own screen:
> ping diaghost
> while true; do curl -s http://diaghost/ -o /dev/null; sleep 1; done
>
> Monitor the network interface of diaghost with tcpdump (you only need
> ICMP), and monitor the web server logs as well. If your production
> server starts to hang, does it still send pings and HTTP requests to
> diaghost?
> Check the hard disk light as well (make sure it works for all disks). Is
> the hard disk light on while the server hangs, or is it off?

We will try some of this out, thanks. Unfortunately the servers are off 
site. So, looking at them in person is not always possible. We have the 
consoles exported over serial.

Brian.


More information about the Users mailing list