[Devel] Re: Node Crashes on 64 bit owing to a single file.
Ligesh
myself at ligesh.com
Sat Jan 13 09:57:26 PST 2007
On Thu, Jan 11, 2007 at 07:19:52PM +0300, Kirill Korotaev wrote:
> Ligesh,
>
> >
> > -> ls -al in /vz/private/100 and /vz/private/400 shows large integer values in place of last modified time.
> > -> Once this particular vps containing the bad file is started, all applications--IN THE NODE--start segfaulting. An strace show it happening after getftime, so I think it could be related to modified time of the file.
> how did you strace it if you say that *ALL* applications start segfaulting?
> strace doesn't segfault?
Hehe. All _system_ applications crash. As I said, it crashes somewhere while reading time, here's the trace:
strace w.
---------------------------
open("/proc/19241/stat", O_RDONLY) = 4
read(4, "19241 (w) R 19240 19240 14483 34"..., 1023) = 227
read(4, "w\0", 2047) = 2
close(4) = 0
getdents64(3, /* 0 entries */, 1024) = 0
close(3) = 0
open("/etc/localtime", O_RDONLY) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0
fstat(3, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x2a983ab000
read(3, "", 4096) = 0
close(3) = 0
munmap(0x2a983ab000, 4096) = 0
--- SIGSEGV (Segmentation fault) @ 0 (0) ---
+++ killed by SIGSEGV +++
Process 19241 detached
------------------------------------------------
The kernel is the latest stable 64bit one. ovzkernel-smp-2.6.9-023stab037.3.x86_64.rpm
> >
> > I will send the kernel dmesg and the strace one I am able to login back to the server.
> would be very helpfull!
Actually the dmesg doesn't contain any information. But here's the ls -al of /vz/private and /vz/root directory. You will see that the offending vpses have their 'mtimes' shown as very large integers. They are the vpses which, when stopped, the problem goes away.
----------------------
# ls -al /vz/private
drwxr-xr-x 20 root root 4096 Jan 13 16:54 290
drwxr-xr-x 20 root root 4096 Jan 13 16:55 310
drwxr-xr-x 3 root root 4096 Jan 13 13:40 320
drwxr-xr-x 22 root root 4096 2666130989161975770 330
drwxr-xr-x 21 root root 4096 Feb 11 2007 340
drwxr-xr-x 20 root root 4096 Feb 11 2007 350
[root at cluster /]# ls -al /vz/root
total 100
drwxr-xr-x 20 root root 4096 Jan 13 16:53 280
drwxr-xr-x 20 root root 4096 Jan 13 16:54 290
drwxr-xr-x 20 root root 4096 Jan 13 16:55 310
drwxr-xr-x 22 root root 4096 2666130989161975770 330
drwxr-xr-x 21 root root 4096 Feb 11 2007 340
drwxr-xr-x 2 root root 4096 Jan 13 02:49 350
-------------------------
More information about the Devel
mailing list