[Users] Processes in D state when vzctl chkpnt suspend

Stoyan Stoyanov s.stoianov at maxtelecom.bg
Tue Mar 20 12:44:19 EDT 2012


Hi,

I have an issue when trying vzbackups that happens randomly.
The issue is with the vzctl chkpnt veid --suspend .

what happens is , all ve's process goes into D states.
no logs on dmesg or anywhere on the node system in the container itself.
As you know these processes are uninterruptible (un-killable).
I'm not sure what exactly happens, so please help me.
vzserver doesn't use nfs or something like that, but fs is on lvms.
the kernel version is: Linux vz2 2.6.32-5-openvz-amd64 #1 SMP Mon Oct  
3 05:12:50 UTC 2011 x86_64 GNU/Linux

here are the ps axu output from the node, only for the freezed  
container processes.:
204 root      6688  0.0  0.0   8352   636 ?        Ds   Mar12   0:01  
init [2]
204 root      7296  0.0  0.0 119692  1292 ?        Dl   Mar12   0:01 / 
usr/sbin/rsyslogd -c4
204 root      7366  0.0  0.0  82588  3316 ?        Ds   Mar12   0:12 / 
usr/sbin/apache2 -k start
204 root      7384  0.0  0.0  20900   712 ?        Ds   Mar12   0:01 / 
usr/sbin/cron
204 root      7577  0.0  0.0  37160  2096 ?        Ds   Mar12   0:00 / 
usr/lib/postfix/master
204 101       7587  0.0  0.0  39380  2224 ?        D    Mar12   0:00  
qmgr -l -t fifo -u
204 root      7622  0.0  0.0  49168   960 ?        Ds   Mar12   0:00 / 
usr/sbin/sshd
204 101       8899  0.0  0.0  39224  2132 ?        D    Mar17   0:00  
pickup -l -t fifo -u -c
204 www-data 25719  0.0  0.0  82728  4044 ?        D    Mar17   0:00 / 
usr/sbin/apache2 -k start
204 www-data 26052  0.0  0.0  82728  4032 ?        D    Mar17   0:00 / 
usr/sbin/apache2 -k start
204 www-data 26894  0.0  0.0  82728  3900 ?        D    Mar17   0:00 / 
usr/sbin/apache2 -k start
204 www-data 27409  0.0  0.0  82728  3860 ?        D    Mar17   0:00 / 
usr/sbin/apache2 -k start
204 www-data 27542  0.0  0.0  82728  3832 ?        D    Mar17   0:00 / 
usr/sbin/apache2 -k start
204 www-data 27905 99.6  0.0  82728  3824 ?        R    Mar17 5182:40 / 
usr/sbin/apache2 -k start
204 www-data 28113  0.0  0.0  82728  3768 ?        D    Mar17   0:00 / 
usr/sbin/apache2 -k start
204 www-data 28191  0.0  0.0  82728  3760 ?        D    Mar17   0:00 / 
usr/sbin/apache2 -k start
204 www-data 28347  0.0  0.0  82728  3708 ?        D    Mar17   0:00 / 
usr/sbin/apache2 -k start
204 www-data 28720  0.0  0.0  82728  3628 ?        D    Mar17   0:00 / 
usr/sbin/apache2 -k start
204 www-data 28750  0.0  0.0  82728  3596 ?        D    Mar17   0:00 / 
usr/sbin/apache2 -k start
204 www-data 28849  0.0  0.0  82728  3560 ?        D    Mar17   0:00 / 
usr/sbin/apache2 -k start
204 root     28956 99.3  0.0  10220   520 ?        Rs   Mar17 5163:04 / 
usr/sbin/vzctl chkpnt 204 --suspend

as you see all of them are in D state.

here is the stack trace for the vzctl chkpnt process

[714486.771855] Pid: 28956, comm: vzctl Not tainted 2.6.32-5-openvz- 
amd64 #1 feoktistov X9SCL/X9SCM
[714486.771857] RIP: 0010:[<ffffffff810484cf>]  [<ffffffff810484cf>]  
wait_task_inactive+0x41/0xfb
[714486.771861] RSP: 0018:ffff8803578f1cf8  EFLAGS: 00000246
[714486.771863] RAX: 0000000000000001 RBX: 800000000000015d RCX:  
ffff8803578f1c78
[714486.771864] RDX: ffff880011a56940 RSI: 0000000000000296 RDI:  
0000000000000292
[714486.771866] RBP: ffff880421c2e800 R08: ffff8803578f0000 R09:  
ffff88043a160780
[714486.771868] R10: 0000000100000000 R11: ffff880011b96940 R12:  
ffff880011a56940
[714486.771869] R13: 0000000000000000 R14: 0000000000016940 R15:  
ffff88043d280800
[714486.771871] FS:  00007f11a6e7e700(0000) GS:ffff880011b80000(0000)  
knlGS:0000000000000000
[714486.771873] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[714486.771875] CR2: 00007f9c12391ae0 CR3: 000000041f983000 CR4:  
00000000000406e0
[714486.771877] DR0: 0000000000000000 DR1: 0000000000000000 DR2:  
0000000000000000
[714486.771878] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:  
0000000000000400
[714486.771880] Call Trace:
[714486.771881]  <NMI>  <<EOE>>  [<ffffffffa03defb6>] ? cpt_vps_suspend 
+0xede/0x138a [vzcpt]
[714486.771887]  [<ffffffffa03dca7f>] ? cpt_ioctl+0x5e5/0xcd2 [vzcpt]
[714486.771889]  [<ffffffffa03dc49a>] ? cpt_ioctl+0x0/0xcd2 [vzcpt]
[714486.771891]  [<ffffffff81134cde>] ? proc_reg_unlocked_ioctl 
+0xa2/0xc2
[714486.771894]  [<ffffffff810fd096>] ? vfs_ioctl+0x21/0x6c
[714486.771896]  [<ffffffff810fd5d3>] ? do_vfs_ioctl+0x47c/0x4cb
[714486.771899]  [<ffffffff810f1aa4>] ? vfs_write+0xcd/0x102
[714486.771901]  [<ffffffff810fd65f>] ? sys_ioctl+0x3d/0x5c
[714486.771903]  [<ffffffff81010c12>] ? system_call_fastpath+0x16/0x1b
[714486.771904] Pid: 28956, comm: vzctl Not tainted 2.6.32-5-openvz- 
amd64 #1
[714486.771905] Call Trace:
[714486.771906]  <NMI>  [<ffffffff8100fdda>] ? show_regs+0x3c/0x5d
[714486.771909]  [<ffffffff812ec738>] ? nmi_watchdog_tick+0xb7/0x1aa
[714486.771912]  [<ffffffff812ebe83>] ? do_nmi+0xa5/0x264
[714486.771914]  [<ffffffff812eb920>] ? nmi+0x20/0x30
[714486.771916]  [<ffffffff810484cf>] ? wait_task_inactive+0x41/0xfb
[714486.771917]  <<EOE>>  [<ffffffffa03defb6>] ? cpt_vps_suspend+0xede/ 
0x138a [vzcpt]
[714486.771921]  [<ffffffffa03dca7f>] ? cpt_ioctl+0x5e5/0xcd2 [vzcpt]
[714486.771924]  [<ffffffffa03dc49a>] ? cpt_ioctl+0x0/0xcd2 [vzcpt]
[714486.771926]  [<ffffffff81134cde>] ? proc_reg_unlocked_ioctl 
+0xa2/0xc2
[714486.771928]  [<ffffffff810fd096>] ? vfs_ioctl+0x21/0x6c
[714486.771931]  [<ffffffff810fd5d3>] ? do_vfs_ioctl+0x47c/0x4cb
[714486.771933]  [<ffffffff810f1aa4>] ? vfs_write+0xcd/0x102
[714486.771935]  [<ffffffff810fd65f>] ? sys_ioctl+0x3d/0x5c
[714486.771937]  [<ffffffff81010c12>] ? system_call_fastpath+0x16/0x1b

I guess I know what's happen, but I don't know how to fix and I want  
to hear some suggestions.

Is there anyone else that suffer of such issue ?
Do you have any idea what happens and if I can provide some other  
useful info , please write.






Stoyan Stoyanov
Core System Administrator

-------------- next part --------------
A non-text attachment was scrubbed...
Name: maxtelecom-logo.gif
Type: image/gif
Size: 2611 bytes
Desc: not available
Url : http://openvz.org/pipermail/users/attachments/20120320/48f156e5/maxtelecom-logo.gif
-------------- next part --------------


CONFIDENTIAL
The information contained in this email and any attachment is  
confidential. It is intended only for the named addressee(s). If you  
are not the named addressee(s) please notify the sender immediately  
and do not disclose, copy or distribute the contents to any other  
person other than the intended addressee(s).



More information about the Users mailing list