[Users] Kernel panic on restore

Roman Haefeli reduzent at gmail.com
Fri Mar 28 03:03:52 PDT 2014


On Wed, 2014-03-26 at 22:30 +0400, Andrew Vagin wrote:
> Hello Roman,
> 
> Could you file a bug to bugzilla.openvz.org and assign it to me?

Sure: https://bugzilla.openvz.org/show_bug.cgi?id=2926

I discovered some other problems related to that that I'd like to
discuss still here on the list. 

After the crash of the HN that tried to restore the CT, I manually tried
to start the CT on another HN. I did:

$ vzctl start ploop4

and the HN on which I issued this command crashed immediately as well
because instead of starting it tried to restore from the dump file that
was still lying around.  To me it looks like the crash just happens at
the time of the restore, but the actual problem is that the dump file is
corrupt and would crashes every HN, when trying to restore from it.

This raises the questions:

* How is it possible to end up with a corrupt dump file in the first
  place?

* Shouldn't 'vzctl restore' make sure the dump file it reads from
  is valid and cannot cause a kernel panic?

Consider the following situation:

We are running an OpenVZ cluster with several HNs where VE_ROOT,
VE_PRIVATE and DUMPDIR are on a shared NFS mount. The CTs are managed by
the corosync / pacemaker HA stack. Now if a planned migration of a CT
leads to a corrupt dump file, the destination HN will crash. The cluster
stack will fence that HN and try to start the CT on a different HN. This
other HN will also immediately crash. This will go on and crash all
remaining HNs until meatware manually removes the dump file.

Roman
 


> On Wed, Mar 26, 2014 at 05:35:55PM +0100, Roman Haefeli wrote:
> > Hi all
> > 
> > I happened to be able to crash one hostnode of our testing cluster when
> > restoring a CT.
> > 
> > Hostnodes:
> > * 3 hostnodes running Debian 7 amd64 with OpenVZ kernel
> > * Kernel:  042stab085.20
> > * VE_ROOT / VE_PRIVATE is on an NFS mount shared by nodes
> > 
> > Test-CT:
> > * Debian 7 from self-made template
> > * amd64
> > * ploop
> > * runs mysql server and apache2 web server
> > * runs scripts to cause load on mysql and web server
> > 
> > For testing purposes, I was online-migrating the test-CT between nodes
> > once every 30 seconds. This went fine for a while, but after a few
> > cycles (~20) one of the hostnodes crashed when trying to restore the CT.
> > 
> > This issue is most likely not specific to the kernel version. I got
> > similar crashes with older versions as well, but was too lazy to report
> > them. 
> > 
> > I'm aware that migrating a CT every 30 seconds might be considered
> > extreme, though we experienced similar crashes on production systems at
> > the time of online migration and on those we migrate every few weeks at
> > most. Before using online migration on production again, I'd like to
> > verify that the most extreme situation I can think of is handled
> > gracefully by the kernel.
> > 
> > Here is the part of the syslog I was able to catch at the time of the
> > crash, let me know if further information is needed: 
> > 
> > Mar 26 16:17:05 virtuetest3 kernel: [ 1000.279251]  ploop46524: p1
> > Mar 26 16:17:05 virtuetest3 kernel: [ 1000.289409]  ploop46524: p1
> > Mar 26 16:17:05 virtuetest3 kernel: [ 1000.313031] EXT4-fs (ploop46524p1): mounted filesystem with ordered data mode. Opts: 
> > Mar 26 16:17:05 virtuetest3 kernel: [ 1000.314840] EXT4-fs (ploop46524p1): loaded balloon from 12 (0 blocks)
> > Mar 26 16:17:05 virtuetest3 kernel: [ 1000.383837] lo: Dropping TSO features since no CSUM feature.
> > Mar 26 16:17:05 virtuetest3 kernel: [ 1000.384787] CT: 54: started
> > Mar 26 16:17:05 virtuetest3 kernel: [ 1000.399195] device veth54.0 entered promiscuous mode
> > Mar 26 16:17:05 virtuetest3 kernel: [ 1000.399286] br_206: port 2(veth54.0) entering forwarding state
> > Mar 26 16:17:07 virtuetest3 kernel: [ 1001.660051] BUG: unable to handle kernel NULL pointer dereference at 0000000000000018
> > Mar 26 16:17:07 virtuetest3 kernel: [ 1001.660232] IP: [<ffffffff814adcfe>] inet_csk_reqsk_queue_prune+0x29e/0x2c0
> > Mar 26 16:17:07 virtuetest3 kernel: [ 1001.660372] PGD 0 
> > Mar 26 16:17:07 virtuetest3 kernel: [ 1001.660419] Oops: 0000 [#1] SMP 
> > Mar 26 16:17:07 virtuetest3 kernel: [ 1001.660498] last sysfs file: /sys/devices/virtual/block/ploop46524/removable
> > Mar 26 16:17:07 virtuetest3 kernel: [ 1001.660616] CPU 0 
> > Mar 26 16:17:07 virtuetest3 kernel: [ 1001.660657] Modules linked in: vzethdev vznetdev pio_nfs pio_direct pfmt_raw pfmt_ploop1 ploop simfs vzrst nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 vzcpt nf_conntrack vziolimit vzmon xt_length xt_hl xt_tcpmss xt_TCPMSS iptable_mangle iptable_filter xt_multiport xt_limit xt_dscp ipt_REJECT ip_tables nfs fscache vzdquota vzdev vzevent ib_iser rdma_cm ib_addr iw_cm ib_cm ib_sa ib_mad ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi fuse nfsd nfs_acl auth_rpcgss lockd sunrpc ipv6 bridge 8021q garp stp llc snd_pcsp radeon iTCO_wdt iTCO_vendor_support snd_pcm ttm snd_page_alloc drm_kms_helper snd_timer lpc_ich i5000_edac drm ioatdma mfd_core edac_core snd i2c_algo_bit i5k_amb i2c_core soundcore serio_raw dca shpchp ext4 jbd2 mbcache sg sd_mod crc_t10dif ata_generic pata_acpi mptsas mptscsih bnx2 ata_piix mptbase scsi_transport_sas [last unloaded: ploop]
> > Mar 26 16:17:07 virtuetest3 kernel: [ 1001.661003] 
> > Mar 26 16:17:07 virtuetest3 kernel: [ 1001.661003] Pid: 0, comm: swapper veid: 0 Not tainted 2.6.32-openvz-042stab085.20-amd64 #1 042stab085_20 IBM IBM eServer BladeCenter HS21 -[7995L3G]-/Server Blade
> > Mar 26 16:17:07 virtuetest3 kernel: [ 1001.661003] RIP: 0010:[<ffffffff814adcfe>]  [<ffffffff814adcfe>] inet_csk_reqsk_queue_prune+0x29e/0x2c0
> > Mar 26 16:17:07 virtuetest3 kernel: [ 1001.661003] RSP: 0018:ffff880028203d50  EFLAGS: 00010202
> > Mar 26 16:17:07 virtuetest3 kernel: [ 1001.661003] RAX: 0000000000000000 RBX: 00000001000ab4dc RCX: 0000000000000000
> > Mar 26 16:17:07 virtuetest3 kernel: [ 1001.661003] RDX: ffff88034c05a500 RSI: ffff880362581c80 RDI: ffff880366f2b080
> > Mar 26 16:17:07 virtuetest3 kernel: [ 1001.661003] RBP: ffff880028203dc0 R08: ffff88002821c320 R09: 0000000000000000
> > Mar 26 16:17:07 virtuetest3 kernel: [ 1001.661003] R10: 0000000000000001 R11: 0000000000000000 R12: ffff880366f2b080
> > Mar 26 16:17:07 virtuetest3 kernel: [ 1001.661003] R13: 000000000001d4c0 R14: ffff880366f2b3c0 R15: ffff880362581c80
> > Mar 26 16:17:07 virtuetest3 kernel: [ 1001.661003] FS:  0000000000000000(0000) GS:ffff880028200000(0000) knlGS:0000000000000000
> > Mar 26 16:17:07 virtuetest3 kernel: [ 1001.661003] CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
> > Mar 26 16:17:07 virtuetest3 kernel: [ 1001.661003] CR2: 0000000000000018 CR3: 0000000349579000 CR4: 00000000000007f0
> > Mar 26 16:17:07 virtuetest3 kernel: [ 1001.661003] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > Mar 26 16:17:07 virtuetest3 kernel: [ 1001.661003] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> > Mar 26 16:17:07 virtuetest3 kernel: [ 1001.661003] Process swapper (pid: 0, veid: 0, threadinfo ffffffff81a00000, task ffffffff81a8d020)
> > Mar 26 16:17:07 virtuetest3 kernel: [ 1001.661003] Stack:
> > Mar 26 16:17:07 virtuetest3 kernel: [ 1001.661003]  0000840bffff23fa ffff88034c05a500 00000000000000c8 0000000181c0f7a8
> > Mar 26 16:17:07 virtuetest3 kernel: [ 1001.661003] <d> 0000009d0000002f 00000000000003e8 ffff88034c05a000 000000058146c918
> > Mar 26 16:17:07 virtuetest3 kernel: [ 1001.661003] <d> ffffc900035d9000 ffff880366f2b080 ffff880366f2b0c8 ffffffff81aaa180
> > Mar 26 16:17:07 virtuetest3 kernel: [ 1001.661003] Call Trace:
> > Mar 26 16:17:07 virtuetest3 kernel: [ 1001.661003]  <IRQ> 
> > Mar 26 16:17:07 virtuetest3 kernel: [ 1001.661003]  [<ffffffff814c2757>] tcp_keepalive_timer+0x187/0x2e0
> > Mar 26 16:17:07 virtuetest3 kernel: [ 1001.661003]  [<ffffffff81089b7c>] run_timer_softirq+0x1bc/0x380
> > Mar 26 16:17:07 virtuetest3 kernel: [ 1001.661003]  [<ffffffff814c25d0>] ? tcp_keepalive_timer+0x0/0x2e0
> > Mar 26 16:17:07 virtuetest3 kernel: [ 1001.661003]  [<ffffffff8107f3c3>] __do_softirq+0x103/0x260
> > Mar 26 16:17:07 virtuetest3 kernel: [ 1001.661003]  [<ffffffff8100c44c>] call_softirq+0x1c/0x30
> > Mar 26 16:17:07 virtuetest3 kernel: [ 1001.661003]  [<ffffffff81010195>] do_softirq+0x65/0xa0
> > Mar 26 16:17:07 virtuetest3 kernel: [ 1001.661003]  [<ffffffff8107f1ed>] irq_exit+0xcd/0xd0
> > Mar 26 16:17:07 virtuetest3 kernel: [ 1001.661003]  [<ffffffff81539515>] do_IRQ+0x75/0xf0
> > Mar 26 16:17:07 virtuetest3 kernel: [ 1001.661003]  [<ffffffff8100ba93>] ret_from_intr+0x0/0x11
> > Mar 26 16:17:07 virtuetest3 kernel: [ 1001.661003]  <EOI> 
> > Mar 26 16:17:07 virtuetest3 kernel: [ 1001.661003]  [<ffffffff81016ce7>] ? mwait_idle+0x77/0xd0
> > Mar 26 16:17:07 virtuetest3 kernel: [ 1001.661003]  [<ffffffff81535a9a>] ? atomic_notifier_call_chain+0x1a/0x20
> > Mar 26 16:17:07 virtuetest3 kernel: [ 1001.661003]  [<ffffffff8100a013>] cpu_idle+0xb3/0x110
> > Mar 26 16:17:07 virtuetest3 kernel: [ 1001.661003]  [<ffffffff81514db5>] rest_init+0x85/0x90
> > Mar 26 16:17:07 virtuetest3 kernel: [ 1001.661003]  [<ffffffff81c31f80>] start_kernel+0x412/0x41e
> > Mar 26 16:17:07 virtuetest3 kernel: [ 1001.661003]  [<ffffffff81c3133a>] x86_64_start_reservations+0x125/0x129
> > Mar 26 16:17:07 virtuetest3 kernel: [ 1001.661003]  [<ffffffff81c31453>] x86_64_start_kernel+0x115/0x124
> > Mar 26 16:17:07 virtuetest3 kernel: [ 1001.661003] Code: ff 0f 1f 40 00 4c 89 fa e9 50 fe ff ff 41 f6 47 49 10 74 09 31 c9 e9 6f ff ff ff 66 90 49 8b 47 20 4c 89 fe 48 89 55 98 4c 89 e7 <ff> 50 18 85 c0 48 8b 55 98 0f 84 68 fe ff ff 41 f6 47 49 10 0f 
> > Mar 26 16:17:07 virtuetest3 kernel: [ 1001.661003] RIP  [<ffffffff814adcfe>] inet_csk_reqsk_queue_prune+0x29e/0x2c0
> > Mar 26 16:17:07 virtuetest3 kernel: [ 1001.661003]  RSP <ffff880028203d50>
> > Mar 26 16:17:07 virtuetest3 kernel: [ 1001.661003] CR2: 0000000000000018
> > Mar 26 16:17:07 virtuetest3 kernel: [ 1001.661003] Tainting kernel with flag 0x7
> > Mar 26 16:17:07 virtuetest3 kernel: [ 1001.661003] Pid: 0, comm: swapper veid: 0 Not tainted 2.6.32-openvz-042stab085.20-amd64 #1
> > Mar 26 16:17:07 virtuetest3 kernel: [ 1001.661003] Call Trace:
> > Mar 26 16:17:07 virtuetest3 kernel: [ 1001.661003]  <IRQ>  [<ffffffff81075e65>] ? add_taint+0x35/0x70
> > Mar 26 16:17:07 virtuetest3 kernel: [ 1001.661003]  [<ffffffff815339b4>] ? oops_end+0x54/0x100
> > Mar 26 16:17:07 virtuetest3 kernel: [ 1001.661003]  [<ffffffff8104af5b>] ? no_context+0xfb/0x260
> > Mar 26 16:17:07 virtuetest3 kernel: [ 1001.661003]  [<ffffffff8104b1d5>] ? __bad_area_nosemaphore+0x115/0x1e0
> > Mar 26 16:17:07 virtuetest3 kernel: [ 1001.661003]  [<ffffffffa034fff8>] ? br_nf_pre_routing_finish+0x238/0x350 [bridge]
> > Mar 26 16:17:07 virtuetest3 kernel: [ 1001.661003]  [<ffffffff8104b2b3>] ? bad_area_nosemaphore+0x13/0x20
> > Mar 26 16:17:07 virtuetest3 kernel: [ 1001.661003]  [<ffffffff8104ba02>] ? __do_page_fault+0x322/0x490
> > Mar 26 16:17:07 virtuetest3 kernel: [ 1001.661003]  [<ffffffffa03505ba>] ? br_nf_pre_routing+0x4aa/0x7e0 [bridge]
> > Mar 26 16:17:07 virtuetest3 kernel: [ 1001.661003]  [<ffffffff81497ab9>] ? nf_iterate+0x69/0xb0
> > Mar 26 16:17:07 virtuetest3 kernel: [ 1001.661003]  [<ffffffffa0349e00>] ? br_handle_frame_finish+0x0/0x320 [bridge]
> > Mar 26 16:17:07 virtuetest3 kernel: [ 1001.661003]  [<ffffffff81497c76>] ? nf_hook_slow+0x76/0x120
> > Mar 26 16:17:07 virtuetest3 kernel: [ 1001.661003]  [<ffffffffa0349e00>] ? br_handle_frame_finish+0x0/0x320 [bridge]
> > Mar 26 16:17:07 virtuetest3 kernel: [ 1001.661003]  [<ffffffff8153597e>] ? do_page_fault+0x3e/0xa0
> > Mar 26 16:17:07 virtuetest3 kernel: [ 1001.661003]  [<ffffffff81532d05>] ? page_fault+0x25/0x30
> > Mar 26 16:17:07 virtuetest3 kernel: [ 1001.661003]  [<ffffffff814adcfe>] ? inet_csk_reqsk_queue_prune+0x29e/0x2c0
> > Mar 26 16:17:07 virtuetest3 kernel: [ 1001.661003]  [<ffffffff814c2757>] ? tcp_keepalive_timer+0x187/0x2e0
> > Mar 26 16:17:07 virtuetest3 kernel: [ 1001.661003]  [<ffffffff81089b7c>] ? run_timer_softirq+0x1bc/0x380
> > Mar 26 16:17:07 virtuetest3 kernel: [ 1001.661003]  [<ffffffff814c25d0>] ? tcp_keepalive_tim
> > 
> > 
> > _______________________________________________
> > Users mailing list
> > Users at openvz.org
> > https://lists.openvz.org/mailman/listinfo/users




More information about the Users mailing list