[Devel] double faults in Virtuozzo KVM
Roman Kagan
rkagan at virtuozzo.com
Fri Sep 29 12:39:25 MSK 2017
On Fri, Sep 29, 2017 at 11:25:20AM +0300, Denis Kirjanov wrote:
> >> > > _Some_ of them related to the fact that during the faults RSP points
> >> > > to userspace and it leads to double-fault scenario.
> >> >
> >> > The postmortem you quote doesn't support that.
> >>
> >>
> >> I'll post a relevant trace
>
> Here it is:
>
> [32065.459255] double fault: 0000 [#1] SMP
> [32065.459975] Modules linked in: dm_mod hcpdriver(POE) kmodlve(OE)
> vzdev ppdev pcspkr sg i
> 2c_piix4 parport_pc parport ip_tables ext4 mbcache jbd2 sd_mod
> crc_t10dif crct10dif_generic
> crct10dif_common virtio_console virtio_scsi virtio_net sr_mod cdrom
> ata_generic pata_acpi
> bochs_drm drm_kms_helper ttm drm serio_raw virtio_pci ata_piix
> virtio_ring virtio i2c_core
> libata floppy
> [32065.460041] CPU: 0 PID: 22951 Comm: cdp-2-6 ve: 0 Tainted: P
> OE ------------
> 3.10.0-714.10.2.lve1.4.61.el7.x86_64 #1 29.2
> [32065.460041] Hardware name: Virtuozzo KVM, BIOS 1.9.1-5.3.2.vz7.6 04/01/2014
> [32065.460041] task: ffff8801e9ab8ff0 ti: ffff8800ab598000 task.ti:
> ffff8800ab598000
> [32065.460041] RIP: 0010:[<ffffffff816a1bdd>] [<ffffffff816a1bdd>]
> async_page_fault+0xd/0x
> 30
> [32065.460041] RSP: 002b:00007f1a1290afe8 EFLAGS: 00010016
> [32065.460041] RAX: 00000000816a192c RBX: 0000000000000001 RCX: ffffffff816a192c
> [32065.460041] RDX: 0000000000000008 RSI: 0000000000000000 RDI: 00007f1a1290b0a8
> [32065.460041] RBP: 00007f1a1290b098 R08: 0000000000000001 R09: 0000000000000000
> [32065.460041] R10: 0000000000000000 R11: 0000000000000000 R12: 00007f1a12919960
> [32065.460041] R13: 0000000000000028 R14: 0000000000000000 R15: 00000000011f3f20
> [32065.460041] FS: 00007f1a1291e700(0000) GS:ffff88023fc00000(0000)
> knlGS:0000000000000000
> [32065.460041] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> [32065.460041] CR2: 00007f1a1290afd8 CR3: 0000000036794000 CR4: 00000000000007f0
> [32065.460041] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [32065.460041] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> [32065.460041] Stack:
> [32065.460041] BUG: unable to handle kernel paging request at 00007f1a1290afe8
> [32065.460041] IP: [<ffffffff8102d6c9>] show_stack_log_lvl+0x109/0x180
> [32065.460041] PGD 36799067 PUD 3679a067 PMD 220382067 PTE 0
> [32065.460041] Oops: 0000 [#2] SMP
> [32065.460041] Modules linked in: dm_mod hcpdriver(POE) kmodlve(OE)
> vzdev ppdev pcspkr sg i2c_piix4 parport_pc parport ip_tables ext4
> mbcache jbd2 sd_mod crc_t10dif crct10dif_generic crct10dif_common
> virtio_console virtio_scsi virtio_net sr_mod cdrom ata_generic
> pata_acpi bochs_drm drm_kms_helper ttm drm serio_raw virtio_pci
> ata_piix virtio_ring virtio i2c_core libata floppy
> [32065.460041] CPU: 0 PID: 22951 Comm: cdp-2-6 ve: 0 Tainted: P
> OE ------------ 3.10.0-714.10.2.lve1.4.61.el7.x86_64 #1 29.2
> [32065.460041] Hardware name: Virtuozzo KVM, BIOS 1.9.1-5.3.2.vz7.6 04/01/2014
> [32065.460041] task: ffff8801e9ab8ff0 ti: ffff8800ab598000 task.ti:
> ffff8800ab598000
> [32065.460041] RIP: 0010:[<ffffffff8102d6c9>] [<ffffffff8102d6c9>]
> show_stack_log_lvl+0x109/0x180
> [32065.460041] RSP: 002b:ffff88023fc04e18 EFLAGS: 00010046
> [32065.460041] RAX: 00007f1a1290aff0 RBX: 00007f1a1290afe8 RCX: 0000000000000000
> [32065.460041] RDX: ffff88023fc03fc0 RSI: ffff88023fc04f58 RDI: 0000000000000000
> [32065.460041] RBP: ffff88023fc04e68 R08: ffff88023fbfffc0 R09: ffff8800369f5900
> [32065.460041] R10: 0000000000000001 R11: 0000000000000000 R12: ffff88023fc04f58
> [32065.460041] R13: 0000000000000000 R14: ffffffff818d31b8 R15: 0000000000000000
> [32065.460041] FS: 00007f1a1291e700(0000) GS:ffff88023fc00000(0000)
> knlGS:0000000000000000
> [32065.460041] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> [32065.460041] CR2: 00007f1a1290afe8 CR3: 0000000036794000 CR4: 00000000000007f0
> [32065.460041] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [32065.460041] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> [32065.460041] Stack:
> [32065.460041] ffff880200000008 ffff88023fc04e78 ffff88023fc04e38
> 000000007ddc4235
> [32065.460041] 00007f1a1290afe8 ffff88023fc04f58 00007f1a1290afe8
> ffff88023fc04f58
> [32065.460041] 000000000000002b 00000000011f3f20 ffff88023fc04ec8
> ffffffff8102d7f6
> [32065.460041] Call Trace:
> [32065.460041] <#DF>
> [32065.460041]
> [32065.460041] [<ffffffff8102d7f6>] show_regs+0xb6/0x240
> [32065.460041] [<ffffffff816a2b0f>] __die+0x9f/0xf0
> [32065.460041] [<ffffffff8102e878>] die+0x38/0x70
> [32065.460041] [<ffffffff8102b5f2>] do_double_fault+0x72/0x80
> [32065.460041] [<ffffffff816aba88>] double_fault+0x28/0x30
> [32065.460041] [<ffffffff816a192c>] ? restore_args+0x30/0x30
> [32065.460041] [<ffffffff816a1bdd>] ? async_page_fault+0xd/0x30
> [32065.460041] <<EOE>>
> [32065.460041] Code:
> [32065.460041] 4d b8 4c 89 45 c0 48 89 55 c8 48 8b 5b f8 e8 37 6b 66
> 00 48 8b 55 c8 4c 8b 45 c0 8b 4d b8 85 c9 74 05 f6 c1 03 74 4c 48 8d
> 43 08 <48> 8b 33 48 c7 c7 b0 31 8d 81 89 4d b4 4c 89 45 b8 48 89 45 c8
> [32065.460041] RIP [<ffffffff8102d6c9>] show_stack_log_lvl+0x109/0x180
> [32065.460041] RSP <ffff88023fc04e18>
> [32065.460041] CR2: 00007f1a1290afe8
I don't see anything here that suggests that the hypervisor is at stake.
Back to your question
> >> > > Is it known problem?
no it is not, we've never seen it in our testing with various
workloads in different distro kernels.
Can you please try and reproduce it with a kernel without your
customisations, e.g. the CentOS/RH one which yours is based upon, and
post the repoducer?
> >> > I'd guess the problem is with your kernel. Doesn't it reproduce on
> >> > bare
> >> > metal?
> >
> > I still hold on this. What guest kernel are you using? What are your
> > reasons to blame the hypervisor and not the kernel?
>
> Nope, we dont have reports from bare-metal hosts.
Perhaps you have test data from other hypervisors, e.g. other Virtuozzo
versions, RHEV, VirtualBox, etc?
Thanks,
Roman.
More information about the Devel
mailing list