Kernel Crash. Verbindung zu KVM?

Welches Modul/Treiber für welche Hardware, Kernel compilieren...
Antworten
alex0801
Beiträge: 195
Registriert: 16.10.2005 19:46:48

Kernel Crash. Verbindung zu KVM?

Beitrag von alex0801 » 04.10.2021 09:11:28

Hallo zusammen,

seit neustem schmiert mir mein System immer mal wieder ab.

Im Kernellog hab ich das hier gefunden:

Code: Alles auswählen

Oct  4 08:00:02 blackbox kernel: [73627.870035] BUG: kernel NULL pointer dereference, address: 0000000000000068
Oct  4 08:00:02 blackbox kernel: [73627.870040] #PF: supervisor read access in kernel mode
Oct  4 08:00:02 blackbox kernel: [73627.870042] #PF: error_code(0x0000) - not-present page
Oct  4 08:00:02 blackbox kernel: [73627.870043] PGD 0 P4D 0 
Oct  4 08:00:02 blackbox kernel: [73627.870045] Oops: 0000 [#1] SMP PTI
Oct  4 08:00:02 blackbox kernel: [73627.870047] CPU: 0 PID: 2281 Comm: CPU 1/KVM Not tainted 5.14.0-1-amd64 #1  Debian 5.14.6-2
Oct  4 08:00:02 blackbox kernel: [73627.870049] Hardware name: BIOSTAR Group H110MHV3/H110MHV3, BIOS 5.12 05/04/2018
Oct  4 08:00:02 blackbox kernel: [73627.870051] RIP: 0010:internal_get_user_pages_fast+0x823/0xe10
Oct  4 08:00:02 blackbox kernel: [73627.870055] Code: 00 00 00 0f 85 fa 05 00 00 48 81 c4 a8 00 00 00 44 89 e0 5b 5d 41 5c 41 5d 41 5e 41 5f c3 48 81 e3 00 f0 ff ff e9 b6 fb ff ff <48> 81 78 68 60 a4 83 90 0f 85
Oct  4 08:00:02 blackbox kernel: [73627.870057] RSP: 0018:ffffb6c1c2aeba20 EFLAGS: 00010046
Oct  4 08:00:02 blackbox kernel: [73627.870059] RAX: 0000000000000000 RBX: ffffef120ce53140 RCX: ffffef120ce53174
Oct  4 08:00:02 blackbox kernel: [73627.870061] RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffffef120ce53140
Oct  4 08:00:02 blackbox kernel: [73627.870062] RBP: 0000000000000000 R08: 0000000000000000 R09: ffffef120ce53140
Oct  4 08:00:02 blackbox kernel: [73627.870062] R10: ffffb6c1c2aebb97 R11: 000000000000000d R12: ffff9c5708c8c6f0
Oct  4 08:00:02 blackbox kernel: [73627.870063] R13: 0000000000080005 R14: 00007fb7a88df000 R15: 80000003394c5867
Oct  4 08:00:02 blackbox kernel: [73627.870065] FS:  00007fb7e6dbd640(0000) GS:ffff9c57dec00000(0000) knlGS:ffff8f1137f00000
Oct  4 08:00:02 blackbox kernel: [73627.870066] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Oct  4 08:00:02 blackbox kernel: [73627.870068] CR2: 0000000000000068 CR3: 0000000150ee2002 CR4: 00000000003726f0
Oct  4 08:00:02 blackbox kernel: [73627.870069] Call Trace:
Oct  4 08:00:02 blackbox kernel: [73627.870073]  get_user_pages_fast_only+0x13/0x20
Oct  4 08:00:02 blackbox kernel: [73627.870076]  hva_to_pfn+0xa4/0x410 [kvm]
Oct  4 08:00:02 blackbox kernel: [73627.870108]  ? mmu_spte_update+0x11/0x190 [kvm]
Oct  4 08:00:02 blackbox kernel: [73627.870138]  try_async_pf+0xa1/0x250 [kvm]
Oct  4 08:00:02 blackbox kernel: [73627.870168]  direct_page_fault+0x11d/0xad0 [kvm]
Oct  4 08:00:02 blackbox kernel: [73627.870196]  ? kvm_mtrr_check_gfn_range_consistency+0x61/0x120 [kvm]
Oct  4 08:00:02 blackbox kernel: [73627.870223]  kvm_mmu_page_fault+0x7a/0x670 [kvm]
Oct  4 08:00:02 blackbox kernel: [73627.870251]  ? direct_page_fault+0x1c5/0xad0 [kvm]
Oct  4 08:00:02 blackbox kernel: [73627.870279]  ? kvm_check_async_pf_completion+0xdf/0x110 [kvm]
Oct  4 08:00:02 blackbox kernel: [73627.870299]  ? vmx_vmexit+0x1d/0x40 [kvm_intel]
Oct  4 08:00:02 blackbox kernel: [73627.870306]  ? vmx_vmexit+0x11/0x40 [kvm_intel]
Oct  4 08:00:02 blackbox kernel: [73627.870312]  vmx_handle_exit+0x120/0x750 [kvm_intel]
Oct  4 08:00:02 blackbox kernel: [73627.870317]  kvm_arch_vcpu_ioctl_run+0xc48/0x16a0 [kvm]
Oct  4 08:00:02 blackbox kernel: [73627.870344]  kvm_vcpu_ioctl+0x267/0x650 [kvm]
Oct  4 08:00:02 blackbox kernel: [73627.870363]  __x64_sys_ioctl+0x83/0xb0
Oct  4 08:00:02 blackbox kernel: [73627.870366]  do_syscall_64+0x3b/0xc0
Oct  4 08:00:02 blackbox kernel: [73627.870369]  entry_SYSCALL_64_after_hwframe+0x44/0xae
Oct  4 08:00:02 blackbox kernel: [73627.870372] RIP: 0033:0x7fb7ef84c957
Oct  4 08:00:02 blackbox kernel: [73627.870374] Code: 3c 1c 48 f7 d8 4c 39 e0 77 b9 e8 24 ff ff ff 85 c0 78 be 4c 89 e0 5b 5d 41 5c c3 0f 1f 84 00 00 00 00 00 b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48
Oct  4 08:00:02 blackbox kernel: [73627.870376] RSP: 002b:00007fb7e6dbc528 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
Oct  4 08:00:02 blackbox kernel: [73627.870378] RAX: ffffffffffffffda RBX: 000000000000ae80 RCX: 00007fb7ef84c957
Oct  4 08:00:02 blackbox kernel: [73627.870379] RDX: 0000000000000000 RSI: 000000000000ae80 RDI: 0000000000000013
Oct  4 08:00:02 blackbox kernel: [73627.870380] RBP: 000055df10b5feb0 R08: 000055df0edff5b8 R09: 0000000000000000
Oct  4 08:00:02 blackbox kernel: [73627.870381] R10: 0000000000000001 R11: 0000000000000246 R12: 0000000000000000
Oct  4 08:00:02 blackbox kernel: [73627.870382] R13: 000055df0f24a020 R14: 0000000000000000 R15: 0000000000000000
Oct  4 08:00:02 blackbox kernel: [73627.870384] Modules linked in: binfmt_misc dm_mod vhost_net vhost vhost_iotlb tap xt_CHECKSUM xt_MASQUERADE xt_conntrack ip6t_REJECT nf_reject_ipv6 ipt_REJECT nf_reject_ipv4 x
Oct  4 08:00:02 blackbox kernel: [73627.870423]  fuse ip_tables x_tables autofs4 ext4 crc16 mbcache jbd2 btrfs blake2b_generic zstd_compress efivarfs raid10 raid456 async_raid6_recov async_memcpy async_pq async_
Oct  4 08:00:02 blackbox kernel: [73627.870453] CR2: 0000000000000068
Oct  4 08:00:02 blackbox kernel: [73627.870454] ---[ end trace 4ee6356bdf6b03d6 ]---
Oct  4 08:00:02 blackbox kernel: [73629.743896] RIP: 0010:internal_get_user_pages_fast+0x823/0xe10
Oct  4 08:00:02 blackbox kernel: [73629.743907] Code: 00 00 00 0f 85 fa 05 00 00 48 81 c4 a8 00 00 00 44 89 e0 5b 5d 41 5c 41 5d 41 5e 41 5f c3 48 81 e3 00 f0 ff ff e9 b6 fb ff ff <48> 81 78 68 60 a4 83 90 0f 85
Oct  4 08:00:02 blackbox kernel: [73629.743909] RSP: 0018:ffffb6c1c2aeba20 EFLAGS: 00010046
Oct  4 08:00:02 blackbox kernel: [73629.743912] RAX: 0000000000000000 RBX: ffffef120ce53140 RCX: ffffef120ce53174
Oct  4 08:00:02 blackbox kernel: [73629.743914] RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffffef120ce53140
Oct  4 08:00:02 blackbox kernel: [73629.743915] RBP: 0000000000000000 R08: 0000000000000000 R09: ffffef120ce53140
Oct  4 08:00:02 blackbox kernel: [73629.743917] R10: ffffb6c1c2aebb97 R11: 000000000000000d R12: ffff9c5708c8c6f0
Oct  4 08:00:02 blackbox kernel: [73629.743918] R13: 0000000000080005 R14: 00007fb7a88df000 R15: 80000003394c5867
Oct  4 08:00:02 blackbox kernel: [73629.743920] FS:  00007fb7e6dbd640(0000) GS:ffff9c57dec00000(0000) knlGS:ffff8f1137f00000
Oct  4 08:00:02 blackbox kernel: [73629.743922] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Oct  4 08:00:02 blackbox kernel: [73629.743924] CR2: 0000000000000068 CR3: 0000000150ee2003 CR4: 00000000003726f0
Auf der Kiste läuft mit Qemu/KVM ein virtualisiertes Debian. Im Stack kann ich ich hier und da "kvm" lesen... Scheint also irgendwie damit zusammen zu hängen. Sowas wie PCIe PassThrough oder so nutze ich nicht.
Manchmal friert der ganze Host ein, manchmal reicht es den QEmu-Prozess zu killen (wenn der Host noch reagiert).

Bisher lief das einwandfrei. Aktueller Kernel: 5.14.0-1-amd64 #1 SMP Debian 5.14.6-2 (2021-09-19)


Any ideas?

Gruß
Alex

alex0801
Beiträge: 195
Registriert: 16.10.2005 19:46:48

Re: Kernel Crash. Verbindung zu KVM?

Beitrag von alex0801 » 08.10.2021 08:56:18

Bin zurück auf Kernel 5.10.0-8-amd64 #1 SMP Debian 5.10.46-4 (2021-08-03), seit dem läufts wieder stabil.

alex0801
Beiträge: 195
Registriert: 16.10.2005 19:46:48

Re: Kernel Crash. Verbindung zu KVM?

Beitrag von alex0801 » 12.10.2021 09:14:36

Zurück au 5.14. ... dieses mal 5.14.0-2-amd64 .. bleibt wieder hängen. Dieses mal erhängt sich aber nur QEMU, bzw. mein virtualisiertes Linux.

Code: Alles auswählen

[83295.912864] BUG: kernel NULL pointer dereference, address: 0000000000000068
[83295.912868] #PF: supervisor read access in kernel mode
[83295.912870] #PF: error_code(0x0000) - not-present page
[83295.912871] PGD 0 P4D 0 
[83295.912873] Oops: 0000 [#1] SMP PTI
[83295.912875] CPU: 3 PID: 2689 Comm: CPU 0/KVM Not tainted 5.14.0-2-amd64 #1  Debian 5.14.9-2
[83295.912878] Hardware name: BIOSTAR Group H110MHV3/H110MHV3, BIOS 5.12 05/04/2018
[83295.912879] RIP: 0010:internal_get_user_pages_fast+0x823/0xe10
[83295.912883] Code: 00 00 00 0f 85 fa 05 00 00 48 81 c4 a8 00 00 00 44 89 e0 5b 5d 41 5c 41 5d 41 5e 41 5f c3 48 81 e3 00 f0 ff ff e9 b6 fb ff ff <48> 81 78 68 60 a4 a3 a6 0f 85 3a fd ff ff 44 89 e8 be 01 00 00 00
[83295.912885] RSP: 0018:ffffb93e03007a20 EFLAGS: 00010046
[83295.912887] RAX: 0000000000000000 RBX: fffffaac4c545780 RCX: fffffaac4c5457b4
[83295.912888] RDX: 0000000000000000 RSI: 0000000000000001 RDI: fffffaac4c545780
[83295.912889] RBP: 0000000000000000 R08: 0000000000000000 R09: fffffaac4c545780
[83295.912890] R10: ffffb93e03007b97 R11: 000000000000000d R12: ffff98c3c4a8f628
[83295.912891] R13: 0000000000080005 R14: 00007f6c7cec6000 R15: 800000031515e867
[83295.912892] FS:  00007f6d58895640(0000) GS:ffff98c71ed80000(0000) knlGS:ffff9dee39c00000
[83295.912894] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[83295.912895] CR2: 0000000000000068 CR3: 000000019fc9e004 CR4: 00000000003726e0
[83295.912896] Call Trace:
[83295.912899]  get_user_pages_fast_only+0x13/0x20
[83295.912903]  hva_to_pfn+0xa4/0x410 [kvm]
[83295.912933]  try_async_pf+0xa1/0x250 [kvm]
[83295.912965]  direct_page_fault+0x11d/0xad0 [kvm]
[83295.912994]  ? kvm_mtrr_check_gfn_range_consistency+0x61/0x120 [kvm]
[83295.913022]  kvm_mmu_page_fault+0x7a/0x670 [kvm]
[83295.913050]  ? direct_page_fault+0x30c/0xad0 [kvm]
[83295.913077]  ? kvm_check_async_pf_completion+0xdf/0x110 [kvm]
[83295.913098]  ? vmx_vmexit+0x1d/0x40 [kvm_intel]
[83295.913105]  ? vmx_vmexit+0x11/0x40 [kvm_intel]
[83295.913110]  vmx_handle_exit+0x120/0x750 [kvm_intel]
[83295.913116]  kvm_arch_vcpu_ioctl_run+0xc48/0x16a0 [kvm]
[83295.913143]  kvm_vcpu_ioctl+0x267/0x650 [kvm]
[83295.913162]  __x64_sys_ioctl+0x83/0xb0
[83295.913164]  do_syscall_64+0x3b/0xc0
[83295.913167]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[83295.913170] RIP: 0033:0x7f6d5cbf3957
[83295.913172] Code: 3c 1c 48 f7 d8 4c 39 e0 77 b9 e8 24 ff ff ff 85 c0 78 be 4c 89 e0 5b 5d 41 5c c3 0f 1f 84 00 00 00 00 00 b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d e1 94 0c 00 f7 d8 64 89 01 48
[83295.913173] RSP: 002b:00007f6d58894528 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[83295.913175] RAX: ffffffffffffffda RBX: 000000000000ae80 RCX: 00007f6d5cbf3957
[83295.913176] RDX: 0000000000000000 RSI: 000000000000ae80 RDI: 0000000000000012
[83295.913177] RBP: 000055f1cb66db60 R08: 000055f1c9d2f5b8 R09: 00000000ffffffff
[83295.913178] R10: 0000000000000001 R11: 0000000000000246 R12: 0000000000000000
[83295.913179] R13: 000055f1ca17a0a0 R14: 0000000000000000 R15: 0000000000000000
[83295.913181] Modules linked in: dm_mod vhost_net vhost vhost_iotlb tap xt_CHECKSUM xt_MASQUERADE xt_conntrack ip6t_REJECT nf_reject_ipv6 ipt_REJECT nf_reject_ipv4 xt_tcpudp nft_compat nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nft_counter nf_tables nfnetlink bridge stp llc ipmi_devintf ipmi_msghandler tun rfkill intel_rapl_msr intel_rapl_common x86_pkg_temp_thermal intel_powerclamp nls_ascii coretemp nls_cp437 kvm_intel vfat snd_hda_codec_hdmi fat xfs kvm irqbypass snd_hda_codec_realtek snd_hda_codec_generic ghash_clmulni_intel ledtrig_audio mei_hdcp snd_hda_intel snd_intel_dspcfg snd_intel_sdw_acpi snd_hda_codec aesni_intel crypto_simd snd_hda_core snd_hwdep snd_pcm cryptd rapl intel_cstate snd_timer intel_uncore iTCO_wdt intel_pmc_bxt iTCO_vendor_support pcspkr serio_raw efi_pstore snd mei_me ftdi_sio at24 watchdog soundcore usbserial sg mei evdev intel_pmc_core acpi_pad parport_pc nfsd ppdev lp parport auth_rpcgss nfs_acl lockd grace configfs fuse sunrpc ip_tables
[83295.913221]  x_tables autofs4 ext4 crc16 mbcache jbd2 btrfs blake2b_generic zstd_compress efivarfs raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c crc32c_generic raid0 multipath linear raid1 md_mod hid_generic sd_mod usbhid hid i915 i2c_algo_bit ttm drm_kms_helper nvme cec xhci_pci ahci rc_core xhci_hcd libahci nvme_core drm t10_pi usbcore libata crc32_pclmul crc_t10dif crct10dif_generic crc32c_intel psmouse crct10dif_pclmul crct10dif_common r8169 i2c_i801 i2c_smbus scsi_mod realtek mdio_devres libphy usb_common fan video button
[83295.913251] CR2: 0000000000000068
[83295.913252] ---[ end trace 220188609fe8c208 ]---
[83296.018146] RIP: 0010:internal_get_user_pages_fast+0x823/0xe10
[83296.018154] Code: 00 00 00 0f 85 fa 05 00 00 48 81 c4 a8 00 00 00 44 89 e0 5b 5d 41 5c 41 5d 41 5e 41 5f c3 48 81 e3 00 f0 ff ff e9 b6 fb ff ff <48> 81 78 68 60 a4 a3 a6 0f 85 3a fd ff ff 44 89 e8 be 01 00 00 00
[83296.018156] RSP: 0018:ffffb93e03007a20 EFLAGS: 00010046
[83296.018158] RAX: 0000000000000000 RBX: fffffaac4c545780 RCX: fffffaac4c5457b4
[83296.018160] RDX: 0000000000000000 RSI: 0000000000000001 RDI: fffffaac4c545780
[83296.018161] RBP: 0000000000000000 R08: 0000000000000000 R09: fffffaac4c545780
[83296.018162] R10: ffffb93e03007b97 R11: 000000000000000d R12: ffff98c3c4a8f628
[83296.018163] R13: 0000000000080005 R14: 00007f6c7cec6000 R15: 800000031515e867
[83296.018164] FS:  00007f6d58895640(0000) GS:ffff98c71ed80000(0000) knlGS:ffff9dee39c00000
[83296.018165] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[83296.018166] CR2: 0000000000000068 CR3: 000000019fc9e004 CR4: 00000000003726e0
[83459.732962] virbr0: port 1(vnet0) entered disabled state
[83459.734923] device vnet0 left promiscuous mode
[83459.734928] virbr0: port 1(vnet0) entered disabled state
Prozess abschießen und neu starten und alles läuft wieder.

Ein Dauerzustand ist das jedoch nicht. Hat jemand noch eine Idee außer Kernel-Downgrade?

Antworten