VMware VM no longer boots after changing to UEFI and specifying GRUB_TERMINAL="serial console"
This document (000021680) is provided subject to the disclaimer at the end of this document.
Environment
VMware ESXi 7.x
SUSE Linux Enterprise Server 15 SP6
Situation
VMware VM no longer boots after being upgraded to SLE 15 SP6 and changing from BIOS to UEFI (SecureBoot disabled). Additionally, GRUB_TERMINAL="gfxmode" was changed to GRUB_TERMINAL="serial console”.
Using this configuration, the system no longer boots and powers itself off with the following error:
vcpu-0 - [msg.efi.exception] The firmware encountered an unexpected exception. The virtual machine cannot boot.
The root cause was a Page Fault similar to:
localhost login: [ 115.792006][ T1] reboot: Power down
!!!! X64 Exception Type - 0E(#PF - Page-Fault) CPU Apic ID - 00000000 !!!!
ExceptionData - 0000000000000000 I:0 R:0 U:0 W:0 P:0 PK:0 S:0
RIP - 000000000E0675BD, CS - 0000000000000018, RFLAGS - 0000000000210202
RAX - 0000000000000004, RCX - 0000000000000000, RDX - 0000000000000004
RBX - 000000A054415253, RSP - 000000000FFBC3F0, RBP - 000000000FFBC3F0
RSI - 000000007FDF6D7C, RDI - 000000A054415253
R8 - 0000000000000053, R9 - 000000000FFBC501, R10 - 0000000000000001
R11 - 0000000000000000, R12 - 0000000000000007, R13 - 000000007FDF6D7C
R14 - 000000000FF660C0, R15 - 000000000FF66000
DS - 0000000000000008, ES - 0000000000000008, FS - 0000000000000008
GS - 0000000000000008, SS - 0000000000000008
CR0 - 0000000080010033, CR2 - 000000A054415253, CR3 - 000000000FF76000
CR4 - 0000000000000668, CR8 - 0000000000000000
DR0 - 0000000000000000, DR1 - 0000000000000000, DR2 - 0000000000000000
DR3 - 0000000000000000, DR6 - 00000000FFFF0FF0, DR7 - 0000000000000400
GDTR - 00000000FFFFFC10 000000000000002F, LDTR - 0000000000000000
IDTR - 000000000FEC44E0 0000000000000FFF, TR - 0000000000000000
FXSAVE_STATE - 000000000FFBC050
!!!! Can't find image information. !!!!
Resolution
For the specifics of setting up a serial port in VMware, please look at https://www.suse.com/support/kb/doc/?id=000018805
Add or modify the following parameter to /etc/default/grub:
GRUB_SERIAL_COMMAND="serial --unit=0"
The system then boots as expected.
Cause
The root cause for the error was a crash. Analysis thereof revealed that a Page Fault occurs during executing the command "serial" without any parameter. Traditionally, before grub 2.12 it is equivalent to "com0", but the new 2.12 release has incorporated a new feature [1] that changes the behavior. It is now equivalent to "serial auto" in that ACPI SPCR [2] table will be used to discover the serial port if present.
The root cause remain unknown, and can be firmware specific. Also, it seems to work with ESXi in legacy boot mode if they shared the same serial settings.
To get around the issue, it is best to always specify the port explicitly so it can always stick to the same standard. To retain backward compatibility with old "serial", the use of GRUB_SERIAL_COMMAND="serial --unit=0" is suggested [3].
Additional Information
[1] https://git.savannah.gnu.org/cgit/grub.git/commit/?id=7b192ec4cd
[3] http://10.67.129.96/tests/2999#step/add_serial_console/24
Disclaimer
This Support Knowledgebase provides a valuable tool for SUSE customers and parties interested in our products and solutions to acquire information, ideas and learn from one another. Materials are provided for informational, personal or non-commercial use within your organization and are presented "AS IS" WITHOUT WARRANTY OF ANY KIND.
- Document ID:000021680
- Creation Date: 28-Jan-2025
- Modified Date:31-Jan-2025
-
- SUSE Linux Enterprise Server
- SUSE Linux Enterprise Server for SAP Applications
For questions or concerns with the SUSE Knowledgebase please contact: tidfeedback[at]suse.com