Upgrading SUSE Linux Enterprise in the Public Cloud
A common question customers ask is how to upgrade SUSE Linux Enterprise instances in the public cloud properly. There are two different scenarios to take into consideration. The first scenario is when you want to perform a service pack upgrade of a SLES or SLES for SAP instance. For example, you want to upgrade from SLES 15 SP1 to SLES 15 SP2. The second scenario is when you want to perform a major release upgrade of a SLES or SLES for SAP instance. For example, you want to upgrade from SLES 12 SP4 to SLES 15 SP2. This post will detail how to accomplish these scenarios and, more importantly, discuss problems you may encounter or need to be aware of and some tips on validating your environment once the upgrade is complete.
Note, this post will assume SUSE software repositories are available to your instance whether they have a PAYG or BYOS image. It will also not discuss other important procedures that may be advised by your organization, for example, volume snapshots or instance backups before upgrading. For more information on preparing your instance or more details on the procedures mentioned in this post, consult the SLE deployment guide [1]. If you have any issues while working through any of the scenarios documented here, be sure to first consult the bulleted list at the bottom of this post discussing possible issues and caveats.
Before doing anything, you will want to familiarize yourself with the supported upgrade paths. Consult the deployment guide for the SLE version you need to upgrade to [1].
I will refer to each phase of the upgrade process as either the PREPARE phase, where the instance is readied to be upgraded, and the MIGRATION phase when the instance is actually migrated.
Let’s explore the first scenario I mentioned above, which is performing a service pack upgrade. This could be upgrading from SLE 12 SP3 to SLE 12 SP4 or upgrading from SLE 15 SP1 to SLE 15 SP2. This is not a major release upgrade like SLE 12 to SLE 15. I’ll discuss that later. Performing a service pack upgrade is a fairly straightforward process. The first thing you will want to do is PREPARE the instance, ensuring the instance is fully up-to-date with the latest updates. To do this, run:
zypper patch
Make sure to read the output while running this command. There are cases where you may need to run this command more than once if a package manager restart is required. This will be clearly displayed as “Note: Package manager restart required. (Run this command once again after the update stack got updated)”.
Once this is complete, the instance is ready for the MIGRATION phase. To initiate the migration of the instance, run:
zypper migration
Follow the prompts to select your migration target. Once the upgrade completes, reboot the instance. You can continue to run these two commands if you need to upgrade the instance further. For example, if the instance was just upgraded from SLES 12 SP2 to SLES 12 SP3, you can rerun these two commands to upgrade the instance from SLES 12 SP3 to SLES 12 SP4. It’s that simple.
Let’s move on to the second scenario, which is performing a major release upgrade. These types of upgrades are unique in that they can only occur offline. This means that running zypper while the instance is online is not an option. Fortunately, our public cloud engineering team created a robust system called the SUSE Distribution Migration System [2]. This system is available to all Azure, GCP, and AWS public cloud customers with PAYG or BYOS images. Before getting into the upgrade process, I first want to overview how this system works.
During a major release upgrade, the instance will need to boot to the distribution media upgrading to, SLE 15. Unlike on-premise servers or VMs, it’s impossible to attach an ISO or USB physically or virtually through a console. There needs to be another way. This is where the SUSE Distribution Migration System enters to solve the problem. Before the migration is started on the instance, the instance bootloader configuration is modified, and an entry is created to boot to an image during the next reboot. From here, the distribution migration begins.
Before starting this process, though, it’s important first to PREPARE the instance, which consists of multiple steps. The first step is to use the first scenario process above to upgrade the instance to either SLE 12 SP4 or SLE 12 SP5. If the instance was at one of these levels already, then make sure it’s fully patched. As a reminder, to do this, run:
zypper patch
Next, the SUSE Distribution Migration System packages will need to be installed by running:
zypper in suse-migration-sle15-activation
This will create the bootloader configuration previously mentioned and also add the ISO image to the instance. This image is what the bootloader entry will load during the next boot. One last optional step is to perform customization of the upgrade process. By default, instances will be upgraded to SLE 15 SP1. But if the intention is to upgrade to SLE 15, then customization is required [3]. To change the migration product, create a file named /etc/sle-migration-service.yml
and add the desired migration product.
For example, to upgrade to SLES for SAP 15 instead of SLES for SAP 15 SP1, the /etc/sle-migration-service.yml
should contain:
SLES_SAP/15/x86_64
After completing any customization, the next phase, MIGRATION, can begin. To kick this process off, run the following:
run_migration
This will initiate a reboot into the upgrade live image and begin the automated distribution migration process. Now sit back and wait for the process to complete. If you want to monitor the progress, you can use the serial console, if supported, or console screenshot. You can also perform the following to get a play by play:
sudo ssh migration@IP_OF_INSTANCE tail -f /system-root/var/log/distro_migration.log
Once the process is complete, the instance will automatically reboot. If all went as planned, the instance would be upgraded, and the OS version can be confirmed with:
cat /etc/os-release
Be aware that the distribution migration will always upgrade the instance to the latest available packages. There is no way to manipulate this.
If problems were encountered which prevented the upgrade from completing, the instance should revert to its original state. This is not guaranteed, though, and rollback depends on how far the migration progressed before failing. The log file /var/log/distro_migration.log
should reveal this or any other problem. For this reason, it’s advised to always have a restore plan before performing an upgrade. If further help is needed for PAYG images, reach out to your cloud service provider. For BYOS images, reach out to SUSE.
All customer environments are unique, and as such, there may be issues encountered as a result. Performing a major release upgrade is exactly that, major, and some aspects of the environment aren’t in the exact state they were in before the upgrade. It’s important to perform a post-check of all applications and configurations as a result. I’d recommend performing a check for orphaned packages, which are packages that don’t belong to an active repository. The following command will give you a list of orphaned packages so you can decide if these should be uninstalled:
zypper packages --orphaned
Also, check for *.rpmnew
or *.rpmsave
files in /etc
and examine their contents to determine if these need to be merged with the current configuration. You can reference the Upgrade Guide for more information [1].
I want to share a collection of issues, and side effects my colleagues and I have seen and recommend proactively checking the instance or environment for it before starting the process or after the process completes:
- A common cause for failure is due to unsupported or no longer working repositories. To test this before starting the distribution migration process, a
zypper refresh
should refresh the repositories and complete without needing manual intervention. If the refresh fails for a repository, then the repository should either be removed or rectified before continuing.- This has been seen in GCP, where custom ‘google-cloud-monitoring’ repositories were missing GPG keys. As a result, migration failed since user intervention was needed to accept prompt to continue with an unsigned key.
- In GCP, if instances are using Cloud NAT and the minimum ports per VM instance haven’t been increased from 64, the migration will always fail. Reference the following TID to make sure Cloud NAT is properly configured to handle the upgrade [4].
- There can be installed packages that cause conflicts. On SLES for SAP instances, the following two packages should never be present:
sle-ha-release sle-ha-release-POOL
If the instance is SLES for SAP, remove these packages before starting the distribution migration:
zypper remove sle-ha-release sle-ha-release-POOL
- Configuration files marked as %config in spec files will have a .rpmnew version but won’t be replaced. The customer’s responsibility is to adapt those, for example, a custom /etc/motd file in public cloud images.
- Repositories without an upgrade path stay in the migrated system and are outdated. It is the customer’s responsibility to update those to point to the new distribution if required at all. Example: static Nvidia repositories in public cloud images
- Instance metadata in the cloud console will still point to the old system. If you run a SLE 12 instance in the cloud and migrate it, only the system itself knows it got migrated. But the cloud console will not be able to show a change of the operating system. Example: See AMI ID in EC2 console of a SLE 12 PAYG image after migration to SLE 15. There is nothing that can be done to change this behavior.
- The migration of instances that uses a root filesystem or setup which is not directly reachable from the grub bootloader is not supported. The loopback boot support used with the migration must load the image from the root filesystem. If this is not possible for reasons shown in the following list, the migration system will not boot:
- unsupported filesystem from the GRUB space
- encrypted root filesystem
- There is no support for upgrading intermediate distribution states like Beta or RC.
- A SLE 12 SP1 to SLE 15 upgrade is out of scope. It also breaks our advertised “skip a service pack” capabilities. A SLE 12 SP1 to SLE 15 upgrade would skip three service packs.
- When installing
suse-migration-sle15-activation
, if zypper reports that the package was not found, it’s likely the instance does not have the sle-module-public-cloud repositories. This may be experienced with BYOS more so than PAYG images. To add the repositories, activate the module and then try package install again:
SUSEConnect -p sle-module-public-cloud/12/x86_64
- DNS resolution issues may occur if the following are both true:
- running BYOS image where the
/etc/resolv.conf
is manually untouched except via DHCP - rebooting to kick off the distribution migration system instead of
run_migration
- running BYOS image where the
If the instance and process meet this criterion, before starting the migration, manually populate /etc/resolv.conf
with a nameserver entry such as:
nameserver 1.1.1.1
or instead kick off the migration by running run_migration
instead of rebooting.
In closing, realize that you also have the option of starting fresh instead of upgrading. Starting fresh means launching a new instance and then migrating data, applications, and configuration to the new instance from the old. You’ll hear differing opinions on which is preferred depending on who you ask. Ultimately, the decision is yours.
[1] https://documentation.suse.com/sles
[4] https://www.suse.com/support/kb/doc/?id=000019662
Related Articles
Mar 05th, 2024
Connecting Industrial IoT devices at the Edge
Feb 15th, 2023
Stop the Churn with SUSE eLearning
May 03rd, 2023
No comments yet