Definitive solution to libvirt guest naming

The answer is libvirt NSS. This is Fedora 26:

yum install libvirt-nss

And enable the NSS module with two “libvirt” keywords:

# egrep ^host /etc/nsswitch.conf
hosts: files libvirt libvirt_guest dns myhostname

DNS resolution just works for all my libvirt guests now. NSS will figure it out according to dnsmasq DHCP records (hostname entry). If that’s not advertised by a guest, then it will use VM name. For FQDN you need to rename your VM names to include domain too, e.g. vm1.home.lan.

For more visit documentation.

Hurray! No more fiddling with /etc/hosts, no more dnsmasq split setups or hacks via virsh. This is elegant solution.

Via Kamil Páral’s blog.

10 October 2017 | linux | fedora

Ryzen and Linux is a disaster (2017)

If you are reading this blog post in 2018 or later, chances are that AMD has fixed all Linux issues. But that was not the case months after launch in 2017. Also, I work at Red Hat, these lines are solely my own opinion blah blah. Let’s start, shall we?

Ryzen 7 series was introduced early March 2017, I bought parts for my Ryzen workstation beginning June 2017 thinking that few months will allow AMD to settle down and release necessary BIOS updates and CPU microcode upgrade. I was wrong. Terribly wrong.

I was excited when doing the build, mostly because of the new case I bought: Fractal Design Define R5 and Seasonic M12 EVO Bronze 520W. This is premium case and I enjoyed every bit of doing the build. Plenty of space for hard-drives, many cool ideas or things in the case design (screw-less mounting), huge and silent fans, modularity of the case. With a big enough SATA Seagate drive, I was planning to do bcache and never run out of space for home directory.

My primary storage for OS was NVMe SSD drive Intel 600p M.2 256GB but shortly after installation of Fedora 25 when I performed the initial grub2 package upgrade, UEFI was not able to boot. Weird error, I spent a week of figuring out until I found a Red Hat Bugzilla describing corruption with non-Windows filesystems (specifically XFS).

In short, Intel did not care enough to test this premium (expensive) drive under Linux on XFS. Since I wanted to start working with my workstation, I replaced the drive with Samsung NVMe and I doublechecked before my purchase it works just fine with Linux. The bug has not yet been resolved (October 2017), some users still report problems with Fedora or CentOS:

Anyway, I needed a graphics card and I had one spare MSI GTX 750 Ti which I used. Noevau driver, PC was freezing, artifacts, terrible experience. After years and years with Intel CPU and IGP, I almost forgot how terrible this is. Well, I wanted to buy AMD Radeon RX 5xx anyway, but I couldn’t due to Bitcoin (cards were sold out). I sold the NVidia and bought entry-level used and very old Radeon card which worked like a charm. Later on I bought RX 540 by ASUS. Radeon with open-source is like day and night compared to NVidia. It used to be the other way around.

Now, the Ryzen thing. Get ready.

My PC was freezing, restarting and coredumping when I was working with VMs running Red Hat Satellite, the product I work on. Tried Ryzen GCC compiler test, my CPU was affected indeed. My CPU unit was batch from week 22, people were saying that units produced before week 25 are affected. Since this was a hardware issue, I contacted AMD support and RMA process started. I did not care calling to Alza (Czech “Amazon”) where I purchased my CPU, because I’d not expect them to be able to figure out the issue.

I was reseting BIOS, doing pictures of BIOS screens of my ASUS B350 mobo with certified Kingston DDR4 2400MHz (32 GB), trying various settings. After one week with AMD support, they sent DHL Express and I was missing my CPU for another two weeks until they sent a replacement (week 30, UA 1730SUS). And it worked, memtest passed twice overnight. Problem solved?

Nope, my PC did random restarts or hard freezes when it was idle. What a disaster - you put Ryzen 7 into decent load, problem. You leave it idling, another problem! The solution? Googled out to turn off C-State support in BIOS, which increased power consumption of course. Well, I can live with this for now, but I want this to be fixed with BIOS/microcode update.

Will my Ryzen PC freeze again? I don’t know. But one thing is clear - AMD clearly did not care testing their hardware under Linux enough. This is rather surprising, I remember Opteron CPU launch which was pretty good compared to this.

Anyway, I am big fan of AMD, my very first x86 chip was made by AMD and I am glad there is some competition for Intel hopefully. Things will settle down, I just hope AMD will work hard with motherboard manufactures to solve these issues for Linux early adopters.

And here is a bonus:

Since there was a plenty of space in my case, I took my old SATA SSD (Samsung 840) and installed Windows 10 on this to be able to perform upgrades of BIOS and SSD firmware. Windows was randomly erroring out during boot about corrupted “winload.efi”. Spent endless nights with this, I thought that Windows 10 do not like my Fedora on the same ESP partition. Went through countless tutorials, restoring windows boot files, checking ESP. Since I partitioned everything as GPT, I could not zap EFI for BIOS anymore. Then I found it - I had a loose SATA cable to my Windows SSD. The Samsung drive did not have holes so SATA plug did not “click” in. Common issue with SATA cables, I swapped it with a different one and taped it just to be sure.

06 October 2017 | linux | fedora | ryzen

EFI with libvirt in RHEL7

RHEL 7 does not ship with EFI firmware by default, at least version 7.4 I just tested with. There is easy help tho if you want to try EFI in libvirt which does not make much any other sense than for development purposes.

We will follow steps from Fedora Wiki but changed into RHEL context. Download firmware repository which have latest builds of EFI QEMU firmware and NVRAM images:

# wget -O /etc/yum.repos.d/firmware.repo

Install the package:

# yum -y install edk2.git-ovmf-x64

There are several flavors available and honestly - I just randomly picked one of these:

# rpm -ql edk2.git-ovmf-x64

There are also builds for AARM64 platform, but I will focus on x64_64 in this article.

Here comes the trick, QEMU in RHEL7 is not configured to search this rather unusual EDK2 path, so let’s add it to its configuration:

# cat >>/etc/libvirt/qemu.conf <<EOQEMU
nvram = [

And restart the daemon.

systemctl restart libvirtd

Done, now you can create VMs with UEFI firmware instead of BIOS. Note this cannot be changed once VM is created via virt-manager but you should be able to use command line tools:

# virsh edit my_domain
<type arch='x86_64' machine='pc-i440fx-rhel7.0.0'>hvm</type>
<loader readonly='yes' type='pflash'>/usr/share/edk2.git/ovmf-x64/OVMF_CODE-pure-efi.fd</loader>

If you are PXE booting, I vaguely remember that libvirt driver had some issues, it should be safer to use RTL chipset emulation. That’s all for today.

03 October 2017 | linux | fedora | uefi | efi | libvirt

CentOS and security updates

I often see articles, blog posts or even video tutorials on how to apply security-only errata in CentOS environments or set a cron job to do this regularly. While it can be very useful to keep components on a specific version and only updating those which has security fixes, it has one drawback.

It does not work in CentOS.

The thing is that yum-plugin-security plugin which is available in CentOS installs just fine and operates properly returning no security updates when people test this. But the missing bit is metadata in CentOS repositories, these are not available.

The official position of the CentOS project on the yum-plugin-security is that the project does not test for CVE closure on updates so does not publish the necessary metadata for the security plugin to function. If you require such validation you are encouraged to use RHEL. End of official statement which you can get by typing “@yumsecurity” on the #centos IRC channel.

Third party repositories might provide security related metadata, EPEL to name one. This makes things to look like everything works just fine while it does not. The core components (e.g. kernel, libc, ssh) are indeed not in EPEL and you can easily get fooled that you are safe.

There are several reasonable workarounds including watching security news or Red Hat security alerts and applying updates manually, buying Red Hat Enterprise LInux subscription or simply applying all updates. It’s not that bad as you think, tracking security news is something that every administrator should do anyway to install just minimum set of updates possible for mission critical systems.

01 August 2017 | linux | fedora | centos | rhel

Git auto fetch script I run every day

I am “shutdowner”, meaning I always shutdown my laptop (now workstation) at the end of the day. I have a script to do that which sleeps 5 seconds (so I can change my mind - e.g. when I dig shell history incorrectly and quickly hit enter - it really happened yeah) and it is simple:

  • puts my monitors into standby mode
  • applies all OS updates
  • runs duplicity backup on my home folder
  • fetches git repos
  • filesystem sync call
  • fstrim root volume
  • poweroff

I learned a trick I want to write about today from colleague of mine Mirek Suchý, but I think he runs it from cron (not a “shutdowner” guy). The idea is simple:

  • find all directories containing .git/ and run on all of them:
  • git fetch –all
  • git gc

So every time I do git pull on a repo that I don’t use much (e.g. ruby language), I don’t need to wait seconds in order to pull all commits. Clever, now I’ve improved it a bit.

With my Ryzen 1700 8 core 16 threads CPU, I am able to leverage GNU parallel to do this in parallel. That will be faster. But how much? Let’s test against git repo I use the most:

# git -c pack.threads=1 gc --aggressive

# git -c pack.threads=16 gc --aggressive

Initially I thought that running 16 GNU parallel worker processes of parallel will be fine, but git gc is really slow on one core (see above), so I usually end up with several very slow garbage tasks while all the others finished downloading. The sweet spot for git is around 4 threads where it always gives reasonable times even for bigger repos.

But I think little bit of CPU overcommit won’t kill, therefore I’ve decided to go with 8x4 which might sound crazy (32 threads in theory), but in practice garbage collect is executed only on few repositories I work regularly on.

Lot of words, I know. Here is the snippet:

find ~/work -name '.git' -type d | \
    parallel -j 6 'pushd "{}"; git fetch --all; git -c pack.threads=4 gc --aggressive --no-prune --auto; popd'

I think I could go further but this already gives me good experience and when my PC is doing this, I am already heading away from it. No biggie. Final notes for git flags I use:

  • aggressive - much slower collect giving better results
  • no-prune - I don’t want to loose any commits at any point in time
  • auto - git will decide when to actually run gc
17 July 2017 | linux | fedora | git