Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regression in CPU utilization virtualization #538

Open
webdock-io opened this issue May 19, 2022 · 13 comments
Open

Regression in CPU utilization virtualization #538

webdock-io opened this issue May 19, 2022 · 13 comments
Labels
External Issue is about a bug/feature in another project Feature New feature, not a bug Maybe Undecided whether in scope for the project

Comments

@webdock-io
Copy link

LXD v 5.0.0
Ubuntu Jammy 5.15.0-27-generic

Create a Ubuntu container and stress all your CPUs on the host system, e.g. stress -c 72 enter the container and run htop and you will see the system reports CPUs utilization at 100% across all CPUs. Load avg. reporting is OK at ~0

This seems to be some regression as on Focal 5.13.0-35-generic systems running v4.19 CPU utilization is correctly reported as ~0% across all threads inside the container.

@webdock-io
Copy link
Author

OK this seems to be something connected to the HWE kernel that ships with Jammy or some change in Jammy in general, although please see questions below.

I did a pretty hardcore nuke of the 5.0 install:

snap remove --purge lxd
apt purge snapd && apt install snapd
reboot

Then snap install --channel=4.0/stable after reboot and I still see this issue present. I did another reboot after installing 4.0.9 for good measure and still see the issue.

Although, I can't find a way for me to check which version of lxcfs is actually running, so I can't be sure lxcfs was really being downgraded as well - how to check this?

Also, despite the hardcore nuke seen above, I still saw some old data hanging around - for example doing lxc remote list showed my old remotes still present! How is that even possible? Isn't that supposed to be stored in the database, which almost certainly should have been nuked during the uninstall??

@stgraber Got any input on all this? Thanks :) I would like my testing to be valid and actually revert to an older version of lxcfs in order to determine for sure whether this is an lxcfs regression or something has changed in newer kernels/Jammy which causes this.

If I don't get any response here, I guess my next test is to start over with my system and install Focal instead of Jammy and see if stuff works with lxd v.5.0 which would then help determine if this is some change in Jammy which lxcfs is not handling or a regression as I initially thought.

@webdock-io
Copy link
Author

All right some more testing later and the results are in:

On Ubuntu Focal 20.04, LXD 5.0 both with stock kernel and after HWE kernel installation (5.13.0-40-generic) the issue is not present

So this is definitely some change in Jammy and/or kernels newer than 5.13.0-40

I have a bunch of systems deployed in a datacenter which I am unable to go live with due to this bug, as customers will surely complain, so my only choice here is to do a full reinstall over KVM to Focal - which is exceedingly slow and painful - unless I can get some hints on how to track this this down and/or fix this :)

I'll give it a day or two to see if anybody here wakes up and gives me some pointers before diving into that particular madness. Thanks!

@tomponline
Copy link
Contributor

Have you tried booting with unified_cgroup_hierarchy=0 as a kernel boot argument to see if its cgroupv2 related?

@webdock-io
Copy link
Author

@tomponline Thank you for chiming in and for providing this hint.

I tried adding this to /etc/default/grub and running update--grub where:

GRUB_CMDLINE_LINUX_DEFAULT="swapaccount=1 init_on_alloc=0 systemd.unified_cgroup_hierarchy=0"

After performing this change and a reboot, the problem has gone away!

Note for the curious: swapaccount is lxd related setting we need for our setup and init_on_alloc is a zfs related optimization.

Followup questions:

Although this solves my immediate problem, is it an issue for me moving forward having done this? cgroupv2 seems like a good thing and is what you support moving forward I'm guessing ...

But I guess I could always switch back to using cgroupv2 after lxcfs is fixed, or is that naive of me?

@tomponline
Copy link
Contributor

I suspect this is a problem with LXCFS in pure cgroupv2 environments which needs fixing.
Yes you can switch back to cgroupv2 later once its fixed.

@webdock-io
Copy link
Author

Great thank you for the update - we have already downgraded all affected systems. I'll be back to test this once a fix has been implemented :)

@varuzam
Copy link

varuzam commented Jul 30, 2022

I have same issue in debian 11 with lxcfs 5.0.1. Switching to cgroupv1 fixed it. Looking forward for proper cgroupv2 support

@stgraber
Copy link
Member

stgraber commented Aug 1, 2022

@brauner interested in looking into this one?

@salanki
Copy link

salanki commented Aug 5, 2022

Getting this fixed would be very nice

@lflare
Copy link

lflare commented Feb 26, 2023

Chiming in that this is still experienced in v5.0.3.

@mihalicyn
Copy link
Member

mihalicyn commented Feb 27, 2023

That's not a LXCFS bug, the problem is that currently cpuset cgroup controller is not enabled by default. But it's required to properly "virtualize" a CPU stats inside the LXC container:

if (!cgroup_ops->get(cgroup_ops, "cpuacct", cg, "cpuacct.usage_all", &usage_str)) {

I'll put this in my ToDo to looks how to properly enable this controller with LXD.

-- Upd.

The cgroup-v1 cpuacct and cpu were replaced by the one cpu controller in cgroup-v2. Unfortunately, cgroup-v2 cpu controller doesn't provides us with an analog for cpuacct.usage_all file. There is cpu.stat file, but it gives us only aggregated stat times (sum).

So, that's a kernel limitation. Nothing can be done here from the LXCFS side.

cc @stgraber

@mihalicyn mihalicyn added Feature New feature, not a bug External Issue is about a bug/feature in another project Maybe Undecided whether in scope for the project labels Mar 20, 2024
@webdock-io
Copy link
Author

Ping +2 years later

Now that cgroupv2 is an inevitable fact and will fully replace v1, where upcoming systemd will even refuse to boot under cgroupv1 apparently, I feel this issue needs to be revisited and has become more pressing.

Would it be a matter of putting in a request with whomever maintains cgroupv2 to resurrect cpuacct.usage_all. and then you'd be able to implement this in lxcfs ?

We really depend on this functionality to provide accurate cpu utilization metrics to container customers and are motivated to get this moving in the right direction. If you could point us to where we can raise this issue or provide financial motivation, even, to implement this metric that's needed in cgroupv2, that would be much appreciated.

@stgraber
Copy link
Member

@mihalicyn ^

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
External Issue is about a bug/feature in another project Feature New feature, not a bug Maybe Undecided whether in scope for the project
Development

No branches or pull requests

7 participants