You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The /proc/diskstats information is assembled for a container by reading the blkio subsystem in the cgroup, in the proc_diskstats_read function.
Problem:
However, the blkio cgroup subsystem can be inaccurate. For instance, if a container operates on the disk /dev/sdc, the host's /proc/diskstats may show a lot of operations on /dev/sdc. But the blkio cgroup may not present this information accurately, as blkio.io_serviced_recursive, blkio.io_service_time_recursive, and other data related to these operations may be empty.
Proposed Solution:
To address this issue, one approach is to read the host's /proc/diskstats to assemble the container's /proc/diskstats.
This can be accomplished by configuring independent disks for each container, such as using /dev/sdc as the data disk of container A and /dev/sdd as the data disk of container B. This practice is also widely adopted in the industry to isolate container resources.
To support this approach, lxcfs may need to isolate /proc/partitions(I don't know why the community doesn't support it so far) and modify the diskstats assembly logic to read the corresponding disk data from the host's /proc/diskstats according to the disk used by the container and reassemble it. A better approach may be to use an option (--enable-host-diskstats) to switch between the old and new solutions, ensuring compatibility.
Furthermore, given the limited accuracy of blkio cgroups in most cases, it may be redundant to use them to assemble /proc/diskstats.
Thank you for your attention!
The text was updated successfully, but these errors were encountered:
Can anyone give me some advice? I tried to implement a solution, a simpler method of disk isolation that does not depend on cgroup - only display the data of the disk used by the current container in /proc/diskstats. @stgraber@mihalicyn
I assume there is a reason why it would not suffice to simply have the container manager bind mount /proc/diskstats from the host into the container after the lxcfs mounts?
In practice, it is found that the disk data recorded by cgroup is not accurate and incomplete, such as the fields rq_ticks and ios_pgr.
Especially in production environments, we usually configure independent disks for containers to ensure service quality. For example, partitioning is a common operation if a large disk is to be allocated to 2 or more containers. Therefore, we can consider the host's /proc/diskstats data.
Background:
The
/proc/diskstats
information is assembled for a container by reading the blkio subsystem in the cgroup, in theproc_diskstats_read
function.Problem:
However, the blkio cgroup subsystem can be inaccurate. For instance, if a container operates on the disk
/dev/sdc
, the host's/proc/diskstats
may show a lot of operations on/dev/sdc
. But the blkio cgroup may not present this information accurately, asblkio.io_serviced_recursive
,blkio.io_service_time_recursive
, and other data related to these operations may be empty.Proposed Solution:
To address this issue, one approach is to read the host's
/proc/diskstats
to assemble the container's/proc/diskstats
.This can be accomplished by configuring independent disks for each container, such as using
/dev/sdc
as the data disk of container A and/dev/sdd
as the data disk of container B. This practice is also widely adopted in the industry to isolate container resources.To support this approach, lxcfs may need to isolate
/proc/partitions
(I don't know why the community doesn't support it so far) and modify the diskstats assembly logic to read the corresponding disk data from the host's/proc/diskstats
according to the disk used by the container and reassemble it.A better approach may be to use an option (
--enable-host-diskstats
) to switch between the old and new solutions, ensuring compatibility.Furthermore, given the limited accuracy of blkio cgroups in most cases, it may be redundant to use them to assemble /proc/diskstats.
Thank you for your attention!
The text was updated successfully, but these errors were encountered: