support enable-host-diskstats option. #601

peppaJoeng · 2023-07-07T03:44:09Z

new interface

--enable-host-diskstats

Experimental procedure

Disk condition after partition：

$ lsblk
sdd                8:48   0  100G  0 disk 
├─sdd1             8:49   0   50G  0 part 
├─sdd2             8:50   0    1K  0 part 
└─sdd5             8:53   0   20G  0 part

start lxcfs

$ lxcfs --enable-host-diskstats /var/lib/lxc/lxcfs/ &

Start the container

$ docker run -tid -m 256m --device=/dev/sdd1:/dev/test -v /var/lib/lxc/lxcfs/proc/meminfo:/proc/meminfo:rw -v /var/lib/lxc/lxcfs/proc/diskstats:/proc/diskstats:rw --name test ubuntu bash

View container's /proc/diskstats

$ docker exec -it test ls /dev/test
/dev/test
$ docker exec -it test cat /proc/diskstats
8       49 test 249 0 17928 24 75 1115 535153 152 0 169 180 51 0 104857608 3
$ cat /proc/diskstats 
   8       0 sda 23881 2843 2239558 980807 62612 124673 23769904 27578300 0 1989128 29020938 0 0 0 0 11198 461830
   8      32 sdc 1329 0 20890 1088 5376 71021 5013944 101385 0 52829 168482 21 0 41943048 1 277 66007
   8      48 sdd 1383 0 75670 143 104 1115 536836 643 0 813 1279 51 0 104857608 3 5 488
   8      49 sdd1 249 0 17928 24 75 1115 535153 152 0 169 180 51 0 104857608 3 0 0
   8      50 sdd2 10 0 68 0 0 0 0 0 0 10 0 0 0 0 0 0 0
   8      53 sdd5 312 0 20032 29 22 0 1667 2 0 82 32 0 0 0 0 0 0

delete container

docker rm -f `docker ps -aq`

After enabling the enable-host-diskstats option, when the user accesses the /proc/diskstats file, the host /proc/diskstats data will be used instead of reading data from the cgroup. This option is suitable for scenarios where the container exclusively uses block/character devices.

fix #599

After enabling the enable-host-diskstats option, when the user accesses the /proc/diskstats file, the host /proc/diskstats data will be used instead of reading data from the cgroup. This option is suitable for scenarios where the container exclusively uses block/character devices. Signed-off-by: peppaJoeng <[email protected]>

mihalicyn · 2023-08-01T10:07:17Z

Hi @peppaJoeng!

I'll review your PR on this week. Sorry about delay.

peppaJoeng · 2023-08-23T08:57:37Z

Hi, @mihalicyn
is there any progress？

mihalicyn · 2023-08-29T12:08:59Z

src/proc_fuse.c

+			lxcfs_error("Error opening dir /dev: %s\n", strerror(errno));
+			goto child_out;
+		}
+		while ((ptr = readdir(dir)) != NULL) {


here you are iterating over the /dev inside the container mount namespace to make a list of block devices which are used by the container, right?

What do you think about using /proc/<pid>/mountinfo file instead to get all used block devices? Speaking more precisely, this file allows you to get all block devices which are mounted inside the container mount namespace.

Another limitation here is that in the container there can be a few different mount namespaces.

Good idea! Thank you for your suggestion. @mihalicyn

For the first question, I did this by going into the container's namespace and traversing the /dev directory underneath it.

I had thought about using /proc/PID/mountinfo on the host side (If I understand correctly) instead of going into the container namespace to achieve this functionality. But I have encountered some problems.

I cannot get the unmounted block device from this file; in fact, the block device mounted through --device cannot be obtained through the above method.

[root@localhost ~]# docker run -tid --name test --device /dev/sdb:/dev/xxx rnd-dockerhub.huawei.com/official/ubuntu-arm64 bash c82db9506ffaff30ce480852d3133cd4528a013d427372c77d91621edfb75809 [root@localhost ~]# docker inspect --format '{{.State.Pid}}' c82db9506ffaff30ce480852d313 3325697 [root@localhost ~]# cat /proc/3325697/mountinfo | grep sdb [root@localhost ~]# cat /proc/3325697/mountinfo | grep xxx [root@localhost ~]# cat /proc/3325697/mountinfo | grep /dev 520 518 0:59 / /dev rw,nosuid - tmpfs tmpfs rw,seclabel,size=65536k,mode=755 521 520 0:60 / /dev/pts rw,nosuid,noexec,relatime - devpts devpts rw,seclabel,gid=5,mode=620,ptmxmode=666 530 523 0:37 /docker/c82db9506ffaff30ce480852d3133cd4528a013d427372c77d91621edfb75809 /sys/fs/cgroup/devices ro,nosuid,nodev,noexec,relatime master:11 - cgroup cgroup rw,seclabel,devices 540 520 0:57 / /dev/mqueue rw,nosuid,nodev,noexec,relatime - mqueue mqueue rw,seclabel 541 518 253:0 /var/lib/docker/containers/c82db9506ffaff30ce480852d3133cd4528a013d427372c77d91621edfb75809/resolv.conf /etc/resolv.conf rw,relatime - ext4 /dev/mapper/euleros-root rw,seclabel 542 518 253:0 /var/lib/docker/containers/c82db9506ffaff30ce480852d3133cd4528a013d427372c77d91621edfb75809/hostname /etc/hostname rw,relatime - ext4 /dev/mapper/euleros-root rw,seclabel 543 518 253:0 /var/lib/docker/containers/c82db9506ffaff30ce480852d3133cd4528a013d427372c77d91621edfb75809/hosts /etc/hosts rw,relatime - ext4 /dev/mapper/euleros-root rw,seclabel 544 520 0:56 / /dev/shm rw,nosuid,nodev,noexec,relatime - tmpfs shm rw,seclabel,size=65536k 464 520 0:60 /0 /dev/console rw,nosuid,noexec,relatime - devpts devpts rw,seclabel,gid=5,mode=620,ptmxmode=666

As you can see, I didn't get this device. However, I can get it in the following way:

[root@localhost ~]# docker exec -it test bash -c "ls /dev/xxx" /dev/xxx

An important reason why I enter the /dev device to read the device name is that the device names in the container and the host are different.

Hope I understand you correctly. Welcome to exchange your ideas

I cannot get the unmounted block device from this file; in fact, the block device mounted through --device cannot be obtained through the above method.

yes, you can only get devices which are mounted inside the container using this method. --device option in Docker does not mount device, but just (I guess) performs a bind-mount of it inside the container.

botieking98 · 2023-12-25T01:31:11Z

Can anybody take some time to review this PR?

notlate · 2024-01-15T07:43:37Z

Hi, @mihalicyn ，could you take a moment to review this code?

mihalicyn

I like this feature in general, especially taking into account that it's a optional thing and if it's useful for some workloads I think we can take it after some rework and polishing.

mihalicyn · 2024-03-19T17:51:07Z

src/proc_fuse.c

-			return read_file_fuse("/proc/diskstats", buf, size, d);
-	}
+	if (opts && opts->use_host_diskstats) {
+		lock_mutex(&container_dev_mutex);


Why do we need this mutex?

mihalicyn · 2024-03-19T17:54:54Z

src/proc_fuse.c

+			lxcfs_error("Error setting mnt ns: %s\n", strerror(errno));
+			goto child_out;
+		}
+		dir = opendir("/dev");


you can avoid fork() and stuff, by doing something like opendir("/proc/<pid>/root/dev"). In this case you don't need to have an extra process, do setns and stuff, but just read the directory.

mihalicyn · 2024-03-19T18:00:37Z

src/proc_fuse.c

+			lxcfs_error("Error opening dir /dev: %s\n", strerror(errno));
+			goto child_out;
+		}
+		while ((ptr = readdir(dir)) != NULL) {


I cannot get the unmounted block device from this file; in fact, the block device mounted through --device cannot be obtained through the above method.

yes, you can only get devices which are mounted inside the container using this method. --device option in Docker does not mount device, but just (I guess) performs a bind-mount of it inside the container.

mihalicyn · 2024-03-19T18:02:47Z

Hi @peppaJoeng

Sorry for long delay with review (please, feel free to ping us next time).

Please rebase this PR if it's still actual. I have left some review comments.

Kind regards,
Alex

peppaJoeng mentioned this pull request Jul 20, 2023

/proc/diskstats information is inaccurate #599

Open

mihalicyn reviewed Aug 29, 2023

View reviewed changes

mihalicyn reviewed Mar 19, 2024

View reviewed changes

stgraber added the Incomplete Waiting on more information from reporter label Mar 26, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

support enable-host-diskstats option. #601

support enable-host-diskstats option. #601

peppaJoeng commented Jul 7, 2023

mihalicyn commented Aug 1, 2023

peppaJoeng commented Aug 23, 2023

mihalicyn Aug 29, 2023

peppaJoeng Aug 31, 2023 •

edited

Loading

mihalicyn Mar 19, 2024

botieking98 commented Dec 25, 2023

notlate commented Jan 15, 2024

mihalicyn left a comment •

edited

Loading

mihalicyn Mar 19, 2024

mihalicyn Mar 19, 2024

mihalicyn Mar 19, 2024

mihalicyn commented Mar 19, 2024

support enable-host-diskstats option. #601

Are you sure you want to change the base?

support enable-host-diskstats option. #601

Conversation

peppaJoeng commented Jul 7, 2023

new interface

Experimental procedure

mihalicyn commented Aug 1, 2023

peppaJoeng commented Aug 23, 2023

mihalicyn Aug 29, 2023

Choose a reason for hiding this comment

peppaJoeng Aug 31, 2023 • edited Loading

Choose a reason for hiding this comment

mihalicyn Mar 19, 2024

Choose a reason for hiding this comment

botieking98 commented Dec 25, 2023

notlate commented Jan 15, 2024

mihalicyn left a comment • edited Loading

Choose a reason for hiding this comment

mihalicyn Mar 19, 2024

Choose a reason for hiding this comment

mihalicyn Mar 19, 2024

Choose a reason for hiding this comment

mihalicyn Mar 19, 2024

Choose a reason for hiding this comment

mihalicyn commented Mar 19, 2024

peppaJoeng Aug 31, 2023 •

edited

Loading

mihalicyn left a comment •

edited

Loading