-
Notifications
You must be signed in to change notification settings - Fork 250
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
support enable-host-diskstats option. #601
base: main
Are you sure you want to change the base?
Conversation
After enabling the enable-host-diskstats option, when the user accesses the /proc/diskstats file, the host /proc/diskstats data will be used instead of reading data from the cgroup. This option is suitable for scenarios where the container exclusively uses block/character devices. Signed-off-by: peppaJoeng <[email protected]>
Hi @peppaJoeng! I'll review your PR on this week. Sorry about delay. |
Hi, @mihalicyn |
lxcfs_error("Error opening dir /dev: %s\n", strerror(errno)); | ||
goto child_out; | ||
} | ||
while ((ptr = readdir(dir)) != NULL) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here you are iterating over the /dev
inside the container mount namespace to make a list of block devices which are used by the container, right?
What do you think about using /proc/<pid>/mountinfo
file instead to get all used block devices? Speaking more precisely, this file allows you to get all block devices which are mounted inside the container mount namespace.
Another limitation here is that in the container there can be a few different mount namespaces.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea! Thank you for your suggestion. @mihalicyn
For the first question, I did this by going into the container's namespace and traversing the /dev
directory underneath it.
I had thought about using /proc/PID/mountinfo
on the host side (If I understand correctly) instead of going into the container namespace to achieve this functionality. But I have encountered some problems.
- I cannot get the unmounted block device from this file; in fact, the block device mounted through
--device
cannot be obtained through the above method.
[root@localhost ~]# docker run -tid --name test --device /dev/sdb:/dev/xxx rnd-dockerhub.huawei.com/official/ubuntu-arm64 bash
c82db9506ffaff30ce480852d3133cd4528a013d427372c77d91621edfb75809
[root@localhost ~]# docker inspect --format '{{.State.Pid}}' c82db9506ffaff30ce480852d313
3325697
[root@localhost ~]# cat /proc/3325697/mountinfo | grep sdb
[root@localhost ~]# cat /proc/3325697/mountinfo | grep xxx
[root@localhost ~]# cat /proc/3325697/mountinfo | grep /dev
520 518 0:59 / /dev rw,nosuid - tmpfs tmpfs rw,seclabel,size=65536k,mode=755
521 520 0:60 / /dev/pts rw,nosuid,noexec,relatime - devpts devpts rw,seclabel,gid=5,mode=620,ptmxmode=666
530 523 0:37 /docker/c82db9506ffaff30ce480852d3133cd4528a013d427372c77d91621edfb75809 /sys/fs/cgroup/devices ro,nosuid,nodev,noexec,relatime master:11 - cgroup cgroup rw,seclabel,devices
540 520 0:57 / /dev/mqueue rw,nosuid,nodev,noexec,relatime - mqueue mqueue rw,seclabel
541 518 253:0 /var/lib/docker/containers/c82db9506ffaff30ce480852d3133cd4528a013d427372c77d91621edfb75809/resolv.conf /etc/resolv.conf rw,relatime - ext4 /dev/mapper/euleros-root rw,seclabel
542 518 253:0 /var/lib/docker/containers/c82db9506ffaff30ce480852d3133cd4528a013d427372c77d91621edfb75809/hostname /etc/hostname rw,relatime - ext4 /dev/mapper/euleros-root rw,seclabel
543 518 253:0 /var/lib/docker/containers/c82db9506ffaff30ce480852d3133cd4528a013d427372c77d91621edfb75809/hosts /etc/hosts rw,relatime - ext4 /dev/mapper/euleros-root rw,seclabel
544 520 0:56 / /dev/shm rw,nosuid,nodev,noexec,relatime - tmpfs shm rw,seclabel,size=65536k
464 520 0:60 /0 /dev/console rw,nosuid,noexec,relatime - devpts devpts rw,seclabel,gid=5,mode=620,ptmxmode=666
As you can see, I didn't get this device. However, I can get it in the following way:
[root@localhost ~]# docker exec -it test bash -c "ls /dev/xxx"
/dev/xxx
- An important reason why I enter the
/dev
device to read the device name is that the device names in the container and the host are different.
Hope I understand you correctly. Welcome to exchange your ideas
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I cannot get the unmounted block device from this file; in fact, the block device mounted through --device cannot be obtained through the above method.
yes, you can only get devices which are mounted inside the container using this method. --device
option in Docker does not mount device, but just (I guess) performs a bind-mount of it inside the container.
Can anybody take some time to review this PR? |
Hi, @mihalicyn ,could you take a moment to review this code? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this feature in general, especially taking into account that it's a optional thing and if it's useful for some workloads I think we can take it after some rework and polishing.
return read_file_fuse("/proc/diskstats", buf, size, d); | ||
} | ||
if (opts && opts->use_host_diskstats) { | ||
lock_mutex(&container_dev_mutex); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need this mutex?
lxcfs_error("Error setting mnt ns: %s\n", strerror(errno)); | ||
goto child_out; | ||
} | ||
dir = opendir("/dev"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can avoid fork()
and stuff, by doing something like opendir("/proc/<pid>/root/dev")
. In this case you don't need to have an extra process, do setns and stuff, but just read the directory.
lxcfs_error("Error opening dir /dev: %s\n", strerror(errno)); | ||
goto child_out; | ||
} | ||
while ((ptr = readdir(dir)) != NULL) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I cannot get the unmounted block device from this file; in fact, the block device mounted through --device cannot be obtained through the above method.
yes, you can only get devices which are mounted inside the container using this method. --device
option in Docker does not mount device, but just (I guess) performs a bind-mount of it inside the container.
Hi @peppaJoeng Sorry for long delay with review (please, feel free to ping us next time). Please rebase this PR if it's still actual. I have left some review comments. Kind regards, |
new interface
--enable-host-diskstats
Experimental procedure
$ lxcfs --enable-host-diskstats /var/lib/lxc/lxcfs/ &
$ docker run -tid -m 256m --device=/dev/sdd1:/dev/test -v /var/lib/lxc/lxcfs/proc/meminfo:/proc/meminfo:rw -v /var/lib/lxc/lxcfs/proc/diskstats:/proc/diskstats:rw --name test ubuntu bash
After enabling the enable-host-diskstats option, when the user accesses the /proc/diskstats file, the host /proc/diskstats data will be used instead of reading data from the cgroup. This option is suitable for scenarios where the container exclusively uses block/character devices.
fix #599