Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Workaround for Linux CGroup V2 causing dropped packets with recording and a memory limit set #3004

Draft
wants to merge 1 commit into
base: 0.x
Choose a base branch
from

Conversation

cb22
Copy link
Contributor

@cb22 cb22 commented Jun 18, 2022

I'd like to preface this by saying this is not a bug in Janus - but rather subtle / broken behaviour from CGroup V2 on Linux. With a memory limit set, the page cache is considered as 'in use' when allocating network buffers. This PR is a workaround that's worked so far for us, but it's likely not the best solution and might not be suitable for inclusion. This is more to get a discussion going, or even just point anyone who's had the same problem in the right direction!

We deploy Janus in Docker containers with a 512MB memory limit, and started noticing that whenever we have a large-ish number of participants in a room (>6) and recording turned on, within 30 minutes, incoming video packets would start getting lost by Janus. This would manifest as glitchy video (NACKs would be sent and they'd recover, but the cycle was continuous) and slowlink events.

Digging further, these losses would show up as drops in /proc/net/udp and with dropwatch. My first thought was that there was something to do with UDP receive buffer sizes going on here, but experimentation with manually tweaking SO_RCVBUF in libnice didn't yield any results.

It seemed related to recording; turning it on / off started and stopped the issue, and the issue disappeared with simply the fwrite calls in record.c commented out. I tried a few combinations, such as a different FS, and tmpfs. Interestingly, running on tmpfs did not replicate the problem, but running an ext4 FS image backed by tmpfs did.

Eventually, I traced it down to __sk_mem_raise_allocated failing in the kernel because it couldn't charge the CGroup for usage. CGroup V2 changed the behaviour, and now keeps track of the page cache on a per CGroup basis, rather than globally in CGroup V1 (why we hadn't hit this before!). Combined with Linux not freeing the page cache when trying to grow a buffer, led to video packets being dropped. I figured audio was unaffected since they're smaller and it didn't need to reallocate.

This was easy to confirm; once the server had gotten into the dropping state, running sync; echo 3 > /proc/sys/vm/drop_caches fixed it completely up until the page cache filled up again.

If you'd like to recreate this, it should be easy enough:

  • Use CGroups V2
  • Run a Janus Docker Container with a low memory limit set (eg, 64MB) to make reproduction quicker
  • Enable recording, and make sure you're recording to a real disk
  • Connect a few browsers to a videoroom on it.

@cb22 cb22 changed the title Workaround for Linux CGroup V2 causing dropped packets with a memory limit set Workaround for Linux CGroup V2 causing dropped packets with recording and a memory limit set Jun 18, 2022
@lminiero
Copy link
Member

Thanks for the detailed analysis of the problem and the root causes, this was an interesting read!

I agree that it's probably not something we can merge as it is, mostly because I don't know what could be the implications and impact on performance in general on systems not affected by this. That said, I think it does make sense to add a way to simply enable this behaviour, e.g., at compile time. A simply way might be to make this functionality dependant on a define, e.g., one that defines the maximum value you use to trigger the action (100 in your case). I'm imagining something like

#ifdef RECORD_MAX_UNFLUSH
	recorder->unflushed_count++;
	if (recorder->unflushed_count == RECORD_MAX_UNFLUSH) {
		posix_fadvise(fileno(recorder->file), 0, 0, POSIX_FADV_DONTNEED);
		recorder->unflushed_count = 0;
	}
#endif

(where unflushed_count is conditionally defined as well), which means you should be able to simply pass -DRECORD_MAX_UNFLUSH=100 as a CFLAG parameter when configuring/building Janus to turn that on. This way, we could keep the pre-existing behaviour by default, and allow those that need this feature to trigger it.

You can also add a commented out define in record.h, e.g.

//~ #define RECORD_MAX_UNFLUSH 100

which is what we do in refcount.h as well as a quick way to turn on reference counters debugging, and would be an easier way of enabling the functionality (you just uncomment that line and recompile).

@lminiero lminiero added the legacy Related to Janus 0.x label Jun 20, 2022
@atoppi
Copy link
Member

atoppi commented Jun 20, 2022

@cb22 first of all, thanks for the very interesting analysis!

Let me clarify that I'm not an expert of CGroups / Linux cache pages, and honestly this is the first time that I need to deal with such kind of challenge :-)

According to Linux kernel docs posix_fadvise is meant to

announce an intention to access file data in a specific pattern in the future, 
thus allowing the kernel to perform appropriate optimizations.
...
The advice is not binding; it merely constitutes an expectation on behalf of the application.

so IIUC this system call is useful for (micro?) optimizations and is not guaranteed to work.

You mentioned that this is not a bug in Janus, but I'm wondering if this is an issue that could potentially arise in Cgroups V2 docker instances with "common" memory limits (e.g. ~GB).
In other words, is this bug in your opinion reproducible with a properly configured janus image and an increasing number of videoroom participants ?

@cb22
Copy link
Contributor Author

cb22 commented Jun 20, 2022

Thanks! @lminiero your solution does look good to me, it'd certainly be perfect for our use case. I'd be happy to make the changes and update this PR, depending on your thoughts on the next section.

@atoppi I had so many false starts when debugging this, it's not every day one hits strange edge cases in the kernel :)

Regarding posix_fadvise, yes, that's what it says but the Linux kernel (currently!) does treat it as an instruction and flushes the page cache (https://unix.stackexchange.com/questions/208237/is-posix-fadvice-posix-fadv-dontneedfunctional-in-linux and https://github.com/torvalds/linux/blob/master/mm/fadvise.c#L111) I can see the flushing take place when I test it manually too.

Of course, if we should rely on this behaviour is indeed a very valid question. The only other alternatives I saw were:

  • Use O_DIRECT - but this has numerous pitfalls and won't work on some filesystems, and we do want the cache when writing at least
  • Fix the behaviour in the Linux kernel to free some page cache when needing to allocate network buffers, just like it does for normal memory usage.

(2) would be the best choice, but it might take quite a while or not be acceptable for the kernel depending on their design decisions.

In other words, is this bug in your opinion reproducible with a properly configured janus image and an increasing number of videoroom participants ?

Yes - the reduced memory limit just causes you to hit it faster. If you give janus say 2GB of memory, you'd then require ~2GB of video data to be written (without any participants leaving since this would cause files to get fclose()d) and you'd hit it. At a lower bitrate and common call lengths this might not ever happen in practice - which is why we only started noticing it on bigger groups of ~6+.

@atoppi
Copy link
Member

atoppi commented Jun 20, 2022

Yes - the reduced memory limit just causes you to hit it faster. If you give janus say 2GB of memory, you'd then require ~2GB of video data to be written (without any participants leaving since this would cause files to get fclose()d)

I don't understand why a cache page should increase its size to the actual file size.
Doesn't fwrite flush the buffer content once data has been written out to the filesystem ?
Sorry if my questions may sound silly.

@cb22
Copy link
Contributor Author

cb22 commented Jun 20, 2022

Doesn't fwrite flush the buffer content once data has been written out to the filesystem ?

fwrite flushes the internal libc buffer to the filesystem, but the kernel will cache this data in its page cache even after the contents have been written to disk. Idea being that if there's free memory around, it makes sense to cache as much as you can (free memory is wasted memory) since if something more important comes along, it'll just cause evictions from the page cache.

You can see this by running the following:

# Create a dummy file for testing
federico@laptop ~> dd if=/dev/zero of=testing.bin bs=1M count=1000
1000+0 records in
1000+0 records out
1048576000 bytes (1.0 GB, 1000 MiB) copied, 0.387327 s, 2.7 GB/s

# Read it back, 8GB/s is much faster than my laptop's SSD
federico@laptop ~> dd if=testing.bin of=/dev/null bs=1M
1000+0 records in
1000+0 records out
1048576000 bytes (1.0 GB, 1000 MiB) copied, 0.132486 s, 7.9 GB/s

# Flush the page cache
federico@laptop ~> sync; echo 3 | sudo tee /proc/sys/vm/drop_caches

# Now it actually reads from disk
federico@laptop ~> dd if=testing.bin of=/dev/null bs=1M
1000+0 records in
1000+0 records out
1048576000 bytes (1.0 GB, 1000 MiB) copied, 1.48592 s, 706 MB/s

@tmatth
Copy link
Contributor

tmatth commented Jun 20, 2022

Doesn't fwrite flush the buffer content once data has been written out to the filesystem ?

fwrite flushes the internal libc buffer to the filesystem, but the kernel will cache this data in its page cache even after the contents have been written to disk. Idea being that if there's free memory around, it makes sense to cache as much as you can (free memory is wasted memory) since if something more important comes along, it'll just cause evictions from the page cache.

You can see this by running the following:

# Create a dummy file for testing
federico@laptop ~> dd if=/dev/zero of=testing.bin bs=1M count=1000
1000+0 records in
1000+0 records out
1048576000 bytes (1.0 GB, 1000 MiB) copied, 0.387327 s, 2.7 GB/s

# Read it back, 8GB/s is much faster than my laptop's SSD
federico@laptop ~> dd if=testing.bin of=/dev/null bs=1M
1000+0 records in
1000+0 records out
1048576000 bytes (1.0 GB, 1000 MiB) copied, 0.132486 s, 7.9 GB/s

# Flush the page cache
federico@laptop ~> sync; echo 3 | sudo tee /proc/sys/vm/drop_caches

# Now it actually reads from disk
federico@laptop ~> dd if=testing.bin of=/dev/null bs=1M
1000+0 records in
1000+0 records out
1048576000 bytes (1.0 GB, 1000 MiB) copied, 1.48592 s, 706 MB/s

Yes - the reduced memory limit just causes you to hit it faster. If you give janus say 2GB of memory, you'd then require ~2GB of video data to be written (without any participants leaving since this would cause files to get fclose()d)

I don't understand why a cache page should increase its size to the actual file size. Doesn't fwrite flush the buffer content once data has been written out to the filesystem ? Sorry if my questions may sound silly.

@cb22 would calling fsync() (or equivalent) have an impact here?

@atoppi
Copy link
Member

atoppi commented Jun 20, 2022

Doesn't fwrite flush the buffer content once data has been written out to the filesystem ?

fwrite flushes the internal libc buffer to the filesystem, but the kernel will cache this data in its page cache even after the contents have been written to disk. Idea being that if there's free memory around, it makes sense to cache as much as you can (free memory is wasted memory) since if something more important comes along, it'll just cause evictions from the page cache.

Oh, got it now. Thanks!

@atoppi
Copy link
Member

atoppi commented Jun 22, 2022

@cb22 are you limiting the memory for the instance with the -m / --memory flag ? I can see there is also a --kernel-memory option.

@atoppi
Copy link
Member

atoppi commented Jun 22, 2022

nvm, I see that --kernel-memory has been deprecated on v2.

@atoppi
Copy link
Member

atoppi commented Jun 22, 2022

@cb22 I'm unable to reproduce.

Steps done:

  • enable cgroups v2 on the host machine (Ubuntu 20.04) through kernel parameter systemd.unified_cgroup_hierarchy=1
  • reboot
  • start a janus docker container with -m 512M and a specific ext4 volume for the recordings (/tmp/recordings)
  • setup a janus videoroom with high bitrate and recording enabled under /tmp/recordings
  • join the room with multiple tabs
  • check memory limits, replace <long ID> with the specific docker container long id (docker ps --no-trunc)
# max memory allowed
cat /sys/fs/cgroup/system.slice/docker-<long ID>.scope/memory.max

# current used memory
cat /sys/fs/cgroup/system.slice/docker-<long ID>.scope/memory.current

# memory stats
cat /sys/fs/cgroup/system.slice/docker-<long ID>.scope/memory.stat
  • check recordings total size
du -h /tmp/recordings/

After a certain amount of time:

  • memory.current sticks to a figure around the memory.max (536870912)
  • recordings total size exceeds 512MB without isses (in my case I stopped the test after 700MB)
  • there are no errors about missing packets and everything works as expected

By inspecting memory.stat I can see that the largest part of memory is used for file.
As of kernel docs, file refers to

Amount of memory used to cache filesystem data, including tmpfs and shared memory.

I guess that take into account also the kernel page cache you are mentioning.
When the memory stops increasing, the file memory stops at around 480MB.

I'm attaching a screenshot of the test.

Am I missing something?
Screenshot from 2022-06-22 18-20-24

@cb22
Copy link
Contributor Author

cb22 commented Jun 22, 2022

@atoppi apologies for the delay, I'll look over this thread and get back to you properly soon, just been a busy past few days! One thing off the top of my head - what kernel version are you running there? My tests were specifically with Debian, 5.10 and 5.16.

@tmatth I remember thinking fsync() wouldn't work; but I didn't explicitly test it and rule it out - I'll give that a shot too.

@tmatth
Copy link
Contributor

tmatth commented Jun 22, 2022

@tmatth I remember thinking fsync() wouldn't work; but I didn't explicitly test it and rule it out - I'll give that a shot too.

Yeah given your description I would honestly be surprised if it mitigated but it did occur to me that it could.

Also IIUC the janus_recorder probably should be calling fsync regardless (see: https://thunk.org/tytso/blog/2009/03/15/dont-fear-the-fsync/) but that's probably out of scope for this issue (unless you find that it does indeed have an impact).

@atoppi
Copy link
Member

atoppi commented Jun 23, 2022

One thing off the top of my head - what kernel version are you running there? My tests were specifically with Debian, 5.10 and 5.16.

@cb22

The host machine is running

  • Ubuntu 20.04
  • 5.13.0-51-generic kernel (x86_64)
  • spectre / meltdown mitigations turned off (mitigations=off in kernel parameters)

The docker instance is running

  • Ubuntu 20.04
  • janus master
  • janus dependecies installed from sources
    • libnice v0.1.19
    • libsrtp v2.4.2
    • libwebsockets v4.3
  • other janus deps installed from apt

@atoppi
Copy link
Member

atoppi commented Jun 28, 2022

@cb22 any update ?

@cb22
Copy link
Contributor Author

cb22 commented Jul 1, 2022

Hey @atoppi - finally got around to this!

I managed to reproduce it reliably on my local system running Arch (5.18.2-arch1-1) - after splitting out most of our application specific signalling stuff etc. It happens both with host mode networking, and Docker mode networking (although, with Docker bridge networking, you won't see the details of the losses in the host's /proc/net/udp)

I've attached this here - janus-docker.tar.gz - feel free to run as is, or use parts of it. It should build cleanly, all commits / deps are pinned, and I just pointed the videoroom demo at it.

Built using: sudo docker buildx build -t janus .
Run using: sudo docker run --rm -ti --memory 32MB janus (the low memory causes the repro much faster)

Some perhaps relevant deviations from the usual configs:

  • Single event loop

Another thought is that perhaps Ubuntu have patched this issue in their kernel.

(FWIW, we've been running my hack in production for a bit now, and we've seen our UDP packet loss go basically to 0 since the rollout. It used to be significant before that.)

@atoppi
Copy link
Member

atoppi commented Jul 1, 2022

I've getting "Timeout expired for session ..." errors even for just 1 publisher after some seconds.
I tried to update lws to v4.3, still no luck.

@atoppi
Copy link
Member

atoppi commented Jul 1, 2022

Oh I noticed you set session timeout setting to 20 in janus.jcfg, increased it to 60 and now it's fixed.

@atoppi
Copy link
Member

atoppi commented Jul 1, 2022

I've managed to reproduce the issue with your dockerfile.
Host kernel is the same, so that is not relevant.
I've also ruled out most of the differences in the janus configuration (event loops, ice tcp, event handlers etc.).

@atoppi atoppi removed the needinfo label Jul 1, 2022
@atoppi
Copy link
Member

atoppi commented Jul 1, 2022

With my setup I got 0 drops, meanwhile with the one you shared I consistently get plenty of NACKs

[Fri Jul  1 17:20:59 2022] [6329852838831458] Sent NACKs for 1 missing packets (video stream #0)
[Fri Jul  1 17:21:04 2022] [8161021847113136] Sent NACKs for 2 missing packets (video stream #0)
[Fri Jul  1 17:21:04 2022] [6329852838831458] Sent NACKs for 1 missing packets (video stream #0)
[Fri Jul  1 17:21:09 2022] [8161021847113136] Sent NACKs for 1 missing packets (video stream #0)
[Fri Jul  1 17:21:09 2022] [6329852838831458] Sent NACKs for 3 missing packets (video stream #0)
[Fri Jul  1 17:21:14 2022] [6329852838831458] Sent NACKs for 1 missing packets (video stream #0)
[Fri Jul  1 17:21:19 2022] [6329852838831458] Sent NACKs for 1 missing packets (video stream #0)
[Fri Jul  1 17:21:29 2022] [6329852838831458] Sent NACKs for 1 missing packets (video stream #0)
[Fri Jul  1 17:21:34 2022] [6329852838831458] Sent NACKs for 2 missing packets (video stream #0)
[Fri Jul  1 17:21:43 2022] [8161021847113136] Sent NACKs for 1 missing packets (video stream #0)
[Fri Jul  1 17:21:43 2022] [6329852838831458] Sent NACKs for 1 missing packets (video stream #0)
[Fri Jul  1 17:21:48 2022] [8161021847113136] Sent NACKs for 1 missing packets (video stream #0)
[Fri Jul  1 17:21:49 2022] [6329852838831458] Sent NACKs for 1 missing packets (video stream #0)
[Fri Jul  1 17:21:53 2022] [8161021847113136] Sent NACKs for 1 missing packets (video stream #0)
[Fri Jul  1 17:21:57 2022] [6329852838831458] Sent NACKs for 1 missing packets (video stream #0)
[Fri Jul  1 17:21:58 2022] [8161021847113136] Sent NACKs for 1 missing packets (video stream #0)

and many "drops", as read by

cat /proc/net/udp | awk '{print $(NF)}' | grep -v -e ^0

drops
69
87

as soon as the memory usage reaches the top limit.

Since the host kernel is the same, that is not part of the problem.
There must be something different (in either dockerfile or jcfgs) between our setups that triggers the issue.

@cb22
Copy link
Contributor Author

cb22 commented Jul 1, 2022

@atoppi that's really interesting - can you share your build and config, and how you're running it? I'll try and reproduce on my end too.

@atoppi
Copy link
Member

atoppi commented Jul 4, 2022

@cb22 finally I managed to reproduce also on my dockerfile.
Indeed there was a configuration mismatch, I didn't notice that your videoroom cfg had a 10 Mbps limit in the videoroom (too many zeroes to read I guess...), while I was setting 1.2 Mbps.

So the issue is definitely confirmed.

@atoppi
Copy link
Member

atoppi commented Jul 4, 2022

I'm also removing the legacy label, since I reproduced the issue on the master branch.

EDIT: I'm keeping the label (the PR is for 0.x), but bear in mind the issue is also present on master.

@atoppi atoppi added the bug label Jul 4, 2022
@lminiero
Copy link
Member

lminiero commented Jul 4, 2022

I didn't notice that your videoroom cfg had a 10 Mbps limit in the videoroom

That is a very unusual setting, though, and no browser will send more than 2 or 2.5mbps anyway in a PeerConnection, no matter what we advertise via REMB.

@cb22
Copy link
Contributor Author

cb22 commented Jul 4, 2022

That is a very unusual setting, though, and no browser will send more than 2 or 2.5mbps anyway in a PeerConnection, no matter what we advertise via REMB.

True - I put that there to speed up reproducing the issue. For us, in production and with my original testing, it was definitely occurring with our standard bitrate cap of ~512kbps.

@atoppi
Copy link
Member

atoppi commented Jul 4, 2022

True - I put that there to speed up reproducing the issue. For us, in production and with my original testing, it was definitely occurring with our standard bitrate cap of ~512kbps.

Very likely because you have dozens of streams to handle (while I'm testing with just a couple of tabs).

@atoppi
Copy link
Member

atoppi commented Jul 4, 2022

By inspecting dropwatch output I can see where drops come from in my case

1 drops at udp_queue_rcv_one_skb+3da (0xffffffffaf4a253a) [software]
1 drops at udp_queue_rcv_one_skb+3da (0xffffffffaf4a253a) [software]
1 drops at udp_queue_rcv_one_skb+3da (0xffffffffaf4a253a) [software]

@cb22 any suggestion on how to further dig the reason for these drops?

@cb22
Copy link
Contributor Author

cb22 commented Jul 24, 2022

Sorry @atoppi somehow missed this.

Right, so that's the same output I saw. A big clue came when I tried a newer kernel that had torvalds/linux@a3ce2b1 in it.

If you're running a kernel with that feature (UDP_MIB_MEMERRORS). You'll see a new item in netstat -u -s - MemErrors. This is what will be incrementing for each drop.

My UDP memory sysctls were set sky high, so I was quite sure those weren't a problem. The call chain in the kernel is something like (https://github.com/torvalds/linux/blob/a3ce2b109a59ee9670706ae8126dcc04cfe261cd/net/ipv4/udp.c):

udp_queue_rcv_one_skb -> __udp_queue_rcv_skb -> __udp_enqueue_schedule_skb

UDP_MIB_MEMERRORS is only incremented in __udp_queue_rcv_skb if __udp_enqueue_schedule_skb returns -ENOBUFS. The only way I can see for that to happen is if __sk_mem_raise_allocated(sk, delta, amt, SK_MEM_RECV) fails.

__sk_mem_raise_allocated (https://github.com/torvalds/linux/blob/7e062cda7d90543ac8c7700fc7c5527d0c0f22ad/net/core/sock.c#L2907) was failing I figured because mem_cgroup_charge_skmem inside it was failing.

mem_cgroup_charge_skmem is here for ref https://github.com/torvalds/linux/blob/03c765b0e3b4cb5063276b086c76f7a612856a9a/mm/memcontrol.c#L7033 - but I didn't go looking much further when combined with seeing that dropping the page cache stopped the problem (and that the problem only occurred when recording was on).

@atoppi
Copy link
Member

atoppi commented Jul 25, 2022

Unfortunately I don't have that feature on my kernel.
I am confident the root of the issue is the same: what are the chances of having dropped incoming datagrams in the same kernel function (udp_queue_rcv_one) and in the same conditions (cgroups v2) but due to a different reason anyway?

__sk_mem_raise_allocated (https://github.com/torvalds/linux/blob/7e062cda7d90543ac8c7700fc7c5527d0c0f22ad/net/core/sock.c#L2907) was failing I figured because mem_cgroup_charge_skmem inside it was failing.

@cb22 How did you check this? I see that __sk_mem_raise_allocated may return 1 fail in many other places.

As of the general progress on the issue, we are concerned that this dropping is going to be a major issue in the next future for both Janus users and people using cgroups v2 with other services.
Since its nature involves a (potentially broken) linux kernel feature we feel that the problem will eventually hit other servers and projects too, so it is better to leave the PR as a draft, using it as a tracker collecting feedback from people having the same issue or from people more experienced than Lorenzo and me about the kernel internals.
Meanwhile something might change in the kernel upstream or another workaround might come up.

@cb22
Copy link
Contributor Author

cb22 commented Jul 26, 2022

@cb22 How did you check this? I see that __sk_mem_raise_allocated may return 1 in many other places.

When __sk_mem_raise_allocated fails, it returns 0; I remember walking through the logic and convincing myself it had to be due to https://github.com/torvalds/linux/blob/7e062cda7d90543ac8c7700fc7c5527d0c0f22ad/net/core/sock.c#L2915 since all the other conditions would be fine, but I happily admit I was quite tired by this point :)

It should actually be easy enough to confirm by adding a printk there. I'll see if I get a chance to try that out. I definitely think it's worth seeking some feedback from someone more experienced with CGroups on the kernel side. If you know of any better places / contact people do let me know, otherwise I think I'll reach out to the CGroup kernel mailing list - https://www.spinics.net/lists/cgroups/

@atoppi
Copy link
Member

atoppi commented Aug 2, 2023

Just a little update: I've been hit by the same issue while working on another project (unrelated to Janus) that was writing a huge log file to the disk.

So I guess this is still something to consider even with an updated environment:

ubuntu 22.04
kernel v5.19.0-50
docker v24.0.5

@atoppi
Copy link
Member

atoppi commented Aug 21, 2023

It looks like redis did a similar fix recently.
They basically addressed the issue with the following chain of calls:

fflush
fdatasync
posix_fadvise

According to redis docs and comments, the first two calls are used to "make sure data will not remain on the OS's output buffers", since "in Linux posix_fadvise only take effect on writeback-ed pages, so a sync(or fsync, fdatasync) is needed to flush the dirty page before posix_fadvise if we reclaim write cache".

Also they added a very useful macro to identify the OS support:

#if defined(__linux__) || __FreeBSD__ >= 10
#define HAVE_FADVISE
#endif

What is not very clear to me is the timing of cache reclaim they are doing.
Anyway I plan to switch from a packet number criteria (like this draft do) to a size threshold (e.g. 1-2MBytes) since it makes more sense to me to reclaim at defined amount of bytes.

@atoppi
Copy link
Member

atoppi commented Aug 21, 2023

What I was thinking of is something like this:

diff --git a/record.c b/record.c
index 6883ccf9..8774692c 100644
--- a/record.c
+++ b/record.c
@@ -21,6 +21,7 @@
 #include <sys/stat.h>
 #include <errno.h>
 #include <libgen.h>
+#include <fcntl.h>
 
 #include <glib.h>
 #include <jansson.h>
@@ -465,6 +466,15 @@ int janus_recorder_save_frame(janus_recorder *recorder, char *buffer, uint lengt
                header->seq_number = htons(seq);
                header->timestamp = htonl(timestamp);
        }
+#if defined(__linux__) || __FreeBSD__ >= 10
+       recorder->unflushed_bytes+=length;
+       if (recorder->unflushed_bytes >= 2000000) {
+               fflush(recorder->file);
+               fsync(fileno(recorder->file));
+               posix_fadvise(fileno(recorder->file), 0, 0, POSIX_FADV_DONTNEED);
+               recorder->unflushed_bytes = 0;
+       }
+#endif
        /* Done */
        janus_mutex_unlock_nodebug(&recorder->mutex);
        return 0;
diff --git a/record.h b/record.h
index 354474f3..4db35fb2 100644
--- a/record.h
+++ b/record.h
@@ -71,6 +71,8 @@ typedef struct janus_recorder {
        janus_mutex mutex;
        /*! \brief Atomic flag to check if this instance has been destroyed */
        volatile gint destroyed;
+       /*! \brief Used to flush the page cache */
+       uint unflushed_bytes;
        /*! \brief Reference counter for this instance */
        janus_refcount ref;
 } janus_recorder;

Basically I'd change the draft by:

  • adding the fflush and fsync calls before posix_fadvise
  • using a macro to detect os support
  • invoke cache reclaim on a written bytes basis instead of number of frames

@cb22 in case you are still interested to the patch, could please take a look and provide some feedback?

@cb22
Copy link
Contributor Author

cb22 commented Aug 22, 2023

@cb22 in case you are still interested to the patch, could please take a look and provide some feedback?

@atoppi at first glance, looks good to me! I can run it in production on a single instance for a few days and give some more feedback, if you'd like. We have monitoring set up for packet drops, so should be able to catch any issues.

@atoppi
Copy link
Member

atoppi commented Aug 22, 2023

@cb22 that would be great! Thanks!

@atoppi
Copy link
Member

atoppi commented Sep 7, 2023

@cb22 have you had the chance to test the patch?

@atoppi
Copy link
Member

atoppi commented Feb 15, 2024

@cb22 just noticed that we never got feedback. Any update on this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug legacy Related to Janus 0.x
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants