-
Notifications
You must be signed in to change notification settings - Fork 75
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
No Fan Control for RX 7900 #293
Comments
As others have reported, I also see this msg about the time I start radeon-profile |
Okay, so, this seems to be more of a problem that I expected, since I had some crashes, and all I can think is that the card is over heating. More testing will tell, but that's all I have at this point. In the meantime, my question is: what component is broken that makes the fan control interface that exists as /sys/drm/.../hwmon/pwm1 just not work for this card? Is it the kernel? Is it firmware? Is it AMDGPU? |
Same here, PMouse. I have some 6800XTs that run fine with radeon-profile but the new 7900XT I just picked up has problems -no fan control |
I also have no fan control on a 7600, max its shown the fan go was around 50% had a crash as i tried out Starfield and hit 85 degrees :/ any fix anyone? |
I have the AMD Radeon™ RX 7900 XTX and still no fan control anyone get control using another program? |
Just a follow up here. I went deep and replaced 100% of my system and RMA-ed my card 3 times trying to troubleshoot this problem...no joy. Pretty disappointed with AMD and PowerColor. However, they have convinced me that the problem is NOT overheating. While I do want to be able to control the fan curve--I don't see any point to having such a week fan curve when my case fans are way louder--I'm quite confident this has nothing to do with the crashing. I switched BOINC applications and the system is very stable. The latest generation of cards self-regulates quite well, apparently. The root cause of my problems is hardware/software, it seems. I've been told to try ROCm 5.7, and after that, 6.0 for improved support. |
On the other hand, I see on Phoronix forums that support for voltage and clock tuning is beyond the initial stages. I get the sense it's available, possibly upstream, but available. I'm not sure where to look, but maybe radeon-profile maintainers do? |
Overclocking is easy on Linux! No real need for Windows type flashy tools. Just understand the commands then write a script, then for permanency you can run it as a startup script.
Point 1 above will give you pointers on how to set voltages power limits, frequencies for CPU and GPU etc. EXAMPLE: 6900XTXH-OC.sh
I just basically run the above and it is overclocked to what it says in that echo statement above. |
Cool, yeah, I guess. That's not changing the fan curve. Is there a way to do that? Fair, though; I guess I didn't make that clear in my post, but that was what I was driving at in this issue. Where do you find the documentation for these parameters? 381,000,000? That would be in microwatts? Odd unit, but okay, I guess it needs to be an integer. |
Have a look at that link I pasted in the previous post, the archlinux wiki on overclocking. You can also cross-reference with this... https://www.reddit.com/r/Amd/comments/agwroj/how_to_overclock_your_amd_gpu_on_linux/ |
Oh, I did. I didn't see anything about the fan curve there.
Yes, I think this information is all quite old, though, right? These interfaces have been exposed for a long time. I don't need to manually control them, because tools like radeon profile work fine...but not for 7900 & RDNA3. I'm just not see this information as applicable. I'm NOT trying to overclock; I hope that is clear. I can see why the information I'm seeking might show up in an overlcocking guide, but it isn't in the ones that have been referenced in this thread. It's been reported by several people that these old interfaces are known to NOT WORK with the new generation of cards. And we know why: the old interface is being deprecated, and they haven't decided on a new standard. I know, it's frustrating. But, nothing that existed before RDNA3 is going to work for RDNA3. And I'm pretty sure they're not going to expose the fan speed directly, as has been the case for a long time. We will not be able to set the fan speed to a constant percent or RPM. They're not going to rebuild that capability. |
Oh yes, you did mention you were after fan control. Apologies I was for some reason only thinking of power and overclocking and controlling volts.
Yeah mate, sorry I thought you were after overclocking. However the base path for your RDNA3 card hwmon files are in the same area and that includes fan config, input and variables for manipulation. Linux doesn't change its stripes for drm cards just because AMD have a new card. They have to follow the Linux "way" to interface with Linux.
I've got an RDNA2 card, 6900XTXH; it is only a generation older than yours. It isn't "OLD". If all you care about is controlling the fan, you may have to experiment at your own risk. Do be careful though. That Arch linux link I gave a few post back also had a link to a GUI for AMD fancontrol written in rust. Just control-f and search for "fan". Good luck! All that said, if you want maximum performance for your RDNA3, get a waterblock and liquid cool it! You'll never have to worry about the fan on cards again! Just the custom fans on the radiator. My radiator for the 6900XTXH has two 140mm noctua industrials on it controlled by pwmconfig fancontrol. With standard consumer PWM 120mm/140mm fans, they'll always be easy to control with lm-sensors, pwmconfig, fancontrol and systemd. Here's my config:
That config controls two sets of radiators. One 360 radiator for my CPU and a 280radiator for my GPU; along with a couple of case fans. Be careful not to manually alter the fan and pwm files if you find them until you are confident on what your are doing. |
Thanks. I think we are talking past each other. I know about the /dev & /proc filesystems. This problem has been widely discussed. The PWM control is broken on RDNA3. They just didn't program that interface to work on the RDNA3 cards. That was 6 months ago. Since then, I heard that kernel 6.7 has fan curve control functional. NOT manual PWM control, like RDNA2 and earlier, but you program the curve in the card and the card follows it for you. I can't find the link right now, but I caught a bit of how to do it and it kinda made sense. Yes, it's poking at the /dev files manually, which is fine with me, I just didn't see any explanation of how to encode the curve. Haven't had time to follow up. But! That is good news. radeon-profile should be able to implement the new fan-curve interface now that it's available in the upstream kernel. I'm running kernel 6.7rc4 now, so I should be able to poke at it. Just need time. I'd be happy to test for radeon-profile though!!! |
Ah, yes, I found what I was thinking about! Here is a little bit of detail about how to use the new fan curve settings: https://gitlab.freedesktop.org/drm/amd/-/issues/2402#note_2184713 And, as someone there pointed out, you can implement something like manual control by setting a single fan curve point with a low temp threshold. Make sense. Hmm, that seems like something I could try right now...Argh, sorry, must do work! must stay focused! Maybe during the holiday break. Good news everyone! |
Tried this for the past three days through both Mainline and AMD-DRM-Next kernels, its a great idea and something I've done in the past with other cards however it doesn't seem to do squat, at least not on the 7900 Taichi. You can absolutely mess with the fan curve through Lact, FanControl-gui, even Corectrl (Radeon-Profile fan controls remain greyed out) but they don't seem to do anything. Others had luck setting a curve via echo to /sys/class/drm/card#/device/gpu_od/fan_ctrl/fan_curve (card# being 0 or 1 depending). but mine never took whether it was that way or through Neovim despite having amdgpu.ppfeaturemask=0xffffffff set. It's been an aggravating weekend lol. |
So what's the current state of affairs? I am considering "upgrading" to Kernel 6.7 but need to know beforehand if fan control works on rdna3. |
@acheronte No idea about this repo, but I stumbled upon this issue searching for the new The new Bear in mind I am a random user and not an expert. Just make sure you monitor temps after testing. To my knowledge, all you need is:
To my knowledge this is purely runtime. Resetting the GPU or rebooting should:tm: clear it, but I am not 100% sure. If you get an error you can consult To clarify, those are not files you can edit through a text editor! They don't behave anything like normal files, the kernel does whatever it wants with reads and writes, same as other sysfs tunables. |
The solution above worked awesome for me. I decided to write a script for everyone else running arch (worked on garuda, not sure what would change on other flavors) to do this very easily. |
Have RX 7600 Pulse OC card from Sapphire, experiencing the same issue. Are there any plans to implement fan curve control for RDNA3 arch in this software? |
I am running arch with qtile, my gpu is an rx 7900 xt. I added the amdgpu.ppfeaturemask=0xffffffff to the grub boot config, but I still can't find the /syc/class/drm/card#/device/gpu_od. I don't have the gpu_od directory. Does anyone know if I missed something or how to do this? |
Works for me. You're on Linux >=6.7, right? |
running 6.6.22-1-lts. |
My fan is still at 0 rmp even though my temps are high 40s. They do work on windows, and they also worked once on linux, I don't what happened for it to spin but then it never did again. |
The 0rpm fan behavior is unclear to me and it seems possible that it is dealt with independently from the fan curve... For instance, I've noticed that if I set a custom curve, the fan always ramps up then down after a while even though the card and hotspot are way cool enough. I haven't found a way to control this behavior. |
I just ran a gpu stress test, the fan ramps up but the junction temp is already at 90ish c. I'm not sure but I read somewhere that some cards change the fan level based on usage not on temp. I'll try to test this. |
I've faced the same issue. Behavior is a bit odd, but it works (somehow) |
Yes, this the incredibly annoying decision of AMD to introduce a "quiet" zone to the fan curves. No idea how to fix that. If you go into windows you will notice that the Adrenaline software will allow you to disable or tweak the quiet zone temperature where the fans do not run until they reach taht threshold. I assume that the quiet zone is the default for the card settings since we all have that issue at 40~something degrees. |
Today I was able to run curve adjustment on Fedora 40 Kernel 6.8+
The parameters above need to be added separated by a space in /etc/default/grub
Next, I set the fan curve using CORECTRL and it worked!!!))) |
Thanks for reporting in @Banditman74. My personal solution, and it might not apply to everyone, was to ditch Linux altogether and use Windows + WSL2.0, to have the best of both worlds. AMD Adrenalin works on Windows and allows me to set a custom fan curve with a few clicks, while WSL allows me to do Linux stuff like writing code and terminal commands, without dual booting or VMs. I have since deleted the Ubuntu partition from my computer. I don't have the time or patience to be tinkering with OS settings as I used to, I have opted for the path of least resistance, so I can focus on doing work on my machine, rather than fight against it. |
For me it was an open gestalt))) |
I haven't used
|
Unfortunately at least for me the fan curve simply gets ignored. I get that below a 50 degrees the fans won't spin no matter what I do but i was at least expecting that beyond that threshold i can setup a fan curve. Of course I added ppfeaturemask to grub but still, it's not working. Kernel: 6.10.0-3-MANJARO also tried with various Kernels 6.6 - 6.9, all with the same behaviour. EDIT: I'm using a 7600 XT, not the 7900 but I think this might apply to all RDNA3 cards |
I feel like the 0rpm behavior might be independent from the fan curve. Does it work if you e.g. have 100% fan speed set and get a high enough temp? Also, I am not 100% sure, but I think the fan curve is governed by the hotspot temperature, not the edge temp. I don't think I've reliably seen the threshold to be 50°C on either the hotspot or edge, but it might be OEM-dependent. |
The fan kicks in at about 65-67 degrees at hotspot, not at core |
Yes, i both tried 35% fixed as well as 100% fixed while running FurMark 2. Either way, the builtin curve remains applied Leasing to a steady Fan Speed increase while running... |
Can you try using the commandline way mentioned earlier in the thread? Do you get any error on one of the echo commands? Normally, when applying fails for a reason or another with the new sysfs API, it gets logged in |
First of all thanks for replying that fast and pointing me to these interesting logs! Just tried it and while the first "echo" (25% fan speed) led to an error (in terminal: Invalid argument, in dmesg: pwm fan curve setting (25) must be within [35,100]), changing the line to 35 led to no errors. Nevertheless the fan curve gets completely ignored, see attached screenshot (blue is the fan, you can see its speed increasing constantly opposite to the "fixed 35%" that i set it up). So still, IT's Not working... |
Interesting. I still don't have any |
I am not really sure, but do make sure you're checking the hotspot temp. I feel like the one you're plotting is the edge temp, which is sometimes significantly lower, in a very workload-dependent way. I don't really recall if corectrl allows you to plot that but |
It's not surprising to me that stuff like fan curves, power limits, etc. are gated behind a kernel commandline flag. I think this was the case for the prior interfaces as well. |
radeon-profile seemed to work great with my 5700XT and 6800XT, but seems to have no effect at all on my 7900. Is anyone else seeing this? Wouldn't be a problem except that it seems to be running hot. That was a problem with my 5700XT, the onboard/bios fan curve was way to shallow.
Is there something I can do to help?
The text was updated successfully, but these errors were encountered: