Hi,
My system sometimes crashes suddenly and reboots itself. It’s random, browsing web, idling, checking mails, I couldn’t find the trigger. This is the only log I could find about the crash
mce: [Hardware Error]: CPU 1: Machine Check: 0 Bank 5: baa0000000030150 microcode: CPU23: patch_level=0x0a201025 fbcon: Taking over console mce: [Hardware Error]: TSC 0 MISC d012000100000000 SYND 4d000002 IPID 500b000000000 mce: [Hardware Error]: PROCESSOR 2:a20f10 TIME 1689019332 SOCKET 0 APIC 2 microcode a201025
EDIT: my thermals are fine btw, 40C at idle and 70C at max on heavy tasks
Woohoo an mce. If it’s always the same core you could disable it with some thing like ‘echo 0 > /sys/devices/system/cpu/cpu3/online’
This would have to be run every boot, there may be kernel options to do the same thing.
Lol.
This is barbaric and I love it.
Lol those cores are totally there for redundancy… Right? :P
I have an old itanium server that ‘boots’ with like 3/8 working cores… Unfortunately the hardware has some other unknown issues that panic Linux shortly after loading. Somehow the efi system seems to be stable…
TIL
I can save like 20 W per real core. Nice tip for a home server.