odroid XU4 wrong temp. Shutdown cpu

After las upgrade to V145 second cpu group has been disabled due a error registering temp in file
if i run cpu command
root@odroid:/home/odroid# cpu
cat: /sys/devices/system/cpu/cpu4/cpufreq/scaling_cur_freq: No existe el fichero o el directorio
cat: /sys/devices/system/cpu/cpu5/cpufreq/scaling_cur_freq: No existe el fichero o el directorio
cat: /sys/devices/system/cpu/cpu6/cpufreq/scaling_cur_freq: No existe el fichero o el directorio
cat: /sys/devices/system/cpu/cpu7/cpufreq/scaling_cur_freq: No existe el fichero o el directorio

─────────────────────────────────────────────────────
DietPi CPU Info
Use dietpi-config to change CPU / performance options
─────────────────────────────────────────────────────
Architecture | armv7l
Temp | Warning: 125’c | Reducing the life of your device.
Governor | conservative
Throttle up | 65% CPU usage

Current Freq Min Freq Max Freq
CPU0 | 200 MHz 200 MHz 1400 MHz
CPU1 | 200 MHz 200 MHz 1400 MHz
CPU2 | 200 MHz 200 MHz 1400 MHz
CPU3 | 200 MHz 200 MHz 1400 MHz

cpu 4 to 7 is offline due temp value storaged in
root@odroid:/home/odroid# cat /sys/devices/virtual/thermal/thermal_zone0/temp
125000

root@odroid:/home/odroid# cat /sys/class/thermal/thermal_zone0/temp
125000

but correct values in thermal zones are
cat /sys/devices/10060000.tmu/temp
sensor0 : 0
sensor1 : 0
sensor2 : 58000
sensor3 : 54000
sensor4 : 54000



I suppose that process that write /sys/devices/virtual/thermal/thermal_zone0/temp or /sys/class/thermal/thermal_zone0/temp from sys/devices/10060000.tmu/temp has any error and can not collect proper temp.


I can not find where services/procees of controlling fan is.
Any help would be appreciated

Hi,

Thanks for the report. Looks like an issue with kernel.

Lets check your current kernel version, please paste results:

uname -a

Here is mine for reference:

root@DietPi:~# uname -a
Linux DietPi 3.10.104+ #1 SMP PREEMPT Tue Feb 21 14:20:54 CET 2017 armv7l GNU/Linux

If your running x.104, lets try reinstalling kernel/image:

apt-get install --reinstall linux-image-armhf-odroid-xu3 linux-image-3.10.104+

When prompted with warning overwriting current kernel, say OK.

Same image as yours
root@odroid:/home/odroid# uname -a
Linux odroid 3.10.104+ #1 SMP PREEMPT Tue Feb 21 14:20:54 CET 2017 armv7l GNU/Linux

also in Linux odroid 3.10.103 and i upgraded some days ago.

If i reboot system also only cpu 0-3 are active due to persistent temp storage value i suppose.

Some days ago y had to remove my emmc card, insert a new microsd whith fresh dietpi. Then all cores were active.
then remove microsd and put again the same emmc card and all cores again active.

Very mysterious problem

Weird lol :frowning:

Just wondering if this is a possible hardware issue, very strange results and not something I’ve personally seen, or had our users report.

During these tests, did you apply the emmc/SD switch as required aswell?

Also, which PSU are you running (eg: 5v/4A)? And, with cloudshell addon or not?

Yes i applied the switch to operate to SDCARD whith sdcard inserted and back to emmc when sdcard removed an emmc inserted.

No cloudshell, used and power supply is original odroid one, im not sure but i remember is 5v /4A.
Tonight will check it.

Will try to shutdown for some minutes and power it on to certificate it could be a hardware issue. :frowning:

Another solution is
1- trying to set the fan in manual mode so i could try to operate it. Something like https://github.com/nthx/odroid-xu3-fan-control
2- fresh install of dietpi

Last night shutdown odroid.
Wait 1 minute and power on with button, same result 4 cores off.

Another shutdown, unplugged power cord, wait 3 min and plugged power cord, and all 8 cores on.

Than makes me think on any state than odroid reminds in core that cannot power them on.

Have limited cpu speed to 1,4Ghz in all cores in order to keep warm and i think if temp raise 100 Celsius 4 cores switch down, and
kernel do not reload temp values to power on the offline cores when temp is warm again

Really strange how 125’c is being reported, yet sensor says other wise.

On my XU4, looks like sensor 2 is CPU probe?

root@DietPi:~# cat /sys/class/thermal/thermal_zone0/temp; cat /sys/devices/10060000.tmu/temp
42000
sensor0 : 37000
sensor1 : 40000
sensor2 : 42000
sensor3 : 39000
sensor4 : 37000



I suppose that process that write /sys/devices/virtual/thermal/thermal_zone0/temp or /sys/class/thermal/thermal_zone0/temp from sys/devices/10060000.tmu/temp has any error and can not collect proper temp.

Not sure what sensor 0/1 are for, but appears yours are lacking.

Its as if your device is skipping thermal throttling (90-95’c) and going straight to emergency core shutdown (125’c).

I can not find where services/procees of controlling fan is.
Any help would be appreciated

http://forum.odroid.com/viewtopic.php?f=52&t=16308

I would personally test another image (even the official HK Ubuntu), if issues persist, and heatsink has good contact with CPU, looks like a possible failed CPU/hardware issue :frowning:

root@odroid:/home/odroid# cat /sys/class/thermal/thermal_zone0/temp; cat /sys/devices/10060000.tmu/temp
56000
sensor0 : 51000
sensor1 : 51000
sensor2 : 56000
sensor3 : 52000
sensor4 : 51000

less stress today, cpu max speed set to 1400

Looks good, maybe your CPU just runs really hot (fabrication of CPU will cause a variance of their heat efficiency, however, they are usually tested to meet a specific range), or, thermal throttling is having no effect :thinking:

Either way, i’d personally recommend upgrading heatsink / giving it a re-seat.

Will do it in future. thanks!!

Finally new fresh install and everything looks fine.
All cores warm!!