Half speed m.2 SSD vs Raspberry OS x64

After using dietpi on a 2gb RPi 4, I absolutely loved the setup. Great job, there. I purchased an Argon One m.2 case, thought dietpi would be the bee’s knees, so I installed it. The 2gb RPi 4, has a sandisk extreme pro 128gb MicroSD, when running the benchmarks, it was 10mb/s faster than my m.2, thought that curious. So, I found a script that benches the drive I/O at pibenchmarks.com. I ran it on dietpi and raspberry pi OS x64. Dietpi scores: 3831 (stock clocks) and RPi OS scores: 7405 (Stock clocks). I have the results up of the dietpi I’ll paste them here:
Category Test Result
HDParm Disk Read 28.17 MB/s
HDParm Cached Disk Read 23.17 MB/s
DD Disk Write 26.8 MB/s
FIO 4k random read 3825 IOPS (15300 KB/s)
FIO 4k random write 6907 IOPS (27629 KB/s)
IOZone 4k read 10310 KB/s
IOZone 4k write 19369 KB/s
IOZone 4k random read 7851 KB/s
IOZone 4k random write 21023 KB/s

Score: 3831

Here are the results of RPi OS:
133 MB/s
264.83 MB/s
259.95 MB/s
10,199 IOPS
13,671 IOPS
40,796 KB/s
54,686 KB/s
31,897 KB/s
26,847 KB/s
18,449 KB/s
31,511 KB/s

I’m mainly wondering if I’m missing something simple. I adore dietpi, it suits my needs amazingly well, but I’m using an SSD, and I would like to get the max performance out of it. I searched the forums for similar issues, found one about the PSU, mine is a 3.5A wall wart, and I’d think if it were the issue, it would happen in both OSes.

Here are the benchmark links, if they’re helpful. I used this benchmark for consistency.

Raspberry Pi OS: https://pibenchmarks.com/benchmark/59530
DietPi: https://pibenchmarks.com/benchmark/59555

Thanks for any ideas.

Just for clarification. You are using the very same setup for both test? Means same hardware aso? And you used DietPi ARMv8 64Bit to compare to RPi OS 64bit?

Just asking because at the end DietPi is nothing else than a RPI OS with reduced amount of packages and some scripting on top.

MichaIng
FYI

Yes sir. Same system, pulled the SSD out, flashed, booted, tested, repeat. The reason I posted is because I dug around trying to find an explanation, everywhere I read it stated ‘dietpi is RasPi OS x64 with scripts.’ I don’t understand the anomaly. The aforementioned 2gb pi with the SD card is faster using the built in bench than my 4gb with a SSD. It isn’t unusable, it’s just half the speed of the same setup as RasPi OS x64. I really want to stick with dietpi. I was just hoping it was some package I needed to install or a config file that isn’t perfect.

Still the question you did not answer, did you user DietPi ARMv8 image? As we offered 3 different version for RPI SBC. It’s important to you the correct one. As well, did you updated all packes to have same software versions.

My unzipped download is DietPi-ARMv8-Bullseye.

If you’re asking if I ran apt update and upgrade, yes. Before the benchmark. For both OSes.

There seems to be a difference on how CPU was set/behaving

RPi OS

CPU:
  Info: model: N/A variant: cortex-a72 bits: 64 type: MCP arch: ARMv8 family: 8 model-id: 0
    stepping: 3
  Topology: cpus: 1x cores: 4 smt: N/A cache: N/A
  Speed (MHz): avg: 1500 min/max: 600/1500 scaling: driver: cpufreq-dt governor: ondemand cores:
    1: 1500 2: 1500 3: 1500 4: 1500 bogomips: 432

DietPi

CPU:
  Info: model: N/A variant: cortex-a72 bits: 64 type: MCP arch: ARMv8 family: 8 model-id: 0
    stepping: 3
  Topology: cpus: 1x cores: 4 smt: N/A cache: N/A
  Speed (MHz): avg: 700 min/max: 600/1500 scaling: driver: cpufreq-dt governor: schedutil cores:
    1: 700 2: 700 3: 700 4: 700 bogomips: 432

looks like frequency stuck at 700 MHz. It’s governor: ondemand vs governor: schedutil

Nice catch, I’ll change that in the config and give it a run. Will post back with results.

Thanks!

The schedutil governor is the modern replacement for ondemand, conservative and interactive governors. It changes the clock much more rapidly in incremental steps based on usage directly obtained from the CPU scheduling driver. That way it is much more efficient compared to ondemand, which switches to max clock directly when hitting a given CPU load in given timeframe and back after given threshold, at least in theory. Would be interesting if this affects real world disk I/O negatively, the benchmark results are indeed by some factors worse. Can you change it to ondemand via dietpi-config performance options and rerun the benchmark? Also in performance (fixed max clock) it would be interesting.

Indeed I will. I made sure the governor was schedutil and my clocks were still default. Ran the bench again, similar results. I’m going to change to your request and I’ll post those results.

This is ondemand: https://pibenchmarks.com/benchmark/59559
This is performance: https://pibenchmarks.com/benchmark/59560

They scored identical.

RasPi OS x64: https://pibenchmarks.com/benchmark/59530

Just to keep things a bit tidy. I would like you both to know, I love the OS, I’m not complaining about the performance, it’s definitely usable. I just found it odd, between the two. Thank you both for your time.

The thing is that kernel and firmware is identical, so it is quite unexpected that performance is any different. As of lower base RAM and background job usage, DietPi is expected to have a little better performance when not stopping processes on RPi OS.

One thing I have never benchmarked is that we have ext4 checksumming enabled (default on Debian also when formatting via mkfs), while RPi OS has this, AFAIK, disabled. This implies some additional computation and disk I/O, but shouldn’t cause such a dramatic difference. It reduces the chance for filesystem corruption, so generally worth it.

Another thing is that RPi OS has arm_boost=1 set in /boot/config.txt which sets the max CPU frequency on RPi 400 and most 4B models to 1800 MHz. This is not seen on your benchmark, but at least worth to try setting this on DietPi. This is actually something we may add by default as well, but needs adjustments to overclocking profiles. Probably you can monitor and compare CPU usage during the benchmarks.

And DietPi runs an APT update at boot. Please assure that this is not running anymore (e.g. via htop) when doing the benchmark.

another option to try would be

  1. install plain RPi OS
  2. perform all possible updated
  3. run the benchmark
  4. using dietpi-installer to get DietPi on top of the existing RPi OS
  5. https://dietpi.com/docs/hardware/#make-your-own-distribution
  6. run the benchmark again
1 Like

That would “remove” the checksumming indeed. I think it can be also done from an external system via tune2fs.

I’m going to try removing checksumming, just to see if that’s the cause. If it is, we will all have the data. I’m just tinkering, I like this kinda stuff. I have a bunch of Linux machines, I will pull the drive and test it once I’m back home from work. Definitely will post the results. Y’all are awesome, keep it up!

Edit: I’m going to add arm_boost to the config first, since I don’t have to pull the drive, run the bench and log that data, too.

Well, I have done the testing, I don’t see any changes. To be 100% fair, I’m not entirely sure how to check if metadata checksumming is disabled, terminal did its thing, but there wasn’t a progress bar or any status displayed. I ran debugfs, every group does have a checksum at the tail, but I’m pretty sure that’s unrelated to checking being enabled or disabled, that’s a journal thing. I’ll attach the results of the benchmark below. I’ll continue to run it as is, unless y’all want to continue to try and figure it out. This pi is my tinkertoy, so I have no issues breaking the OS and reinstalling. Thank y’all, again.

Arm_boost=1: https://pibenchmarks.com/benchmark/59587
Arm_boost=1 and disabled checksumming (allegedly): https://pibenchmarks.com/benchmark/59593

did you tried to install plain RPi OS first and performed the transformation into DietPi afterwards? https://dietpi.com/forum/t/half-speed-m-2-ssd-vs-raspberry-os-x64/11232/12

1 Like

Not yet, that’s my next step. I’ll let y’all know my findings. :upside_down_face:

Alright, this is a bit interesting.

RasPi OS to Dietpi, stock : https://pibenchmarks.com/benchmark/59598

RasPi OS to dietpi, 1800mhz, perf gov: https://pibenchmarks.com/benchmark/59599

I did run the bench on the RasPi OS minimal, after updating, the score was around 7,400. I wanted to submit it, but I had an interwebz issue. I hope I’m not wasting y’all’s time, it’s merely a conundrum that is interesting, to me.

Very interesting indeed, so it must be something with the packages (the uninstalled ones) or the few system configurations. I’ve no idea currently what would affect I/O performance that dramatically. Will have a look thorough the DietPi-Installer. Probably it makes sense to test restoring the original config.txt from RPi OS. Would be not the first time that some setting had unexpected side effects.

But also: As noted above, did you assure that the initial APT and DietPi update check at boot finished before running the benchmark?

Btw, this is the benchmark tool you use? https://github.com/TheRemote/PiBenchmarks

I stopped all services installed, made sure htop wasn’t reporting any APT or other background processes using any I/O. I also waited at least 10 minutes after boot to start the bench/htop. Monitored my network statistics as well, just to double check. Minus the bare minimum services required to run and MATE (same desktop I used on vanilla RasPi OS x64), nothing was running. That is the benchmark I have been using, only for consistency and linkable results. I will swap out the config and see if that makes a difference. I did make sure the OS was updated and firmware was up to date on every install. It has piqued my interest, hopefully we can figure it out, just don’t let me take y’all from more important matters. Thanks again, to the both of y’all.