DNS server performance with pihole and unbound

I have tested DNS server performance with dietpi using dnsperf for uncached entries. It seems to be 10x faster than raspberry pi os and alpine os on the Raspberry Pi4 (20k queries per second vs 1800 qps and 1345 qps respectively). Can anyone else confirm this?

I cannot, but I’m interested in your use case. DO you wanna build a public resolver? :smiley:

I have a PI3, running DietPI, PiHole and Unbound. It just feels faster to me than other solutions I have experimented with. Not tried much, but used to have the PiHole going via OpenDNS. Also an option via a router for a while.

I just like how neat it is. One little board to handle DNS and blocking. Works well for me.

When I say “feels faster”, I just mean watching page loading. Nothing really tested.

Why did I swap to Ubound instead of a public option like OpenDNS or Cloudfare? Partially privacy, partially independence. (laughs at Cloudfare falling over again this week) Mostly just as an excuse to try it.

The DNS cache of Pi-hole as well as Unbound will probably have an effect on the benchmark, unless testing with all different hostnames. The full resolving cycle for an uncached hostname should be slower than with upstream DNS providers, with the intermediate step through Pi-hole, recursive root server queries and DNSSEC. But once the cache is filled with the websites you use, it is fast :slightly_smiling_face:.

Thanks for your comments.

I agree you need to avoid cache for a fair benchmark.

There seems to have been a recent update to PiHole that results in many more effective cache hits on my setups, you then get an incredible increase in subjective speed.

I created a list of DNS queries with a shell script that is then used by dnsperf when you run it in an attempt to objectively benchmark this;

dnsperf -s 192.168.1.13 -d queries.txt

In addition to the odd x10 faster performance in queries per second with dietpi, I decided to test with a a second DNS server (another DietPI instance) with exactly the same set up. The second server was fast 10Kqps but half as fast as the first dietpi (20Kqps).

The solution was the cpu-governor!

The fastest performance was on the pi with the governor set to performance.

Interesting that this is so CPU intense. Instead of running the device at max CPU frequency 24/7, you might want to test the ondemand governor, in case tweak its parameters. Compared to schedutil, it ramps up to max frequency earlier/faster, and stays there for longer (1s with default sampling rate and down factor). In a benchmark case, i.e. if there is a burst of queries, that should practically mean max frequency until the burst ended, hence almost identical to performance.

schedutil is a very fast adapting governor, i.e. it raises the frequency quickly if needed, but makes more use of the intermediate steps (no throttle up threshold). And it scales down as quickly, possibly leading to many up and downs in fast iterating workloads. That saves power, also since it run mostly in the CPU scheduler rather than in CPUFreq with sampling overhead etc, but reduces performance in certain scenarios, especially such benchmarks.

For a typical home network, this should be almost irrelevant. Personally, I only get about 500–1,000 requests in 10 minutes during the day. That’s just a few requests per second. It’s highly unlikely that a typical household would have 10,000 or more requests per second.