USB r/w speed issues

Creating a bug report/issue

Required Information

  • DietPi version | cat /boot/dietpi/.version
    G_DIETPI_VERSION_CORE=8
    G_DIETPI_VERSION_SUB=7
    G_DIETPI_VERSION_RC=1
    G_GITBRANCH=‘master’
    G_GITOWNER=‘MichaIng’

  • Distro version | echo $G_DISTRO_NAME $G_RASPBIAN
    bullseye 0

  • Kernel version | uname -a
    Linux pubbypi 5.15.32-v8+ #1538 SMP PREEMPT Thu Mar 31 19:40:39 BST 2022 aarch64 GNU/Linux

  • SBC model | echo $G_HW_MODEL_NAME or (EG: RPi3)
    RPi 4 Model B (aarch64)

  • Power supply used | (EG: 5V 1A RAVpower)
    5V 3.5A

  • USB card used | (EG: SanDisk ultra)
    Kingston Datatraveler Exodia 3.2 [gen 1?] 128GB

Additional Information (if applicable)

  • Software title | (EG: Nextcloud)
  • Was the software title installed freshly or updated/migrated?
  • Can this issue be replicated on a fresh installation of DietPi?
    ← If you sent a “dietpi-bugreport”, please paste the ID here →
  • Bug report ID | echo $G_HW_UUID
root@pubbypi:~# vcgencmd bootloader_version
2022/04/26 11:24:28
version 507b2360eb46af23c05844b289dc5ae4ecfc3cca (release)
timestamp 1650968668
update-time 1659651248
capabilities 0x0000007f

Steps to reproduce

Expected behaviour


  • Slightly faster r/w at least

Actual behaviour

  • Custom Filesystem : Write = 2 MB/s | Read = 23 MB/s

Extra details

root@pubbypi:~# blkid
/dev/sda1: LABEL_FATBOOT="boot" LABEL="boot" UUID="4ADC-240F" BLOCK_SIZE="512" TYPE="vfat" PARTUUID="8f4dbd00-01"
/dev/sda2: LABEL="rootfs" UUID="89cef7ba-d45a-47e7-8d42-aff18afcec80" BLOCK_SIZE="4096" TYPE="ext4" PARTUUID="8f4dbd00-02"
/dev/loop0: TYPE="squashfs"
/dev/loop1: TYPE="squashfs"
root@pubbypi:~# lsusb -t
/:  Bus 02.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/4p, 5000M
    |__ Port 2: Dev 2, If 0, Class=Hub, Driver=hub/4p, 5000M
        |__ Port 4: Dev 3, If 0, Class=Mass Storage, Driver=usb-storage, 5000M
/:  Bus 01.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/1p, 480M
    |__ Port 1: Dev 2, If 0, Class=Hub, Driver=hub/4p, 480M
        |__ Port 2: Dev 3, If 0, Class=Hub, Driver=hub/4p, 480M
            |__ Port 2: Dev 4, If 1, Class=Human Interface Device, Driver=usbhid, 1.5M
            |__ Port 2: Dev 4, If 0, Class=Human Interface Device, Driver=usbhid, 1.5M
            |__ Port 3: Dev 5, If 0, Class=Human Interface Device, Driver=usbhid, 12M
            |__ Port 3: Dev 5, If 1, Class=Human Interface Device, Driver=usbhid, 12M
root@pubbypi:~# lsblk
NAME   MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
loop0    7:0    0  87.7M  1 loop /snap/core/10862
loop1    7:1    0  88.1M  1 loop /snap/core/11996
sda      8:0    1 115.2G  0 disk
├─sda1   8:1    1   256M  0 part /boot
└─sda2   8:2    1  74.3G  0 part /

(lsbusb -v whilst connected via powered hub atm)

root@pubbypi:~# lsusb -vs 2:3

Bus 002 Device 003: ID 0951:1666 Kingston Technology DataTraveler 100 G3/G4/SE9 G2/50
Device Descriptor:
  bLength                18
  bDescriptorType         1
  bcdUSB               3.20
  bDeviceClass            0
  bDeviceSubClass         0
  bDeviceProtocol         0
  bMaxPacketSize0         9
  idVendor           0x0951 Kingston Technology
  idProduct          0x1666 DataTraveler 100 G3/G4/SE9 G2/50
  bcdDevice            0.01
  iManufacturer           1 Kingston
  iProduct                2 DataTraveler 3.0
  iSerial                 3 E0D55EA573D8F6506938294A
  bNumConfigurations      1
  Configuration Descriptor:
    bLength                 9
    bDescriptorType         2
    wTotalLength       0x002c
    bNumInterfaces          1
    bConfigurationValue     1
    iConfiguration          0
    bmAttributes         0x80
      (Bus Powered)
    MaxPower              296mA
    Interface Descriptor:
      bLength                 9
      bDescriptorType         4
      bInterfaceNumber        0
      bAlternateSetting       0
      bNumEndpoints           2
      bInterfaceClass         8 Mass Storage
      bInterfaceSubClass      6 SCSI
      bInterfaceProtocol     80 Bulk-Only
      iInterface              0
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x81  EP 1 IN
        bmAttributes            2
          Transfer Type            Bulk
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0400  1x 1024 bytes
        bInterval               0
        bMaxBurst              15
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x02  EP 2 OUT
        bmAttributes            2
          Transfer Type            Bulk
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0400  1x 1024 bytes
        bInterval               0
        bMaxBurst              15
Binary Object Store Descriptor:
  bLength                 5
  bDescriptorType        15
  wTotalLength       0x0016
  bNumDeviceCaps          2
  USB 2.0 Extension Device Capability:
    bLength                 7
    bDescriptorType        16
    bDevCapabilityType      2
    bmAttributes   0x00000006
      BESL Link Power Management (LPM) Supported
  SuperSpeed USB Device Capability:
    bLength                10
    bDescriptorType        16
    bDevCapabilityType      3
    bmAttributes         0x00
    wSpeedsSupported   0x000e
      Device can operate at Full Speed (12Mbps)
      Device can operate at High Speed (480Mbps)
      Device can operate at SuperSpeed (5Gbps)
    bFunctionalitySupport   2
      Lowest fully-functional device speed is High Speed (480Mbps)
    bU1DevExitLat          10 micro seconds
    bU2DevExitLat        2047 micro seconds
can't get debug descriptor: Resource temporarily unavailable
Device Status:     0x0000
  (Bus Powered)
root@pubbypi:~# hdparm -tT /dev/sda1

/dev/sda1:
 Timing cached reads:   1734 MB in  2.00 seconds = 868.23 MB/sec
 Timing buffered disk reads: 252 MB in  3.01 seconds =  83.84 MB/sec

I’ve used with/without usb-storage.quirks=0951:1666:u in /boot/cmdline.txt to no avail.
I’ve connected with/without a hub (mind you this is just a usb stick itself), on every usb port with no difference. One odd thing I’ve noticed is it can randomly be fast, at least with the reading, however it’s still atrocious with writes, especially when I’m trying to do updates. I recently just got this usb stick, but I also have a spare sd card which I tried booting up via my transcend usb card reader and it has no speed problems, the same with another 2.5inch usb drive I was recently using (which this kingston was supposed to replace). I cloned that 2.5inch drive to the kingston using balena which is why you see it’s not using the whole disk yet. I notice that boot up times can be seemingly random, sometimes fast, sometimes very very slow, especially when it reaches redis and mariadb section of startup. Is there anything I might have missed or can do in this situation as I don’t think a hub chipset would be an issue if I’m having the same problem with just the kingston stick plugged in. Are kingston datatraveler 3.2 gen 1’s just bad with rpi4’s in general? I’d like to know if there is anything else I can try as I would like to exhaust my options before I consider getting a different stick.

If I’m looking at this issue correctly, it’s the random iops that is the main problem and not the sequential r/w speeds? Anyone else using this usb stick and not having problems like I am?

Since we just faced it with an older NVMe SSD, does this help?

hdparm -B 254 /dev/sda
root@pubbypi:~# hdparm -B 254 /dev/sda

/dev/sda:
 setting Advanced Power Management level to 0xfe (254)
SG_IO: bad/missing sense data, sb[]:  70 00 05 00 00 00 00 0a 00 00 00 00 20 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 APM_level      = not supported

Was updating a few packages after this but did not notice any improvements in the speed.

Okay APM is not supported by this drive, which is true for most modem drives.

Does it help when you switch to performance CPU governor (always max CPU frequency) e.g. via dietpi-config Performance Options?

I tried changing the governor to performance, but it got me even more confused because all the results I had when I timed it were very random (I tested on small package, irssi) - in fact testing between different overvolts [just -2/0/2] and different governors, testing each 10 times before changes seemed to yield random speeds, just nothing consistent. It was either slow, or at times fast(er). I really can’t make heads or tails out of it. One thing I know though, whenever I boot up, as soon as I see the services start slow I know it will yield slow speed, so I keep rebooting until I see them load fast and then the speeds seem a little more consistent. I’ve also tried booting the pi without the stick inserted to see the diagnostic messages of the hub getting init, then I plug the usb stick in which might be my imagination but I see the services tend to boot quicker doing it that way. Big mystery this one.
I tried doing some other i/o tests:

root@pubbypi:~# ioping -c 20 /tmp/
4 KiB <<< /tmp/ (tmpfs tmpfs 3.84 GiB): request=1 time=29.9 us (warmup)
4 KiB <<< /tmp/ (tmpfs tmpfs 3.84 GiB): request=2 time=42.9 us
4 KiB <<< /tmp/ (tmpfs tmpfs 3.84 GiB): request=3 time=42.5 us
4 KiB <<< /tmp/ (tmpfs tmpfs 3.84 GiB): request=4 time=44.4 us
4 KiB <<< /tmp/ (tmpfs tmpfs 3.84 GiB): request=5 time=42.0 us
4 KiB <<< /tmp/ (tmpfs tmpfs 3.84 GiB): request=6 time=42.6 us
4 KiB <<< /tmp/ (tmpfs tmpfs 3.84 GiB): request=7 time=41.6 us (fast)
4 KiB <<< /tmp/ (tmpfs tmpfs 3.84 GiB): request=8 time=40.6 us (fast)
4 KiB <<< /tmp/ (tmpfs tmpfs 3.84 GiB): request=9 time=43 us
4 KiB <<< /tmp/ (tmpfs tmpfs 3.84 GiB): request=10 time=43.3 us
4 KiB <<< /tmp/ (tmpfs tmpfs 3.84 GiB): request=11 time=41.3 us
4 KiB <<< /tmp/ (tmpfs tmpfs 3.84 GiB): request=12 time=40.2 us (fast)
4 KiB <<< /tmp/ (tmpfs tmpfs 3.84 GiB): request=13 time=41.3 us
4 KiB <<< /tmp/ (tmpfs tmpfs 3.84 GiB): request=14 time=41.9 us
4 KiB <<< /tmp/ (tmpfs tmpfs 3.84 GiB): request=15 time=90.2 us (slow)
4 KiB <<< /tmp/ (tmpfs tmpfs 3.84 GiB): request=16 time=43.2 us
4 KiB <<< /tmp/ (tmpfs tmpfs 3.84 GiB): request=17 time=43.3 us
4 KiB <<< /tmp/ (tmpfs tmpfs 3.84 GiB): request=18 time=41.6 us (fast)
4 KiB <<< /tmp/ (tmpfs tmpfs 3.84 GiB): request=19 time=37.2 us (fast)
4 KiB <<< /tmp/ (tmpfs tmpfs 3.84 GiB): request=20 time=42.6 us

--- /tmp/ (tmpfs tmpfs 3.84 GiB) ioping statistics ---
19 requests completed in 845.7 us, 76 KiB read, 22.5 k iops, 87.8 MiB/s
generated 20 requests in 19.0 s, 80 KiB, 1 iops, 4.21 KiB/s
min/avg/max/mdev = 37.2 us / 44.5 us / 90.2 us / 10.9 us

fio ran fine with filesize of 10M, but when I tried 100M and 1G it got stuck in D state:

root@pubbypi:~# fio --randrepeat=1 --ioengine=libaio --direct=1 --gtod_reduce=1 --name=fiotest --filename=testfio --bs=4k --iodepth=64 --size=10M --readwrite=randrw --rwmixread=75
fiotest: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=64
fio-3.25
Starting 1 process
fiotest: Laying out IO file (1 file / 10MiB)
Jobs: 1 (f=1)
fiotest: (groupid=0, jobs=1): err= 0: pid=16825: Wed Aug 10 00:56:24 2022
  read: IOPS=1526, BW=6107KiB/s (6254kB/s)(7628KiB/1249msec)
   bw (  KiB/s): min= 5008, max= 6656, per=95.49%, avg=5832.00, stdev=1165.31, samples=2
   iops        : min= 1252, max= 1664, avg=1458.00, stdev=291.33, samples=2
  write: IOPS=522, BW=2091KiB/s (2141kB/s)(2612KiB/1249msec); 0 zone resets
   bw (  KiB/s): min= 1872, max= 2128, per=95.64%, avg=2000.00, stdev=181.02, samples=2
   iops        : min=  468, max=  532, avg=500.00, stdev=45.25, samples=2
  cpu          : usr=2.56%, sys=11.30%, ctx=2573, majf=0, minf=16
  IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.3%, 16=0.6%, 32=1.2%, >=64=97.5%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0%
     issued rwts: total=1907,653,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=64

Run status group 0 (all jobs):
   READ: bw=6107KiB/s (6254kB/s), 6107KiB/s-6107KiB/s (6254kB/s-6254kB/s), io=7628KiB (7811kB), run=1249-1249msec
  WRITE: bw=2091KiB/s (2141kB/s), 2091KiB/s-2091KiB/s (2141kB/s-2141kB/s), io=2612KiB (2675kB), run=1249-1249msec

Disk stats (read/write):
  sda: ios=1761/605, merge=1/0, ticks=1387/606, in_queue=1993, util=91.80%

and here is fio in D state:

root@pubbypi:~# ps aux | awk '$8 ~ /^D/{print}'
root          97  0.0  0.0      0     0 ?        D    Aug09   0:16 [usb-storage]
root       16348  0.0  0.0      0     0 ?        D    00:38   0:00 [kworker/u8:3+flush-8:0]
root       16507  0.0  0.0      0     0 ?        D    00:47   0:00 [kworker/1:2+events_freezable_power_]
root       16828  0.6  0.2 384040 22224 ?        D    00:56   0:04 fio --randrepeat=1 --ioengine=libaio --direct=1 --gtod_reduce=1 --name=fiotest --filename=testfio --bs=4k --iodepth=64 --size=1G --readwrite=randrw --rwmixread=75
root       16923  0.2  0.2 384036 22280 pts/8    Dl+  01:00   0:01 fio --randrepeat=1 --ioengine=libaio --direct=1 --gtod_reduce=1 --name=fiotest --filename=testfio --bs=4k --iodepth=64 --size=100M --readwrite=randrw --rwmixread=75

Just wanted to update post that I ordered several diff power supplies and cables but the performance remains the same. To reiterate, I am using the Kingston Datatraveler Exodia usb 3.2 usb flash drive. This is the only flash drive I have with this affliction where the writes are really slow (seems to do fine in windows however). If anyone is using this particular usb flash drive on a Raspberry Pi4B and not having this issue, please let me know. Maybe it is something to do with sequential vs random iops I am not sure but I would be glad to run additional tests. Would really like to figure this one out though for now I can use it in it’s “crippled” state since I dish most of my stuff to ram. In worst case scenario I will of course swap with another stick, and use this in another box. I guess I just want to know the cause if I can’t do anything about it, it’s itching at my mind haha.

That ioping was on the /tmp tmpfs and not a disk was intended?

Can you post the dmesg logs when inserting the USB drive after boot, and probably also thr whole log when it’s inserted on boot?

And when it’s slow, does htop show a full memory usage bar with blue (write buffer) and/or yellow (read cache) parts? Generally on Linux disk I/O can be inconsistent when available memory is fully used by buffer/cache ao that actual disk I/O starts. However, with --direct=1 you disable the buffer for fio, I guess.

lol my bad with the /tmp (happens when tired and doing lots of tests)

dietpi in 🌐 pubbypi in ~
❯ sudo ioping -c 20 /home/dietpi/test
4 KiB <<< /home/dietpi/test (ext4 /dev/root 73.1 GiB): request=1 time=626.7 us (warmup)
4 KiB <<< /home/dietpi/test (ext4 /dev/root 73.1 GiB): request=2 time=1.03 s
4 KiB <<< /home/dietpi/test (ext4 /dev/root 73.1 GiB): request=3 time=6.28 ms
4 KiB <<< /home/dietpi/test (ext4 /dev/root 73.1 GiB): request=4 time=1.08 s
4 KiB <<< /home/dietpi/test (ext4 /dev/root 73.1 GiB): request=5 time=1.84 ms
4 KiB <<< /home/dietpi/test (ext4 /dev/root 73.1 GiB): request=6 time=1.09 s
4 KiB <<< /home/dietpi/test (ext4 /dev/root 73.1 GiB): request=7 time=7.84 ms (fast)
4 KiB <<< /home/dietpi/test (ext4 /dev/root 73.1 GiB): request=8 time=1.05 s
4 KiB <<< /home/dietpi/test (ext4 /dev/root 73.1 GiB): request=9 time=1.18 ms (fast)
4 KiB <<< /home/dietpi/test (ext4 /dev/root 73.1 GiB): request=10 time=1.03 s
4 KiB <<< /home/dietpi/test (ext4 /dev/root 73.1 GiB): request=11 time=7.50 ms (fast)
4 KiB <<< /home/dietpi/test (ext4 /dev/root 73.1 GiB): request=12 time=617.7 us (fast)
4 KiB <<< /home/dietpi/test (ext4 /dev/root 73.1 GiB): request=13 time=615.9 us (fast)
4 KiB <<< /home/dietpi/test (ext4 /dev/root 73.1 GiB): request=14 time=630.9 us (fast)
4 KiB <<< /home/dietpi/test (ext4 /dev/root 73.1 GiB): request=15 time=625.8 us (fast)
4 KiB <<< /home/dietpi/test (ext4 /dev/root 73.1 GiB): request=16 time=459.0 us (fast)
4 KiB <<< /home/dietpi/test (ext4 /dev/root 73.1 GiB): request=17 time=1.62 s (slow)
4 KiB <<< /home/dietpi/test (ext4 /dev/root 73.1 GiB): request=18 time=1.48 ms (fast)
4 KiB <<< /home/dietpi/test (ext4 /dev/root 73.1 GiB): request=19 time=653.0 us (fast)
4 KiB <<< /home/dietpi/test (ext4 /dev/root 73.1 GiB): request=20 time=651.1 us (fast)

--- /home/dietpi/test (ext4 /dev/root 73.1 GiB) ioping statistics ---
19 requests completed in 6.92 s, 76 KiB read, 2 iops, 11.0 KiB/s
generated 20 requests in 19.9 s, 80 KiB, 1 iops, 4.02 KiB/s
min/avg/max/mdev = 459.0 us / 364.4 ms / 1.62 s / 546.2 ms
dietpi in 🌐 pubbypi in ~ took 32s
❯ sudo ioping -c 20 /
4 KiB <<< / (ext4 /dev/root 73.1 GiB): request=1 time=533.0 us (warmup)
4 KiB <<< / (ext4 /dev/root 73.1 GiB): request=2 time=714.1 us
4 KiB <<< / (ext4 /dev/root 73.1 GiB): request=3 time=648.2 us
4 KiB <<< / (ext4 /dev/root 73.1 GiB): request=4 time=642.8 us
4 KiB <<< / (ext4 /dev/root 73.1 GiB): request=5 time=1.28 s
4 KiB <<< / (ext4 /dev/root 73.1 GiB): request=6 time=540.2 us
4 KiB <<< / (ext4 /dev/root 73.1 GiB): request=7 time=625.6 us (fast)
4 KiB <<< / (ext4 /dev/root 73.1 GiB): request=8 time=623.4 us (fast)
4 KiB <<< / (ext4 /dev/root 73.1 GiB): request=9 time=623.8 us (fast)
4 KiB <<< / (ext4 /dev/root 73.1 GiB): request=10 time=610.8 us (fast)
4 KiB <<< / (ext4 /dev/root 73.1 GiB): request=11 time=609.0 us (fast)
4 KiB <<< / (ext4 /dev/root 73.1 GiB): request=12 time=586.5 us (fast)
4 KiB <<< / (ext4 /dev/root 73.1 GiB): request=13 time=623.1 us (fast)
4 KiB <<< / (ext4 /dev/root 73.1 GiB): request=14 time=624.3 us (fast)
4 KiB <<< / (ext4 /dev/root 73.1 GiB): request=15 time=597.3 us (fast)
4 KiB <<< / (ext4 /dev/root 73.1 GiB): request=16 time=688.6 us (fast)
4 KiB <<< / (ext4 /dev/root 73.1 GiB): request=17 time=619.8 us (fast)
4 KiB <<< / (ext4 /dev/root 73.1 GiB): request=18 time=652.5 us (fast)
4 KiB <<< / (ext4 /dev/root 73.1 GiB): request=19 time=617.8 us (fast)
4 KiB <<< / (ext4 /dev/root 73.1 GiB): request=20 time=598.1 us (fast)

--- / (ext4 /dev/root 73.1 GiB) ioping statistics ---
19 requests completed in 1.29 s, 76 KiB read, 14 iops, 59.1 KiB/s
generated 20 requests in 19.3 s, 80 KiB, 1 iops, 4.15 KiB/s
min/avg/max/mdev = 540.2 us / 67.7 ms / 1.28 s / 284.7 ms

This is what I’m getting atm. I can’t bring the pi down atm because people are on the network and they need the pihole/unbound for resolving but I’ll do it once people aren’t connected. I will keep a lookout for htop mem usage.