Part 5: Redundancy tasks.
The first four posts “Using a RAID system with DietPi – part 1“, “– part 2“, “– part 3” and “– part 4” dealt with the setup of, the access to, the management/diagnosis and the maintenance tasks of a RAID system with an example of a RAID 5.
This blog post deals with the redundancy tasks, e.g. changing RAID disks, rebuild RAID systems, re-activate disks into the RAID, etc.
These tasks will be updated resp. extended, whenever suitable issues become present (e.g. via the DietPi Forum).
This blog post is one of a series regarding setup, usage and management of a RAID system:
- Using a RAID system with DietPi – part 1: System overview and installation of the RAID
- Using a RAID system with DietPi – part 2: Access the RAID
- Using a RAID system with DietPi – part 3: Basic management and diagnosis tasks
- Using a RAID system with DietPi – part 4: RAID maintenance
- Using a RAID system with DietPi – part 5: Redundancy tasks
Table of contents
1. Re-add a disk if it is missing
1.1 Issue description
Issue: One of the RAID disks is inactive (shows “removed” in mdadm --detail /dev/md0
)
Problem description: One of the RAID disks is inactive, the RAID itself is running properly, but in this case the RAID does not have any redundancy any more. I.e., a following failure will lead to data losses.
1.2 Observation
The missing RAID disk can be detected via the lsblk
command:
root@raid:~# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 0 1,8T 0 disk
└─md0 9:0 0 5,5T 0 raid5 /mnt/raid
sdb 8:16 0 1,8T 0 disk
sdc 8:32 0 1,8T 0 disk
└─md0 9:0 0 5,5T 0 raid5 /mnt/raid
sdd 8:48 0 1,8T 0 disk
└─md0 9:0 0 5,5T 0 raid5 /mnt/raid
mmcblk0 179:0 0 29,8G 0 disk
└─mmcblk0p1 179:1 0 29,8G 0 part /
As can be seen, the disk /dev/sdb
ist not part of the RAID any more (i.e. it does not show any md0
underlying.
Another option of investigating the RAID status is the mdadm --examine
command resp. the /proc/mstat
output:
root@raid:~# mdadm --examine --scan --verbose
ARRAY /dev/md/0 level=raid5 metadata=1.2 num-devices=4 UUID=c38c30c6:86c7eef7:d62fa28e:0f008f8a name=raid:0
devices=/dev/sdd,/dev/sdc,/dev/sdb,/dev/sda
root@raid:~# cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
md0 : active raid5 sdd[2] sda[1] sdc[4]
5860147200 blocks super 1.2 level 5, 512k chunk, algorithm 2 [4/3] [_UUU]
bitmap: 1/15 pages [4KB], 65536KB chunk
unused devices: <none>
As can be seen, only three disks are contained in the RAID, /dev/sdb
is missing.
The RAID then also shows a degraded state:
root@raid:~# mdadm --detail /dev/md0
/dev/md0:
Version : 1.2
Creation Time : Fri Sep 16 23:00:10 2022
Raid Level : raid5
Array Size : 5860147200 (5588.67 GiB 6000.79 GB)
Used Dev Size : 1953382400 (1862.89 GiB 2000.26 GB)
Raid Devices : 4
Total Devices : 3
Persistence : Superblock is persistent
Intent Bitmap : Internal
Update Time : Thu Jun 22 22:49:17 2023
State : clean, degraded
Active Devices : 3
Working Devices : 3
Failed Devices : 0
Spare Devices : 0
Layout : left-symmetric
Chunk Size : 512K
Consistency Policy : bitmap
Name : raid:0 (local to host raid)
UUID : c38c30c6:86c7eef7:d62fa28e:0f008f8a
Events : 12634
Number Major Minor RaidDevice State
- 0 0 0 removed
1 8 0 1 active sync /dev/sda
2 8 48 2 active sync /dev/sdd
4 8 32 3 active sync /dev/sdc
This output shows:
- Only three disk devices: “Total Devices”, “Active Devices”, “Working Devices”
- One disk device removed: Shown in first line of bottom table
- RAID is generally working: State “clean”
- No full redundancy anymore: State “degraded”
1.3 Measures and monitoring
1.3.1 Measures
The formerly present RAID disk /dev/sdb
can be put back to the RAID with the mdadm --add
command:
root@raid:~# mdadm --add /dev/md0 /dev/sdb
mdadm: added /dev/sdb
Then, in the lsblk
output, the disk /dev/sdb
shows up again:
root@raid:~# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 0 1,8T 0 disk
└─md0 9:0 0 5,5T 0 raid5 /mnt/raid
sdb 8:16 0 1,8T 0 disk
└─md0 9:0 0 5,5T 0 raid5 /mnt/raid
sdc 8:32 0 1,8T 0 disk
└─md0 9:0 0 5,5T 0 raid5 /mnt/raid
sdd 8:48 0 1,8T 0 disk
└─md0 9:0 0 5,5T 0 raid5 /mnt/raid
mmcblk0 179:0 0 29,8G 0 disk
└─mmcblk0p1 179:1 0 29,8G 0 part /
1.3.2 Monitoring
After (re-)adding the disk to the RAID, it needs to be rebuild. The rebuild resp. recovering process can be monitored via the mdadm --detail
command:
root@raid:~# mdadm --detail /dev/md0
/dev/md0:
Version : 1.2
Creation Time : Fri Sep 16 23:00:10 2022
Raid Level : raid5
Array Size : 5860147200 (5588.67 GiB 6000.79 GB)
Used Dev Size : 1953382400 (1862.89 GiB 2000.26 GB)
Raid Devices : 4
Total Devices : 4
Persistence : Superblock is persistent
Intent Bitmap : Internal
Update Time : Thu Jun 22 23:03:46 2023
State : clean, degraded, recovering
Active Devices : 3
Working Devices : 4
Failed Devices : 0
Spare Devices : 1
Layout : left-symmetric
Chunk Size : 512K
Consistency Policy : bitmap
Rebuild Status : 0% complete
Name : raid:0 (local to host raid)
UUID : c38c30c6:86c7eef7:d62fa28e:0f008f8a
Events : 12637
Number Major Minor RaidDevice State
5 8 16 0 spare rebuilding /dev/sdb
1 8 0 1 active sync /dev/sda
2 8 48 2 active sync /dev/sdd
4 8 32 3 active sync /dev/sdc
The state output “recovering” resp. the State “spare rebuilding” in the bottom table shows the rebuild process and the “Rebuild Status” shows the progress.
Additionally the recovering process can be monitored via:
root@raid:~# cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
md0 : active raid5 sdb[5] sdd[2] sda[1] sdc[4]
5860147200 blocks super 1.2 level 5, 512k chunk, algorithm 2 [4/3] [_UUU]
[>………………..] recovery = 1.4% (28251228/1953382400) finish=361.3min speed=88800K/sec
bitmap: 0/15 pages [0KB], 65536KB chunk
unused devices: <none>
If the progress reaches 100 %, the RAID has gained its fully redundancy status again.
2. Sticking at status “clean, degraded”
2.1 Issue description
Issue: The RAID is not accessible and the status sticks at “clean, degraded”
Problem description: One of the RAID disks sticks in status “spare rebuilding”, the RAID itself sticks at status “clean, degraded”, but does no rebuild of the “spare” disk.
2.2 Observation
The RAID is not shown as a working disk, which can be detected via the lsblk
command:
root@raid:~# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS
sda 8:0 0 1,8T 0 disk
└─md0 9:0 0 5,5T 0 raid5
sdb 8:16 0 1,8T 0 disk
└─md0 9:0 0 5,5T 0 raid5
sdc 8:32 0 1,8T 0 disk
└─md0 9:0 0 5,5T 0 raid5
sdd 8:48 0 1,8T 0 disk
└─md0 9:0 0 5,5T 0 raid5
mmcblk0 179:0 0 29,8G 0 disk
└─mmcblk0p1 179:1 0 29,8G 0 part /
As can be seen, none of the disks /dev/sdX
are part of of /mnt/raid
.
The RAID shows a “clean, degraded” state with one disk with a “spare rebuilding” state:
root@raid:~# mdadm --detail /dev/md0
/dev/md0:
Version : 1.2
Creation Time : Mon Aug 7 00:23:42 2023
Raid Level : raid5
Array Size : 5860147200 (5.46 TiB 6.00 TB)
Used Dev Size : 1953382400 (1862.89 GiB 2000.26 GB)
Raid Devices : 4
Total Devices : 4
Persistence : Superblock is persistent
Intent Bitmap : Internal
Update Time : Mon Aug 7 10:39:00 2023
State : clean, degraded
Active Devices : 3
Working Devices : 4
Failed Devices : 0
Spare Devices : 1
Layout : left-symmetric
Chunk Size : 512K
Consistency Policy : bitmap
Name : raid:0 (local to host raid)
UUID : 7ad81bc0:814202bc:8917b731:73953d0e
Events : 6098
Number Major Minor RaidDevice State
0 8 0 0 active sync /dev/sda
1 8 16 1 active sync /dev/sdb
2 8 32 2 active sync /dev/sdc
4 8 48 3 spare rebuilding /dev/sdd
The disk /dev/sdd
shows its “spare rebuilding” state.
The system stays in this state without any change seeable via cat /proc/mdstat
:
root@raid:~# cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
md0 : active (auto-read-only) raid5 sdb[1] sdc[2] sdd[4] sda[0]
5860147200 blocks super 1.2 level 5, 512k chunk, algorithm 2 [4/3] [UUU_]
bitmap: 0/15 pages [0KB], 65536KB chunk
unused devices: <none>
This output shows that /dev/md0
is active, but in “auto-read-only” mode.
2.3 Measures and monitoring
2.3.1 Measures
To overcome this auto-read-only state, the command mdadm --readwrite
is used:
mdadm --readwrite /dev/md0
This puts the RAID back to read-write status.
2.3.2 Monitoring
After bringing the RAID back to read-write, the system shows:
root@raid:~# cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
md0 : active raid5 sdb[1] sdc[2] sdd[4] sda[0]
5860147200 blocks super 1.2 level 5, 512k chunk, algorithm 2 [4/3] [UUU_]
bitmap: 0/15 pages [0KB], 65536KB chunk
unused devices: <none>
The output shows that /dev/md0
is active, but not also in “auto-read-only” mode like before.
The output of mdadm --detail /dev/md0
shows the restarting recovering process:
root@raid:~# mdadm --detail /dev/md0
/dev/md0:
Version : 1.2
Creation Time : Mon Aug 7 00:23:42 2023
Raid Level : raid5
Array Size : 5860147200 (5.46 TiB 6.00 TB)
Used Dev Size : 1953382400 (1862.89 GiB 2000.26 GB)
Raid Devices : 4
Total Devices : 4
Persistence : Superblock is persistent
Intent Bitmap : Internal
Update Time : Mon Aug 7 13:48:37 2023
State : clean, degraded, recovering
Active Devices : 3
Working Devices : 4
Failed Devices : 0
Spare Devices : 1
Layout : left-symmetric
Chunk Size : 512K
Consistency Policy : bitmap
Rebuild Status : 85% complete
Name : raid:0 (local to host raid)
UUID : 7ad81bc0:814202bc:8917b731:73953d0e
Events : 6101
Number Major Minor RaidDevice State
0 8 0 0 active sync /dev/sda
1 8 16 1 active sync /dev/sdb
2 8 32 2 active sync /dev/sdc
4 8 48 3 spare rebuilding /dev/sdd
This recovering can also be monitored via cat /proc/mdstat
:
root@raid:~# cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
md0 : active raid5 sdb[1] sdc[2] sdd[4] sda[0]
5860147200 blocks super 1.2 level 5, 512k chunk, algorithm 2 [4/3] [UUU_]
[=================>...] recovery = 85.5% (1670961084/1953382400) finish=78.4min speed=60029K/sec
bitmap: 0/15 pages [0KB], 65536KB chunk
unused devices: <none>
3. References
- https://raid.wiki.kernel.org/index.php/Replacing_a_failed_drive
- https://raid.wiki.kernel.org/index.php/Assemble_Run
- https://raid.wiki.kernel.org/index.php/Recovering_a_damaged_RAID
- https://ctaas.de/software-raid.htm#RAID-Verbund_aktivieren
test
test answer
Super Good write up on raid setup and use, thank you so much.
Using a RAID system with DietPi offers enhanced data security and performance. DietPi’s lightweight design coupled with RAID configurations ensures efficient utilization of resources while safeguarding against data loss. This combination is ideal for users seeking reliable storage solutions without sacrificing system efficiency on their DietPi-based setups.
Currently, the content on your website is of exceptional quality. Without a doubt, only a small number of individuals engage in this pastime. Due to your accurate and effective implementation of the approach. The extent of your commitment and exertion in spreading an extensive amount of knowledge on this subject is simply extraordinary.
It seems like the previous posts covered the setup, access, management, and maintenance of a RAID system, while this one focuses on redundancy tasks such as changing RAID disks, rebuilding RAID systems, and re-activating disks into the RAID.