Check health status on OSS
# cat /proc/fs/lustre/health_check
device lustre-OST0000 reported unhealthy
device lustre-OST0001 reported unhealthy
Find OST names on OSS
# lctl dl | grep osd-ldiskfs
0 UP osd-ldiskfs lustre-OST0000-osd lustre-OST0000-osd_UUID 5
5 UP osd-ldiskfs lustre-OST0001-osd lustre-OST0001-osd_UUID 5
Find device ID on MDT
# lctl dl | grep osc
8 UP osp lustre-OST0000-osc-MDT0000 lustre-MDT0000-mdtlov_UUID 5
9 UP osp lustre-OST0001-osc-MDT0000 lustre-MDT0000-mdtlov_UUID 5
Deactivate on MDT temporary
lctl --device 8 deactivate
lctl --device 9 deactivate
NOTE: They will be active after MDT reboot
Check status on MDT
# cat /proc/fs/lustre/lov/lustre-MDT0000-mdtlov/target_obd
0: lustre-OST0000_UUID INACTIVE
1: lustre-OST0001_UUID INACTIVE
List OST UUID on OSS
# lfs osts
OBDS::
0: lustre-OST0000_UUID ACTIVE
1: lustre-OST0001_UUID ACTIVE
. . .
Move data on client node
NOTE: Swap partition should be connected because it can be out of memory on master or moving node.
time lfs find --obd lustre-OST0000_UUID /lustre | lfs_migrate -sy
time lfs find --obd lustre-OST0001_UUID /lustre | lfs_migrate -sy
All clients need lazystatfs option if OST deleted because it will be difficult to mount client.
/usr/sbin/lctl set_param llite.*.lazystatfs=1
Add as permanent mount option
sed -i 's/.*lustre.*/10.0.0.2@tcp:/lustre /lustre lustre defaults,_netdev,flock,lazystatfs 0 0/g' /etc/fstab
Deactivate OST permanently on MDT
lctl conf_param lustre-OST0000.osc.active=0
lctl conf_param lustre-OST0001.osc.active=0
Check on client
# lctl get_param osc.*-OST000*.active
osc.lustre-OST0000-osc-ffff880605e5ac00.active=0
osc.lustre-OST0001-osc-ffff880605e5ac00.active=0
osc.lustre-OST0002-osc-ffff880605e5ac00.active=1
osc.lustre-OST0003-osc-ffff880605e5ac00.active=1
Unmount OST on OSS node
umount /mnt0
umount /mnt1
Shutdown OSS node.