LUSTRE - move data and remove OST and OSS

Posted on Sat 10 October 2015 by Pavlo Khmel

Check health status on OSS

# cat /proc/fs/lustre/health_check
device lustre-OST0000 reported unhealthy
device lustre-OST0001 reported unhealthy

Find OST names on OSS

# lctl dl | grep osd-ldiskfs
0 UP osd-ldiskfs lustre-OST0000-osd lustre-OST0000-osd_UUID 5
5 UP osd-ldiskfs lustre-OST0001-osd lustre-OST0001-osd_UUID 5

Find device ID on MDT

# lctl dl | grep osc
8 UP osp lustre-OST0000-osc-MDT0000 lustre-MDT0000-mdtlov_UUID 5
9 UP osp lustre-OST0001-osc-MDT0000 lustre-MDT0000-mdtlov_UUID 5

Deactivate on MDT temporary

lctl --device 8 deactivate
lctl --device 9 deactivate

NOTE: They will be active after MDT reboot

Check status on MDT

# cat /proc/fs/lustre/lov/lustre-MDT0000-mdtlov/target_obd
0: lustre-OST0000_UUID INACTIVE
1: lustre-OST0001_UUID INACTIVE

List OST UUID on OSS

# lfs osts
OBDS::
0: lustre-OST0000_UUID ACTIVE
1: lustre-OST0001_UUID ACTIVE
. . .

Move data on client node
NOTE: Swap partition should be connected because it can be out of memory on master or moving node.

time lfs find --obd lustre-OST0000_UUID /lustre | lfs_migrate -sy
time lfs find --obd lustre-OST0001_UUID /lustre | lfs_migrate -sy

All clients need lazystatfs option if OST deleted because it will be difficult to mount client.

/usr/sbin/lctl set_param llite.*.lazystatfs=1

Add as permanent mount option

sed -i 's/.*lustre.*/10.0.0.2@tcp:/lustre /lustre lustre defaults,_netdev,flock,lazystatfs 0 0/g' /etc/fstab

Deactivate OST permanently on MDT

lctl conf_param lustre-OST0000.osc.active=0
lctl conf_param lustre-OST0001.osc.active=0

Check on client

# lctl get_param osc.*-OST000*.active
osc.lustre-OST0000-osc-ffff880605e5ac00.active=0
osc.lustre-OST0001-osc-ffff880605e5ac00.active=0
osc.lustre-OST0002-osc-ffff880605e5ac00.active=1
osc.lustre-OST0003-osc-ffff880605e5ac00.active=1

Unmount OST on OSS node

umount /mnt0
umount /mnt1

Shutdown OSS node.