SCSI Device Names And Bus IDs
Problem
(Asked on SuperUser.com)
I have a CentOS 4.x server running software raid. The server has two scsi disks in hot-swap trays. mdadm reports that the raid has failed, and so I would like to replace it.
I examine /proc/mdstat, which tells me my two raid devices have components on disks sda and sdb. It also tells me that sdb is the one that failed.
I examine /proc/scsi/scsi, which tells me I have two physical devices, at SCSI IDs 0-0-0-0 and 0-0-1-0.
Because I want to do the change hot, I assume that sdb is 0-0-1-0. So I say:
# echo "scsi remove-single-device 0 0 1 0" > /proc/scsi/scsi
...and the computer barfs because sda has just been removed, leaving the system with no valid drives.
Now upon reflection, the way I got into this mess was the last time a drive failed, it was sda/0-0-0-0, and I did it the old-school way -- stop the computer, remove the dead drive, then boot off of the survivor in it's old slot. This left me with a computer that thought that sda was 0-0-1-0. I then obtained and inserted, hot, a replacement, and added it like so:
# echo "scsi add-single-device 0 0 0 0" > /proc/scsi/scsi
...which worked, which meant I could apply a disk label, partition, and reconstruct my raid arrays. This also meant that the computer thinks that sdb is 0-0-0-0. Now sdb dies again (350 days later, but that's another issue) and I have forgotten all this.
So. Assuming that both my memory and my records keeping skills are inadequate for reminding me that this has happened, is there a way that in future can I compare the scsi device names (0-0-$n-0) and associate them with named devices (sd$x)?
Solution
Each SCSI id has a directory: /sys/bus/scsi/devices/ and in there is a symbolic link block which points to the device name:
[root@stargate2 0:0:0:0]# pwd
/sys/bus/scsi/devices/0:0:0:0
[root@stargate2 0:0:0:0]# ls -ld block
lrwxrwxrwx 1 root root 0 Dec 11 14:58 block -> ../../../../../../../../block/sdb
In this example, 0:0:0:0 is actually sdb.