Ok, this is not a full complete hard disk recovery method. I’m simply sharing how I managed to fix my M10 hard disk.
I was upgrading JUNOS form version 8 to 9 when I got a bunch of errors and suddenly the hard disk was no longer available. Unfortunately I did not grab a log when this happened.
When I rebooted the box I was seeing this error on startup:
mfs: /dev/ad1s1b: Device not configured /dev/ad1s1f: CAN'T CHECK FILE SYSTEM. /dev/ad1s1f: UNEXPECTED INCONSISTENCY; RUN fsck MANUALLY. Can't open /dev/ad1s1f: Device not configured
Once the box finally started up via the compact-flash I attempted to run a diagnostics but could not:
root> request chassis routing-engine diagnostics hard-disk -> Hard disk is absent from the boot list. Hard disk may have had previous errors, skipping test.
So basically it looks like JUNOS has removed the hard drive from the boot list. How can we stick it back? First you need to log in via root so you get the % prompt. Do NOT go into cli mode yet.
sysctl -w machdep.bootdevs=pcmcia-flash,compact-flash,disk,lan
This tells JUNOS to put the boot order back to the default, including the hard drive.
Not reboot the box and it should come back into the boot order. This is what I saw on my boot:
[3970/16/63] at ata0-master PIO4 ad1: DMA limited to UDMA33 ad1: 11513MB [23392/16/63] at ata0-slave UDMA33 Mounting root from ufs:/dev/ad0s1a if_pfe_open: listener socket opened, listening... Mounted jbase package on /dev/vn0...
So far so good. Once the box is back up go back into root mode and type:
root@% smartd -oe /dev/ad1
Hopefully your drive should be back up. If not you can run some extended smart tests on it. If you’ve lost your partitions you can repartition the disk by first going into cli mode and then:
root> request system partition hard-disk WARNING: The hard disk is about to be partitioned. The contents WARNING: of /altroot and /altconfig will be saved and restored. WARNING: All other data is at risk. This is the setup stage, the WARNING: partition happens during the next reboot. Setting up to partition the hard disk ... WARNING: A REBOOT IS REQUIRED TO PARTITION THE HARD DISK. Use the WARNING: 'request system reboot' command when you are ready to proceed WARNING: with the partitioning. To abort the partition of the hard disk WARNING: use the 'request system partition abort' command. root> request system reboot Reboot the system ? [yes,no] (no) yes
I had to repartition my disk and rebooted. All looks good:
root> show system storage Filesystem Size Used Avail Capacity Mounted on /dev/ad0s1a 992M 51M 932M 5% / devfs 16K 16K 0B 100% /dev/ /dev/vn0 14M 14M 0B 100% /packages/mnt/jbase /dev/vn1 57M 57M 0B 100% /packages/mnt/jkernel-8.1-20071208.0 /dev/vn2 6.4M 6.4M 0B 100% /packages/mnt/jpfe-M10-8.1-20071208.0 /dev/vn3 2.9M 2.9M 0B 100% /packages/mnt/jdocs-8.1-20071208.0 /dev/vn4 20M 20M 0B 100% /packages/mnt/jroute-8.1-20071208.0 /dev/vn5 8.3M 8.3M 0B 100% /packages/mnt/jcrypto-8.1-20071208.0 /dev/vn6 9.6M 9.6M 0B 100% /packages/mnt/jpfe-common-8.1-20071208.0 mfs:260 1.5G 8.0K 1.4G 0% /tmp mfs:267 1.5G 248K 1.4G 0% /mfs /dev/ad0s1e 189M 5.0K 187M 0% /config procfs 4.0K 4.0K 0B 100% /proc /dev/ad1s1f 7.7G 6.6M 7.1G 0% /var
After this I went from 8.1 to 9.3 and then to 10.4 and everything looks good :)
I’ve moved my blog from my old provider to a new one. Everything looks fine so far, but you may get things broken here or there. If you see anything broken please let me know via mellow[dot]drifter[at]gmail[this_should_be_a_dot]com
I know the title is quite a mouthful, but I did want to cover all the above in this post. Daniel asked me to check a few things as I have ready access to real switches.
You learn in your studies that layer 2 control packets are ‘special’ – Special in the way that traffic going over the trunk between 2 switches does not follow the standard practice. Let’s use wireshark to see exactly what is going on in a bunch of scenarios. It’ll also give me the opportunity to do a bit of testing with SPAN and RSPAN.
Let’s first set up a span session on the 3750. I will monitor port gi1/0/9 in both directions and send that traffic to gi1/0/24 to be picked up by the laptop.
monitor session 1 source interface Gi1/0/9 monitor session 1 destination interface Gi1/0/24
The first thing I noticed when I plug in my laptop however is that Windows of course is very noisy. Already my capture is filling up with stuff that Windows is sending out. And so I’ve downloaded an NST ISO which I’ll listen with on the laptop.
So now that I’ve booted up into NST and got Wireshark running, I hardly see anything at all happening between the 2 switches. Where is all the layer 2 control traffic? Well the problem is that control traffic is not automatically replicated to a SPAN port. You need to enable encapsulation replication in order for it to work. Let’s do so:
C3750#conf t C3750(config)#monitor session 1 destination interface gi1/0/24 encapsulation replicate
C3750#sh monitor session 1 Session 1 --------- Type : Local Session Source Ports : Both : Gi1/0/9 Destination Ports : Gi1/0/24 Encapsulation : Replicate Ingress : Disabled
I can now see lot’s of control traffic in my Wireshark capture.
Both switches are already connected to each other. By default they’ll create a trunk link and vlan 1 will be the native vlan. I’ll then configure the switches to tag the native vlan and see what happens.
Let’s ensure the native vlan is not currently tagged:
C3750#sh vlan dot1q tag native dot1q native vlan tagging is disabled
So what does my CDP/STP/DTP control packets look like in Wireshark? Note that I’m running the default mode of STP on the switch for now
I do see something odd. I am seeing an STP packet that has a dot1q tag of 1, 10 and an untagged packet. 10 I can understand because I have created vlan 10 and it has a separate STP instance. But why would the main one be tagged with vlan 1 if vlan 1 is the native vlan?
dot1q vlan 1
dot1q vlan 10
no dot1q tag
Both CDP and DTP are currently sent with no vlan tag at all. CDP does carry information about the native vlan in it’s packet which is why CDP does complain when these don’t match on either end. But the important thing is that both are untagged.
Let’s now tag the native vlan and see what happens.
C3750(config)#vlan dot1q tag native C3750#sh vlan dot1q tag native dot1q native vlan tagging is enabled
Interesting. DTP has not tag, but CDP is using a tag of 1.
What happens if I change the native vlan to 10? Well no need to paste output because it’s exactly the previous example. i.e. vlan 10 is now the native vlan and tagged, but CDP is still using a tag value of 1. STP and DTP remain unchanged.
Now let’s try something else. Let’s keep vlan 10 as the tagged native, but let’s remove vlan 1 from the trunk:
interface GigabitEthernet1/0/9 switchport trunk encapsulation dot1q switchport trunk native vlan 10 switchport trunk allowed vlan 2-4094
DTP is unchanged. i.e. it’s still sending untagged traffic. CDP however is sending tagged traffic. In vlan 1!
As a quick test I created int vlan 1 on both switches in the same subnet and tried to ping accross. I could not. Therefore it looks like Cisco will use vlan 1 tagged to send certain control data, even if vlan 1 is pruned, but no user data will be allowed on that vlan.
For STP I still have the same 3 outputs. A tagged vlan 1 STP frame, A tagged vlan 10 STP frame, and an untagged STP frame.
One last thing I wanted to test now was RSPAN config. I’ve always been a little confused as to the correct config on a switch that is the RSPAN end-point, and is also sending traffic to be monitored. i.e. Let’s say that the 3550 above is monitoring traffic on vlan 2 with a destination of remote span vlan 500. The 3750 is the rspan endpoint who monitoring rspan vlan 500 and sends it out to a local port on the switch. What happens if the 3750 is also monitoring vlan 2 on it’s own ports and sending out. Do we configure the destination to vlan 500 or straight to a local port?
Let’s configure it like so:
C3750#sh monitor session all Session 1 --------- Type : Remote Source Session Source Ports : Both : Gi1/0/9 Dest RSPAN VLAN : 500 Session 2 --------- Type : Remote Destination Session Source RSPAN VLAN : 500 Destination Ports : Gi1/0/24 Encapsulation : Replicate Ingress : Disabled
I’ve tested sending it to RSPAN vlan 500 and I don’t see any traffic at all. As soon as I change it to send traffic directly to the port it works.
EDIT (05/06/12) – I’ve uploaded my captures to Cloudshark so you can take them apart to do your own research
Untagged native vlan 1
Tagged native vlan 1
Tagged native vlan 10
Tagged native vlan 10 with vlan 1 removed from trunk