One of my customers calls and complains that in one specific room in one of their buildings Wi-Fi isn’t working. My first question is always what is “not” working? Does the client get associated? Does it get an IP address? Does it work for 2 minutes and then disconnects? Well the client doesn’t even connect to the Wi-Fi, same client works great elsewhere in the office.
Ok, for ones it does actually sound like an 802.11 issue. As the site was more that 500 km (300 miles) away from my office I had the customer add an access-point in the room that I could use for troubleshooting.
The customer had a Cisco wireless setup and using the newly added 3702i access-point in Spectrum-Expert-Connect mode I could get an idea of the layer 1 environment.
For those who don’t know it all Cisco ClearAir access-point allow for remove monitoring of the RF. Using the (outdated) Cisco Spectrum Expert program or Metageek Chanalyzer with CleanAir license you can connect to an access-point using the AP IP and the Network Spectrum Interface Key. In client operating mode (Local or FlexConnect) you can get spectrum information from the operating channel, but if the access-point is in SE-Connect mode you will get the full 2.4GHz or 5GHz spectrum supported by the access-point.
I like starting my troubleshooting using just a few minutes on layer 1. (Unfortunately I don’t have any screenshots of this Cisco Spectrum Expert output).
As the radio frequency spectrum looked fairly normal I changed the access-point mode to sniffer mode.
Again for those that don’t now it Cisco access-point can be put into Sniffer mode and the radios can capture packets and send to a your wireshark using the PEEKREMOTE format.
Looking at the sniffer trace in Wireshark I quickly discovered something odd. Every time the client got associated the BSSID sends a de-authentication frame.
Client sends a 802.11 Auth, AP sends an answer in a Auth. Looking good.
Client sends a 802.11 Association Request, AP answers in a Association Responce. Still looking good.
Then the AP broadcasts a Deauth, then the client sends a Deauthentication & Disassociate.
And just to make sure the AP sends a unicast Deauthentication frame to the client.
Why on earth would an access-point behave like this. The reason code for the deauthentication was reason code 2. Previous authentication no longer valid.
So this Wireless LAN Controller had over 500 access-points serving clients and only this access-point seemed to misbehave. Why would an access-point do that? Rushing into the believe that this was surely a bug I created a service request with Cisco TAC.
While waiting for TAC to look into my findings, I looked over my Wireshark trace again and again. Finally, I noticed the different Signal strength on the frames.
My sniffer access-point was placed right next to the client and the AP servicing the client, but still the signal of the Deauthentication frames are -79dBm, -80dBm and -81dBm.
How can an access-point send beacons that is received with an signal of -26 dBm and then send an Deauthentication frame with the signal of -81 dBm?
And why is the clients Deauthentication and Disassociation frames seen at -80/-81 dBm when the Authentication and Association frames are seen at -36 dBm?
Who is moving the client and the access-point? How can an access-point be at 2 different locations at once? Well it can’t!
The Deauthentication frames are spoofed. Sent from a different Wireless STA. Looking at the sniffertrace without any filter I found multiple beacons containing other SSIDs and some of them were received at around -80 dBm. Looking at the BSSID I found that the other beacons were sent using Cisco Wireless LAN. And what feature on a Cisco WLC can make an access-point evil and deauthenticate and disassociate clients.
The neighbour building’s Wireless infrastructure were containing this corner of the building. The rest of the office were to far away for the neighbour infrastructure to care.
The customer wanted us to configure something to fix this. We could start to look into Client MFP, but that would be a workaround. The real fix was for the customer to track down the neighbour infrastructure owner and getting this stopped.