One of my regular customers reported an issue that some endusers complained about oneway audio on Skype for Business calls. It was only present at some sites not all. The customer has around 20-25 larger sites with wireless service. The 20-25 sites are handled by 5 set of Cisco 8510 Wireless LAN Controllers.
Just recently (read 4 weeks) we moved from 18.104.22.168 software to 22.214.171.124 software. It reports about the oneway audio was only just reported yesterday for the first time, but knowing that the time “from a user has an issue to it is reported to helpdesk” and “from helpdesk to push it to the Wireless Operations department” can be several weeks it could not be ruled out that the software upgrade triggered this event.
But again if it was the software why did only some sites have the issues? And furthermore some users reported that it was only during some kind of S4B calls not all.
Armed with Wireshark for 802.11 frames, Wireshark on a Cisco 3850 access-switch connecting the access-point and a standard customer laptop I started troubleshooting. Everything worked. I could see the packets and frames being marked correct and forwarded as expected. But again no-one at this location were complaining about the issue, but at least I had established a baseline.
Next day I went to a site were a user could reproduce the issue. I started all sniffertraces and sitting right next to her I called her with my laptop being wired. I could hear everything she said, but she got no audio back. Looking at the sniffertrace if was clear that packets were going back and forward from the WLC to the access-points, but the RTP packets downstream towards her only got to the access-point. At the access-point the packets were dropped. I could not believe my eyes. An access-point dropping all audio packets downstream.
So what is different from this site to where I did the baseline. Lowest Data Rate. At this site the customer was using an RF-Profile to have different Data Rates. The lowest Data Rate available was 36Mbps and where I tested it was 24Mbps. Not finding any other differences we decided to make 24Mbps as lowest Mandatory Data Rate. That did the trick.
After 20 minutes of searching Cisco Bug engine I found this bug:
voice tagged frames drop at AP radio after upgrade to 8.2 and later CSCva07307 --- Symptom: Random frame drops observed after upgrade to 126.96.36.199 Frames are voice TID Conditions: AP on local mode This is configuration on 188.8.131.52 or higher, with low data rates disabled (6,9,12,24), and traffic is marked as voice Workaround: Enable at least one of the low data rates (6, 9,12, 24) This is triggered due to radio firmware not finding any of the "voice" data rates available, which have restricted retry counts
When a packet was marked as a Voice packet (DSCP EF) and going downstream towards the client the access-point’s firmware dropped the packet. Reason being the access-point had hardcoded Data Rates allowed for Voice packets as 6, 9, 12 or 24 and with non of these being enabled the packet was dropped.
Looking in the access-point shows no working active voice rates:
AP*****#sh controllers d1 ! interface Dot11Radio1 Radio AN 5GHz, Base Address remo.vedf.bef0, BBlock version 0.00, Software version *.*.* Serial number: FOC******** Unused dynamic SQRAM memory: 0x0001F3A8 (124 KB) Unused dynamic SDRAM memory: 0x00318C08 (3171 KB) Spectrum FW version: 1.15.4 Number of supported simultaneous BSSID on Dot11Radio1: 16 Carrier Set: Denmark (DK) (-E) ...(SNIP)... Default Voice Rates: basic-6.0 basic-12.0 basic-24.0 Active Voice Rates: basic-6.0 basic-12.0 basic-24.0 Managment Rates: basic-36.0
But the user still reported that some calls were working fine before the change. After interviewing a couple of end users we found the following:
- Client to Client = failed
- Client to Meeting = failed
- Client to external client = worked
It turns out that the calls to external networks (normal PSTN/cellphones/etc) worked before the 24Mbps Data Rate fix. Looking at a Wireshark trace of one of these calls quickly explained why. QoS was misconfigured for external calls. RTP packets from the Voice Gateway handling external calls were marked as best effort, therefore not triggering the bug.