Revisiting QoS on the Nexus 9k, Part 1: Design Strategy

It’s been over a year since the last post detailing my experience with QoS and TCAM utilization on the Nexus 9k. I’ve recently been working on the replacement of HP 6120xg blade switches with Cisco B22HP Fabric Extenders and am revising our QoS policies.

The initial configuration was completed rapidly to support a greenfield deployment at a new corporate headquarters. After a year of production, it’s time to re-evaluate how our traffic is classified, marked, and queued. This has been a great opportunity to review the Cisco Nexus 9000 Series NX-OS Quality of Service Configuration Guide and learn where mistakes were made in the form of unsupported configuration.

First on the list is FEX queuing. I had originally applied policy maps to the FEX HIF ports which performed classification and marking based on ACLs and DSCP values. This is not supported – reviewing the matrix in the FEX QoS Configuration section of the Cisco guide, it’s clear that the FEX supports the following at the system level only:

Classification of traffic via ‘match cos’
Setting the qos-group based on the CoS classification
Input and output queuing (bandwidth, bandwidth remaining, and priority level 1)

Also note the follwing conditions:

“When configuring end to end queuing from the HIF to the front panel port, the QoS classification policy needs to be applied to both system and HIF. This allows the FEX to queue on ingress appropriately (system) and allows the egress front panel port to queue appropriately (HIF).“
“For VLAN-tagged packets, priority is assigned based on the 802.1p field in the VLAN tag and takes precedence over the assigned internal priority (qos-group). DSCP or IP access-list classification cannot be performed on VLAN-tagged frames.“

OK, good to know. The policy maps which were performing classification and marking based on … well, anything … were basically doing nothing. Time to get the house in order!

To begin, let’s look at the classification and marking strategy. The traffic classes we expect to see (as well as some examples) are show below:

Traffic Class	Examples	DSCP Value
Voice	G.711, G.729	EF
Network Control	Routing Protocols	CS6
Conferencing	Jabber, RDP	AF41
Storage	iSCSI	CS4
Streaming	vMotion, vSphere Management	AF31
Signaling	SCCP, SIP, Jabber Control	CS3
Transactional Data	DB2, MSSQL, SAS	AF21
Ops / Admin Mgmt (OAM)	iLO, Active Directory, DNS, SSH	CS2
Bulk Data	NDMP, CIFS, NFS, FTP	AF11

In this environment, the Network Services team controls both the server and network infrastructure. This helps to reduce the size of classification ACLs by offloading the marking to hosts and virtual machines as follows:

Windows Group Policies have been created to classify and mark as much traffic as possible (Active Directory, CIFS, DNS, and other traffic to/from servers)
Hosts licensed for vSphere Enterprise Plus are running virtual Distributed Switches with classification/marking policies
Cisco CallManager will mark voice and signaling traffic accordingly
Cisco Expressway will mark Jabber and Jabber control as required
Linux hosts will (eventually) be managed by a configuration management tool such as Ansible which allows the use of iptables ‘mangle’ rules to mark traffic

In addition, we don’t need to classify or mark Network Control traffic, as the NXOS QoS Guide clarifies that “Control traffic, such as BPDUs, routing protocol packets, LACP/CDP/BFD, GOLD packets, glean traffic, and management traffic, are automatically classified into a control group based on a criteria. These packets are classified into qos-group 8 and have a strict absolute priority over other traffic. These packets are also given a dedicated buffer pool so that any congestion of data traffic does not affect control traffic. The control qos-group traffic classification cannot be modified.”

Because we’re limited to 4 qos-groups and must classify based on CoS for the FEX, it’s time to identify the traffic we care about. A NetApp is being used for iSCSI storage and will mark traffic to CoS 4 per NetApp Knowledgebase article FA126, which we will assign into its own qos-group. vMotion will be mapped into a separate qos-group and assigned CoS 3. The priority queue on the Nexus 9k in a 4q model is qos-group 3, so that will handle the voice traffic. Everything else will go into the default queue. The PHB to qos-group mapping will be as follows:

PHB	qos-group
EF	3
CS6	Control
AF41	Default
CS4	2
AF31	Default
CS3	1
AF21	Default
CS1	Default
AF11	Default
Default	Default

And the qos-group queuing policy will be created to allocate bandwidth as follows:

qos-group	Bandwidth
3	Priority
2	40% Remaining
1	20% Remaining
default	40% Remaining

Most uplinks are 10GbE and I don’t expect an abundance of voice traffic, so this queuing config appears reasonable for our environment. QoS is just “managed unfairness,” so the dedicated qos-groups for iSCSI and vMotion should be sufficient to keep those classes of traffic flowing during periods of congestion while allowing the default class to consume as much as needed during periods of lower demand.

Now that the basic strategy has been created, it’s time to create the configs and apply to the switches. This will be covered in a follow-up post, so stay tuned!