This document provides answers to Frequently Asked Questions (FAQ) about Content Services Switch 11000 (CSS) series flow information.
Q. What is a Flow Control Block (FCB) and where does it live?
A. FCB is a term used for two different things. There is a fastpath FCB that is used to map a flow and allow transformations on each frame in the flow to be done entirely by the fastpath microcode running on either the etherPIF (for 10/100 ports) or the XPIF (for Gigabit ports). Fastpath FCBs come out of memory that is local to the chip used for the port. For an etherPIF, this chip contains 4 10/100 Ethernet port interfaces. The memory used for FCBs is shared for each port in the etherPIF. In the CSS 11150 and CSS 11050 products, the ports 1-4 share memory, ports 5-8 share memory, and so on. For XPIF ports, the memory is not shared between ports. If you have uplink 10/100 ports, you will want to keep them isolated from any other ports that are occupied (like server ports) if you have that option. Obviously, if all ports are populated, you do not get to make this decision.
The flow manager also has FCBs. The flow manager FCBs contain all the information in the fastpath FCB, plus additional information to allow the flow manager to properly manage the flow. There is one FCB used per ingress direction of a flow. Thus, a TCP connection is going to take two FCBs, and a UDP flow is going to take one. UDP flows can be bi-directional, however, they often are not (for example, streaming audio).
Q. What is mapped as flows?
A. Only TCP and UDP traffic are mapped as flows. IPSec traffic that may include embedded TCP or UDP is not mapped as a flow. ICMP is never mapped as a flow, however, pay attention to ICMP frames so that you can properly associate them with a flow to do NAT where appropriate. This will ensure that the eventual recipient gets the "right" information back. This is particularly important for Path MTU Discovery (PMTD) support because you have to be aware of what session you are trying to adjust the TCP MSS for.
Below is the list of port numbers (UDP or TCP) that the CSS does not setup a flow for. If a packet is received by the CSS with either a source or destination port which matches one of the ones below, the frame is just routed. A flow is not setup.
67, /* BOOTP server */
68, /* BOOTP client */
137, /* NETBIOS name service */
138, /* NETBIOS datagram service */
161, /* SNMP */
162, /* SNMP traps */
520, /* RIP */
8089,/* Inktomi crud */
With 6.10, you can control whether or not the CSS creates a FCB for SNMP and DNS traffic. The command to issue is flow-state.
Traceroute is not quite so straight forward. Different platforms perform traceroute in different manners (UDP, ICMP, and so on), and it is not the port number alone which signifies that something is a traceroute packet.
The CSS does not setup flows for ICMP packets at all. The check for a traceroute packet says to not setup a flow if the following conditions exist:
Protocol is UDP.
The source port is less than 32769.
The destination port is less than 33434.
The UDP data length is less than 20.
There is 1 byte sequence number in UDP portion.
There is 1 byte original TTL.
If all these conditions are true, you will not setup a flow for this packet.
Q. How does the flow manager maintain FCB mapping information?
A. The flow manager has an exact copy of the FCB information that is programmed into the fastpath when a flow is mapped. If frames need to be forwarded and the fastpath mapping is not present, the flow manager can still forward the frames. You can do this because after the fastpath notifies the flow manager that a flow has torn down, the flow structure (and all transform information) is put at the end of the flow free list. While it is in this list and before it is reclaimed for another flow, it is accessible through a hash table called the spoof list. Any frames that come in following the elimination of the flow in the fastpath can be properly forwarded.
Once an FCB has been placed in the free-list, you cannot see it anymore when you issue the flow active-list or show flows commands.
The length of time that this information persists depends on the flow rate because the flow rate determines the amount of time that a free flow is going to take to go from the end of the free list to the beginning. A rough calculation is to take the number of flows (issue the flow statistics command and look for available flows) and divide by two since you have two flow structures per TCP connection. This assumes little or no UDP. Divide this number by the average flow rate per second. That will give you the number of seconds to reclaim the flow. When things are slow, this will be a big number. When you pass a lot of flows per second, this number will be smaller.
To determine how often the cache cycles out entries, use the following equation:
# of FCBs / 2 / flows per second
Another way to determine how long an entry will remain in the cache is to issue the flow statistics command in debug mode, and make note of the total numbers on each port (on the CS800 this would be per-SFP). Wait a period of time, for example, 60 seconds, and issue the flow statistics command again. Add up the differences in the total number (how many FCBs you cycled through in the last minute). From this number, you can calculate how quickly you will cycle through 204398 FCBs on the CS150.
CS150(debug)# flow stat
Flow Manager Statistics:
Cur High Avg
UDP Flows per second 0 0 0
TCP Flows per second 0 6 0
Total Flows per second 0 6 0
Hits per second 0 0 0
Number of Allocated Flows (non-purged) 1
Number of Free Flows 4095
Number of Flow Drops 0
Max Number of Flow Control Blocks 204398
Number of Flows with SynAckPending 1
Accumulated Port Flow Statistics:
Current Number of Active Flows 1
Total Flow Accounting Reports received 243385
Total Out of Sequence Packet Received 0
Port CE Active Total Acct TCP UDP Rst FCBs
# 1 1f00 0 4 4 0 0 0 15238
# 2 401f00 0 0 0 0 0 0 15238
# 12 2c01f00 1 243382 243381 1 0 0 14738
Q. How large is the internal cache (spoof list)?
A. This is dependent on the release (ECO release - it is not going to change for Sustaining builds) and CSS model. For example, a CSS 11150 running R4.0 B3 has 204398 FCBs as seen by issuing the flow statistics in debug mode. This is the size of the internal cache or spoof list. On a CS800, each SFP (6/1, 6/2, 9/1, 9/2) has a separate spoof list. Currently, on the CS800 there are ~128000 FCBs per SFP.
Q. Can I map more flows than a port has memory to support?
A. Yes, you can. This is not normally done because you usually have plenty of memory to support fastpath only flow mappings. If the fastpath runs out of FCBs, the flow is handled by slowpath using the slowpath FCB. Obviously, you do not want to run a large number of flows this way because the system will slow down. In the CSS 11800, you have up to four session processors to do this forwarding. The main thing to remember is that you are not strictly limited by the amount of memory in the fastpath.
Q. How are flow setups load balanced in the CSS 11800?
A. The transport layer source and destination ports are hashed together and used to construct an index that selects which SFP a flow gets mastered on. Because the distribution of these ports is relatively uniform, you see pretty even load balancing. On occasion, you may see one SFP with a much higher flow rate than others. This can be caused by traffic that always uses the same source and destination port numbers (for example, DNS).
Q. How are flows torn down?
A. The fastpath will remove a flow structure when it sees a FIN or a RST bit set on an existing TCP connection. The FCB is maintained for a short time in a local cache to cache reverse path ACKs to the FIN/ACKs. Any remaining packets that get to the fastpath after this are sent to the mastering session processor where they can be forwarded or rejected as appropriate.
UDP flows are torn down by a timer process that asks the fastpath periodically (depending on the UDP protocol type) if there has been activity on the flow in a given interval. If there has not been, the flow is torn down. Because of the lag time in querying the fastpath about UDP flows, there are times where a large spike in UDP flows can weigh the box down. Garbage collection starts at this point to accelerate the cleanup, however, the system can slow down a bit because of this.
Flows are torn down if you do not see a proper three-way handshake for a Layer 2 (L2)-Layer 4 (L4) TCP connection, or you do not see a content frame for an Layer 5 (L5) flow after 15 seconds. This is done via a timer maintained on a flow. For L2-L4 flows, the timer will go off if the flow is active, and the flow manager will ask the fastpath if it has seen the ACK associated with the three-way handshake on the client side of the flow. The fastpath keeps track of this information and a message is sent to the flow manager with a status bit that indicates the ACK seen status. If the ACK has not been seen on the flow, the flow manager will tear down the flow. For L2-L4 flows, the flow manager will also send a RST frame to the service to tear down the TCP state on the service. For L5 flows, this is not necessary since the connection was never made to the service. For L5 flows, you do not have to query the fastpath to see if it has seen a content frame because the flow manager gets to see all the frames prior to completing the spoofing process on the connection. Flows are also torn down via garbage collection.
Q. How is flow garbage collected?
A. The flow manager has a timer task that wakes up every second. This timer task performs garbage collection in an interval that depends on the total number of flows on either a single port or the session processor as a whole. By default, garbage collection will run every eight seconds. Each time the garbage collection runs, it looks at a number of slots in a hash table of mapped flows. Each flow is checked to see if it is older than a certain number of seconds, which varies depending on protocol. If the flow is older than a given number of seconds, the fastpath is asked if there has been any activity on the flow within a protocol dependent number of seconds. Thus, you would expect to allow more "dead" time for a chat flow than an NFS flow. If the fastpath either cannot find the flow asked about or the flow has been idle for longer than the specified number of seconds, the flow is torn down by the fastpath.
While we allow for a protocol number of seconds for a flow to be idle, the actual time that it takes to garbage collect a dead flow is that number of seconds plus the time it takes to get through the flow map. This could be as long as four extra minutes, or as short as exactly the timeout interval for a flow of a given protocol type. It is not strictly deterministic because of the way that the collection algorithm works.
As the load increases on a given port or the session processor, the number of seconds between garbage collection intervals goes down first to four then to every two seconds. The number of slots looked at also doubles each time you pass a load threshold. The busier you get, the more cleanup efforts are made. Usually, this works pretty well to maintain resources.
If you ever run short of buffers, immediately try to garbage collect, and also take any buffers queued to a flow (only relevant for content aware processing) and return them to the system buffer pool. This will cause a retransmission for the client but will maintain the system integrity. This is reviewed as a minor penalty. This also is very rare.
For example, if the flow is set up in a hash bin that the garbage collector gets to just after the default timeout interval, and there has been no activity within that timeout interval, the flow will be removed. The flow is torn down via garbage collection in as little as 15 seconds.
When a flow is removed, it is placed at the end of the free list. This list contains all allocated flows (used or not) in the system. The flow remains in a list that is hashed via network tuples so that you can forward any frames until such time as the flow is reallocated off the front of the free list. Under these circumstances, you may see a connection be destroyed in as little as 15 + ( (total allocated FCBs / 2) / sustained flow rate per second) seconds. The flow gets torn down rapidly, however, it is around long enough to do useful work till it gets reallocated from the head of the free list.
Q. Can you explain idle and permanently mapped flows?
A. The following are the three basic categories of flows to be concerned with:
The client session is just routed through the CSS and hits no rule. This session can be idle indefinitely. The flow associated with the session maybe garbage collected, however, it will be restarted mid-flow when the user sends new data.
The session hits a rule, however, it is only NAT (that is, hits an Layer 3 (L3)/L4 rule and is not spoofed). In this case, the flow can be restarted mid-session for a period of time. You have a flow for this session with the NAT information. It is garbage collected. We can restart this flow when the user sends new data until the FCB is recycled. The number of FCBs available in the SFM memory is unit dependent. The heavier the load, the less time you have for the flow to be reformed.
The session hits a rule which is L5 or requires spoofing (for example, any rule with balance method domainhash requires spoofing). This session cannot be restarted. Once garbage is collected, it cannot be reformed.
When you garbage collect and tear down idle flows, the portmapper (NAT table) entry is also removed. Once this flow information is aged out of the cache, there is currently no way to remap the flow through the portmapper, and the user's session will hang with various long lived flow applications. Flow manager garbage collection happens after about 15 to 20 seconds, and gets more aggressive with heavy loads. The following are two ways to ensure that a flow is not garbage collected:
Configure a permanent port. A user can configure permanent ports by issuing the flow permanent port1 <> command. The downside to configuring permanent ports is that flows that are not terminated normally (will never clean up). Some customers have to periodically take the permanent ports off the CSS to allow garbage collection to occur.
Issue the flow reserve-clean 0 command. This causes you to not garbage collect any flows for port less than 24 (Telnet, SSH, FTP, and so on; flows that tend to be idle). This has the same downside as configuring a permanent port.
The maximum configurable number of permanent ports used to be four. Some customers did not feel that four permanent ports was enough to satisfy their needs. They requested an enhancement that provided more than four permanent ports and an increase in the flow idle timer on a per TCP port basis. The maximum number of configurable permanent ports was increased to ten in version 4.01/b5s, and to 20 with vesion 5.00(0.61) (bug CSCdy13359). The idle timers for the following TCP ports were set to 600 seconds:
Also, in version 4.01/9, the TCP timeout for garbage collection was increased to ten minutes for the following additional ports:
The following are general garbage collection default timeout values of interest:
TCP timeout: 15 seconds
UDP timeout: 5 seconds
NFS timeout: 2 seconds (Port 2049)
CHAT timeout: 180 seconds (Port 5050, 5190..5193)
HTTP timeout: 8 seconds (Port 80)
Any time a flow is garbage collected due to age, the CSS will keep a record of it in memory for as long as possible. This is limited by the free memory available and the flow rate in the environment. If a subsequent packet comes along belonging to a flow that you still have cached in memory, you can remap it back into fastpath (resources permitting). The idea is that long-lived flows, such as Telnet or FTP control channel, or database connections, can be safely garbage collected to free up fastpath resources, however, they can be mapped back to fastpath when needed.