Hacking and Other Thoughts

Mon, 30 Aug 2010

How GRO works

All modern device drivers should be doing two things, first they should use NAPI for interrupt mitigation plus simpler mutual exclusion (all RX code paths run in software interrupt context just like TX), and use the GRO NAPI interfaces for feeding packets into the network stack.

Like just about anything else in the networking, GRO is all about increasing performance. The idea is that we can accumulate consequetive packets (based upon protocol specific sequence number checks etc.) into one huge packet. Then process the whole group as one packet object. (in Network Algorithmics this would be principle P2c, Shift computation in time, Share expenses, batch)

GRO help significantly on everyday systems, but it helps even more strongly on machines making use of virtualization since bridging streams of packets is very common and GRO batching decreases the number of switching operations.

Each NAPI instance maintains a list of GRO packets we are trying to accumulate to, called napi->gro_list. The GRO layer dispatches to the network layer protocol that the packet is for. Each network layer that supports GRO implements both a ptype->gro_receive and a ptype->gro_complete method.

->gro_receive attempts to match the incoming skb with ones that have already been queued onto the ->gro_list At this time, the IP and TCP headers are popped from the front of the packets (from GRO's perspective, that actual normal skb packet header pointers are left alone). Also, the GRO'ability state of all packets in the GRO list and the new incoming SKB are updated.

Once we've committed to receiving a GRO skb, we invoke the ->gro_complete method. It is at this point that we make the collection of individual packets look truly like one huge one. Checksums are updated, as are various private GSO state flags in the head 'skb' given to the network stack.

We do not try to accumulate GRO packets infinitely. At the end of a NAPI poll quantum, we force flush the GRO packet list.

For ipv4 TCP there are various criteria for GRO matching.

Certain events cause the current GRO bunch to get flushed out. For example:

The most important attribute of GRO is that it preserves the received packets in their entirety, such that if we don't actually receive the packets locally (for example we want to bridge or route them) they can be perfectly and accurately reconstituted to the transmit path. This is because none of the packet headers are modified (they are entirely preserved) and since GRO requires completely regular packet streams for merging, the packet boundary points are known precisely as well. The GRO merged packet can be completely unraveled and it will mimmick exactly the incoming packet sequence.

GRO mainly the work of Herbert Xu. Various driver authors and others helped him tune and optimize the implementation.