More rain, more busy trains, and a quicker arrival to the conference location since everyone knew the exact location this time.
The day started with Herbert Xu discussing GSO and his CRYPTO layer work. GSO helps a lot with the direct communication paths between nodes in a Xen system. Many useful results were obtained indirectly from the GSO work. For example, we got TCP/ECN and IPV6 TSO support essentially for free. I covered the GSO work in more detail in a previous blog entry.
In the Crypto layer Herbert is working on consolidation of the algorithm implementation interfaces. We have many ways to do things and the interfaces were starting to get out of control. The templating mechanism he will use to put algorithm instances together seems very interesting.
Another goal of his crypto layer work is async crypto. And once we support hardware crypto offload devices, it now makes sense to provide a user API to the crypto layer in the kernel. Since we currently have only software implementations, a userspace API to it makes no sense, since userspace can run the same code and do it more efficiently by not having to call into the kernel.
James Morris gave his second set of presentations on OLPC, SELINUX MAC, and IPSEC support. The OLPC stuff was very interesting, because of the incredible constraints these systems much work under and the extreme environments in which the laptops might be used. These laptops will run over L2 meshed wireless networks, usually with a concentrator access point of some point. The concentrator nodes can be connected to the internet or just provide local content in more rural areas.
Next the SELINUX netfilter modules were discussed, such as SECMARK. These can be used to mark packets based upon SELINUX rules. As well as being powerful, these facilities have the potential to simplify security configurations, which is one of the biggest hurdles in SELINUX currently.
Jamal Hadi Salim discussed some very interesting issues wrt. qdisc locking on transmit. Using some amazingly constructed animated figures, he showed the problem with two cpus running in lockstep in the transmit qdisc path. If two cpu threads of control enter the qdisc insert, and the second gets into the insert while the first one gets into the device transmit they can end up in lockstep trying to run the queue over and over when we only need one cpu to do the queue running.
Herbert gave a patch which fixed this up so that the first cpu that gets into qdisc marks it "occupied" so that other cpus will only add packets to the qdisc, not try to dequeue and transmit. A further enhancement Jamal made was the ability to queue multiple packets out of the qdisc when a full TX queue of a device is restarted. It is commonly the case that on device TX queue wakeup, many free slots are available not just one so this could help a lot.
Thomas Graf gave a presentation on his netlink work, cleaning things up, and preparing the userland libnl for a long-awaited 1.0 release. He also went on to discuss an idea for route marking. You could mark a route in two ways: 1) during normal route installation 2) by interface address since the kernel adds default routes for new interface addresses (for the network and broadcast). Then, an application can use a new socket option to specify what mark the routes for the socket could use. He also described a way to say in which context a route would be allowed to be used, so a route could only be valid on "INPUT" lookups.
Next he described a way to do route matching based upon branching rules with a "GOTO" operation. Only forward GOTOs would be allowed in the tree in order to prevent loops, but this would allow to match on many key components and allow a high level of rule sharing.
One nice consequence of Thomas's work is that we can force packets for a local address to still go over a real ethernet device. Some people do this for testing and the current kernel cannot do this. Continually a very ugly patch keeps getting submitted which allowed this but broke a lot of other things. Thomas's ideas allow this to be supported cleanly.
Junji TAMATSUKURI discussed his work on the Internet2 land speed record testing using Linux's TCP stack. His graphs were very informative and we learned about some strange effects caused by SONET framing and ACKs. Two back to back ACKs can be compressed into a single SONET frame, so on receive the ACKs arrive back-to-back instead of properly spaced apart.
He had many impressive graphs showing that some serious performance problems have been introduced somehow recently over these long latency connections, especially with ipv6. I anticipate that we will work these people a lot in the future, as this is a very area of technology to improve upon.
Finally, we ended with a guest speaker presentation from HP, Paul Moore. He discussed his implementation of NetLabel on Linux which recently got integrated into the net-2.6.19 tree. Most interesting to me was when the SELINUX hooks execute to decide how to label packets used on that socket and how this all ties together. He also discussed future directions including a potential RIPSO implementation and how compliant the current CIPSO work is with the draft specification (it was never fully standardized even though it is an older technology).
Another enjoyable day of netconf!