|
|
Attendees:
|
David Miller
Soyoung Park
Stephen Hemminger
Jeff Kirsher
John Fastabend
Herbet Xu
Eric Beiderman
Tom Herbert
Eric Duzamet
Scott Feldman
|
Rusty Russell
Jamal Hadi Salim
Pablo Neira Ayuso
Anna Schumaker
Florian Fainelli
Hannes Frederic Sowa
Eric W. Biederman
Chris Wright
John Linville
Johannes Berg
|
- Introductions
- SCTP consolidation with generic infrastructure.
- Lower Arnaldo's generic infrastructure work to SCTP some more.
- Who uses associations?
- Does not use inet hash
synchronize RCU, which is in recent merge w/ dynamic lookup done
Conversion of ehash tables to rhashtables
- eBPF
- 64-bit opcodes, LLVM --> C backend, more than networking
- pushback(code & design reviews needed on community level)
- load arbitrary pointers(security concern)
- LLVM backend
- useful for tracing
- Pablo Netfilter Workshop report
- http://workshop.netfilter.org/2014/wiki/index.php/List_of_presentations
- Conntrack removal, Jesper
- OVS replacing bridging, shemminger
- 10GB/s wirespeed, Jesper
- DPDK (concerns: sharing IP and port space with normal stack)
+ take another look at netchannels
+ people think userland development is "easier"
- offloads between hardware such as switches, software and vm layers
- nftables, compatibility layer, multidimensional keys - Pablo/Patrick
- OVS w/conntrack - Jesse Gross
- Tom Herbert - Offloading Encapsulations
- non-virtualization (GRE, source routing) vs. virtualization (vxlan, nvgre, etc.)
- UDP encapsulation ubiquitous
- Avoiding deep packet inspection for flow steering
- Idea: Set source port to hash of inner packet
- checksum offloading
+ multiple checksums in single packet (IP->UDP->GRE->IP->TCP)
+ Switch vendors want to avoid UDP checksums
+ Receive checksum overhaul
- CHECKSUM_COMPLETE(always works) vs. CHECKSUM_UNNECESSARY(stack allows two levels)
- Most NICs can provide checksum unnecessary for UDP packets
- if checksum is non-zero, derive checksum complete when processing UDP packets
- after conversion, any encapsulated checksums is verified by using skb->csum
+ TX checksum offloading
- one checksum easy, for two stack and NIC do not support
- Remote Checksum Offload
+ Checksum only outer UDP packet on TX
+ like normal checksum offload except it's deferred to peer
+ Deduce both csums on receive
- 2 or more checksums
+ outer packet and inner transport packet
+ stack and NICs do not support
+ Alternative: Remote Checksum Offload
- GRO after GRE decap
- TSO/GSO
+ Partially generic support
+ works as long as no per-segment values reside in encap header
(such as sequence numbers, packet lengths)
- TSO/LRO to guest driver
+ Tx guest uses TSO interface, host kernel converts to TSO/GSO
+ On Rx, host probably uses GRO, converts to LRO to guest device
- Johannes Berg - Wireless
- ARP Proxying
+ Power and air time saving
+ Snoops DHCP, ARP, NS/NA frames.
+ Implementation location: generic networking vs. bridge
- L2 Traffic Inspection and Filtering
- Wireless traffic bound for wireless medium currently forwarded
internally by 802.11 layer. Will have to change in order to
implement snooping for things like ARP proxying
+ BR_HAIRPIN_MODE
- GTK protected traffic(L3 unicast in L2 multicast), RFC 1122 broadcast check
+ since all stations share GTK key, any station can send multicast
GTK protected frames to anyone and appear to be the AP.
+ thus frames containing L2 unicast in L2 multicast packets, which
are GTK protected, should be dropped
+ RFC 1122 mandated rules on receive should already disallow this but
seem to be simply not implemented in ip_route_input_slow() yet.
+ Alternatively, parse L3 in wireless stack (or even in iptables rules?)
+ Previous attempt at using an skb bit "drop_unicast" reject by Eric
- Jamal Hadi Salim - Network Function Offloading
- Use/support existing tools(nftables/iptables, iproute2, route/ifconfig...)
- Linux APIs, netlink, etc., no vendor APIs/SDKs
- Bridging/switching, QoS, IPSEC, L3 forwarding, stateless ACL
- Capability probing becomes necessary due to disparate set of features
and capacities (TCAMs etc.)
- How vs. Network centric view of the worls
- Challenge1: design toward generic framework that will cater each drivers
- Challenge2: limited open source drivers
- Challenge3: unresolved driver support issues get spilled to the userspace for kernel to support as openwrt guys' open drivers
- QEMU virtual device coming soon so that prototyping is possible
of userspace
- on-going discussions: https://linux.cumulusnetworks.com/offload-discussion-1/
- David S. Miller - VLAN offload limits
- HW decap stored in SKB
- multiple HW decaps possible?
- --> no
- Want consistent handling of multiple tags
- Bonding
- very modular
- locking is a lot cleaner
- smaller code base
- Where are with bonding and offloading?
- Eric Dumazet - IP VLAN
- Like MAC VLAN, but decapsulating on IP address
- supports ipv4 and ipv6
- David S. Miller - send batching
- ->ndo_start_xmit() takes one SKB at a time
- Extend to be able to queue multiple SKBs at a time
- Amortize the number of "trigger" events (writes to "TX start"
register, or calls into hypervisor)
- Transition path is important, there are 400+ implementations
of ->ndo_start_xmit()
- Add ->ndo_xmit_flush() op
- If implemented, semantics are that it must be called after
a sequence of ->ndo_start_xmit() invocations. The implication
is that ->ndo_start_xmit() does not kick the TX queue to
start processing the newly queued up SKBs, that's what the
new ->ndo_xmit_flush() operation does.
- Misc
- Batching of "same routing keyed" packets on RX to amortize
routing lookups, perhaps similarly on TX.
- TPACKET_V4, generically exporting RX ring into userspace.
Header will have description of RX queue descriptor format.
Should work on Intel, Mellanox, Solarflare NICs
|