Several months ago the powerpc folks let us know about some limitations of the NAPI framework being hit by the EHEA driver they were writing. It turns out that many devices are beginning to have multiple queues, like EHEA does, and will thus hit these issues too.
Stephen came up with a patch that I'm recently trying to bring back to life and have ready enough for integration by 2.6.24
The basic premise is that the NAPI context does not necessarily have a one to one relationship with the network device, yet that is the world view imposed by the current network driver interfaces and data structures. So what drivers currently do to "work around" this is they allocate fake dummy network device structures for each of the device's queues that serve merely as NAPI context cookies and are otherwise unused. It's horrific but it works and there does not exist a better solution.
So the patch breaks out the NAPI specific state into a seperate structure, and all the NAPI bits work on this thing instead of (just a) network device. As a first approximation, Stephen's patch put a "struct napi_struct" in the network driver so that the drivers could be ported over quickly.
Upon the first posting of my forward port to the current tree,
Rusty suggested
that we should just rip out the napi_struct from the generic
network device datastructure and put it into the driver privates.
They know how many they will need, and for multi queue devices
that in-netdev instance will just be wasted space. So I did this
and began converting the drivers.
...a very long day passed...
Several drivers needed major surgery, as they were doing awful things
to keep track of multi-queue state. But those cases were resolved,
and even a few drivers are now multi-NAPI converted fully such as the
aforementioned EHEA driver.
The driver interrupt and ->poll() implementations look pretty clean
now:
What's left from a driver is that it must firstly shut down
all of it's NAPI instances for a network device in it's
close handler:
There are some issues, such as netpoll, to deal with (it is
currently broken because it has some deep rooted assumptions
that napi<-->net_device are one-to-one) but it's getting there.
I plan to post V4 later tonight before hitting bed, and perhaps work
on the netpoll issues tomorrow.
static int foo_poll(struct napi_struct *napi, int budget)
{
struct foo_private *fp = container_of(napi, struct foo_private, napi);
struct net_device *dev = fp->dev;
int rx;
rx = process_rx_packets(fp, dev);
process_tx_packets(fp, dev);
if (rx < budget) {
spin_lock_irq(&fp->lock);
foo_enable_interrupts(fp);
if (rx_packets_available(fp))
foo_disable_interrupts(fp);
else
__netif_rx_complete(dev, napi);
spin_unlock_irq(&fp->lock);
}
return rx;
}
static irqreturn_t foo_interrupt(int irq, void *dev_id)
{
struct foo_private *fp = dev_id;
unsigned long flags;
spin_lock_irqsave(&fp->lock, flags);
if (netif_rx_schedule_prep(dev, &fp->napi)) {
foo_disable_interrupts(fp);
__netif_rx_schedule(dev, &fp->napi);
}
spin_unlock_irqrestore(&fp->lock, flags);
return IRQ_HANDLED;
}
If you were familiar with the existing interfaces, the driver had
to manage the queue quotas and other stuff by hand in it's
->poll() handler, but that is no more. The caller gets the
number of RX packets processed and does all of the quota
management.
static int foo_close(struct net_device *dev)
{
struct foo_private *fp = netdev_priv(dev);
napi_disable(&fp->napi);
...
return 0;
}
Next, it must initialize it's NAPI instances in it's probe
routine:
netif_napi_init(&fp->napi, foo_poll, 64);
And finally it has to declare the NAPI instance in it's
driver private:
struct foo_private {
...
struct napi_struct napi;
...
};
That's pretty much it. If you have multiple RX queues, you
define a napi_struct in whatever datastructure you use to
manage each queue, and that also has backpointers to the
main adapter private software state which can thus also get
you to the "struct net_device".