Skip to content
Commit f65aac16 authored by Matt Carlson's avatar Matt Carlson Committed by David S. Miller
Browse files

tg3: Improve small packet performance



smp_mb() inside tg3_tx_avail() is used twice in the normal
tg3_start_xmit() path (see illustration below).  The full memory
barrier is only necessary during race conditions with tx completion.
We can speed up the tx path by replacing smp_mb() in tg3_tx_avail()
with a compiler barrier.  The compiler barrier is to force the
compiler to fetch the tx_prod and tx_cons from memory.

In the race condition between tg3_start_xmit() and tg3_tx(),
we have the following situation:

tg3_start_xmit()                       tg3_tx()
    if (!tg3_tx_avail())
        BUG();

    ...

    if (!tg3_tx_avail())
        netif_tx_stop_queue();         update_tx_index();
        smp_mb();                      smp_mb();
        if (tg3_tx_avail())            if (netif_tx_queue_stopped() &&
            netif_tx_wake_queue();         tg3_tx_avail())

With smp_mb() removed from tg3_tx_avail(), we need to add smp_mb() to
tg3_start_xmit() as shown above to properly order netif_tx_stop_queue()
and tg3_tx_avail() to check the ring index.  If it is not strictly
ordered, the tx queue can be stopped forever.

This improves performance by about 3% with 2 ports running
bi-directional 64-byte packets.

Reviewed-by: default avatarBenjamin Li <benli@broadcom.com>
Signed-off-by: default avatarMichael Chan <mchan@broadcom.com>
Signed-off-by: default avatarMatt Carlson <mcarlson@broadcom.com>
Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
parent 67b284d4
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment