summaryrefslogtreecommitdiffstats
AgeCommit message (Collapse)Author
2010-10-04Merge branch 'master' of ↵David S. Miller
master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: net/ipv4/Kconfig net/ipv4/tcp_timer.c
2010-10-03net: introduce DST_NOCACHE flagEric Dumazet
While doing stress tests with IP route cache disabled, and multi queue devices, I noticed a very high contention on one rwlock used in neighbour code. When many cpus are trying to send frames (possibly using a high performance multiqueue device) to the same neighbour, they fight for the neigh->lock rwlock in order to call neigh_hh_init(), and fight on hh->hh_refcnt (a pair of atomic_inc/atomic_dec_and_test()) But we dont need to call neigh_hh_init() for dst that are used only once. It costs four atomic operations at least, on two contended cache lines, plus the high contention on neigh->lock rwlock. Introduce a new dst flag, DST_NOCACHE, that is set when dst was not inserted in route cache. With the stress test bench, sending 160000000 frames on one neighbour, results are : Before patch: real 2m28.406s user 0m11.781s sys 36m17.964s After patch: real 1m26.532s user 0m12.185s sys 20m3.903s Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-03sctp: Fix break indentation in sctp_ioctl().David S. Miller
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-03be2net: add multiple RX queue supportSathya Perla
This patch adds multiple RX queue support to be2net. There are upto 4 extra rx-queues per port into which TCP/UDP traffic can be hashed into. Some of the ethtool stats are now displayed on a per queue basis. Signed-off-by: Sathya Perla <sathya.perla@emulex.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-03Merge branch 'for-davem' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6
2010-10-03qeth: tagging with VLAN-ID 0Ursula Braun
This patch adapts qeth to handle tagged frames with VLAN-ID 0 and with or without priority information in the tag. It enables qeth to receive priority-tagged frames on a base interface, for example from z/OS, without configuring an additional VLAN interface. Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com> Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-03cxgb4: remove a bogus PCI function number checkDimitris Michailidis
Remove a bogus PCI function number check from the driver's .remove method that causes pci_release_regions not to be called for function 0 if additional functions are attached and one of them is used as primary. Signed-off-by: Dimitris Michailidis <dm@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-03drivers/atm/idt77252.c: Remove unnecessary error checkJulia Lawall
This code does not call deinit_card(card); in an error case, as done in other error-handling code in the same function. But actually, the called function init_sram can only return 0, so there is no need for the error check at all. init_sram is also given a void return type, and its single return statement at the end of the function is dropped. A simplified version of the sematic match that finds this problem is as follows: (http://coccinelle.lip6.fr/) // <smpl> @r exists@ @r@ statement S1,S2,S3; constant C1,C2,C3; @@ *if (...) {... S1 return -C1;} ... *if (...) {... when != S1 return -C2;} ... *if (...) {... S1 return -C3;} // </smpl> Signed-off-by: Julia Lawall <julia@diku.dk> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-03drivers-net-tulip-de4x5c-fix-copy-length-in-de4x5_ioctl-checkpatch-fixesAndrew Morton
ERROR: trailing statements should be on next line #23: FILE: drivers/net/tulip/de4x5.c:5477: + if (copy_to_user(ioc->data, tmp.lval, ioc->len)) return -EFAULT; total: 1 errors, 0 warnings, 8 lines checked ./patches/drivers-net-tulip-de4x5c-fix-copy-length-in-de4x5_ioctl.patch has style problems, please review. If any of these errors are false positives report them to the maintainer, see CHECKPATCH in MAINTAINERS. Please run checkpatch prior to sending patches Cc: Dan Rosenberg <dan.j.rosenberg@gmail.com> Cc: Grant Grundler <grundler@parisc-linux.org> Cc: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-03sctp: Fix out-of-bounds reading in sctp_asoc_get_hmac()Dan Rosenberg
The sctp_asoc_get_hmac() function iterates through a peer's hmac_ids array and attempts to ensure that only a supported hmac entry is returned. The current code fails to do this properly - if the last id in the array is out of range (greater than SCTP_AUTH_HMAC_ID_MAX), the id integer remains set after exiting the loop, and the address of an out-of-bounds entry will be returned and subsequently used in the parent function, causing potentially ugly memory corruption. This patch resets the id integer to 0 on encountering an invalid id so that NULL will be returned after finishing the loop if no valid ids are found. Signed-off-by: Dan Rosenberg <drosenberg@vsecurity.com> Acked-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-03sctp: prevent reading out-of-bounds memoryDan Rosenberg
Two user-controlled allocations in SCTP are subsequently dereferenced as sockaddr structs, without checking if the dereferenced struct members fall beyond the end of the allocated chunk. There doesn't appear to be any information leakage here based on how these members are used and additional checking, but it's still worth fixing. [akpm@linux-foundation.org: remove unfashionable newlines, fix gmail tab->space conversion] Signed-off-by: Dan Rosenberg <dan.j.rosenberg@gmail.com> Acked-by: Vlad Yasevich <vladislav.yasevich@hp.com> Cc: David Miller <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-03ipv4: correct IGMP behavior on v3 query during v2-compatibility modeDavid Stevens
A recent patch to allow IGMPv2 responses to IGMPv3 queries bypasses length checks for valid query lengths, incorrectly resets the v2_seen timer, and does not support IGMPv1. The following patch responds with a v2 report as required by IGMPv2 while correcting the other problems introduced by the patch. Signed-Off-By: David L Stevens <dlstevens@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-03ipmr: cleanupsEric Dumazet
Various code style cleanups Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-03ipmr: RCU protection for mfc_cache_arrayEric Dumazet
Use RCU & RTNL protection for mfc_cache_array[] ipmr_cache_find() is called under rcu_read_lock(); Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-03ipmr: RCU conversion of mroute_skEric Dumazet
Use RCU and RTNL to protect (struct mr_table)->mroute_sk Readers use RCU, writers use RTNL. ip_ra_control() already use an RCU grace period before ip_ra_destroy_rcu(), so we dont need synchronize_rcu() in mrtsock_destruct() Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-03ipmr: __pim_rcv() is called under rcu_read_lockEric Dumazet
No need to get a reference on reg_dev and release it, we are in a rcu_read_lock() protected section. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-03gre: protocol table can be staticstephen hemminger
This table is only used in gre.c Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-03netdev: Depend on INET before selecting INET_LROBen Hutchings
Since 'select' ignores dependencies, drivers that select INET_LRO must depend on INET. This fixes the broken configuration reported in <http://article.gmane.org/gmane.linux.kernel/825646>. Reported-by: Subrata Modak <subrata@linux.vnet.ibm.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-03Revert "ipv4: Make INET_LRO a bool instead of tristate."Ben Hutchings
This reverts commit e81963b180ac502fda0326edf059b1e29cdef1a2. LRO is now deprecated in favour of GRO, and only a few drivers use it, so it is desirable to build it as a module in distribution kernels. The original change to prevent building it as a module was made in an attempt to avoid the case where some dependents are set to y and some to m, and INET_LRO can be set to m rather than y. However, the Kconfig system will reliably set INET_LRO=y in this case. Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-03net: Fix the condition passed to sk_wait_event()Nagendra Tomar
This patch fixes the condition (3rd arg) passed to sk_wait_event() in sk_stream_wait_memory(). The incorrect check in sk_stream_wait_memory() causes the following soft lockup in tcp_sendmsg() when the global tcp memory pool has exhausted. >>> snip <<< localhost kernel: BUG: soft lockup - CPU#3 stuck for 11s! [sshd:6429] localhost kernel: CPU 3: localhost kernel: RIP: 0010:[sk_stream_wait_memory+0xcd/0x200] [sk_stream_wait_memory+0xcd/0x200] sk_stream_wait_memory+0xcd/0x200 localhost kernel: localhost kernel: Call Trace: localhost kernel: [sk_stream_wait_memory+0x1b1/0x200] sk_stream_wait_memory+0x1b1/0x200 localhost kernel: [<ffffffff802557c0>] autoremove_wake_function+0x0/0x40 localhost kernel: [ipv6:tcp_sendmsg+0x6e6/0xe90] tcp_sendmsg+0x6e6/0xce0 localhost kernel: [sock_aio_write+0x126/0x140] sock_aio_write+0x126/0x140 localhost kernel: [xfs:do_sync_write+0xf1/0x130] do_sync_write+0xf1/0x130 localhost kernel: [<ffffffff802557c0>] autoremove_wake_function+0x0/0x40 localhost kernel: [hrtimer_start+0xe3/0x170] hrtimer_start+0xe3/0x170 localhost kernel: [vfs_write+0x185/0x190] vfs_write+0x185/0x190 localhost kernel: [sys_write+0x50/0x90] sys_write+0x50/0x90 localhost kernel: [system_call+0x7e/0x83] system_call+0x7e/0x83 >>> snip <<< What is happening is, that the sk_wait_event() condition passed from sk_stream_wait_memory() evaluates to true for the case of tcp global memory exhaustion. This is because both sk_stream_memory_free() and vm_wait are true which causes sk_wait_event() to *not* call schedule_timeout(). Hence sk_stream_wait_memory() returns immediately to the caller w/o sleeping. This causes the caller to again try allocation, which again fails and again calls sk_stream_wait_memory(), and so on. [ Bug introduced by commit c1cbe4b7ad0bc4b1d98ea708a3fecb7362aa4088 ("[NET]: Avoid atomic xchg() for non-error case") -DaveM ] Signed-off-by: Nagendra Singh Tomar <tomer_iisc@yahoo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-03net: Fix IPv6 PMTU disc. w/ asymmetric routesMaciej Żenczykowski
Signed-off-by: Maciej Żenczykowski <maze@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-01Merge branch 'master' of ↵John W. Linville
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6 into for-davem
2010-10-01enic: Update MAINTAINERSVasanthy Kolluri
Update MAINTAINERS list Signed-off-by: Vasanthy Kolluri <vkolluri@cisco.com> Signed-off-by: Roopa Prabhu <roprabhu@cisco.com> Signed-off-by: David Wang <dwang2@cisco.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-01enic: Make local functions staticVasanthy Kolluri
Make functions used locally in a file as static Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: Vasanthy Kolluri <vkolluri@cisco.com> Signed-off-by: Roopa Prabhu <roprabhu@cisco.com> Signed-off-by: David Wang <dwang2@cisco.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-01enic: Remove dead codeVasanthy Kolluri
Removed code that is unused Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: Vasanthy Kolluri <vkolluri@cisco.com> Signed-off-by: Roopa Prabhu <roprabhu@cisco.com> Signed-off-by: David Wang <dwang2@cisco.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-01neigh: reorder fields in struct neighbourEric Dumazet
On 64bit arches, there are two 32bit holes that we can remove. sizeof(struct neighbour) shrinks from 0xf8 to 0xf0 bytes Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-01isdn/gigaset: improve bas_gigaset USB error reportingTilman Schmidt
Rephrase some USB error messages to make them clearer and more consistent. Downgrade some warning messages that may occur during normal operation to debug messages. Signed-off-by: Tilman Schmidt <tilman@imap.cc> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-01isdn/gigaset: fix bas_gigaset interrupt read error handlingTilman Schmidt
Rework the handling of USB errors in interrupt input reads to clear halts correctly, delay URB resubmission after errors, limit retries, and improve error recovery. Signed-off-by: Tilman Schmidt <tilman@imap.cc> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-01isdn/gigaset: unclog bas_gigaset AT response pipeTilman Schmidt
Recover from a lost HD_RECEIVEATDATA_ACK message by sending a zero-length HD_READ_ATMESSAGE command when ev_layer sends "+++". Signed-off-by: Tilman Schmidt <tilman@imap.cc> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-01isdn/gigaset: try USB reset for bas_gigaset error recoveryTilman Schmidt
In error_reset(), if sending HD_RESET_INTERRUPT_PIPE to the device fails, try performing an USB reset. Also correct an error in the leading comment. Signed-off-by: Tilman Schmidt <tilman@imap.cc> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-01isdn/gigaset: bas_gigaset timer cleanupTilman Schmidt
Use setup_timer() and mod_timer() instead of direct assignment to timer structure members, simplify the argument of one timer routine, and make extra sure all timers are stopped during suspend. Signed-off-by: Tilman Schmidt <tilman@imap.cc> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-01isdn/gigaset: drop obsolete debug optionTilman Schmidt
Remove the debug flag DEBUG_DRIVER and associated code. It doesn't serve any useful purpose anymore. Signed-off-by: Tilman Schmidt <tilman@imap.cc> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-01isdn/gigaset: correct bas_gigaset rx buffer handlingTilman Schmidt
In transparent data reception, avoid a NULL pointer dereference in case an skbuff cannot be allocated, remove an inappropriate call to the HDLC flush routine, and correct the accounting of received bytes for continued buffers. Signed-off-by: Tilman Schmidt <tilman@imap.cc> CC: stable <stable@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-01isdn/gigaset: fix bas_gigaset AT read error handlingTilman Schmidt
Rework the handling of USB errors in AT response reads to fix a possible infinite retry loop and a memory leak, and silence a few overly verbose kernel messages. Signed-off-by: Tilman Schmidt <tilman@imap.cc> CC: stable <stable@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-01isdn/gigaset: bas_gigaset locking fixTilman Schmidt
Unlock cs->lock before calling error_hangup() which is marked "cs->lock must not be held". Signed-off-by: Tilman Schmidt <tilman@imap.cc> CC: stable <stable@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-01tg3: Update version to 3.114Matt Carlson
This patch updates the tg3 version to 3.114. Signed-off-by: Matt Carlson <mcarlson@broadcom.com> Reviewed-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-01tg3: Add extend rx ring sizes for 5717 and 5719Matt Carlson
This patch increases the rx ring sizes for those asic revs that support them. Signed-off-by: Matt Carlson <mcarlson@broadcom.com> Reviewed-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-01tg3: Prepare for larger rx ring sizesMatt Carlson
This patch adds two new variables to track the size of the standard and jumbo rx producer ring sizes. The code is then pivoted to these variables from preprocessor constants. Signed-off-by: Matt Carlson <mcarlson@broadcom.com> Reviewed-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-01tg3: Futureproof the loopback testMatt Carlson
There are other multiqueue modes 5717 and 5719 devices can assume. This patch makes sure that the loopback test is safe, should those other modes be enabled in the future. Signed-off-by: Matt Carlson <mcarlson@broadcom.com> Reviewed-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-01tg3: Cleanup missing VPD partno sectionMatt Carlson
This patch cleans up the default VPD partno section. New entries for 5717 asic rev devices were also added. Signed-off-by: Matt Carlson <mcarlson@broadcom.com> Reviewed-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-01tg3: Remove 5724 device IDMatt Carlson
This product was never released to the public. Signed-off-by: Matt Carlson <mcarlson@broadcom.com> Reviewed-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-01tg3: 5719: Prevent tx data corruptionMatt Carlson
This patch enables a bit that prevents read DMA overflows and adjusts the txmbuf margin from the hardware default. The combination of these modifications prevents a tx data corruption issue we were seeing on the 5719. Signed-off-by: Matt Carlson <mcarlson@broadcom.com> Reviewed-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-01tg3: Fix potential netpoll crashMatt Carlson
Up until now the tg3 driver would call netif_napi_add() for the maximum number of NAPI instances the driver could use. The problem is that netpoll could call tg3_poll() on instances that are not active. The net effect is that the driver will crash attempting to dereference uninitialized pointers. The fix is to only allocate as many NAPI instances as the driver would use in tg3_open() and deleted them in tg3_close(). Signed-off-by: Matt Carlson <mcarlson@broadcom.com> Reviewed-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-30ipv4: rcu conversion in ip_route_output_slowEric Dumazet
ip_route_output_slow() is enclosed in an rcu_read_lock() protected section, so that no references are taken/released on device, thanks to __ip_dev_find() & dev_get_by_index_rcu() Tested with ip route cache disabled, and a stress test : Before patch: elapsed time : real 1m38.347s user 0m11.909s sys 23m51.501s Profile: 13788.00 22.7% ip_route_output_slow [kernel] 7875.00 13.0% dst_destroy [kernel] 3925.00 6.5% fib_semantic_match [kernel] 3144.00 5.2% fib_rules_lookup [kernel] 3061.00 5.0% dst_alloc [kernel] 2276.00 3.7% rt_set_nexthop [kernel] 1762.00 2.9% fib_table_lookup [kernel] 1538.00 2.5% _raw_read_lock [kernel] 1358.00 2.2% ip_output [kernel] After patch: real 1m28.808s user 0m13.245s sys 20m37.293s 10950.00 17.2% ip_route_output_slow [kernel] 10726.00 16.9% dst_destroy [kernel] 5170.00 8.1% fib_semantic_match [kernel] 3937.00 6.2% dst_alloc [kernel] 3635.00 5.7% rt_set_nexthop [kernel] 2900.00 4.6% fib_rules_lookup [kernel] 2240.00 3.5% fib_table_lookup [kernel] 1427.00 2.2% _raw_read_lock [kernel] 1157.00 1.8% kmem_cache_alloc [kernel] Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-30ipv4: introduce __ip_dev_find()Eric Dumazet
ip_dev_find(net, addr) finds a device given an IPv4 source address and takes a reference on it. Introduce __ip_dev_find(), taking a third argument, to optionally take the device reference. Callers not asking the reference to be taken should be in an rcu_read_lock() protected section. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-30vlan: dont drop packets from unknown vlans in promiscuous modeEric Dumazet
Roger Luethi noticed packets for unknown VLANs getting silently dropped even in promiscuous mode. Check for promiscuous mode in __vlan_hwaccel_rx() and vlan_gro_common() before drops. As suggested by Patrick, mark such packets to have skb->pkt_type set to PACKET_OTHERHOST to make sure they are dropped by IP stack. Reported-by: Roger Luethi <rl@hellgate.ch> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> CC: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-30e1000e: 82579 performance improvementsBruce Allan
The initial support for 82579 was tuned poorly for performance. Adjust the packet buffer allocation appropriately for both standard and jumbo frames; and for jumbo frames increase the receive descriptor pre-fetch, disable adaptive interrupt moderation and set the DMA latency tolerance. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-30e1000e: use hardware writeback batchingJesse Brandeburg
Most e1000e parts support batching writebacks. The problem with this is that when some of the TADV or TIDV timers are not set, Tx can sit forever. This is solved in this patch with write flushes using the Flush Partial Descriptors (FPD) bit in TIDV and RDTR. This improves bus utilization and removes partial writes on e1000e, particularly from 82571 parts in S5500 chipset based machines. Only ES2LAN and 82571/2 parts are included in this optimization, to reduce testing load. Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Emil Tantilov <emil.s.tantilov@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-30ixgbe: fix link issues and panic with shared interrupts for 82598Emil Tantilov
Fix possible panic/hang with shared Legacy interrupts by not enabling interrupts when interface is down. Also fixes an intermittent link by enabling LSC upon exit from ixgbe_intr() This patch adds flags to ixgbe_irq_enable() to allow for some flexibility when enabling interrupts. Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com> Tested-by: Stephen Ko <stephen.s.ko@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-30ipv4: __mkroute_output() speedupEric Dumazet
While doing stress tests with a disabled IP route cache, I found __mkroute_output() was touching three times in_device atomic refcount. Use RCU to touch it once to reduce cache line ping pongs. Before patch time to perform the test real 1m42.009s user 0m12.545s sys 25m0.726s Profile : 16109.00 26.4% ip_route_output_slow vmlinux 7434.00 12.2% dst_destroy vmlinux 3280.00 5.4% fib_rules_lookup vmlinux 3252.00 5.3% fib_semantic_match vmlinux 2622.00 4.3% fib_table_lookup vmlinux 2535.00 4.1% dst_alloc vmlinux 1750.00 2.9% _raw_read_lock vmlinux 1532.00 2.5% rt_set_nexthop vmlinux After patch real 1m36.503s user 0m12.977s sys 23m25.608s 14234.00 22.4% ip_route_output_slow vmlinux 8717.00 13.7% dst_destroy vmlinux 4052.00 6.4% fib_rules_lookup vmlinux 3951.00 6.2% fib_semantic_match vmlinux 3191.00 5.0% dst_alloc vmlinux 1764.00 2.8% fib_table_lookup vmlinux 1692.00 2.7% _raw_read_lock vmlinux 1605.00 2.5% rt_set_nexthop vmlinux Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>