summaryrefslogtreecommitdiffstats
path: root/drivers
AgeCommit message (Collapse)Author
2012-05-04mISDN: Fix refcounting bugKarsten Keil
Under some configs it was still not possible to unload the driver, because the module use count was srewed up. Signed-off-by: Karsten Keil <keil@b1-systems.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-05-04mISDN: Added PH_* state info to tei manager.Andreas Eversberg
Tei manager reports current layer 1 state on creation. On state change it reports it to the socket interface. Signed-off-by: Andreas Eversberg <andreas@eversberg.eu> Signed-off-by: Karsten Keil <keil@b1-systems.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-05-04ixgbe: Update link flow control to correctly handle multiple packet buffer DCBAlexander Duyck
This change updates the link flow control configuration so that we correctly set the link flow control settings for DCB. Previously we would have to call the fc_enable call 8 times, once for each packet buffer. If we move that logic into the fc_enable call itself we can avoid multiple unnecessary register writes. This change also corrects an issue in which we were only shifting the water marks for 82599 parts by 6 instead of 10. This was resulting in us only using 1/16 of the packet buffer when flow control was enabled. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Ross Brattain <ross.b.brattain@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2012-05-04ixgbe: Reorder link flow control functions in ixgbe_common.cAlexander Duyck
We can avoid many of the forward declarations found in ixgbe_common.c by just reordering things so this patch does that to help cleanup the code. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Ross Brattain <ross.b.brattain@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2012-05-04ixgbe: Use __free_pages instead of put_page to release pagesAlexander Duyck
This change replaces the calls to put_page with calls to __free_page. Since the FCoE code is able to access order 1 pages I thought it would be a good idea to change things over to using __free_pages since that is the preferred approach for freeing pages. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Ross Brattain <ross.b.brattain@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2012-05-04ixgbe: Make ixgbe_fc_autoneg return void and always set current_modeAlexander Duyck
This change makes it so that ixgbe_fc_autoneg is a void and always sets the current_mode. Previously if the link was down we would return an error, however there is no harm in simply treating a link down case as a case in which autoneg simply failed. This allows us to rely on the return value of the ixgbe_fc_enable call now since there should be no cases where it returns an error that would normally be ignored. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Ross Brattain <ross.b.brattain@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2012-05-04ixgbe: Reorder the ring to q_vector mapping to improve performanceAlexander Duyck
This change reorders the mapping of rings to q_vectors in the case that the number of rings exceeds the number of q_vectors. Previously we would allocate the first R/N queues to the first q_vector where R is the number of rings and N is the number of q_vectors. Instead of doing this we can do a better job of interleaving the rings to the CPUs by assigning every Nth ring to the q_vector. The below tables illustrate this change for the R = 16 N = 4 case. Before patch After patch q_vector: 0 1 2 3 0 1 2 3 Rings: 0 4 8 12 0 1 2 3 1 5 9 13 4 5 6 7 3 6 10 14 8 9 10 11 4 7 11 15 12 13 14 15 This should improve the performance for both DCB or ATR when the number of rings exceeds the number of q_vectors allocated by the adapter. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Ross Brattain <ross.b.brattain@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2012-05-04ixgbe: Track instances of buffer available but no DMA resources presentAlexander Duyck
This change makes it so that we can track instances of where a packet was dropped due to a packet being received when there are no DMA buffers available in the ring. For some reason this was only being enabled with RSC, however it makes more sense to always have this feature on so that we can track any cases where we might drop a buffer due to an Rx ring being full. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Ross Brattain <ross.b.brattain@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2012-05-04e1000e: initial support for i217Bruce Allan
i217 is the next-generation LOM that will be available on systems with the Lynx Point Platform Controller Hub (PCH) chipset from Intel. This patch provides the initial support for the device. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2012-05-04e1000e: Update driver version numberMatthew Vick
Version bump to 1.11.3-k. Signed-off-by: Matthew Vick <matthew.vick@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2012-05-03net/niu: remove one superfluous dma mask checkSebastian Andrzej Siewior
The idea here seems to be to get a 44bit DMA mask working and if this fails it should fallback to a 32bit DMA mask. The dma_mask variable is assigned once to 44bit and never updated. pci_set_dma_mask() and pci_set_consistent_dma_mask() are both implemented as functions so there is no evil macro which might update dma_mask. Looking at the assembly, I see a call to dma_set_mask() followed by dma_supported() and then a jump passed the second dma_set_mask(). The only way to get to second dma_set_mask() call is by an error code in the first one. So I hereby remove the check since it looks superfluous. Please ignore the path if there is black magic involved. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-05-03ixgbevf: Update version stringGreg Rose
Signed-off-by: Greg Rose <gregory.v.rose@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2012-05-03ixgbevf: Make sure jumbo frames are set correctly after PF resetGreg Rose
If the Physical Function (PF) resets after the VF has set jumbo frame MTU then the VF jumbo frame is overwritten. Make sure the VF driver always requests proper MTU size after reset synchronization. Signed-off-by: Greg Rose <gregory.v.rose@intel.com> Tested-by: Sibai Li <sibai.li@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2012-05-03ixgbevf: Add support to recognize 100mb link speedGreg Rose
The X540 10Gig controller is capable of linking at 100Mbits - add support for reporting that link speed. Signed-off-by: Greg Rose <gregory.v.rose@intel.com> Tested-by: Sibai Li <sibai.li@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2012-05-03e1000e: Remove special case for 82573/82574 ASPM L1 disablementChris Boot
For the 82573, ASPM L1 gets disabled wholesale so this special-case code is not required. For the 82574 the previous patch does the same as for the 82573, disabling L1 on the adapter. Thus, this code is no longer required and can be removed. Signed-off-by: Chris Boot <bootc@bootc.net> Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2012-05-03e1000e: Disable ASPM L1 on 82574Chris Boot
ASPM on the 82574 causes trouble. Currently the driver disables L0s for this NIC but only disables L1 if the MTU is >1500. This patch simply causes L1 to be disabled regardless of the MTU setting. Signed-off-by: Chris Boot <bootc@bootc.net> Cc: "Wyborny, Carolyn" <carolyn.wyborny@intel.com> Cc: Nix <nix@esperi.org.uk> Link: https://lkml.org/lkml/2012/3/19/362 Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2012-05-03e1000e: Driver workaround for IPv6 Header Extension Erratum.Matthew Vick
Previously, IPv6 extension header parsing was disabled for all devices supported by e1000e when using packet split mode. However, as per a silicon errata, only certain devices need this restriction and will need to disable IPv6 extension header parsing for all modes. Signed-off-by: Matthew Vick <matthew.vick@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2012-05-03e1000e: Resolve intermittent negotiation issue on 82574/82583.Matthew Vick
For 82574 and 82583 devices, resolve an intermittent link issue where the link negotiates to 100Mbps rather than 1Gbps when powering off the PHY and powering on the PHY after several seconds. Signed-off-by: Matthew Vick <matthew.vick@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2012-05-03e1000e: cleanup long [read|write]_reg_locked PHY ops function pointersBruce Allan
Calling the locked versions of the read/write PHY ops function pointers often produces excessively long lines. Shorten these as is done with the non-locked versions of the PHY register read/write functions. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2012-05-03e1000e: suggest a possible workaround to a device hang on 82577/8Bruce Allan
There is a known issue in the 82577 and 82578 device that can cause a hang in the device hardware during traffic stress; the current workaround in the driver is to disable transmit flow control by default. If the user enables transmit flow control and the device hang occurs, provide a message in the syslog suggesting to re-enable the workaround. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2012-05-03ixgbe: Fix use after free on module removeAlexander Duyck
While testing the TCP changes I had to fix an issue in order to be able to load and unload the module. The recent patch that added thermal sensor support added a use after free bug on module unload with an 82598 adapter in the system. To resolve the issue I have updated the code so that when we free the info_kobj we set it back to NULL. I suspect there are likely other bugs present, but I will leave that for another patch that can undergo more testing. I am submitting this directly to net-next since this fixes a fairly serious bug that will lock up the ixgbe module until the system is rebooted. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-05-02be2net: Fix EEH error reset before a flash dump completesSomnath Kotur
An EEH error can cause the FW to trigger a flash debug dump. Resetting the card while flash dump is in progress can cause it not to recover. Wait for it to finish before letting EEH flow to reset the card. Signed-off-by: Sathya Perla <Sathya.Perla@emulex.com> Signed-off-by: Somnath Kotur <somnath.kotur@emulex.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-05-02be2net: Record receive queue index in skb to aid RPS.Somnath Kotur
Signed-off-by: Sarveshwar Bandi <Sarveshwar.Bandi@emulex.com> Signed-off-by: Somnath Kotur <somnath.kotur@emulex.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-05-02be2net: Fix to apply duplex value as unknown when link is down.Somnath Kotur
Suggested-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: Sarveshwar Bandi <sarveshwar.bandi@emulex.com> Signed-off-by: Somnath Kotur <somnath.kotur@emulex.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-05-02be2net: Fix to not set link speed for disabled functions of a UMC cardSomnath Kotur
This renders the interface view somewhat inconsistent from the Host OS POV considering the rest of the interfaces are showing their respective speeds based on the bandwidth assigned to them. Signed-off-by: Somnath Kotur <somnath.kotur@emulex.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-05-02net/pasemi: fix compiler warningStephen Rothwell
Fix this compiler warning (on PowerPC) by not marking a parameter as const: drivers/net/ethernet/pasemi/pasemi_mac.c: In function 'pasemi_mac_replenish_rx_ring': drivers/net/ethernet/pasemi/pasemi_mac.c:646:3: warning: passing argument 1 of 'netdev_alloc_skb' discards qualifiers from pointer target type include/linux/skbuff.h:1706:31: note: expected 'struct net_device *' but argument is of type 'const struct net_device *' Cc: Olof Johansson <olof@lixom.net> Cc: Pradeep A. Dalvi <netdev@pradeepdalvi.com> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-05-02bnx2x: fix handling single MSIX mode for 57710/57711Dmitry Kravkov
commit 30a5de7723a8a4211be02e94236e9167a424fd07 added ability to use single MSI-X vector, but lack proper handling for 57710/57711 HW Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-05-02ixgbe: Reset max_vfs to zero when user request is out of rangeGreg Rose
If the user request for the number of VFs in the max_vfs parameter is out of range then reset the value to the default value of zero. This makes the behavior of the ixgbe driver the same as for the igb driver. Signed-off-by: Greg Rose <gregory.v.rose@intel.com> Tested-by: Robert Garrett <robertx.e.garrett@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2012-05-02ixgbe: Deny MACVLAN requests from VFs with admin set MACGreg Rose
If the host VMM administrator has set the virtual function device's MAC address then also deny VF requests for MACVLAN filters. Signed-off-by: Greg Rose <gregory.v.rose@intel.com> Tested-by: Garrett, Robert <robertx.e.garrett@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2012-05-02ixgbe: add hwmon interface to export thermal dataDon Skidmore
Some of our adapters have thermal data available, this patch exports this data via hwmon sysfs interface. Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com> Tested-by: Stephen Ko <stephen.s.ko@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2012-05-02ixgbe: add support functions to access thermal dataDon Skidmore
Some 82599 adapters contain thermal data that we can get to via an i2c interface. These functions provide support to get at that data. A following patch will export this data. Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2012-05-02e1000e: fix .ndo_set_rx_mode for 82579Bruce Allan
Secondary unicast and multicast addresses are added to the Receive Address registers (RAR) for most parts supported by the driver. For 82579, there is only one actual RAR and a number of Shared Receive Address registers (SHRAR) that are shared among the driver and f/w which can be reserved and write-protected by the f/w. On this device, use the SHRARs that are not taken by f/w for the additional addresses. Add a MAC ops function pointer infrastructure (similar to other MAC operations in the driver) for setting RARs, introduce a new rar_set function for 82579 and convert the existing code that sets RARs on other devices to a generic rar_set function. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2012-05-02e1000e: PHY initialization flow changes for 82577/8/9Bruce Allan
The PHY initialization flows and assorted workarounds for 82577/8/9 done during driver load and resume from Sx should be the same yet they are not. Combine the current flows/workarounds into a common set of functions that are called during the different code paths. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2012-05-02e1000e: workaround EEPROM configuration change on 82579Bruce Allan
An update to the EEPROM on 82579 will extend a delay in hardware to fix an issue with WoL not working after a G3->S5 transition which is unrelated to the driver. However, this extended delay conflicts with nominal operation of the device when it is initialized by the driver and after every reset of the hardware (i.e. the driver starts configuring the device before the hardware is done with it's own configuration work). The workaround for when the driver is in control of the device is to tell the hardware after every reset the configuration delay should be the original shorter one. Some pre-existing variables are renamed generically to be re-used with new register accesses. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2012-04-30atl1c: remove PHY polling from atl1c_change_mtuHuang, Xiong
PHY polling code for FPGA is considered in every MDIO R/W API. no need to add additional code to atl1c_change_mtu. Signed-off-by: xiong <xiong@qca.qualcomm.com> Tested-by: David Liu <dwliu@qca.qaulcomm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-30atl1c: Disable L0S when no cable linkHuang, Xiong
L0S might be unstable if no cable link, only enable it when link up. Signed-off-by: xiong <xiong@qca.qualcomm.com> Tested-by: Liu David <dwliu@qca.qualcomm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-30atl1c: do MAC-reset when PHY link downHuang, Xiong
There may be tx-skbs still pending in HW when PHY link down. Reset MAC will make the DMA engine go to the start point. and release all pending skbs. Note: Reset MAC will clear any interrupt status and mask. Signed-off-by: xiong <xiong@qca.qualcomm.com> Tested-by: Liu David <dwliu@qca.qualcomm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-30atl1c: cancel task when interface closedHuang, Xiong
common_task might be running while close routine is called, wait/cancel it. Signed-off-by: xiong <xiong@qca.qualcomm.com> Tested-by: Liu David <dwliu@qca.qualcomm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-30atl1c: enlarge L1 response waiting timerHuang, Xiong
The hardware incorrectly process L0S/L1 entrance if the chipset/root response after specific/shorter timer and cause system hang. Enlarge the timeout value to avoid this issue. Signed-off-by: xiong <xiong@qca.qualcomm.com> Tested-by: Liu David <dwliu@qca.qualcomm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-30atl1c: refine mac address related codeHuang, Xiong
On some platform with EEPROM/OTP existing, the BIOS could overwrite a new MAC address for the NIC. so, the permanent mac address should be from BIOS. the address is restored when driver removing. Voltage raising isn't applicable for l1d. Replace swab32 with htonl for big/little endian platform. related Registers are refined as well. Signed-off-by: xiong <xiong@qca.qualcomm.com> Tested-by: Liu David <dwliu@qca.qualcomm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-30atl1c: remove code of closing register writable attributionHuang, Xiong
The Close-action is done by atl1c_reset_pcie, remove it from atl1c_get_permanent_address. Signed-off-by: xiong <xiong@qca.qualcomm.com> Tested-by: Liu David <dwliu@qca.qualcomm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-30atl1c: clear WoL status when reset pcieHuang, Xiong
WoL status is read-clear and should be cleared when in S0 status. putting it in atl1c_reset_pcie is more suitable than in atl1c_get_permanent_address. Signed-off-by: xiong <xiong@qca.qualcomm.com> Tested-by: Liu David <dwliu@qca.qualcomm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-30atl1c: add PHY link event(up/down) patchHuang, Xiong
On some platforms the PHY settings need to change depending on the cable link status to get better stability. Signed-off-by: xiong <xiong@qca.qualcomm.com> Tested-by: Liu David <dwliu@qca.qualcomm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-30atl1c: add workaround for issue of bit INTX-disable for MSI interruptHuang, Xiong
All supported devices have one issue that msi interrupt doesn't assert if pci command register bit (PCI_COMMAND_INTX_DISABLE) is set. Add workaround in drivers/pci/quirks.c Signed-off-by: xiong <xiong@qca.qualcomm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-30bnx2x: remove some bloatEric Dumazet
Before doing skb->head_frag work on bnx2x driver, I found too much stuff was inlined in bnx2x/bnx2x_cmn.h for no good reason and made my work not very easy. Move some big functions out of this include file to the respective .c file. A lot of inline keywords are not needed at all in this huge driver. text data bss dec hex filename 490083 1270 56 491409 77f91 bnx2x/bnx2x.ko.before 484206 1270 56 485532 7689c bnx2x/bnx2x.ko Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Eilon Greenstein <eilong@broadcom.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Maciej Żenczykowski <maze@google.com> Cc: Neal Cardwell <ncardwell@google.com> Cc: Tom Herbert <therbert@google.com> Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Cc: Ben Hutchings <bhutchings@solarflare.com> Cc: Matt Carlson <mcarlson@broadcom.com> Cc: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-30pch_gbe: reprogram multicast address register on resetRongQing.Li
The reset logic after a Rx FIFO overrun will clear the programmed multicast addresses. This patch fixes the issue by reprogramming the registers after the reset. The commit eefc48b ("pch_gbe: reprogram multicast address register on reset") tried to fix this problem, but it introduces unnecessary codes. In fact, all multicast addresses have been saved in netdev->mc, So we can call pch_gbe_set_multi() directly after reset_hw and reset_rx. This commit kills 50+ line codes Cc: Richard Cochran <richardcochran@gmail.com> Cc: Takahiro Shimizu <tshimizu818@gmail.com> Signed-off-by: RongQing.Li <roy.qing.li@gmail.com> Acked-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-30tg3: provide frags as skb headEric Dumazet
This patch converts tg3 driver, one of our reference drivers, to use new build_skb() api in frag mode. Instead of using kmalloc() to allocate the memory block that will be used by build_skb() as skb->head, we use a page fragment. This is a followup of patch "net: allow skb->head to be a page fragment" This allows GRO, TCP coalescing, and splice() to be more efficient. Incidentally, this also removes SLUB slow path contention in kfree() Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Maciej Żenczykowski <maze@google.com> Cc: Neal Cardwell <ncardwell@google.com> Cc: Tom Herbert <therbert@google.com> Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Cc: Ben Hutchings <bhutchings@solarflare.com> Cc: Matt Carlson <mcarlson@broadcom.com> Cc: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-30net: allow skb->head to be a page fragmentEric Dumazet
skb->head is currently allocated from kmalloc(). This is convenient but has the drawback the data cannot be converted to a page fragment if needed. We have three spots were it hurts : 1) GRO aggregation When a linear skb must be appended to another skb, GRO uses the frag_list fallback, very inefficient since we keep all struct sk_buff around. So drivers enabling GRO but delivering linear skbs to network stack aren't enabling full GRO power. 2) splice(socket -> pipe). We must copy the linear part to a page fragment. This kind of defeats splice() purpose (zero copy claim) 3) TCP coalescing. Recently introduced, this permits to group several contiguous segments into a single skb. This shortens queue lengths and save kernel memory, and greatly reduce probabilities of TCP collapses. This coalescing doesnt work on linear skbs (or we would need to copy data, this would be too slow) Given all these issues, the following patch introduces the possibility of having skb->head be a fragment in itself. We use a new skb flag, skb->head_frag to carry this information. build_skb() is changed to accept a frag_size argument. Drivers willing to provide a page fragment instead of kmalloc() data will set a non zero value, set to the fragment size. Then, on situations we need to convert the skb head to a frag in itself, we can check if skb->head_frag is set and avoid the copies or various fallbacks we have. This means drivers currently using frags could be updated to avoid the current skb->head allocation and reduce their memory footprint (aka skb truesize). (thats 512 or 1024 bytes saved per skb). This also makes bpf/netfilter faster since the 'first frag' will be part of skb linear part, no need to copy data. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Maciej Żenczykowski <maze@google.com> Cc: Neal Cardwell <ncardwell@google.com> Cc: Tom Herbert <therbert@google.com> Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Cc: Ben Hutchings <bhutchings@solarflare.com> Cc: Matt Carlson <mcarlson@broadcom.com> Cc: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-30forcedeth: add transmit timestamping supportWillem de Bruijn
Insert an skb_tx_timestamp call in both ndo_start_xmit routines Tested to work for the nv_start_xmit_optimized case Signed-off-by: Willem de Bruijn <willemb@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-30bnx2x: add transmit timestamping supportWillem de Bruijn
Signed-off-by: Willem de Bruijn <willemb@google.com> Acked-by: Eilon Greenstein <eilong@broadcom.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>