summaryrefslogtreecommitdiffstats
path: root/net
AgeCommit message (Collapse)Author
2013-06-13netlink: make compare exist all the timeGao feng
Commit da12c90e099789a63073fc82a19542ce54d4efb9 "netlink: Add compare function for netlink_table" only set compare at the time we create kernel netlink, and reset compare to NULL at the time we finially release netlink socket, but netlink_lookup wants the compare exist always. So we should set compare after we allocate nl_table, and never reset it. make comapre exist all the time. Reported-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-12Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds
Pull networking update from David Miller: 1) Fix dump iterator in nfnl_acct_dump() and ctnl_timeout_dump() to dump all objects properly, from Pablo Neira Ayuso. 2) xt_TCPMSS must use the default MSS of 536 when no MSS TCP option is present. Fix from Phil Oester. 3) qdisc_get_rtab() looks for an existing matching rate table and uses that instead of creating a new one. However, it's key matching is incomplete, it fails to check to make sure the ->data[] array is identical too. Fix from Eric Dumazet. 4) ip_vs_dest_entry isn't fully initialized before copying back to userspace, fix from Dan Carpenter. 5) Fix ubuf reference counting regression in vhost_net, from Jason Wang. 6) When sock_diag dumps a socket filter back to userspace, we have to translate it out of the kernel's internal representation first. From Nicolas Dichtel. 7) davinci_mdio holds a spinlock while calling pm_runtime, which sleeps. Fix from Sebastian Siewior. 8) Timeout check in sh_eth_check_reset is off by one, from Sergei Shtylyov. 9) If sctp socket init fails, we can NULL deref during cleanup. Fix from Daniel Borkmann. 10) netlink_mmap() does not propagate errors properly, from Patrick McHardy. 11) Disable powersave and use minstrel by default in ath9k. From Sujith Manoharan. 12) Fix a regression in that SOCK_ZEROCOPY is not set on tuntap sockets which prevents vhost from being able to use zerocopy. From Jason Wang. 13) Fix race between port lookup and TX path in team driver, from Jiri Pirko. 14) Missing length checks in bluetooth L2CAP packet parsing, from Johan Hedberg. 15) rtlwifi fails to connect to networking using any encryption method other than WPA2. Fix from Larry Finger. 16) Fix iwlegacy build due to incorrect CONFIG_* ifdeffing for power management stuff. From Yijing Wang. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (35 commits) b43: stop format string leaking into error msgs ath9k: Use minstrel rate control by default Revert "ath9k_hw: Update rx gain initval to improve rx sensitivity" ath9k: Disable PowerSave by default net: wireless: iwlegacy: fix build error for il_pm_ops rtlwifi: Fix a false leak indication for PCI devices wl12xx/wl18xx: scan all 5ghz channels wl12xx: increase minimum singlerole firmware version required wl12xx: fix minimum required firmware version for wl127x multirole rtlwifi: rtl8192cu: Fix problem in connecting to WEP or WPA(1) networks mwifiex: debugfs: Fix out of bounds array access Bluetooth: Fix mgmt handling of power on failures Bluetooth: Fix missing length checks for L2CAP signalling PDUs Bluetooth: btmrvl: support Marvell Bluetooth device SD8897 Bluetooth: Fix checks for LE support on LE-only controllers team: fix checks in team_get_first_port_txable_rcu() team: move add to port list before port enablement team: check return value of team_get_port_by_index_rcu() for NULL tuntap: set SOCK_ZEROCOPY flag during open netlink: fix error propagation in netlink_mmap() ...
2013-06-13mac80211: fix TX aggregation TID struct leakJohannes Berg
Ben reports that kmemleak is saying TX aggregation TID structs are leaked. Given his workload, I suspect that they're leaked because stations are destroyed before their aggregation sessions get a chance to start. Fix this by simply freeing structs that are not used yet. Reported-by: Ben Greear <greearb@candelatech.com> Tested-by: Ben Greear <greearb@candelatech.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-06-12udp: fix two sparse errorsEric Dumazet
commit ba418fa357a7b3c ("soreuseport: UDP/IPv4 implementation") added following sparse errors : net/ipv4/udp.c:433:60: warning: cast from restricted __be16 net/ipv4/udp.c:433:60: warning: incorrect type in argument 1 (different base types) net/ipv4/udp.c:433:60: expected unsigned short [unsigned] [usertype] val net/ipv4/udp.c:433:60: got restricted __be16 [usertype] sport net/ipv4/udp.c:433:60: warning: cast from restricted __be16 net/ipv4/udp.c:433:60: warning: cast from restricted __be16 net/ipv4/udp.c:514:60: warning: cast from restricted __be16 net/ipv4/udp.c:514:60: warning: incorrect type in argument 1 (different base types) net/ipv4/udp.c:514:60: expected unsigned short [unsigned] [usertype] val net/ipv4/udp.c:514:60: got restricted __be16 [usertype] sport net/ipv4/udp.c:514:60: warning: cast from restricted __be16 net/ipv4/udp.c:514:60: warning: cast from restricted __be16 Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-12gro: remove a sparse errorEric Dumazet
Fix following sparse error : net/ipv4/af_inet.c:1410:59: warning: restricted __be16 degrades to integer added in commit db8caf3dbc77599 ("gro: should aggregate frames without DF") Reported-by: kbuild test robot <fengguang.wu@intel.com> From: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-12Merge branch 'for-davem' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next John W. Linville says: ==================== This pull request is intended for the 3.11 stream... One big highlight is the cw1200 driver the ST-E CW1100 & CW1200 WLAN chipsets. This one has been lingering for a while, lacking some review comments. Once started getting pulled into linux-next, it got a bit more attention and a number of improvements were made over the initial cut. No doubt there will be more changes ahead, but I think it is looking alright at this point. Along with that, there is the usual flurry of updates to the mac80211 core and the iwlwifi, mwifiex, ath9k, rt2x00, wil6210, and other drivers. A few of the highlights are some rt2x00 refactoring/cleanup by Gabor Juhos, some rt2800 hardware support enhancements by Stanislaw Gruszka, some iwlwifi power management updates from Alexander Bondar, some enhanced bcma SPROM support from Rafał Miłecki, and a variety of other things here and there. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-12igmp: fix new sparse errorsEric Dumazet
Fix following sparse errors : net/ipv4/igmp.c:1222:25: warning: cast from restricted __be32 net/ipv4/igmp.c:1234:31: warning: incorrect type in assignment (different address spaces) net/ipv4/igmp.c:1234:31: expected struct ip_mc_list [noderef] <asn:4>*next_hash net/ipv4/igmp.c:1234:31: got struct ip_mc_list *<noident> net/ipv4/igmp.c:1250:31: warning: incorrect type in assignment (different address spaces) net/ipv4/igmp.c:1250:31: expected struct ip_mc_list [noderef] <asn:4>*next_hash net/ipv4/igmp.c:1250:31: got struct ip_mc_list *<noident> net/ipv4/igmp.c:2380:37: warning: cast from restricted __be32 These were added by commit e9897071350bd9 ("igmp: hash a hash table to speedup ip_check_mc_rcu()") Reported-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-12Merge branch 'for-john' of ↵John W. Linville
git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next Conflicts: drivers/net/wireless/iwlwifi/mvm/mac80211.c
2013-06-12Merge branch 'master' of ↵John W. Linville
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless Conflicts: drivers/net/wireless/ath/ath9k/Kconfig net/mac80211/iface.c
2013-06-12Merge branch 'for-john' of ↵John W. Linville
git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211
2013-06-12Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client Pull ceph fixes from Sage Weil: "There is a pair of fixes for double-frees in the recent bundle for 3.10, a couple of fixes for long-standing bugs (sleep while atomic and an endianness fix), and a locking fix that can be triggered when osds are going down" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: rbd: fix cleanup in rbd_add() rbd: don't destroy ceph_opts in rbd_add() ceph: ceph_pagelist_append might sleep while atomic ceph: add cpu_to_le32() calls when encoding a reconnect capability libceph: must hold mutex for reset_changed_osds()
2013-06-12Merge branch 'master' of ↵John W. Linville
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless into for-davem
2013-06-12Bluetooth: Fix mgmt handling of power on failuresJohan Hedberg
If hci_dev_open fails we need to ensure that the corresponding mgmt_set_powered command gets an appropriate response. This patch fixes the missing response by adding a new mgmt_set_powered_failed function that's used to indicate a power on failure to mgmt. Since a situation with the device being rfkilled may require special handling in user space the patch uses a new dedicated mgmt status code for this. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Cc: stable@vger.kernel.org Acked-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2013-06-12Bluetooth: Fix missing length checks for L2CAP signalling PDUsJohan Hedberg
There has been code in place to check that the L2CAP length header matches the amount of data received, but many PDU handlers have not been checking that the data received actually matches that expected by the specific PDU. This patch adds passing the length header to the specific handler functions and ensures that those functions fail cleanly in the case of an incorrect amount of data. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Cc: stable@vger.kernel.org Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2013-06-12Bluetooth: Fix checks for LE support on LE-only controllersJohan Hedberg
LE-only controllers do not support extended features so any kind of host feature bit checks do not make sense for them. This patch fixes code used for both single-mode (LE-only) and dual-mode (BR/EDR/LE) to use the HCI_LE_ENABLED flag instead of the "Host LE supported" feature bit for LE support tests. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Acked-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2013-06-12netfilter: xt_TCPMSS: Fix missing fragmentation handlingPhil Oester
Similar to commit bc6bcb59 ("netfilter: xt_TCPOPTSTRIP: fix possible mangling beyond packet boundary"), add safe fragment handling to xt_TCPMSS. Signed-off-by: Phil Oester <kernel@linuxace.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2013-06-12netfilter: xt_TCPMSS: Fix IPv6 default MSS tooPhil Oester
As a followup to commit 409b545a ("netfilter: xt_TCPMSS: Fix violation of RFC879 in absence of MSS option"), John Heffner points out that IPv6 has a higher MTU than IPv4, and thus a higher minimum MSS. Update TCPMSS target to account for this, and update RFC comment. While at it, point to more recent reference RFC1122 instead of RFC879. Signed-off-by: Phil Oester <kernel@linuxace.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2013-06-12pktgen: ipv6: numa: consolidate skb allocation to pktgen_alloc_skbDaniel Borkmann
We currently allow for numa-node aware skb allocation only within the fill_packet_ipv4() path, but not in fill_packet_ipv6(). Consolidate that code to a common allocation helper to enable numa-node aware skb allocation for ipv6, and use it in both paths. This also makes both functions a bit more readable. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-12net: udp4: move GSO functions to udp_offloadDaniel Borkmann
Similarly to TCP offloading and UDPv6 offloading, move all related UDPv4 functions to udp_offload.c to make things more explicit. Also, by this, we can make those functions static. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-12igmp: remove unnecessary in_device member zeroingShawn Bohrer
ip_mc_init_dev() is passed a freshly kzalloc'd in_device so it is unnecessary to explicitly zero out the members. Signed-off-by: Shawn Bohrer <sbohrer@rgmadvisors.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-12igmp: hash a hash table to speedup ip_check_mc_rcu()Eric Dumazet
After IP route cache removal, multicast applications using a lot of multicast addresses hit a O(N) behavior in ip_check_mc_rcu() Add a per in_device hash table to get faster lookup. This hash table is created only if the number of items in mc_list is above 4. Reported-by: Shawn Bohrer <sbohrer@rgmadvisors.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Tested-by: Shawn Bohrer <sbohrer@rgmadvisors.com> Reviewed-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-12net_sched: htb: do not setup default rate estimatorsEric Dumazet
With a thousand htb classes, est_timer() spends ~5 million cpu cycles and throws out cpu cache, because each htb class has a default rate estimator (est 4sec 16sec). Most users do not use default rate estimators, so switch htb to not setup ones. Add a module parameter (htb_rate_est) so that users relying on this default rate estimator can revert the behavior. echo 1 >/sys/module/sch_htb/parameters/htb_rate_est Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-12mac80211: Fix rate control mask matching callSimon Wunderlich
The order of parameters was mixed up, introduced in commit "mac80211: improve the rate control API" Cc: Felix Fietkau <nbd@openwrt.org> Signed-off-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de> Signed-off-by: Mathias Kretschmer <mathias.kretschmer@fokus.fraunhofer.de> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-06-12mac80211: abort CAC in stop_ap()Simon Wunderlich
When a CAC is running and stop_ap is called (e.g. when hostapd is killed while performing CAC), the CAC must be aborted immediately. Otherwise ieee80211_stop_ap() will try to stop it when it's too late - wdev->channel is already NULL and the abort event can not be generated. Signed-off-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-06-12mac80211: work around broken APs not including HT infoJohannes Berg
There are some APs, notably 2G/3G/4G Wifi routers, specifically the "Onda PN51T", "Vodafone PocketWiFi 2", "ZTE MF60" and a similar T-Mobile branded device [1] that erroneously don't include all the needed information in (re)association response frames. Work around this by assuming the information is the same as it was in the beacon or probe response and using the data from there instead. This fixes https://bugzilla.kernel.org/show_bug.cgi?id=58881. [1] https://bbs.archlinux.org/viewtopic.php?pid=1277305 Note that this requires marking the first ieee802_11_parse_elems() argument const, otherwise we'd get a compiler warning. Cc: stable@vger.kernel.org Reported-and-tested-by: Michal Zajac <manwe@manwe.pl> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-06-11net_sched: psched_ratecfg_precompute() improvementsEric Dumazet
Before allowing 64bits bytes rates, refactor psched_ratecfg_precompute() to get better comments and increased accuracy. rate_bps field is renamed to rate_bytes_ps, as we only have to worry about bytes per second. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Ben Greear <greearb@candelatech.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-11Merge branch 'master' of ↵John W. Linville
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next into for-davem Conflicts: drivers/net/wireless/ath/ath9k/debug.c net/mac80211/iface.c
2013-06-11cfg80211: fix rtnl leak in wiphy dump error casesJohannes Berg
In two wiphy dump error cases, most often when the dump allocation must be increased, the RTNL is leaked. This quickly results in a complete system lockup. Release the RTNL correctly. Reported-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-06-11nl80211: allow sending CMD_FRAME without specifying any frequencyAntonio Quartulli
Users may want to send a frame on the current channel without specifying it. This is particularly useful for the correct implementation of the IBSS/RSN support in wpa_supplicant which requires to receive and send AUTH frames. Make mgmt_tx pass a NULL channel to the driver if none has been specified by the user. Signed-off-by: Antonio Quartulli <antonio@open-mesh.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-06-11mac80211: make mgmt_tx accept a NULL channelAntonio Quartulli
cfg80211 passes a NULL channel to mgmt_tx if the frame has to be sent on the one currently in use by the device. Make the implementation of mgmt_tx correctly handle this case. Fail if offchan is required. Signed-off-by: Antonio Quartulli <antonio@open-mesh.com> [fix RCU locking] Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-06-11cfg80211: fix VHT TDLS peer AID verificationJouni Malinen
I (Johannes) accidentally applied the first version of the patch ("Allow TDLS peer AID to be configured for VHT"). Now apply just the changes between v1 and v2 to get the AID verification and prefer the new attribute over the old one. Signed-off-by: Jouni Malinen <jouni@qca.qualcomm.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-06-11{nl,mac,cfg}80211: Allow user to configure basic rates for meshAshok Nagarajan
Currently mesh uses mandatory rates as the default basic rates. Allow basic rates to be configured during mesh join. Basic rates are applied only if channel is also provided with mesh join command. Signed-off-by: Ashok Nagarajan <ashok@cozybit.com> [some whitespace fixes, refuse basic rates w/o channel] Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-06-11mac80211: expire mesh peers based on mesh configurationColleen Twitty
The time it takes to see the peer link expire may differ by a minute since sta_expire() is run once a minute as a mesh housekeeping task. Signed-off-by: Colleen Twitty <colleen@cozybit.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-06-11{nl,cfg}80211: make peer link expiration time configurableColleen Twitty
If a STA has a peer that it hasn't seen any tx activity from for a certain length of time, the peer link is expired. This means the inactive STA is removed from the list of peers and that STA is not considered a peer again unless it re-peers. Previously, this inactivity time was always 30 minutes. Now, add it to the mesh configuration and allow it to be configured. Retain 30 minutes as a default value. Signed-off-by: Colleen Twitty <colleen@cozybit.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-06-11mac80211: fix mesh deadlockThomas Pedersen
The patch "cfg80211/mac80211: use cfg80211 wdev mutex in mac80211" introduced several deadlocks by converting the ifmsh->mtx to wdev->mtx. Solve these by: 1. drop the cancel_work_sync() in ieee80211_stop_mesh(). Instead make the mesh work conditional on whether the mesh is running or not. 2. lock the mesh work with sdata_lock() to protect beacon updates and prevent races with wdev->mesh_id_len or cfg80211. Signed-off-by: Thomas Pedersen <thomas@cozybit.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-06-11netlink: fix error propagation in netlink_mmap()Patrick McHardy
Return the error if something went wrong instead of unconditionally returning 0. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-11net_sched: add 64bit rate estimatorsEric Dumazet
struct gnet_stats_rate_est contains u32 fields, so the bytes per second field can wrap at 34360Mbit. Add a new gnet_stats_rate_est64 structure to get 64bit bps/pps fields, and switch the kernel to use this structure natively. This structure is dumped to user space as a new attribute : TCA_STATS_RATE_EST64 Old tc command will now display the capped bps (to 34360Mbit), instead of wrapped values, and updated tc command will display correct information. Old tc command output, after patch : eric:~# tc -s -d qd sh dev lo qdisc pfifo 8001: root refcnt 2 limit 1000p Sent 80868245400 bytes 1978837 pkt (dropped 0, overlimits 0 requeues 0) rate 34360Mbit 189696pps backlog 0b 0p requeues 0 This patch carefully reorganizes "struct Qdisc" layout to get optimal performance on SMP. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-11net: sctp: fix NULL pointer dereference in socket destructionDaniel Borkmann
While stress testing sctp sockets, I hit the following panic: BUG: unable to handle kernel NULL pointer dereference at 0000000000000020 IP: [<ffffffffa0490c4e>] sctp_endpoint_free+0xe/0x40 [sctp] PGD 7cead067 PUD 7ce76067 PMD 0 Oops: 0000 [#1] SMP Modules linked in: sctp(F) libcrc32c(F) [...] CPU: 7 PID: 2950 Comm: acc Tainted: GF 3.10.0-rc2+ #1 Hardware name: Dell Inc. PowerEdge T410/0H19HD, BIOS 1.6.3 02/01/2011 task: ffff88007ce0e0c0 ti: ffff88007b568000 task.ti: ffff88007b568000 RIP: 0010:[<ffffffffa0490c4e>] [<ffffffffa0490c4e>] sctp_endpoint_free+0xe/0x40 [sctp] RSP: 0018:ffff88007b569e08 EFLAGS: 00010292 RAX: 0000000000000000 RBX: ffff88007db78a00 RCX: dead000000200200 RDX: ffffffffa049fdb0 RSI: ffff8800379baf38 RDI: 0000000000000000 RBP: ffff88007b569e18 R08: ffff88007c230da0 R09: 0000000000000001 R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 R13: ffff880077990d00 R14: 0000000000000084 R15: ffff88007db78a00 FS: 00007fc18ab61700(0000) GS:ffff88007fc60000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 0000000000000020 CR3: 000000007cf9d000 CR4: 00000000000007e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Stack: ffff88007b569e38 ffff88007db78a00 ffff88007b569e38 ffffffffa049fded ffffffff81abf0c0 ffff88007db78a00 ffff88007b569e58 ffffffff8145b60e 0000000000000000 0000000000000000 ffff88007b569eb8 ffffffff814df36e Call Trace: [<ffffffffa049fded>] sctp_destroy_sock+0x3d/0x80 [sctp] [<ffffffff8145b60e>] sk_common_release+0x1e/0xf0 [<ffffffff814df36e>] inet_create+0x2ae/0x350 [<ffffffff81455a6f>] __sock_create+0x11f/0x240 [<ffffffff81455bf0>] sock_create+0x30/0x40 [<ffffffff8145696c>] SyS_socket+0x4c/0xc0 [<ffffffff815403be>] ? do_page_fault+0xe/0x10 [<ffffffff8153cb32>] ? page_fault+0x22/0x30 [<ffffffff81544e02>] system_call_fastpath+0x16/0x1b Code: 0c c9 c3 66 2e 0f 1f 84 00 00 00 00 00 e8 fb fe ff ff c9 c3 66 0f 1f 84 00 00 00 00 00 55 48 89 e5 53 48 83 ec 08 66 66 66 66 90 <48> 8b 47 20 48 89 fb c6 47 1c 01 c6 40 12 07 e8 9e 68 01 00 48 RIP [<ffffffffa0490c4e>] sctp_endpoint_free+0xe/0x40 [sctp] RSP <ffff88007b569e08> CR2: 0000000000000020 ---[ end trace e0d71ec1108c1dd9 ]--- I did not hit this with the lksctp-tools functional tests, but with a small, multi-threaded test program, that heavily allocates, binds, listens and waits in accept on sctp sockets, and then randomly kills some of them (no need for an actual client in this case to hit this). Then, again, allocating, binding, etc, and then killing child processes. This panic then only occurs when ``echo 1 > /proc/sys/net/sctp/auth_enable'' is set. The cause for that is actually very simple: in sctp_endpoint_init() we enter the path of sctp_auth_init_hmacs(). There, we try to allocate our crypto transforms through crypto_alloc_hash(). In our scenario, it then can happen that crypto_alloc_hash() fails with -EINTR from crypto_larval_wait(), thus we bail out and release the socket via sk_common_release(), sctp_destroy_sock() and hit the NULL pointer dereference as soon as we try to access members in the endpoint during sctp_endpoint_free(), since endpoint at that time is still NULL. Now, if we have that case, we do not need to do any cleanup work and just leave the destruction handler. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-11net: pass correct parameter to skb_headers_offset_update()Peter Pan(潘卫平)
Since commit 1a37e412a022(net: Use 16bits for *_headers fields of struct skbuff), skb->*_header are relative to skb->head, so copy_skb_header() should not call skb_headers_offset_update() now, and we should pass correct parameter to skb_headers_offset_update() in pskb_expand_head() and skb_copy_expand(). Signed-off-by: Weiping Pan <panweiping3@gmail.com> Reviewed-by: Simon Horman <horms@verge.net.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-11netlink: Add compare function for netlink_tableGao feng
As we know, netlink sockets are private resource of net namespace, they can communicate with each other only when they in the same net namespace. this works well until we try to add namespace support for other subsystems which use netlink. Don't like ipv4 and route table.., it is not suited to make these subsytems belong to net namespace, Such as audit and crypto subsystems,they are more suitable to user namespace. So we must have the ability to make the netlink sockets in same user namespace can communicate with each other. This patch adds a new function pointer "compare" for netlink_table, we can decide if the netlink sockets can communicate with each other through this netlink_table self-defined compare function. The behavior isn't changed if we don't provide the compare function for netlink_table. Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com> Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-11bridge: Add a flag to control unicast packet flood.Vlad Yasevich
Add a flag to control flood of unicast traffic. By default, flood is on and the bridge will flood unicast traffic if it doesn't know the destination. When the flag is turned off, unicast traffic without an FDB will not be forwarded to the specified port. Signed-off-by: Vlad Yasevich <vyasevic@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-11bridge: Add flag to control mac learning.Vlad Yasevich
Allow user to control whether mac learning is enabled on the port. By default, mac learning is enabled. Disabling mac learning will cause new dynamic FDB entries to not be created for a particular port. Signed-off-by: Vlad Yasevich <vyasevic@redhat.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-10sock_diag: fix filter code sent to userspaceNicolas Dichtel
Filters need to be translated to real BPF code for userland, like SO_GETFILTER. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-10net: remove last caller of skb_tail_offset() and itselfCong Wang
Similar to the following commits: commit 00f97da17a0c8d656d0c9 (netpoll: fix position of network header) commit 525cebedb32a87fa48584 (pktgen: Fix position of ip and udp header) using skb_tail_offset() seems not correct since the offset is based on head pointer. With the last caller removed, skb_tail_offset() can be killed finally. Cc: Thomas Graf <tgraf@suug.ch> Cc: Daniel Borkmann <dborkmann@redhat.com> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Cong Wang <amwang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-10tcp: add low latency socket poll support.Eliezer Tamir
Adds low latency socket poll support for TCP. In tcp_v[46]_rcv() add a call to sk_mark_ll() to copy the napi_id from the skb to the sk. In tcp_recvmsg(), when there is no data in the socket we busy-poll. This is a good example of how to add busy-poll support to more protocols. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: Eliezer Tamir <eliezer.tamir@linux.intel.com> Acked-by: Eric Dumazet <edumazet@google.com> Tested-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-10udp: add low latency socket poll supportEliezer Tamir
Add upport for busy-polling on UDP sockets. In __udp[46]_lib_rcv add a call to sk_mark_ll() to copy the napi_id from the skb into the sk. This is done at the earliest possible moment, right after we identify which socket this skb is for. In __skb_recv_datagram When there is no data and the user tries to read we busy poll. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: Eliezer Tamir <eliezer.tamir@linux.intel.com> Acked-by: Eric Dumazet <edumazet@google.com> Tested-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-10net: add low latency socket pollEliezer Tamir
Adds an ndo_ll_poll method and the code that supports it. This method can be used by low latency applications to busy-poll Ethernet device queues directly from the socket code. sysctl_net_ll_poll controls how many microseconds to poll. Default is zero (disabled). Individual protocol support will be added by subsequent patches. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: Eliezer Tamir <eliezer.tamir@linux.intel.com> Acked-by: Eric Dumazet <edumazet@google.com> Tested-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-10net: add napi_id and hashEliezer Tamir
Adds a napi_id and a hashing mechanism to lookup a napi by id. This will be used by subsequent patches to implement low latency Ethernet device polling. Based on a code sample by Eric Dumazet. Signed-off-by: Eliezer Tamir <eliezer.tamir@linux.intel.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Tested-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-10Merge tag '9p-3.10-bug-fix-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs Pull net/9p bug fix from Eric Van Hensbergen: "zero copy error fix" * tag '9p-3.10-bug-fix-1' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs: net/9p: Handle error in zero copy request correctly for 9p2000.u
2013-06-11netfilter: xt_TCPOPTSTRIP: don't use tcp_hdr()Pablo Neira Ayuso
In (bc6bcb5 netfilter: xt_TCPOPTSTRIP: fix possible mangling beyond packet boundary), the use of tcp_hdr was introduced. However, we cannot assume that skb->transport_header is set for non-local packets. Cc: Florian Westphal <fw@strlen.de> Reported-by: Phil Oester <kernel@linuxace.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>