summaryrefslogtreecommitdiffstats
path: root/include/net
AgeCommit message (Collapse)Author
2012-07-10tcp: Maintain dynamic metrics in local cache.David S. Miller
Maintain a local hash table of TCP dynamic metrics blobs. Computed TCP metrics are no longer maintained in the route metrics. The table uses RCU and an extremely simple hash so that it has low latency and low overhead. A simple hash is legitimate because we only make metrics blobs for fully established connections. Some tweaking of the default hash table sizes, metric timeouts, and the hash chain length limit certainly could use some tweaking. But the basic design seems sound. With help from Eric Dumazet and Joe Perches. Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-10tcp: Abstract back handling peer aliveness test into helper function.David S. Miller
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-10tcp: Move dynamnic metrics handling into seperate file.David S. Miller
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-10ipv4: Fix crashes in fib_rules_tclass().David S. Miller
All paths assume, when CONFIG_IP_MULTIPLE_TABLES is enabled, that any successful call to fib_lookup() will initialize the fib_result->r value to something. We violated that expectation in the new fib_lookup() fast path. Reported-by: Or Gerlitz <ogerlitz@mellanox.com> Tested-by: Eric Dumazet <eric.dumazet@gmail.com> Tested-by: Greg Rose <gregory.v.rose@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-09NFC: Allow HCI driver to pre-open pipes to some gatesEric Lapuyade
Some NFC chips will statically create and open pipes for both standard and proprietary gates. The driver can now pass this information to HCI such that HCI will not attempt to create and open them, but will instead directly use the passed pipe ids. Signed-off-by: Eric Lapuyade <eric.lapuyade@intel.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2012-07-09NFC: Driver failure APIEric Lapuyade
This API should be used by drivers, HCI, SHDLC or NCI stacks to report an unrecoverable error. Signed-off-by: Eric Lapuyade <eric.lapuyade@intel.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2012-07-09NFC: Prepare asynchronous error management for driver and shdlcEric Lapuyade
Signed-off-by: Eric Lapuyade <eric.lapuyade@intel.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2012-07-09cfg80211: use wdev in mgmt-tx/ROC APIsJohannes Berg
The management frame and remain-on-channel APIs will be needed in the P2P device abstraction, so move them over to the new wdev-based APIs. Userspace can still use both the interface index and wdev identifier for them so it's backward compatible, but for the P2P Device wdev it will be able to use the wdev identifier only. Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2012-07-09nl80211: prepare for non-netdev wireless devsJohannes Berg
In order to support a P2P device abstraction and Bluetooth high-speed AMPs, we need to have a way to identify virtual interfaces that don't have a netdev associated. Do this by adding a NL80211_ATTR_WDEV attribute to identify a wdev which may or may not also be a netdev. To simplify things, use a 64-bit value with the high 32 bits being the wiphy index for this new wdev identifier in the nl80211 API. Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2012-07-09mac80211: remove ieee80211_key_removedJohannes Berg
This API call was intended to be used by drivers if they want to optimize key handling by removing one key when another is added. Remove it since no driver is using it. If needed, it can always be added back. Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2012-07-09netfilter: nf_ct_ecache: fix crash with multiple containers, one shutting downPablo Neira Ayuso
Hans reports that he's still hitting: BUG: unable to handle kernel NULL pointer dereference at 000000000000027c IP: [<ffffffff813615db>] netlink_has_listeners+0xb/0x60 PGD 0 Oops: 0000 [#3] PREEMPT SMP CPU 0 It happens when adding a number of containers with do: nfct_query(h, NFCT_Q_CREATE, ct); and most likely one namespace shuts down. this problem was supposed to be fixed by: 70e9942 netfilter: nf_conntrack: make event callback registration per-netns Still, it was missing one rcu_access_pointer to check if the callback is set or not. Reported-by: Hans Schillstrom <hans@schillstrom.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-07-07Merge branch 'master' of git://1984.lsi.us.es/nf-nextDavid S. Miller
2012-07-05ipv4: Avoid overhead when no custom FIB rules are installed.David S. Miller
If the user hasn't actually installed any custom rules, or fiddled with the default ones, don't go through the whole FIB rules layer. It's just pure overhead. Instead do what we do with CONFIG_IP_MULTIPLE_TABLES disabled, check the individual tables by hand, one by one. Also, move fib_num_tclassid_users into the ipv4 network namespace. Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-05cfg80211: bitrate calculation for 60gVladimir Kondratiev
60g band uses different from .11n MCS scheme, so bitrate should be calculated differently Signed-off-by: Vladimir Kondratiev <qca_vkondrat@qca.qualcomm.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2012-07-05{nl,cfg}80211: support high bitratesVladimir Kondratiev
Until now, a u16 value was used to represent bitrate value. With VHT bitrates this becomes too small. Introduce a new 32-bit bitrate attribute. nl80211 will report both the new and the old attribute, unless the bitrate doesn't fit into the old u16 attribute in which case only the new one will be reported. User space tools encouraged to prefer the 32-bit attribute, if available (since it won't be available on older kernels.) Signed-off-by: Vladimir Kondratiev <qca_vkondrat@qca.qualcomm.com> [reword commit message and comments a bit] Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2012-07-05Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
2012-07-05net: Kill dst->_neighbour, accessors, and final uses.David S. Miller
No longer used. Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-05ipv6: Store route neighbour in rt6_info struct.David S. Miller
This makes for a simplified conversion away from dst_get_neighbour*(). All code outside of ipv6 will use neigh lookups via dst_neigh_lookup*(). Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-05net: Pass neighbours and dest address into NETEVENT_REDIRECT events.David S. Miller
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-05decnet: Use neighbours privately in dn_route struct.David S. Miller
This allows an easy conversion away from dst_get_neighbour*(). Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-05net: Add optional SKB arg to dst_ops->neigh_lookup().David S. Miller
Causes the handler to use the daddr in the ipv4/ipv6 header when the route gateway is unspecified (local subnet). Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-05net: Do delayed neigh confirmation.David S. Miller
When a dst_confirm() happens, mark the confirmation as pending in the dst. Then on the next packet out, when we have the neigh in-hand, do the update. This removes the dependency in dst_confirm() of dst's having an attached neigh. While we're here, remove the explicit 'dst' NULL check, all except 2 or 3 call sites ensure it's not NULL. So just fix those cases up. Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-05ipv4: Make neigh lookups directly in output packet path.David S. Miller
Do not use the dst cached neigh, we'll be getting rid of that. Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-04netfilter: nf_conntrack: generalize nf_ct_l4proto_netPablo Neira Ayuso
This patch generalizes nf_ct_l4proto_net by splitting it into chunks and moving the corresponding protocol part to where it really belongs to. To clarify, note that we follow two different approaches to support per-net depending if it's built-in or run-time loadable protocol tracker. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Acked-by: Gao feng <gaofeng@cn.fujitsu.com>
2012-07-03mac80211: add TX prepare APIJohannes Berg
Some drivers require setup before being able to send management frames in managed mode, in particular in multi-channel cases. Introduce API to allow the drivers to do such setup while being able to sleep waiting for the setup to finish in the device. This isn't possible inside the TX call since that can't sleep. A future patch may also restructure the TX retry to wait for the driver to report the frame status, as suggested by Arik in http://mid.gmane.org/CA+XVXffKSEL6ZQPQ98x-zO-NL2=TNF1uN==mprRyUmAaRn254g@mail.gmail.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2012-07-03mac80211: reduce IEEE80211_TX_MAX_RATESThomas Huehn
IEEE80211_TX_MAX_RATES can be reduced from 5 to 4 as there is no current hardware supporting a rate chain with 5 multi rate stages (mrr), so 4 mrr stages are sufficient. The memory that is freed within the ieee80211_tx_info struct will be used in the upcoming Transmission Power Control (TPC) implementation. Suggested-by: Felix Fietkau <nbd@openwrt.org> Signed-off-by: Thomas Huehn <thomas@net.t-labs.tu-berlin.de> [reword commit message] Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2012-07-02mac80211: remove tx_frags driver callbackJohannes Berg
The implementation of tx_frags is buggy due to not handling queue stop, and there's no driver implementing it so remove it. Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2012-07-02cfg80211: add 802.11ad (60gHz band) supportVladimir Kondratiev
Add enumerations for both cfg80211 and nl80211. This expands wiphy.bands etc. arrays. Extend channel <-> frequency translation to cover 60g band and modify the rate check logic since there are no legacy mandatory rates (only MCS is used.) Signed-off-by: Vladimir Kondratiev <qca_vkondrat@qca.qualcomm.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2012-06-30sctp: be more restrictive in transport selection on bundled sacksNeil Horman
It was noticed recently that when we send data on a transport, its possible that we might bundle a sack that arrived on a different transport. While this isn't a major problem, it does go against the SHOULD requirement in section 6.4 of RFC 2960: An endpoint SHOULD transmit reply chunks (e.g., SACK, HEARTBEAT ACK, etc.) to the same destination transport address from which it received the DATA or control chunk to which it is replying. This rule should also be followed if the endpoint is bundling DATA chunks together with the reply chunk. This patch seeks to correct that. It restricts the bundling of sack operations to only those transports which have moved the ctsn of the association forward since the last sack. By doing this we guarantee that we only bundle outbound saks on a transport that has received a chunk since the last sack. This brings us into stricter compliance with the RFC. Vlad had initially suggested that we strictly allow only sack bundling on the transport that last moved the ctsn forward. While this makes sense, I was concerned that doing so prevented us from bundling in the case where we had received chunks that moved the ctsn on multiple transports. In those cases, the RFC allows us to select any of the transports having received chunks to bundle the sack on. so I've modified the approach to allow for that, by adding a state variable to each transport that tracks weather it has moved the ctsn since the last sack. This I think keeps our behavior (and performance), close enough to our current profile that I think we can do this without a sysctl knob to enable/disable it. Signed-off-by: Neil Horman <nhorman@tuxdriver.com> CC: Vlad Yaseivch <vyasevich@gmail.com> CC: David S. Miller <davem@davemloft.net> CC: linux-sctp@vger.kernel.org Reported-by: Michele Baldessari <michele@redhat.com> Reported-by: sorin serban <sserban@redhat.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-06-30Bluetooth: Improve debugging messages for hci_connAndrei Emeltchenko
Improve debugging of hci_conn objects by: adding print to hci_conn refcounting, adding object spcifier when missing, change conn to hcon since conn is heavily used for l2cap_conn objects and this is misleading. Signed-off-by: Andrei Emeltchenko <andrei.emeltchenko@intel.com> Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
2012-06-29Merge branch 'master' of ↵John W. Linville
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next into for-davem Conflicts: drivers/net/wireless/brcm80211/brcmfmac/dhd_sdio.c
2012-06-29cfg80211/mac80211: remove .get_channelMichal Kazior
We do not need it anymore since cfg80211 tracks monitor channel and monitor channel type. Signed-off-by: Michal Kazior <michal.kazior@tieto.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2012-06-29cfg80211: track monitor interfaces countMichal Kazior
Implements .set_monitor_enabled(wiphy, enabled). Notifies driver upon change of interface layout. If only monitor interfaces become present it is called with 2nd argument being true. If non-monitor interface appears then 2nd argument is false. Driver is notified only upon change. This makes it more obvious about the fact that cfg80211 supports single monitor channel. Once we implement multi-channel we don't want to allow setting monitor channel while other interface types are running. Otherwise it would be ambiguous once we start considering num_different_channels. Signed-off-by: Michal Kazior <michal.kazior@tieto.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2012-06-29cfg80211: track ibss fixed channelMichal Kazior
IBSS may hop between channels. It is necessary to account this special case when considering interface combinations. Signed-off-by: Michal Kazior <michal.kazior@tieto.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2012-06-29cfg80211: add channel tracking for AP and meshMichal Kazior
We need to know which channel is used by a running AP and mesh for channel context accounting and finding matching/active interface combination. STA/IBSS have current_bss already which allows us to check which channel a vif is tuned to. Non-fixed channel IBSS can be handled with additional changes. Monitor mode is going to be handled differently. Signed-off-by: Michal Kazior <michal.kazior@tieto.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2012-06-29ipv4: Elide fib_validate_source() completely when possible.David S. Miller
If rpfilter is off (or the SKB has an IPSEC path) and there are not tclassid users, we don't have to do anything at all when fib_validate_source() is invoked besides setting the itag to zero. We monitor tclassid uses with a counter (modified only under RTNL and marked __read_mostly) and we protect the fib_validate_source() real work with a test against this counter and whether rpfilter is to be done. Having a way to know whether we need no tclassid processing or not also opens the door for future optimized rpfilter algorithms that do not perform full FIB lookups. Signed-off-by: David S. Miller <davem@davemloft.net>
2012-06-29ipv6_tunnel: Allow receiving packets on the fallback tunnel if they pass ↵Ville Nuorvala
sanity checks At Facebook, we do Layer-3 DSR via IP-in-IP tunneling. Our load balancers wrap an extra IP header on incoming packets so they can be routed to the backend. In the v4 tunnel driver, when these packets fall on the default tunl0 device, the behavior is to decapsulate them and drop them back on the stack. So our setup is that tunl0 has the VIP and eth0 has (obviously) the backend's real address. In IPv6 we do the same thing, but the v6 tunnel driver didn't have this same behavior - if you didn't have an explicit tunnel setup, it would drop the packet. This patch brings that v4 feature to the v6 driver. The same IPv6 address checks are performed as with any normal tunnel, but as the fallback tunnel endpoint addresses are unspecified, the checks must be performed on a per-packet basis, rather than at tunnel configuration time. [Patch description modified by phil@ipom.com] Signed-off-by: Ville Nuorvala <ville.nuorvala@gmail.com> Tested-by: Phil Dibowitz <phil@ipom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-06-28ipv4: Adjust in_dev handling in fib_validate_source()David S. Miller
Checking for in_dev being NULL is pointless. In fact, all of our callers have in_dev precomputed already, so just pass it in and remove the NULL checking. Signed-off-by: David S. Miller <davem@davemloft.net>
2012-06-28net: Use NLMSG_DEFAULT_SIZE in combination with nlmsg_new()Thomas Graf
Using NLMSG_GOODSIZE results in multiple pages being used as nlmsg_new() will automatically add the size of the netlink header to the payload thus exceeding the page limit. NLMSG_DEFAULT_SIZE takes this into account. Signed-off-by: Thomas Graf <tgraf@suug.ch> Cc: Jiri Pirko <jpirko@redhat.com> Cc: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com> Cc: Sergey Lapin <slapin@ossfans.org> Cc: Johannes Berg <johannes@sipsolutions.net> Cc: Lauro Ramos Venancio <lauro.venancio@openbossa.org> Cc: Aloisio Almeida Jr <aloisio.almeida@openbossa.org> Cc: Samuel Ortiz <sameo@linux.intel.com> Reviewed-by: Jiri Pirko <jpirko@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-06-28tcp: pass fl6 to inet6_csk_route_req()Neal Cardwell
This commit changes inet_csk_route_req() so that it uses a pointer to a struct flowi6, rather than allocating its own on the stack. This brings its behavior in line with its IPv4 cousin, inet_csk_route_req(), and allows a follow-on patch to fix a dst leak. Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-06-28Merge remote-tracking branch 'wireless-next/master' into mac80211-nextJohannes Berg
2012-06-28cfg80211: allow advertising VHT capabilitiesMahesh Palivela
Allow drivers to advertise their VHT capabilities and export them to userspace via nl80211. Signed-off-by: Mahesh Palivela <maheshp@posedge.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2012-06-28ipv4: Kill rt->rt_spec_dst, no longer used.David S. Miller
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-06-28ipv4: Create and use fib_compute_spec_dst() helper.David S. Miller
The specific destination is the host we direct unicast replies to. Usually this is the original packet source address, but if we are responding to a multicast or broadcast packet we have to use something different. Specifically we must use the source address we would use if we were to send a packet to the unicast source of the original packet. The routing cache precomputes this value, but we want to remove that precomputation because it creates a hard dependency on the expensive rpfilter source address validation which we'd like to make cheaper. There are only three places where this matters: 1) ICMP replies. 2) pktinfo CMSG 3) IP options Now there will be no real users of rt->rt_spec_dst and we can simply remove it altogether. Signed-off-by: David S. Miller <davem@davemloft.net>
2012-06-28ipv4: Show that ip_send_reply() is purely unicast routine.David S. Miller
Rename it to ip_send_unicast_reply() and add explicit 'saddr' argument. This removed one of the few users of rt->rt_spec_dst. Signed-off-by: David S. Miller <davem@davemloft.net>
2012-06-28mac80211: don't expose ieee80211_add_srates_ie()Johannes Berg
This and ieee80211_add_ext_srates_ie() aren't exported, so can't be used by drivers anyway, but there's also no reason that they should be so make them private to mac80211 and use sdata instead of vif arguments. Acked-by: Arik Nemtsov <arik@wizery.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2012-06-27ipv4: Kill early demux method return value.David S. Miller
It's completely unnecessary. Signed-off-by: David S. Miller <davem@davemloft.net>
2012-06-27xfrm_user: Propagate netlink error codes properly.David S. Miller
Instead of using a fixed value of "-1" or "-EMSGSIZE", propagate what the nla_*() interfaces actually return. Signed-off-by: David S. Miller <davem@davemloft.net>
2012-06-27Revert "ipv4: tcp: dont cache unconfirmed intput dst"David S. Miller
This reverts commit c074da2810c118b3812f32d6754bd9ead2f169e7. This change has several unwanted side effects: 1) Sockets will cache the DST_NOCACHE route in sk->sk_rx_dst and we'll thus never create a real cached route. 2) All TCP traffic will use DST_NOCACHE and never use the routing cache at all. Signed-off-by: David S. Miller <davem@davemloft.net>
2012-06-27ipv4: tcp: dont cache unconfirmed intput dstEric Dumazet
DDOS synflood attacks hit badly IP route cache. On typical machines, this cache is allowed to hold up to 8 Millions dst entries, 256 bytes for each, for a total of 2GB of memory. rt_garbage_collect() triggers and tries to cleanup things. Eventually route cache is disabled but machine is under fire and might OOM and crash. This patch exploits the new TCP early demux, to set a nocache boolean in case incoming TCP frame is for a not yet ESTABLISHED or TIMEWAIT socket. This 'nocache' boolean is then used in case dst entry is not found in route cache, to create an unhashed dst entry (DST_NOCACHE) SYN-cookie-ACK sent use a similar mechanism (ipv4: tcp: dont cache output dst for syncookies), so after this patch, a machine is able to absorb a DDOS synflood attack without polluting its IP route cache. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Hans Schillstrom <hans.schillstrom@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>