summaryrefslogtreecommitdiffstats
path: root/include/net
AgeCommit message (Collapse)Author
2013-06-13ip_tunnel: remove __net_init/exit from exported functionsEric Dumazet
If CONFIG_NET_NS is not set then __net_init is the same as __init and __net_exit is the same as __exit. These functions will be removed from memory after the module loads or is removed. Functions that are exported for use by other functions should never be labeled for removal. Bug introduced by commit c54419321455631079c ("GRE: Refactor GRE tunneling code.") Reported-by: Steinar H. Gunderson <sgunderson@bigfoot.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-13mac80211: track AP's beacon rate and give it to the driverAlexander Bondar
Track the AP's beacon rate in the scan BSS data and in the interface configuration to let the drivers know which rate the AP is using. This information may be used by drivers, in our case to let the firmware optimise beacon RX. Signed-off-by: Alexander Bondar <alexander.bondar@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-06-13tcp: properly send new data in fast recovery in first RTTYuchung Cheng
Linux sends new unset data during disorder and recovery state if all (suspected) lost packets have been retransmitted ( RFC5681, section 3.2 step 1 & 2, RFC3517 section 4, NexSeg() Rule 2). One requirement is to keep the receive window about twice the estimated sender's congestion window (tcp_rcv_space_adjust()), assuming the fast retransmits repair the losses in the next round trip. But currently it's not the case on the first round trip in either normal or Fast Open connection, beucase the initial receive window is identical to (expected) sender's initial congestion window. The fix is to double it. Signed-off-by: Yuchung Cheng <ycheng@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-12Merge branch 'for-davem' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next John W. Linville says: ==================== This pull request is intended for the 3.11 stream... One big highlight is the cw1200 driver the ST-E CW1100 & CW1200 WLAN chipsets. This one has been lingering for a while, lacking some review comments. Once started getting pulled into linux-next, it got a bit more attention and a number of improvements were made over the initial cut. No doubt there will be more changes ahead, but I think it is looking alright at this point. Along with that, there is the usual flurry of updates to the mac80211 core and the iwlwifi, mwifiex, ath9k, rt2x00, wil6210, and other drivers. A few of the highlights are some rt2x00 refactoring/cleanup by Gabor Juhos, some rt2800 hardware support enhancements by Stanislaw Gruszka, some iwlwifi power management updates from Alexander Bondar, some enhanced bcma SPROM support from Rafał Miłecki, and a variety of other things here and there. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-12Merge branch 'for-john' of ↵John W. Linville
git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next Conflicts: drivers/net/wireless/iwlwifi/mvm/mac80211.c
2013-06-12Merge branch 'master' of ↵John W. Linville
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless Conflicts: drivers/net/wireless/ath/ath9k/Kconfig net/mac80211/iface.c
2013-06-12Merge branch 'master' of ↵John W. Linville
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless into for-davem
2013-06-12Bluetooth: Fix mgmt handling of power on failuresJohan Hedberg
If hci_dev_open fails we need to ensure that the corresponding mgmt_set_powered command gets an appropriate response. This patch fixes the missing response by adding a new mgmt_set_powered_failed function that's used to indicate a power on failure to mgmt. Since a situation with the device being rfkilled may require special handling in user space the patch uses a new dedicated mgmt status code for this. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Cc: stable@vger.kernel.org Acked-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2013-06-12ipv4: remove is_data also from ip_options documentation.Rami Rosen
commit ef722495c8867aacc1db0675a6737e5cf1e72e07 ( [IPV4]: Remove unused ip_options->is_data) removed the unused is_data member from ip_options struct. This patch removes is_data also from the documentation of the ip_options struct. Signed-off-by: Rami Rosen <ramirose@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-12net: udp4: move GSO functions to udp_offloadDaniel Borkmann
Similarly to TCP offloading and UDPv6 offloading, move all related UDPv4 functions to udp_offload.c to make things more explicit. Also, by this, we can make those functions static. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-11net_sched: psched_ratecfg_precompute() improvementsEric Dumazet
Before allowing 64bits bytes rates, refactor psched_ratecfg_precompute() to get better comments and increased accuracy. rate_bps field is renamed to rate_bytes_ps, as we only have to worry about bytes per second. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Ben Greear <greearb@candelatech.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-11Merge branch 'master' of ↵John W. Linville
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next into for-davem Conflicts: drivers/net/wireless/ath/ath9k/debug.c net/mac80211/iface.c
2013-06-11{nl,mac,cfg}80211: Allow user to configure basic rates for meshAshok Nagarajan
Currently mesh uses mandatory rates as the default basic rates. Allow basic rates to be configured during mesh join. Basic rates are applied only if channel is also provided with mesh join command. Signed-off-by: Ashok Nagarajan <ashok@cozybit.com> [some whitespace fixes, refuse basic rates w/o channel] Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-06-11{nl,cfg}80211: make peer link expiration time configurableColleen Twitty
If a STA has a peer that it hasn't seen any tx activity from for a certain length of time, the peer link is expired. This means the inactive STA is removed from the list of peers and that STA is not considered a peer again unless it re-peers. Previously, this inactivity time was always 30 minutes. Now, add it to the mesh configuration and allow it to be configured. Retain 30 minutes as a default value. Signed-off-by: Colleen Twitty <colleen@cozybit.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-06-11net_sched: add 64bit rate estimatorsEric Dumazet
struct gnet_stats_rate_est contains u32 fields, so the bytes per second field can wrap at 34360Mbit. Add a new gnet_stats_rate_est64 structure to get 64bit bps/pps fields, and switch the kernel to use this structure natively. This structure is dumped to user space as a new attribute : TCA_STATS_RATE_EST64 Old tc command will now display the capped bps (to 34360Mbit), instead of wrapped values, and updated tc command will display correct information. Old tc command output, after patch : eric:~# tc -s -d qd sh dev lo qdisc pfifo 8001: root refcnt 2 limit 1000p Sent 80868245400 bytes 1978837 pkt (dropped 0, overlimits 0 requeues 0) rate 34360Mbit 189696pps backlog 0b 0p requeues 0 This patch carefully reorganizes "struct Qdisc" layout to get optimal performance on SMP. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-10net: add low latency socket pollEliezer Tamir
Adds an ndo_ll_poll method and the code that supports it. This method can be used by low latency applications to busy-poll Ethernet device queues directly from the socket code. sysctl_net_ll_poll controls how many microseconds to poll. Default is zero (disabled). Individual protocol support will be added by subsequent patches. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: Eliezer Tamir <eliezer.tamir@linux.intel.com> Acked-by: Eric Dumazet <edumazet@google.com> Tested-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-07net: tcp: move GRO/GSO functions to tcp_offloadDaniel Borkmann
Would be good to make things explicit and move those functions to a new file called tcp_offload.c, thus make this similar to tcpv6_offload.c. While moving all related functions into tcp_offload.c, we can also make some of them static, since they are only used there. Also, add an explicit registration function. Suggested-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-06Merge branch 'master' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next Conflicts: net/netfilter/nf_log.c The conflict in nf_log.c is that in 'net' we added CONFIG_PROC_FS protection around foo_proc_entry() calls to fix a build failure, whereas in Pablo's tree a guard if() test around a call is remove_proc_entry() was removed. Trivially resolved. Pablo Neira Ayuso says: ==================== The following patchset contains the first batch of Netfilter/IPVS updates for your net-next tree, they are: * Three patches with improvements and code refactorization for nfnetlink_queue, from Florian Westphal. * FTP helper now parses replies without brackets, as RFC1123 recommends, from Jeff Mahoney. * Rise a warning to tell everyone about ULOG deprecation, NFLOG has been already in the kernel tree for long time and supersedes the old logging over netlink stub, from myself. * Don't panic if we fail to load netfilter core framework, just bail out instead, from myself. * Add cond_resched_rcu, used by IPVS to allow rescheduling while walking over big hashtables, from Simon Horman. * Change type of IPVS sysctl_sync_qlen_max sysctl to avoid possible overflow, from Zhang Yanfei. * Use strlcpy instead of strncpy to skip zeroing of already initialized area to write the extension names in ebtables, from Chen Gang. * Use already existing per-cpu notrack object from xt_CT, from Eric Dumazet. * Save explicit socket lookup in xt_socket now that we have early demux, also from Eric Dumazet. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-05Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
Merge 'net' bug fixes into 'net-next' as we have patches that will build on top of them. This merge commit includes a change from Emil Goode (emilgoode@gmail.com) that fixes a warning that would have been introduced by this merge. Specifically it fixes the pingv6_ops method ipv6_chk_addr() to add a "const" to the "struct net_device *dev" argument and likewise update the dummy_ipv6_chk_addr() declaration. Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-05wireless: fix kernel-docJohannes Berg
Some kernel-doc fixes for forgotten fields and renamed things. Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-06-05mac80211: Use suitable semantics for beacon availability indicationAlexander Bondar
Currently beacon availability upon association is marked by have_beacon flag of assoc_data structure that becomes unavailable when association completes. However beacon availability indication is required also after association to inform a driver. Currently dtim_period parameter is used for this purpose. Move have_beacon flag to another structure, persistant throughout a interface's life cycle. Use suitable sematics for beacon availability indication. Signed-off-by: Alexander Bondar <alexander.bondar@intel.com> [fix another instance of BSS_CHANGED_DTIM_PERIOD in docs] Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-06-04transp_v6.h: style neateningJoe Perches
Use a more current code style. Remove extern from function prototypes. Align function arguments and reflow to 80 cols. Use network comment styles. Signed-off-by: Joe Perches <joe@perches.com> cc: Lorenzo Colitti <lorenzo@google.com>, Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-04net: ipv6: Implement /proc/net/icmp6.Lorenzo Colitti
The format is based on /proc/net/icmp and /proc/net/{udp,raw}6. Compiles and displays reasonable results with CONFIG_IPV6={n,m,y} Couldn't figure out how to test without CONFIG_PROC_FS enabled. Signed-off-by: Lorenzo Colitti <lorenzo@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-04net: ipv4: make the ping /proc code AF-independentLorenzo Colitti
Introduce a ping_seq_afinfo structure (similar to its UDP equivalent) and use it to make some of the ping /proc functions address-family independent. Rename the remaining ping /proc functions from ping_* to ping_v4_*. Compiles and displays reasonable results with CONFIG_IPV6={n,m,y} Signed-off-by: Lorenzo Colitti <lorenzo@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-04net: ipv6: Unify {raw,udp}6_sock_seq_show.Lorenzo Colitti
udp6_sock_seq_show and raw6_sock_seq_show are identical, except the UDP version displays ports and the raw version displays the protocol. Refactor most of the code in these two functions into a new common ip6_dgram_sock_seq_show function, in preparation for using it to display ICMPv6 sockets as well. Also reduce the indentation in parts of include/net/transp_v6.h to improve readability. Compiles and displays reasonable results with CONFIG_IPV6={n,m,y} Signed-off-by: Lorenzo Colitti <lorenzo@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-04Clean up indentation in net/ipv6/transp_v6.hLorenzo Colitti
Reduce the indentation of most of the functions and make it a bit more consistent. This allows longer function and arg names to be consistently indented without wrapping. Signed-off-by: Lorenzo Colitti <lorenzo@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-04cfg80211: separate internal SME implementationJohannes Berg
The current internal SME implementation in cfg80211 is very mixed up with the MLME handling, which has been causing issues for a long time. There are three things that the implementation has to provide: * a basic SME implementation for nl80211's connect() call (for drivers implementing auth/assoc, which is really just mac80211) and wireless extensions * MLME events for the userspace SME * SME events (connected, disconnected etc.) for all different SME implementation possibilities (driver, cfg80211 and userspace) To achieve these goals it isn't necessary to track the software SME's connection status outside of it's state (which is the part that caused many issues.) Instead, track it only in the SME data (wdev->conn) and in the general case only track whether the wdev is connected or not (via wdev->current_bss.) Also separate the internal implementation to not have callbacks from the SME events, but rather call it from the API functions that the driver (or rather mac80211) calls. This separates the code better. Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-06-04cfg80211/mac80211: clean up cfg80211 SME APIsJohannes Berg
Do some cleanups in the cfg80211 SME APIs, which are only used by mac80211. Most of these functions get a frame passed, and there isn't really any reason to export multiple functions as cfg80211 can check the frame type instead, do that. Additionally, the API functions have confusing names like cfg80211_send_...() which was meant to indicate that it sends an event to userspace, but gets a bit confusing when there's both TX and RX and they're not all clearly labeled. Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-06-04mac80211: add a tx control flag to indicate PS-Poll/uAPSD responseFelix Fietkau
Signed-off-by: Felix Fietkau <nbd@openwrt.org> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-06-03cfg80211: take WoWLAN support information out of wiphy structJohannes Berg
There's no need to take up the space for devices that don't support WoWLAN, and most drivers can even make the support data static const (except where it's modified at runtime.) Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-06-03ipv4: use separate genid for next hop exceptionsTimo Teräs
commit 13d82bf5 (ipv4: Fix flushing of cached routing informations) added the support to flush learned pmtu information. However, using rt_genid is quite heavy as it is bumped on route add/change and multicast events amongst other places. These can happen quite often, especially if using dynamic routing protocols. While this is ok with routes (as they are just recreated locally), the pmtu information is learned from remote systems and the icmp notification can come with long delays. It is worthy to have separate genid to avoid excessive pmtu resets. Cc: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: Timo Teräs <timo.teras@iki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-02net_sched: restore "overhead xxx" handlingEric Dumazet
commit 56b765b79 ("htb: improved accuracy at high rates") broke the "overhead xxx" handling, as well as the "linklayer atm" attribute. tc class add ... htb rate X ceil Y linklayer atm overhead 10 This patch restores the "overhead xxx" handling, for htb, tbf and act_police The "linklayer atm" thing needs a separate fix. Reported-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Vimalkumar <j.vimal@gmail.com> Cc: Jiri Pirko <jpirko@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-05-31xfrm: force a garbage collection after deleting a policyPaul Moore
In some cases after deleting a policy from the SPD the policy would remain in the dst/flow/route cache for an extended period of time which caused problems for SELinux as its dynamic network access controls key off of the number of XFRM policy and state entries. This patch corrects this problem by forcing a XFRM garbage collection whenever a policy is sucessfully removed. Reported-by: Ondrej Moris <omoris@redhat.com> Signed-off-by: Paul Moore <pmoore@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-05-31iptunnel: specify protocol outside IP headerNicolas Dichtel
Before this patch, ip_tunnel_xmit() was using the field protocol from the IP header passed into argument. There is no functional change, this patch prepares the support of IPv4 over IPv4 for module sit. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-05-29cfg80211: remove cleanup_work kernel-docJohannes Berg
I evidently forgot this when removing the work itself. Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-05-29cfg80211: support an active monitor interface flagFelix Fietkau
An active monitor interface is one that is used for communication (via injection). It is expected to ACK incoming unicast packets. This is useful for running various 802.11 testing utilities that associate to an AP via injection and manage the state in user space. Signed-off-by: Felix Fietkau <nbd@openwrt.org> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-05-28net: Correct comparisons and calculations using skb->tail and ↵Simon Horman
skb-transport_header This corrects an regression introduced by "net: Use 16bits for *_headers fields of struct skbuff" when NET_SKBUFF_DATA_USES_OFFSET is not set. In that case skb->tail will be a pointer whereas skb->transport_header will be an offset from head. This is corrected by using wrappers that ensure that comparisons and calculations are always made using pointers. Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-05-27cfg80211: make WoWLAN configuration available to driversJohannes Berg
Make the current WoWLAN configuration available to drivers at runtime. This isn't really useful for the normal WoWLAN behaviour and accessing it can also be racy, but drivers may use it for testing the WoWLAN device behaviour while the host stays up & running to observe the device. Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-05-25net: ipv6: Add IPv6 support to the ping socket.Lorenzo Colitti
This adds the ability to send ICMPv6 echo requests without a raw socket. The equivalent ability for ICMPv4 was added in 2011. Instead of having separate code paths for IPv4 and IPv6, make most of the code in net/ipv4/ping.c dual-stack and only add a few IPv6-specific bits (like the protocol definition) to a new net/ipv6/ping.c. Hopefully this will reduce divergence and/or duplication of bugs in the future. Caveats: - Setting options via ancillary data (e.g., using IPV6_PKTINFO to specify the outgoing interface) is not yet supported. - There are no separate security settings for IPv4 and IPv6; everything is controlled by /proc/net/ipv4/ping_group_range. - The proc interface does not yet display IPv6 ping sockets properly. Tested with a patched copy of ping6 and using raw socket calls. Compiles and works with all of CONFIG_IPV6={n,m,y}. Signed-off-by: Lorenzo Colitti <lorenzo@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-05-26ipvs: change type of netns_ipvs->sysctl_sync_qlen_maxZhang Yanfei
This member of struct netns_ipvs is calculated from nr_free_buffer_pages so change its type to unsigned long in case of overflow. Also, type of its related proc var sync_qlen_max and the return type of function sysctl_sync_qlen_max() should be changed to unsigned long, too. Besides, the type of ipvs_master_sync_state->sync_queue_len should be changed to unsigned long accordingly. Signed-off-by: Zhang Yanfei <zhangyanfei@cn.fujitsu.com> Cc: Julian Anastasov <ja@ssi.bg> Cc: David Miller <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Simon Horman <horms@verge.net.au>
2013-05-24Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
Merge net into net-next because some upcoming net-next changes build on top of bug fixes that went into net. Signed-off-by: David S. Miller <davem@davemloft.net>
2013-05-25cfg80211/mac80211: use cfg80211 wdev mutex in mac80211Johannes Berg
Using separate locks in cfg80211 and mac80211 has always caused issues, for example having to unlock in places in mac80211 to call cfg80211, which even needed a framework to make cfg80211 calls after some functions returned etc. Additionally, I suspect some issues people have reported with the cfg80211 state getting confused could be due to such issues, when cfg80211 is asking mac80211 to change state but mac80211 is in the process of telling cfg80211 that the state changed (in another way.) Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-05-25cfg80211: vastly simplify lockingJohannes Berg
Virtually all code paths in cfg80211 already (need to) hold the RTNL. As such, there's little point in having another four mutexes for various parts of the code, they just cause lock ordering issues (and much of the time, the RTNL and a few of the others need thus be held.) Simplify all this by getting rid of the extra four mutexes and just use the RTNL throughout. Only a few code changes were needed to do this and we can get rid of a work struct for bonus points. Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-05-25Merge remote-tracking branch 'mac80211/master' into mac80211-nextJohannes Berg
2013-05-24{cfg,mac}80211: move mandatory rates calculation to cfg80211Ashok Nagarajan
Move mandatory rates calculation to cfg80211, shared with non mac80211 drivers. Signed-off-by: Ashok Nagarajan <ashok@cozybit.com> [extend documentation] Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-05-24mac80211: add STBC flag for radiotapOleksij Rempel
Some chips can tell us if received frame was encoded with STBC or not. To make this information available in user space we can use updated radiotap specification: http://www.radiotap.org/defined-fields/MCS This patch will set number of STBC encoded spatial streams (Nss). The HAVE_STBC flag should be provided by driver. Signed-off-by: Oleksij Rempel <linux@rempel-privat.de> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-05-23netfilter: {ipt,ebt}_ULOG: rise warning on deprecationPablo Neira Ayuso
This target has been superseded by NFLOG. Spot a warning so we prepare removal in a couple of years. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Acked-by: Gao feng <gaofeng@cn.fujitsu.com>
2013-05-23netfilter: add nf_ipv6_ops hook to fix xt_addrtype with IPv6Florian Westphal
Quoting https://bugzilla.netfilter.org/show_bug.cgi?id=812: [ ip6tables -m addrtype ] When I tried to use in the nat/PREROUTING it messes up the routing cache even if the rule didn't matched at all. [..] If I remove the --limit-iface-in from the non-working scenario, so just use the -m addrtype --dst-type LOCAL it works! This happens when LOCAL type matching is requested with --limit-iface-in, and the default ipv6 route is via the interface the packet we test arrived on. Because xt_addrtype uses ip6_route_output, the ipv6 routing implementation creates an unwanted cached entry, and the packet won't make it to the real/expected destination. Silently ignoring --limit-iface-in makes the routing work but it breaks rule matching (--dst-type LOCAL with limit-iface-in is supposed to only match if the dst address is configured on the incoming interface; without --limit-iface-in it will match if the address is reachable via lo). The test should call ipv6_chk_addr() instead. However, this would add a link-time dependency on ipv6. There are two possible solutions: 1) Revert the commit that moved ipt_addrtype to xt_addrtype, and put ipv6 specific code into ip6t_addrtype. 2) add new "nf_ipv6_ops" struct to register pointers to ipv6 functions. While the former might seem preferable, Pablo pointed out that there are more xt modules with link-time dependeny issues regarding ipv6, so lets go for 2). Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2013-05-20tcp: md5: remove spinlock usage in fast pathEric Dumazet
TCP md5 code uses per cpu variables but protects access to them with a shared spinlock, which is a contention point. [ tcp_md5sig_pool_lock is locked twice per incoming packet ] Makes things much simpler, by allocating crypto structures once, first time a socket needs md5 keys, and not deallocating them as they are really small. Next step would be to allow crypto allocations being done in a NUMA aware way. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-05-20net: ipv6: remove 'next' member from inet6_devDaniel Borkmann
The next pointer within the inet6_dev structure seems not to be used anywhere. So just remove it. Tested with allmodconfig on x86_64. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>