asmadeus/linux.git - The linux kernel

Age	Commit message (Collapse)	Author
2010-10-04	netfilter: remove duplicated include	Nicolas Kaiser
	Remove duplicated include. Signed-off-by: Nicolas Kaiser <nikai@nikai.net> Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-10-04	Merge branch 'master' of ↵	David S. Miller
	master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: net/ipv4/Kconfig net/ipv4/tcp_timer.c
2010-10-04	netfilter: ipt_LOG: add bufferisation to call printk() once	Eric Dumazet
	ipt_LOG & ip6t_LOG use lot of calls to printk() and use a lock in a hope several cpus wont mix their output in syslog. printk() being very expensive [1], its better to call it once, on a prebuilt and complete line. Also, with mixed IPv4 and IPv6 trafic, separate IPv4/IPv6 locks dont avoid garbage. I used an allocation of a 1024 bytes structure, sort of seq_printf() but with a fixed size limit. Use a static buffer if dynamic allocation failed. Emit a once time alert if buffer size happens to be too short. [1]: printk() has various features like printk_delay()... Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-10-04	netfilter: nf_nat: make find/put static	Stephen Hemminger
	The functions nf_nat_proto_find_get and nf_nat_proto_put are only used internally in nf_nat_core. This might break some out of tree NAT module. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-10-04	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6	Linus Torvalds
	* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: vlan: dont drop packets from unknown vlans in promiscuous mode Phonet: Correct header retrieval after pskb_may_pull um: Proper Fix for f25c80a4: remove duplicate structure field initialization ip_gre: Fix dependencies wrt. ipv6. net-2.6: SYN retransmits: Add new parameter to retransmits_timed_out() iwl3945: queue the right work if the scan needs to be aborted mac80211: fix use-after-free
2010-10-04	IPVS: sip persistence engine	Simon Horman
	Add the SIP callid as a key for persistence. This allows multiple connections from the same IP address to be differentiated on the basis of the callid. When used in conjunction with the persistence mask, it allows connections from different IP addresses to be aggregated on the basis of the callid. It is envisaged that a persistence mask of 0.0.0.0 will be a useful setting. That is, ignore the source IP address when checking for persistence. It is envisaged that this option will be used in conjunction with one-packet scheduling. This only works with UDP and cannot be made to work with TCP within the current framework. Signed-off-by: Simon Horman <horms@verge.net.au> Acked-by: Julian Anastasov <ja@ssi.bg>
2010-10-04	IPVS: Fallback if persistence engine fails	Simon Horman
	Fall back to normal persistence handling if the persistence engine fails to recognise a packet. This way, at least the packet will go somewhere. It is envisaged that iptables could be used to block packets such if this is not desired although nf_conntrack_sip would likely need to be enhanced first. Signed-off-by: Simon Horman <horms@verge.net.au> Acked-by: Julian Anastasov <ja@ssi.bg>
2010-10-04	IPVS: Allow configuration of persistence engines	Simon Horman
	Allow the persistence engine of a virtual service to be set, edited and unset. This feature only works with the netlink user-space interface. Signed-off-by: Simon Horman <horms@verge.net.au> Acked-by: Julian Anastasov <ja@ssi.bg>
2010-10-04	IPVS: management of persistence engine modules	Simon Horman
	This is based heavily on the scheduler management code Signed-off-by: Simon Horman <horms@verge.net.au> Acked-by: Julian Anastasov <ja@ssi.bg>
2010-10-04	IPVS: Add persistence engine data to /proc/net/ip_vs_conn	Simon Horman
	This shouldn't break compatibility with userspace as the new data is at the end of the line. I have confirmed that this doesn't break ipvsadm, the main (only?) user-space user of this data. Signed-off-by: Simon Horman <horms@verge.net.au> Acked-by: Julian Anastasov <ja@ssi.bg>
2010-10-04	IPVS: Add struct ip_vs_pe	Simon Horman
	Signed-off-by: Simon Horman <horms@verge.net.au> Acked-by: Julian Anastasov <ja@ssi.bg>
2010-10-04	IPVS: ip_vs_{un,}bind_scheduler NULL arguments	Simon Horman
	In general NULL arguments aren't passed by the few callers that exist, so don't test for them. The exception is to make passing NULL to ip_vs_unbind_scheduler() a noop. Signed-off-by: Simon Horman <horms@verge.net.au> Acked-by: Julian Anastasov <ja@ssi.bg>
2010-10-04	IPVS: Allow null argument to ip_vs_scheduler_put()	Simon Horman
	This simplifies caller logic sightly. Signed-off-by: Simon Horman <horms@verge.net.au> Acked-by: Julian Anastasov <ja@ssi.bg>
2010-10-04	IPVS: Add struct ip_vs_conn_param	Simon Horman
	Signed-off-by: Simon Horman <horms@verge.net.au> Acked-by: Julian Anastasov <ja@ssi.bg>
2010-10-04	IPVS: compact ip_vs_sched_persist()	Simon Horman
	Compact ip_vs_sched_persist() by setting up parameters and calling functions once. Signed-off-by: Simon Horman <horms@verge.net.au> Acked-by: Julian Anastasov <ja@ssi.bg>
2010-10-04	netfilter: nf_conntrack_sip: Add callid parser	Simon Horman
	Signed-off-by: Simon Horman <horms@verge.net.au> Acked-by: Julian Anastasov <ja@ssi.bg>
2010-10-04	netfilter: nf_conntrack_sip: Allow ct_sip_get_header() to be called with a ↵	Simon Horman
	null ct argument Signed-off-by: Simon Horman <horms@verge.net.au> Acked-by: Julian Anastasov <ja@ssi.bg>
2010-10-03	net: introduce DST_NOCACHE flag	Eric Dumazet
	While doing stress tests with IP route cache disabled, and multi queue devices, I noticed a very high contention on one rwlock used in neighbour code. When many cpus are trying to send frames (possibly using a high performance multiqueue device) to the same neighbour, they fight for the neigh->lock rwlock in order to call neigh_hh_init(), and fight on hh->hh_refcnt (a pair of atomic_inc/atomic_dec_and_test()) But we dont need to call neigh_hh_init() for dst that are used only once. It costs four atomic operations at least, on two contended cache lines, plus the high contention on neigh->lock rwlock. Introduce a new dst flag, DST_NOCACHE, that is set when dst was not inserted in route cache. With the stress test bench, sending 160000000 frames on one neighbour, results are : Before patch: real 2m28.406s user 0m11.781s sys 36m17.964s After patch: real 1m26.532s user 0m12.185s sys 20m3.903s Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-03	sctp: Fix break indentation in sctp_ioctl().	David S. Miller
	Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-03	Merge branch 'for-davem' of ↵	David S. Miller
	git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6
2010-10-03	sctp: Fix out-of-bounds reading in sctp_asoc_get_hmac()	Dan Rosenberg
	The sctp_asoc_get_hmac() function iterates through a peer's hmac_ids array and attempts to ensure that only a supported hmac entry is returned. The current code fails to do this properly - if the last id in the array is out of range (greater than SCTP_AUTH_HMAC_ID_MAX), the id integer remains set after exiting the loop, and the address of an out-of-bounds entry will be returned and subsequently used in the parent function, causing potentially ugly memory corruption. This patch resets the id integer to 0 on encountering an invalid id so that NULL will be returned after finishing the loop if no valid ids are found. Signed-off-by: Dan Rosenberg <drosenberg@vsecurity.com> Acked-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-03	sctp: prevent reading out-of-bounds memory	Dan Rosenberg
	Two user-controlled allocations in SCTP are subsequently dereferenced as sockaddr structs, without checking if the dereferenced struct members fall beyond the end of the allocated chunk. There doesn't appear to be any information leakage here based on how these members are used and additional checking, but it's still worth fixing. [akpm@linux-foundation.org: remove unfashionable newlines, fix gmail tab->space conversion] Signed-off-by: Dan Rosenberg <dan.j.rosenberg@gmail.com> Acked-by: Vlad Yasevich <vladislav.yasevich@hp.com> Cc: David Miller <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-03	ipv4: correct IGMP behavior on v3 query during v2-compatibility mode	David Stevens
	A recent patch to allow IGMPv2 responses to IGMPv3 queries bypasses length checks for valid query lengths, incorrectly resets the v2_seen timer, and does not support IGMPv1. The following patch responds with a v2 report as required by IGMPv2 while correcting the other problems introduced by the patch. Signed-Off-By: David L Stevens <dlstevens@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-03	ipmr: cleanups	Eric Dumazet
	Various code style cleanups Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-03	ipmr: RCU protection for mfc_cache_array	Eric Dumazet
	Use RCU & RTNL protection for mfc_cache_array[] ipmr_cache_find() is called under rcu_read_lock(); Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-03	ipmr: RCU conversion of mroute_sk	Eric Dumazet
	Use RCU and RTNL to protect (struct mr_table)->mroute_sk Readers use RCU, writers use RTNL. ip_ra_control() already use an RCU grace period before ip_ra_destroy_rcu(), so we dont need synchronize_rcu() in mrtsock_destruct() Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-03	ipmr: __pim_rcv() is called under rcu_read_lock	Eric Dumazet
	No need to get a reference on reg_dev and release it, we are in a rcu_read_lock() protected section. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-03	gre: protocol table can be static	stephen hemminger
	This table is only used in gre.c Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-03	Revert "ipv4: Make INET_LRO a bool instead of tristate."	Ben Hutchings
	This reverts commit e81963b180ac502fda0326edf059b1e29cdef1a2. LRO is now deprecated in favour of GRO, and only a few drivers use it, so it is desirable to build it as a module in distribution kernels. The original change to prevent building it as a module was made in an attempt to avoid the case where some dependents are set to y and some to m, and INET_LRO can be set to m rather than y. However, the Kconfig system will reliably set INET_LRO=y in this case. Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-03	net: Fix the condition passed to sk_wait_event()	Nagendra Tomar
	This patch fixes the condition (3rd arg) passed to sk_wait_event() in sk_stream_wait_memory(). The incorrect check in sk_stream_wait_memory() causes the following soft lockup in tcp_sendmsg() when the global tcp memory pool has exhausted. >>> snip <<< localhost kernel: BUG: soft lockup - CPU#3 stuck for 11s! [sshd:6429] localhost kernel: CPU 3: localhost kernel: RIP: 0010:[sk_stream_wait_memory+0xcd/0x200] [sk_stream_wait_memory+0xcd/0x200] sk_stream_wait_memory+0xcd/0x200 localhost kernel: localhost kernel: Call Trace: localhost kernel: [sk_stream_wait_memory+0x1b1/0x200] sk_stream_wait_memory+0x1b1/0x200 localhost kernel: [<ffffffff802557c0>] autoremove_wake_function+0x0/0x40 localhost kernel: [ipv6:tcp_sendmsg+0x6e6/0xe90] tcp_sendmsg+0x6e6/0xce0 localhost kernel: [sock_aio_write+0x126/0x140] sock_aio_write+0x126/0x140 localhost kernel: [xfs:do_sync_write+0xf1/0x130] do_sync_write+0xf1/0x130 localhost kernel: [<ffffffff802557c0>] autoremove_wake_function+0x0/0x40 localhost kernel: [hrtimer_start+0xe3/0x170] hrtimer_start+0xe3/0x170 localhost kernel: [vfs_write+0x185/0x190] vfs_write+0x185/0x190 localhost kernel: [sys_write+0x50/0x90] sys_write+0x50/0x90 localhost kernel: [system_call+0x7e/0x83] system_call+0x7e/0x83 >>> snip <<< What is happening is, that the sk_wait_event() condition passed from sk_stream_wait_memory() evaluates to true for the case of tcp global memory exhaustion. This is because both sk_stream_memory_free() and vm_wait are true which causes sk_wait_event() to not call schedule_timeout(). Hence sk_stream_wait_memory() returns immediately to the caller w/o sleeping. This causes the caller to again try allocation, which again fails and again calls sk_stream_wait_memory(), and so on. [ Bug introduced by commit c1cbe4b7ad0bc4b1d98ea708a3fecb7362aa4088 ("[NET]: Avoid atomic xchg() for non-error case") -DaveM ] Signed-off-by: Nagendra Singh Tomar <tomer_iisc@yahoo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-03	net: Fix IPv6 PMTU disc. w/ asymmetric routes	Maciej Żenczykowski
	Signed-off-by: Maciej Żenczykowski <maze@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-01	nfsd: provide callbacks on svc_xprt deletion	J. Bruce Fields
	NFSv4.1 needs warning when a client tcp connection goes down, if that connection is being used as a backchannel, so that it can warn the client that it has lost the backchannel connection. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2010-10-01	nfsd4: remove spkm3	J. Bruce Fields
	Unfortunately, spkm3 never got very far; while interoperability with one other implementation was demonstrated at some point, problems were found with the spec that were deemed not worth fixing. The kernel code is useless on its own without nfs-utils patches which were never merged into nfs-utils, and were only ever available from citi.umich.edu. They appear not to have been updated since 2005. Therefore it seems safe to assume that this code has no users, and never will. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-10-01	sunrpc: fix race in new cache_wait code.	NeilBrown
	If we set up to wait for a cache item to be filled in, and then find that it is no longer pending, it could be that some other thread is in 'cache_revisit_request' and has moved our request to its 'pending' list. So when our setup_deferral calls cache_revisit_request it will find nothing to put on the pending list, and do nothing. We then return from cache_wait_req, thus leaving the 'sleeper' on-stack structure open to being corrupted by subsequent stack usage. However that 'sleeper' could still be on the 'pending' list that the other thread is looking at and so any corruption could cause it to behave badly. To avoid this race we simply take the same path as if the 'wait_for_completion_interruptible_timeout' was interrupted and if the sleeper is no longer on the list (which it won't be) we wait on the completion - which will ensure that any other cache_revisit_request will have let go of the sleeper. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-10-01	sunrpc: Create sockets in net namespaces	Pavel Emelyanov
	The context is already known in all the sock_create callers. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-10-01	net: Export __sock_create	Pavel Emelyanov
	Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-10-01	sunrpc: Tag rpc_xprt with net	Pavel Emelyanov
	The net is known from the xprt_create and this tagging will also give un the context in the conntection workers where real sockets are created. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-10-01	sunrpc: Add net to xprt_create	Pavel Emelyanov
	Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-10-01	sunrpc: Add net to rpc_create_args	Pavel Emelyanov
	Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-10-01	sunrpc: Pull net argument downto svc_create_socket	Pavel Emelyanov
	After this the socket creation in it knows the context. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-10-01	sunrpc: Add net argument to svc_create_xprt	Pavel Emelyanov
	Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-10-01	sunrpc: Factor out rpc_xprt freeing	Pavel Emelyanov
	Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-10-01	sunrpc: Factor out rpc_xprt allocation	Pavel Emelyanov
	Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-10-01	Merge branch 'master' of ↵	John W. Linville
	git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6 into for-davem
2010-09-30	ipv4: rcu conversion in ip_route_output_slow	Eric Dumazet
	ip_route_output_slow() is enclosed in an rcu_read_lock() protected section, so that no references are taken/released on device, thanks to __ip_dev_find() & dev_get_by_index_rcu() Tested with ip route cache disabled, and a stress test : Before patch: elapsed time : real 1m38.347s user 0m11.909s sys 23m51.501s Profile: 13788.00 22.7% ip_route_output_slow [kernel] 7875.00 13.0% dst_destroy [kernel] 3925.00 6.5% fib_semantic_match [kernel] 3144.00 5.2% fib_rules_lookup [kernel] 3061.00 5.0% dst_alloc [kernel] 2276.00 3.7% rt_set_nexthop [kernel] 1762.00 2.9% fib_table_lookup [kernel] 1538.00 2.5% _raw_read_lock [kernel] 1358.00 2.2% ip_output [kernel] After patch: real 1m28.808s user 0m13.245s sys 20m37.293s 10950.00 17.2% ip_route_output_slow [kernel] 10726.00 16.9% dst_destroy [kernel] 5170.00 8.1% fib_semantic_match [kernel] 3937.00 6.2% dst_alloc [kernel] 3635.00 5.7% rt_set_nexthop [kernel] 2900.00 4.6% fib_rules_lookup [kernel] 2240.00 3.5% fib_table_lookup [kernel] 1427.00 2.2% _raw_read_lock [kernel] 1157.00 1.8% kmem_cache_alloc [kernel] Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-30	ipv4: introduce __ip_dev_find()	Eric Dumazet
	ip_dev_find(net, addr) finds a device given an IPv4 source address and takes a reference on it. Introduce __ip_dev_find(), taking a third argument, to optionally take the device reference. Callers not asking the reference to be taken should be in an rcu_read_lock() protected section. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-30	vlan: dont drop packets from unknown vlans in promiscuous mode	Eric Dumazet
	Roger Luethi noticed packets for unknown VLANs getting silently dropped even in promiscuous mode. Check for promiscuous mode in __vlan_hwaccel_rx() and vlan_gro_common() before drops. As suggested by Patrick, mark such packets to have skb->pkt_type set to PACKET_OTHERHOST to make sure they are dropped by IP stack. Reported-by: Roger Luethi <rl@hellgate.ch> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> CC: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-30	ipv4: __mkroute_output() speedup	Eric Dumazet
	While doing stress tests with a disabled IP route cache, I found __mkroute_output() was touching three times in_device atomic refcount. Use RCU to touch it once to reduce cache line ping pongs. Before patch time to perform the test real 1m42.009s user 0m12.545s sys 25m0.726s Profile : 16109.00 26.4% ip_route_output_slow vmlinux 7434.00 12.2% dst_destroy vmlinux 3280.00 5.4% fib_rules_lookup vmlinux 3252.00 5.3% fib_semantic_match vmlinux 2622.00 4.3% fib_table_lookup vmlinux 2535.00 4.1% dst_alloc vmlinux 1750.00 2.9% _raw_read_lock vmlinux 1532.00 2.5% rt_set_nexthop vmlinux After patch real 1m36.503s user 0m12.977s sys 23m25.608s 14234.00 22.4% ip_route_output_slow vmlinux 8717.00 13.7% dst_destroy vmlinux 4052.00 6.4% fib_rules_lookup vmlinux 3951.00 6.2% fib_semantic_match vmlinux 3191.00 5.0% dst_alloc vmlinux 1764.00 2.8% fib_table_lookup vmlinux 1692.00 2.7% _raw_read_lock vmlinux 1605.00 2.5% rt_set_nexthop vmlinux Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-30	Phonet: restore flow control credits when sending fails	Rémi Denis-Courmont
	This patch restores the below flow control patch submitted by Rémi Denis-Courmont, which accidentaly got lost due to Pipe controller patch on Phonet. commit 1a98214feef2221cd7c24b17cd688a5a9d85b2ea Author: Rémi Denis-Courmont <remi.denis-courmont@nokia.com> Date: Mon Aug 30 12:57:03 2010 +0000 Phonet: restore flow control credits when sending fails Signed-off-by: Rémi Denis-Courmont <remi.denis-courmont@nokia.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Kumar Sanghvi <kumar.sanghvi@stericsson.com> Acked-by: Linus Walleij <linus.walleij@stericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-30	Merge branch 'master' of ↵	David S. Miller
	git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6