asmadeus/linux.git - The linux kernel

Age	Commit message (Collapse)	Author
2010-03-08	tcp: Fix tcp_make_synack()	Eric Dumazet
	Commit 4957faad (TCPCT part 1g: Responder Cookie => Initiator), part of TCP_COOKIE_TRANSACTION implementation, forgot to correctly size synack skb in case user data must be included. Many thanks to Mika Pentillä for spotting this error. Reported-by: Penttillä Mika <mika.penttila@ixonos.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-08	net: fix route cache rebuilds	Eric Dumazet
	We added an automatic route cache rebuilding in commit 1080d709fb9d8cd43 but had to correct few bugs. One of the assumption of original patch, was that entries where kept sorted in a given way. This assumption is known to be wrong (commit 1ddbcb005c395518 gave an explanation of this and corrected a leak) and expensive to respect. Paweł Staszewski reported to me one of his machine got its routing cache disabled after few messages like : [ 2677.850065] Route hash chain too long! [ 2677.850080] Adjust your secret_interval! [82839.662993] Route hash chain too long! [82839.662996] Adjust your secret_interval! [155843.731650] Route hash chain too long! [155843.731664] Adjust your secret_interval! [155843.811881] Route hash chain too long! [155843.811891] Adjust your secret_interval! [155843.858209] vlan0811: 5 rebuilds is over limit, route caching disabled [155843.858212] Route hash chain too long! [155843.858213] Adjust your secret_interval! This is because rt_intern_hash() might be fooled when computing a chain length, because multiple entries with same keys can differ because of TOS (or mark/oif) bits. In the rare case the fast algorithm see a too long chain, and before taking expensive path, we call a helper function in order to not count duplicates of same routes, that only differ with tos/mark/oif bits. This helper works with data already in cpu cache and is not be very expensive, despite its O(N^2) implementation. Paweł Staszewski sucessfully tested this patch on his loaded router. Reported-and-tested-by: Paweł Staszewski <pstaszewski@itcare.pl> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-08	tcp: Add SNMP counters for backlog and min_ttl drops	Eric Dumazet
	Commit 6b03a53a (tcp: use limited socket backlog) added the possibility of dropping frames when backlog queue is full. Commit d218d111 (tcp: Generalized TTL Security Mechanism) added the possibility of dropping frames when TTL is under a given limit. This patch adds new SNMP MIB entries, named TCPBacklogDrop and TCPMinTTLDrop, published in /proc/net/netstat in TcpExt: line netstat -s \| egrep "TCPBacklogDrop\|TCPMinTTLDrop" TCPBacklogDrop: 0 TCPMinTTLDrop: 0 Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-05	net: backlog functions rename	Zhu Yi
	sk_add_backlog -> __sk_add_backlog sk_add_backlog_limited -> sk_add_backlog Signed-off-by: Zhu Yi <yi.zhu@intel.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-05	udp: use limited socket backlog	Zhu Yi
	Make udp adapt to the limited socket backlog change. Cc: "David S. Miller" <davem@davemloft.net> Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Cc: "Pekka Savola (ipv6)" <pekkas@netcore.fi> Cc: Patrick McHardy <kaber@trash.net> Signed-off-by: Zhu Yi <yi.zhu@intel.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-05	tcp: use limited socket backlog	Zhu Yi
	Make tcp adapt to the limited socket backlog change. Cc: "David S. Miller" <davem@davemloft.net> Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Cc: "Pekka Savola (ipv6)" <pekkas@netcore.fi> Cc: Patrick McHardy <kaber@trash.net> Signed-off-by: Zhu Yi <yi.zhu@intel.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-04	gre: fix hard header destination address checking	Timo Teräs
	ipgre_header() can be called with zero daddr when the gre device is configured as multipoint tunnel and still has the NOARP flag set (which is typically cleared by the userspace arp daemon). If the NOARP packets are not dropped, ipgre_tunnel_xmit() will take rt->rt_gateway (= NBMA IP) and use that for route look up (and may lead to bogus xfrm acquires). The multicast address check is removed as sending to multicast group should be ok. In fact, if gre device has a multicast address as destination ipgre_header is always called with multicast address. Signed-off-by: Timo Teras <timo.teras@iki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-03	ipsec: Fix bogus bundle flowi	Herbert Xu
	When I merged the bundle creation code, I introduced a bogus flowi value in the bundle. Instead of getting from the caller, it was instead set to the flow in the route object, which is totally different. The end result is that the bundles we created never match, and we instead end up with an ever growing bundle list. Thanks to Jamal for find this problem. Reported-by: Jamal Hadi Salim <hadi@cyberus.ca> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Acked-by: Steffen Klassert <steffen.klassert@secunet.com> Acked-by: Jamal Hadi Salim <hadi@cyberus.ca> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-02-28	Merge branch 'master' of /home/davem/src/GIT/linux-2.6/	David S. Miller
	Conflicts: drivers/firmware/iscsi_ibft.c
2010-02-26	Merge branch 'master' of ↵	David S. Miller
	git://git.kernel.org/pub/scm/linux/kernel/git/kaber/nf-next-2.6
2010-02-26	netfilter: xtables: restore indentation	Jan Engelhardt
	Signed-off-by: Jan Engelhardt <jengelh@medozas.de> Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-02-25	Merge branch 'master' of ↵	David S. Miller
	master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
2010-02-25	net: Add checking to rcu_dereference() primitives	Paul E. McKenney
	Update rcu_dereference() primitives to use new lockdep-based checking. The rcu_dereference() in __in6_dev_get() may be protected either by rcu_read_lock() or RTNL, per Eric Dumazet. The rcu_dereference() in __sk_free() is protected by the fact that it is never reached if an update could change it. Check for this by using rcu_dereference_check() to verify that the struct sock's ->sk_wmem_alloc counter is zero. Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: laijs@cn.fujitsu.com Cc: dipankar@in.ibm.com Cc: mathieu.desnoyers@polymtl.ca Cc: josh@joshtriplett.org Cc: dvhltc@us.ibm.com Cc: niv@us.ibm.com Cc: peterz@infradead.org Cc: rostedt@goodmis.org Cc: Valdis.Kletnieks@vt.edu Cc: dhowells@redhat.com LKML-Reference: <1266887105-1528-5-git-send-email-paulmck@linux.vnet.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-02-24	Merge branch 'master' of ↵	David S. Miller
	git://git.kernel.org/pub/scm/linux/kernel/git/kaber/nf-next-2.6
2010-02-24	netfilter: xtables: reduce arguments to translate_table	Jan Engelhardt
	Just pass in the entire repl struct. In case of a new table (e.g. ip6t_register_table), the repldata has been previously filled with table->name and table->size already (in ip6t_alloc_initial_table). Signed-off-by: Jan Engelhardt <jengelh@medozas.de> Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-02-24	netfilter: xtables: optimize call flow around xt_ematch_foreach	Jan Engelhardt
	Signed-off-by: Jan Engelhardt <jengelh@medozas.de> Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-02-24	netfilter: xtables: replace XT_MATCH_ITERATE macro	Jan Engelhardt
	The macro is replaced by a list.h-like foreach loop. This makes the code more inspectable. Signed-off-by: Jan Engelhardt <jengelh@medozas.de> Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-02-24	netfilter: xtables: optimize call flow around xt_entry_foreach	Jan Engelhardt
	Signed-off-by: Jan Engelhardt <jengelh@medozas.de> Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-02-24	netfilter: xtables: replace XT_ENTRY_ITERATE macro	Jan Engelhardt
	The macro is replaced by a list.h-like foreach loop. This makes the code much more inspectable. Signed-off-by: Jan Engelhardt <jengelh@medozas.de> Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-02-22	xfrm: SA lookups signature with mark	Jamal Hadi Salim
	pass mark to all SA lookups to prepare them for when we add code to have them search. Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-02-19	net: Fix sysctl restarts...	Eric W. Biederman
	Yuck. It turns out that when we restart sysctls we were restarting with the values already changed. Which unfortunately meant that the second time through we thought there was no change and skipped all kinds of work, despite the fact that there was indeed a change. I have fixed this the simplest way possible by restoring the changed values when we restart the sysctl write. One of my coworkers spotted this bug when after disabling forwarding on an interface pings were still forwarded. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-02-18	net: TCP thin dupack	Andreas Petlund
	This patch enables fast retransmissions after one dupACK for TCP if the stream is identified as thin. This will reduce latencies for thin streams that are not able to trigger fast retransmissions due to high packet interarrival time. This mechanism is only active if enabled by iocontrol or syscontrol and the stream is identified as thin. Signed-off-by: Andreas Petlund <apetlund@simula.no> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-02-18	net: TCP thin linear timeouts	Andreas Petlund
	This patch will make TCP use only linear timeouts if the stream is thin. This will help to avoid the very high latencies that thin stream suffer because of exponential backoff. This mechanism is only active if enabled by iocontrol or syscontrol and the stream is identified as thin. A maximum of 6 linear timeouts is tried before exponential backoff is resumed. Signed-off-by: Andreas Petlund <apetlund@simula.no> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-02-18	ipv6: drop unused "dev" arg of icmpv6_send()	Alexey Dobriyan
	Dunno, what was the idea, it wasn't used for a long time. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-02-18	netfilter: nf_defrag_ipv4: fix compilation error with NF_CONNTRACK=n	Patrick McHardy
	As reported by Randy Dunlap <randy.dunlap@oracle.com>, compilation of nf_defrag_ipv4 fails with: include/net/netfilter/nf_conntrack.h:94: error: field 'ct_general' has incomplete type include/net/netfilter/nf_conntrack.h:178: error: 'const struct sk_buff' has no member named 'nfct' include/net/netfilter/nf_conntrack.h:185: error: implicit declaration of function 'nf_conntrack_put' include/net/netfilter/nf_conntrack.h:294: error: 'const struct sk_buff' has no member named 'nfct' net/ipv4/netfilter/nf_defrag_ipv4.c:45: error: 'struct sk_buff' has no member named 'nfct' net/ipv4/netfilter/nf_defrag_ipv4.c:46: error: 'struct sk_buff' has no member named 'nfct' net/nf_conntrack.h must not be included with NF_CONNTRACK=n, add a few #ifdefs. Long term the header file should be fixed to be usable even with NF_CONNTRACK=n. Tested-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-02-17	ipmr: remove useless checks from ipmr_device_event	Pavel Emelyanov
	The net being checked there is dev_net(dev) and thus this if is always false. Fits both net and net-next trees. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-02-16	percpu: add __percpu sparse annotations to net	Tejun Heo
	Add __percpu sparse annotations to net. These annotations are to make sparse consider percpu variables to be in a different address space and warn if accessed without going through percpu accessors. This patch doesn't affect normal builds. The macro and type tricks around snmp stats make things a bit interesting. DEFINE/DECLARE_SNMP_STAT() macros mark the target field as __percpu and SNMP_UPD_PO_STATS() macro is updated accordingly. All snmp_mib_() users which used to cast the argument to (void ) are updated to cast it to (void __percpu *). Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: David S. Miller <davem@davemloft.net> Cc: Patrick McHardy <kaber@trash.net> Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Cc: Vlad Yasevich <vladislav.yasevich@hp.com> Cc: netdev@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>
2010-02-16	Merge branch 'master' of ↵	David S. Miller
	master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
2010-02-16	net neigh: Decouple per interface neighbour table controls from binary sysctls	Eric W. Biederman
	Stop computing the number of neighbour table settings we have by counting the number of binary sysctls. This behaviour was silly and meant that we could not add another neighbour table setting without also adding another binary sysctl. Don't pass the binary sysctl path for neighour table entries into neigh_sysctl_register. These parameters are no longer used and so are just dead code. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-02-16	net ipv4: Decouple ipv4 interface parameters from binary sysctl numbers	Eric W. Biederman
	Stop using the binary sysctl enumeartion in sysctl.h as an index into a per interface array. This leads to unnecessary binary sysctl number allocation, and a fragility in data structure and implementation because of unnecessary coupling. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-02-16	tunnels: fix netns vs proto registration ordering	Alexey Dobriyan
	Same stuff as in ip_gre patch: receive hook can be called before netns setup is done, oopsing in net_generic(). Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-02-16	gre: fix netns vs proto registration ordering	Alexey Dobriyan
	GRE protocol receive hook can be called right after protocol addition is done. If netns stuff is not yet initialized, we're going to oops in net_generic(). This is remotely oopsable if ip_gre is compiled as module and packet comes at unfortunate moment of module loading. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-02-16	ipcomp: Avoid duplicate calls to ipcomp_destroy	Herbert Xu
	When ipcomp_tunnel_attach fails we will call ipcomp_destroy twice. This may lead to double-frees on certain structures. As there is no reason to explicitly call ipcomp_destroy, this patch removes it from ipcomp*.c and lets the standard xfrm_state destruction take place. This is based on the discovery and patch by Alexey Dobriyan. Tested-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-02-16	Merge branch 'master' of ↵	David S. Miller
	git://git.kernel.org/pub/scm/linux/kernel/git/kaber/nf-next-2.6
2010-02-15	netfilter: nf_conntrack: add support for "conntrack zones"	Patrick McHardy
	Normally, each connection needs a unique identity. Conntrack zones allow to specify a numerical zone using the CT target, connections in different zones can use the same identity. Example: iptables -t raw -A PREROUTING -i veth0 -j CT --zone 1 iptables -t raw -A OUTPUT -o veth1 -j CT --zone 1 Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-02-15	netfilter: nf_conntrack: pass template to l4proto ->error() handler	Patrick McHardy
	The error handlers might need the template to get the conntrack zone introduced in the next patches to perform a conntrack lookup. Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-02-15	netfilter: xtables: add const qualifiers	Jan Engelhardt
	This should make it easier to remove redundant arguments later. Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
2010-02-15	netfilter: xtables: constify args in compat copying functions	Jan Engelhardt
	Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
2010-02-15	netfilter: iptables: remove unused function arguments	Jan Engelhardt
	Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
2010-02-14	Merge branch 'master' of ↵	David S. Miller
	master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: net/mac80211/rate.c
2010-02-12	udp: remove redundant variable	Gerrit Renker
	The variable 'copied' is used in udp_recvmsg() to emphasize that the passed 'len' is adjusted to fit the actual datagram length. But the same can be done by adjusting 'len' directly. This patch thus removes the indirection. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-02-12	inet: Remove bogus IGMPv3 report handling	Herbert Xu
	Currently we treat IGMPv3 reports as if it were an IGMPv2/v1 report. This is broken as IGMPv3 reports are formatted differently. So we end up suppressing a bogus multicast group (which should be harmless as long as the leading reserved field is zero). In fact, IGMPv3 does not allow membership report suppression so we should simply ignore IGMPv3 membership reports as a host. This patch does exactly that. I kept the case statement for it so people won't accidentally add it back thinking that we overlooked this case. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-02-11	netfilter: xtables: fix mangle tables	Alexey Dobriyan
	In POST_ROUTING hook, calling dev_net(in) is going to oops. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-02-11	netfilter: nf_nat_sip: add TCP support	Patrick McHardy
	Add support for mangling TCP SIP packets. Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-02-11	netfilter: nf_nat: support mangling a single TCP packet multiple times	Patrick McHardy
	nf_nat_mangle_tcp_packet() can currently only handle a single mangling per window because it only maintains two sequence adjustment positions: the one before the last adjustment and the one after. This patch makes sequence number adjustment tracking in nf_nat_mangle_tcp_packet() optional and allows a helper to manually update the offsets after the packet has been fully handled. Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-02-11	netfilter: nf_conntrack_sip: add TCP support	Patrick McHardy
	Add TCP support, which is mandated by RFC3261 for all SIP elements. SIP over TCP is similar to UDP, except that messages are delimited by Content-Length: headers and multiple messages may appear in one packet. Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-02-11	netfilter: nf_conntrack_sip: pass data offset to NAT functions	Patrick McHardy
	When using TCP multiple SIP messages might be present in a single packet. A following patch will parse them by setting the dptr to the beginning of each message. The NAT helper needs to reload the dptr value after mangling the packet however, so it needs to know the offset of the message to the beginning of the packet. Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-02-10	tcp: fix ICMP-RTO war	Damian Lukowski
	Make sure, that TCP has a nonzero RTT estimation after three-way handshake. Currently, a listening TCP has a value of 0 for srtt, rttvar and rto right after the three-way handshake is completed with TCP timestamps disabled. This will lead to corrupt RTO recalculation and retransmission flood when RTO is recalculated on backoff reversion as introduced in "Revert RTO on ICMP destination unreachable" (f1ecd5d9e7366609d640ff4040304ea197fbc618). This behaviour can be provoked by connecting to a server which "responds first" (like SMTP) and rejecting every packet after the handshake with dest-unreachable, which will lead to softirq load on the server (up to 30% per socket in some tests). Thanks to Ilpo Jarvinen for providing debug patches and to Denys Fedoryshchenko for reporting and testing. Changes since v3: Removed bad characters in patchfile. Reported-by: Denys Fedoryshchenko <denys@visp.net.lb> Signed-off-by: Damian Lukowski <damian@tvk.rwth-aachen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-02-10	netfilter: xtables: generate initial table on-demand	Jan Engelhardt
	The static initial tables are pretty large, and after the net namespace has been instantiated, they just hang around for nothing. This commit removes them and creates tables on-demand at runtime when needed. Size shrinks by 7735 bytes (x86_64). Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
2010-02-10	netfilter: xtables: use xt_table for hook instantiation	Jan Engelhardt
	The respective xt_table structures already have most of the metadata needed for hook setup. Add a 'priority' field to struct xt_table so that xt_hook_link() can be called with a reduced number of arguments. So should we be having more tables in the future, it comes at no static cost (only runtime, as before) - space saved: 6807373->6806555. Signed-off-by: Jan Engelhardt <jengelh@medozas.de>