summaryrefslogtreecommitdiffstats
path: root/net/ipv6
AgeCommit message (Collapse)Author
2012-08-20gre: information leak in ip6_tnl_ioctl()Dan Carpenter
There is a one byte hole between p->hop_limit and p->flowinfo where stack memory is leaked to the user. This was introduced in c12b395a46 "gre: Support GRE over IPv6". Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
2012-08-14gre: Support GRE over IPv6xeb@mail.ru
GRE over IPv6 implementation. Signed-off-by: Dmitry Kozlov <xeb@mail.ru> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-08-09net: Loopback ifindex is constant nowPavel Emelyanov
As pointed out, there are places, that access net->loopback_dev->ifindex and after ifindex generation is made per-net this value becomes constant equals 1. So go ahead and introduce the LOOPBACK_IFINDEX constant and use it where appropriate. Signed-off-by: Pavel Emelyanov <xemul@parallels.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-08-09time: jiffies_delta_to_clock_t() helper to the rescueEric Dumazet
Various /proc/net files sometimes report crazy timer values, expressed in clock_t units. This happens when an expired timer delta (expires - jiffies) is passed to jiffies_to_clock_t(). This function has an overflow in : return div_u64((u64)x * TICK_NSEC, NSEC_PER_SEC / USER_HZ); commit cbbc719fccdb8cb (time: Change jiffies_to_clock_t() argument type to unsigned long) only got around the problem. As we cant output negative values in /proc/net/tcp without breaking various tools, I suggest adding a jiffies_delta_to_clock_t() wrapper that caps the negative delta to a 0 value. Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Maciej Żenczykowski <maze@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Paul Gortmaker <paul.gortmaker@windriver.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: hank <pyu@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-31Merge branch 'akpm' (Andrew's patch-bomb)Linus Torvalds
Merge Andrew's second set of patches: - MM - a few random fixes - a couple of RTC leftovers * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (120 commits) rtc/rtc-88pm80x: remove unneed devm_kfree rtc/rtc-88pm80x: assign ret only when rtc_register_driver fails mm: hugetlbfs: close race during teardown of hugetlbfs shared page tables tmpfs: distribute interleave better across nodes mm: remove redundant initialization mm: warn if pg_data_t isn't initialized with zero mips: zero out pg_data_t when it's allocated memcg: gix memory accounting scalability in shrink_page_list mm/sparse: remove index_init_lock mm/sparse: more checks on mem_section number mm/sparse: optimize sparse_index_alloc memcg: add mem_cgroup_from_css() helper memcg: further prevent OOM with too many dirty pages memcg: prevent OOM with too many dirty pages mm: mmu_notifier: fix freed page still mapped in secondary MMU mm: memcg: only check anon swapin page charges for swap cache mm: memcg: only check swap cache pages for repeated charging mm: memcg: split swapin charge function into private and public part mm: memcg: remove needless !mm fixup to init_mm when charging mm: memcg: remove unneeded shmem charge type ...
2012-07-31net: introduce sk_gfp_atomic() to allow addition of GFP flags depending on ↵Mel Gorman
the individual socket Introduce sk_gfp_atomic(), this function allows to inject sock specific flags to each sock related allocation. It is only used on allocation paths that may be required for writing pages back to network storage. [davem@davemloft.net: Use sk_gfp_atomic only when necessary] Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Mel Gorman <mgorman@suse.de> Acked-by: David S. Miller <davem@davemloft.net> Cc: Neil Brown <neilb@suse.de> Cc: Mike Christie <michaelc@cs.wisc.edu> Cc: Eric B Munson <emunson@mgebm.net> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Sebastian Andrzej Siewior <sebastian@breakpoint.cc> Cc: Mel Gorman <mgorman@suse.de> Cc: Christoph Lameter <cl@linux.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-07-31memcg: rename config variablesAndrew Morton
Sanity: CONFIG_CGROUP_MEM_RES_CTLR -> CONFIG_MEMCG CONFIG_CGROUP_MEM_RES_CTLR_SWAP -> CONFIG_MEMCG_SWAP CONFIG_CGROUP_MEM_RES_CTLR_SWAP_ENABLED -> CONFIG_MEMCG_SWAP_ENABLED CONFIG_CGROUP_MEM_RES_CTLR_KMEM -> CONFIG_MEMCG_KMEM [mhocko@suse.cz: fix missed bits] Cc: Glauber Costa <glommer@parallels.com> Acked-by: Michal Hocko <mhocko@suse.cz> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Hugh Dickins <hughd@google.com> Cc: Tejun Heo <tj@kernel.org> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Cc: David Rientjes <rientjes@google.com> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-07-30net: TCP early demux cleanupEric Dumazet
early_demux() handlers should be called in RCU context, and as we use skb_dst_set_noref(skb, dst), caller must not exit from RCU context before dst use (skb_dst(skb)) or release (skb_drop(dst)) Therefore, rcu_read_lock()/rcu_read_unlock() pairs around ->early_demux() are confusing and not needed : Protocol handlers are already in an RCU read lock section. (__netif_receive_skb() does the rcu_read_lock() ) Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-29ipv6: fix incorrect route 'expires' value passed to userspaceLi Wei
When userspace use RTM_GETROUTE to dump route table, with an already expired route entry, we always got an 'expires' value(2147157) calculated base on INT_MAX. The reason of this problem is in the following satement: rt->dst.expires - jiffies < INT_MAX gcc promoted the type of both sides of '<' to unsigned long, thus a small negative value would be considered greater than INT_MAX. With the help of Eric Dumazet, do the out of bound checks in rtnl_put_cacheinfo(), _after_ conversion to clock_t. Signed-off-by: Li Wei <lw@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-26ipv6: Early TCP socket demuxEric Dumazet
This is the IPv6 missing bits for infrastructure added in commit 41063e9dd1195 (ipv4: Early TCP socket demux.) Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-23tcp: dont drop MTU reduction indicationsEric Dumazet
ICMP messages generated in output path if frame length is bigger than mtu are actually lost because socket is owned by user (doing the xmit) One example is the ipgre_tunnel_xmit() calling icmp_send(skb, ICMP_DEST_UNREACH, ICMP_FRAG_NEEDED, htonl(mtu)); We had a similar case fixed in commit a34a101e1e6 (ipv6: disable GSO on sockets hitting dst_allfrag). Problem of such fix is that it relied on retransmit timers, so short tcp sessions paid a too big latency increase price. This patch uses the tcp_release_cb() infrastructure so that MTU reduction messages (ICMP messages) are not lost, and no extra delay is added in TCP transmits. Reported-by: Maciej Żenczykowski <maze@google.com> Diagnosed-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Nandita Dukkipati <nanditad@google.com> Cc: Tom Herbert <therbert@google.com> Cc: Tore Anderson <tore@fud.no> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-20net: Document dst->obsolete better.David S. Miller
Add a big comment explaining how the field works, and use defines instead of magic constants for the values assigned to it. Suggested by Joe Perches. Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-19net-tcp: Fast Open baseYuchung Cheng
This patch impelements the common code for both the client and server. 1. TCP Fast Open option processing. Since Fast Open does not have an option number assigned by IANA yet, it shares the experiment option code 254 by implementing draft-ietf-tcpm-experimental-options with a 16 bits magic number 0xF989. This enables global experiments without clashing the scarce(2) experimental options available for TCP. When the draft status becomes standard (maybe), the client should switch to the new option number assigned while the server supports both numbers for transistion. 2. The new sysctl tcp_fastopen 3. A place holder init function Signed-off-by: Yuchung Cheng <ycheng@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-18ipv6: add ipv6_addr_hash() helperEric Dumazet
Introduce ipv6_addr_hash() helper doing a XOR on all bits of an IPv6 address, with an optimized x86_64 version. Use it in flow dissector, as suggested by Andrew McGregor, to reduce hash collision probabilities in fq_codel (and other users of flow dissector) Use it in ip6_tunnel.c and use more bit shuffling, as suggested by David Laight, as existing hash was ignoring most of them. Use it in sunrpc and use more bit shuffling, using hash_32(). Use it in net/ipv6/addrconf.c, using hash_32() as well. As a cleanup, use it in net/ipv4/tcp_metrics.c Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Andrew McGregor <andrewmcgr@gmail.com> Cc: Dave Taht <dave.taht@gmail.com> Cc: Tom Herbert <therbert@google.com> Cc: David Laight <David.Laight@ACULAB.COM> Cc: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-18ipv6: fix inet6_csk_xmit()Eric Dumazet
We should provide to inet6_csk_route_socket a struct flowi6 pointer, so that net6_csk_xmit() works correctly instead of sending garbage. Also add some consts Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Yuchung Cheng <ycheng@google.com> Cc: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-17Merge branch 'nexthop_exceptions'David S. Miller
These patches implement the final mechanism necessary to really allow us to go without the route cache in ipv4. We need a place to have long-term storage of PMTU/redirect information which is independent of the routes themselves, yet does not get us back into a situation where we have to write to metrics or anything like that. For this we use an "next-hop exception" table in the FIB nexthops. The one thing I desperately want to avoid is having to create clone routes in the FIB trie for this purpose, because that is very expensive. However, I'm willing to entertain such an idea later if this current scheme proves to have downsides that the FIB trie variant would not have. In order to accomodate this any such scheme, we need to be able to produce a full flow key at PMTU/redirect time. That required an adjustment of the interface call-sites used to propagate these events. For a PMTU/redirect with a fully specified socket, we pass that socket and use it to produce the flow key. Otherwise we use a passed in SKB to formulate the key. There are two cases that need to be distinguished, ICMP message processing (in which case the IP header is at skb->data) and output packet processing (mostly tunnels, and in all such cases the IP header is at ip_hdr(skb)). We also have to make the code able to handle the case where the dst itself passed into the dst_ops->{update_pmtu,redirect} method is invalidated. This matters for calls from sockets that have cached that route. We provide a inet{,6} helper function for this purpose, and edit SCTP specially since it caches routes at the transport rather than socket level. Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-17net: Pass optional SKB and SK arguments to dst_ops->{update_pmtu,redirect}()David S. Miller
This will be used so that we can compose a full flow key. Even though we have a route in this context, we need more. In the future the routes will be without destination address, source address, etc. keying. One ipv4 route will cover entire subnets, etc. In this environment we have to have a way to possess persistent storage for redirects and PMTU information. This persistent storage will exist in the FIB tables, and that's why we'll need to be able to rebuild a full lookup flow key here. Using that flow key will do a fib_lookup() and create/update the persistent entry. Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-17ipv6: fix unappropriate errno returned for non-multicast addressLi Wei
We need to check the passed in multicast address and return appropriate errno(EINVAL) if it is not valid. And it's no need to walk through the ipv6_mc_list in this situation. Signed-off-by: Li Wei <lw@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-16ipv6: fix RTPROT_RA markup of RA routes w/nexthopsDenis Ovsienko
Userspace implementations of network routing protocols sometimes need to tell RA-originated IPv6 routes from other kernel routes to make proper routing decisions. This makes most sense for RA routes with nexthops, namely, default routes and Route Information routes. The intended mean of preserving RA route origin in a netlink message is through indicating RTPROT_RA as protocol code. Function rt6_fill_node() tried to do that for default routes, but its test condition was taken wrong. This change is modeled after the original mailing list posting by Jeff Haran. It fixes the test condition for default route case and sets the same behaviour for Route Information case (both types use nexthops). Handling of the 3rd RA route type, Prefix Information, is left unchanged, as it stands for interface connected routes (without nexthops). Signed-off-by: Denis Ovsienko <infrastation@yandex.ru> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-16ipv6: Add helper inet6_csk_update_pmtu().David S. Miller
This is the ipv6 version of inet_csk_update_pmtu(). Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-14ipv6: Initialize the struct rt6_info behind the dst_enty fieldSteffen Klassert
We start initializing the struct rt6_info at the first field behind the struct dst_enty. This is error prone because it might leave a new field uninitialized. So start initializing the struct rt6_info right behind the dst_entry. Suggested-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-12net: Remove checks for dst_ops->redirect being NULL.David S. Miller
No longer necessary. Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-12net: Add dummy dst_ops->redirect method where needed.David S. Miller
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-12ipv6: Use icmpv6_notify() to propagate redirect, instead of rt6_redirect().David S. Miller
And delete rt6_redirect(), since it is no longer used. Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-12ipv6: Add redirect support to all protocol icmp error handlers.David S. Miller
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-12ipv6: Add ip6_redirect() and ip6_sk_redirect() helper functions.David S. Miller
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-12ipv6: Pull main logic of rt6_redirect() into rt6_do_redirect().David S. Miller
Hook it into dst_ops->redirect as well. Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-11ipv6: Move bulk of redirect handling into rt6_redirect().David S. Miller
This sets things up so that we can have the protocol error handlers call down into the ipv6 route code for redirects just as ipv4 already does. Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-11ipv6: Export ndisc option parsing from ndisc.cDavid S. Miller
This is going to be used internally by the rt6 redirect code. Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-11tcp: TCP Small QueuesEric Dumazet
This introduce TSQ (TCP Small Queues) TSQ goal is to reduce number of TCP packets in xmit queues (qdisc & device queues), to reduce RTT and cwnd bias, part of the bufferbloat problem. sk->sk_wmem_alloc not allowed to grow above a given limit, allowing no more than ~128KB [1] per tcp socket in qdisc/dev layers at a given time. TSO packets are sized/capped to half the limit, so that we have two TSO packets in flight, allowing better bandwidth use. As a side effect, setting the limit to 40000 automatically reduces the standard gso max limit (65536) to 40000/2 : It can help to reduce latencies of high prio packets, having smaller TSO packets. This means we divert sock_wfree() to a tcp_wfree() handler, to queue/send following frames when skb_orphan() [2] is called for the already queued skbs. Results on my dev machines (tg3/ixgbe nics) are really impressive, using standard pfifo_fast, and with or without TSO/GSO. Without reduction of nominal bandwidth, we have reduction of buffering per bulk sender : < 1ms on Gbit (instead of 50ms with TSO) < 8ms on 100Mbit (instead of 132 ms) I no longer have 4 MBytes backlogged in qdisc by a single netperf session, and both side socket autotuning no longer use 4 Mbytes. As skb destructor cannot restart xmit itself ( as qdisc lock might be taken at this point ), we delegate the work to a tasklet. We use one tasklest per cpu for performance reasons. If tasklet finds a socket owned by the user, it sets TSQ_OWNED flag. This flag is tested in a new protocol method called from release_sock(), to eventually send new segments. [1] New /proc/sys/net/ipv4/tcp_limit_output_bytes tunable [2] skb_orphan() is usually called at TX completion time, but some drivers call it in their start_xmit() handler. These drivers should at least use BQL, or else a single TCP session can still fill the whole NIC TX ring, since TSQ will have no effect. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Dave Taht <dave.taht@bufferbloat.net> Cc: Tom Herbert <therbert@google.com> Cc: Matt Mathis <mattmathis@google.com> Cc: Yuchung Cheng <ycheng@google.com> Cc: Nandita Dukkipati <nanditad@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-10net: Fix (nearly-)kernel-doc comments for various functionsBen Hutchings
Fix incorrect start markers, wrapped summary lines, missing section breaks, incorrect separators, and some name mismatches. Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-10rtnetlink: Remove ts/tsage args to rtnl_put_cacheinfo().David S. Miller
Nobody provides non-zero values any longer. Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-10inet: Kill FLOWI_FLAG_PRECOW_METRICS.David S. Miller
No longer needed. TCP writes metrics, but now in it's own special cache that does not dirty the route metrics. Therefore there is no longer any reason to pre-cow metrics in this way. Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-10inet: Minimize use of cached route inetpeer.David S. Miller
Only use it in the absolutely required cases: 1) COW'ing metrics 2) ipv4 PMTU 3) ipv4 redirects Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-10inet: Remove ->get_peer() method.David S. Miller
No longer used. Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-10tcp: Move timestamps from inetpeer to metrics cache.David S. Miller
With help from Lin Ming. Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-10tcp: Abstract back handling peer aliveness test into helper function.David S. Miller
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-07Merge branch 'master' of git://1984.lsi.us.es/nf-nextDavid S. Miller
2012-07-06ipv6: fix a bad cast in ip6_dst_lookup_tail()Eric Dumazet
Fix a bug in ip6_dst_lookup_tail(), where typeof(dst) is "struct dst_entry **", not "struct dst_entry *" Reported-by: Fengguang Wu <wfg@linux.intel.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-05ipv6: remove redundant declarationsEric Dumazet
remove redundant declarations, they belong in include/net/tcp.h Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-05ipv6: Initialize the neighbour pointer of rt6_info on allocationSteffen Klassert
git commit 97cac082 (ipv6: Store route neighbour in rt6_info struct) added a neighbour pointer to rt6_info. Currently we don't initialize this pointer at allocation time. We assume this pointer to be valid if it is not a null pointer, so initialize it on allocation. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-05ipv6: remove unnecessary codes in tcp_ipv6.cRongQing.Li
opt always equals np->opts, so it is meaningless to define opt, and check if opt does not equal np->opts and then try to free opt. Signed-off-by: RongQing.Li <roy.qing.li@gmail.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-05ipv6: Store route neighbour in rt6_info struct.David S. Miller
This makes for a simplified conversion away from dst_get_neighbour*(). All code outside of ipv6 will use neigh lookups via dst_neigh_lookup*(). Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-05net: Pass neighbours and dest address into NETEVENT_REDIRECT events.David S. Miller
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-05net: Add optional SKB arg to dst_ops->neigh_lookup().David S. Miller
Causes the handler to use the daddr in the ipv4/ipv6 header when the route gateway is unspecified (local subnet). Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-05net: Do delayed neigh confirmation.David S. Miller
When a dst_confirm() happens, mark the confirmation as pending in the dst. Then on the next packet out, when we have the neigh in-hand, do the update. This removes the dependency in dst_confirm() of dst's having an attached neigh. While we're here, remove the explicit 'dst' NULL check, all except 2 or 3 call sites ensure it's not NULL. So just fix those cases up. Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-04netfilter: nf_conntrack: generalize nf_ct_l4proto_netPablo Neira Ayuso
This patch generalizes nf_ct_l4proto_net by splitting it into chunks and moving the corresponding protocol part to where it really belongs to. To clarify, note that we follow two different approaches to support per-net depending if it's built-in or run-time loadable protocol tracker. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Acked-by: Gao feng <gaofeng@cn.fujitsu.com>
2012-06-29ipv6_tunnel: Allow receiving packets on the fallback tunnel if they pass ↵Ville Nuorvala
sanity checks At Facebook, we do Layer-3 DSR via IP-in-IP tunneling. Our load balancers wrap an extra IP header on incoming packets so they can be routed to the backend. In the v4 tunnel driver, when these packets fall on the default tunl0 device, the behavior is to decapsulate them and drop them back on the stack. So our setup is that tunl0 has the VIP and eth0 has (obviously) the backend's real address. In IPv6 we do the same thing, but the v6 tunnel driver didn't have this same behavior - if you didn't have an explicit tunnel setup, it would drop the packet. This patch brings that v4 feature to the v6 driver. The same IPv6 address checks are performed as with any normal tunnel, but as the fallback tunnel endpoint addresses are unspecified, the checks must be performed on a per-packet basis, rather than at tunnel configuration time. [Patch description modified by phil@ipom.com] Signed-off-by: Ville Nuorvala <ville.nuorvala@gmail.com> Tested-by: Phil Dibowitz <phil@ipom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-06-28tcp: plug dst leak in tcp_v6_conn_request()Neal Cardwell
The code in tcp_v6_conn_request() was implicitly assuming that tcp_v6_send_synack() would take care of dst_release(), much as tcp_v4_send_synack() already does. This resulted in tcp_v6_conn_request() leaking a dst if sysctl_tw_recycle is enabled. This commit restructures tcp_v6_send_synack() so that it accepts a dst pointer and takes care of releasing the dst that is passed in, to plug the leak and avoid future surprises by bringing the IPv6 behavior in line with the IPv4 side. Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-06-28tcp: use inet6_csk_route_req() in tcp_v6_send_synack()Neal Cardwell
With the recent change (earlier in this patch series) to set flowi6_oif to treq->iif in inet6_csk_route_req(), the dst lookup in these two functions is now identical, so tcp_v6_send_synack() can now just call inet6_csk_route_req(), to reduce code duplication and keep things closer to the IPv4 side, which is structured this way. Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>