summaryrefslogtreecommitdiffstats
path: root/net
AgeCommit message (Collapse)Author
2012-07-12ipv6: Add redirect support to all protocol icmp error handlers.David S. Miller
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-12ipv6: Add ip6_redirect() and ip6_sk_redirect() helper functions.David S. Miller
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-12ipv6: Pull main logic of rt6_redirect() into rt6_do_redirect().David S. Miller
Hook it into dst_ops->redirect as well. Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-11ipv6: Move bulk of redirect handling into rt6_redirect().David S. Miller
This sets things up so that we can have the protocol error handlers call down into the ipv6 route code for redirects just as ipv4 already does. Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-11ipv6: Export ndisc option parsing from ndisc.cDavid S. Miller
This is going to be used internally by the rt6 redirect code. Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-11ipv4: Kill ip_rt_redirect().David S. Miller
No longer needed, as the protocol handlers now all properly propagate the redirect back into the routing code. Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-11ipv4: Add redirect support to all protocol icmp error handlers.David S. Miller
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-11ipv4: Add ipv4_redirect() and ipv4_sk_redirect() helper functions.David S. Miller
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-11ipv4: Generalize ip_do_redirect() and hook into new dst_ops->redirect.David S. Miller
All of the redirect acceptance policy is now contained within. Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-11ipv4: Rearrange arguments to ip_rt_redirect()David S. Miller
Pass in the SKB rather than just the IP addresses, so that policy and other aspects can reside in ip_rt_redirect() rather then icmp_redirect(). Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-11ipv4: Pull redirect instantiation out into a helper function.David S. Miller
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-11ipv4: Deliver ICMP redirects to sockets too.David S. Miller
And thus, we can remove the ping_err() hack. Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-11ipv4: Pull icmp socket delivery out into a helper function.David S. Miller
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-11tcp: TCP Small QueuesEric Dumazet
This introduce TSQ (TCP Small Queues) TSQ goal is to reduce number of TCP packets in xmit queues (qdisc & device queues), to reduce RTT and cwnd bias, part of the bufferbloat problem. sk->sk_wmem_alloc not allowed to grow above a given limit, allowing no more than ~128KB [1] per tcp socket in qdisc/dev layers at a given time. TSO packets are sized/capped to half the limit, so that we have two TSO packets in flight, allowing better bandwidth use. As a side effect, setting the limit to 40000 automatically reduces the standard gso max limit (65536) to 40000/2 : It can help to reduce latencies of high prio packets, having smaller TSO packets. This means we divert sock_wfree() to a tcp_wfree() handler, to queue/send following frames when skb_orphan() [2] is called for the already queued skbs. Results on my dev machines (tg3/ixgbe nics) are really impressive, using standard pfifo_fast, and with or without TSO/GSO. Without reduction of nominal bandwidth, we have reduction of buffering per bulk sender : < 1ms on Gbit (instead of 50ms with TSO) < 8ms on 100Mbit (instead of 132 ms) I no longer have 4 MBytes backlogged in qdisc by a single netperf session, and both side socket autotuning no longer use 4 Mbytes. As skb destructor cannot restart xmit itself ( as qdisc lock might be taken at this point ), we delegate the work to a tasklet. We use one tasklest per cpu for performance reasons. If tasklet finds a socket owned by the user, it sets TSQ_OWNED flag. This flag is tested in a new protocol method called from release_sock(), to eventually send new segments. [1] New /proc/sys/net/ipv4/tcp_limit_output_bytes tunable [2] skb_orphan() is usually called at TX completion time, but some drivers call it in their start_xmit() handler. These drivers should at least use BQL, or else a single TCP session can still fill the whole NIC TX ring, since TSQ will have no effect. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Dave Taht <dave.taht@bufferbloat.net> Cc: Tom Herbert <therbert@google.com> Cc: Matt Mathis <mattmathis@google.com> Cc: Yuchung Cheng <ycheng@google.com> Cc: Nandita Dukkipati <nanditad@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-11tcp: Fix out of bounds access to tcpm_valsAlexander Duyck
The recent patch "tcp: Maintain dynamic metrics in local cache." introduced an out of bounds access due to what appears to be a typo. I believe this change should resolve the issue by replacing the access to RTAX_CWND with TCP_METRIC_CWND. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-11bridge: fix endianLi RongQing
mld->mld_maxdelay is net endian, so we should use ntohs, not htons CC: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: Li RongQing <roy.qing.li@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-10Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
Conflicts: net/batman-adv/bridge_loop_avoidance.c net/batman-adv/bridge_loop_avoidance.h net/batman-adv/soft-interface.c net/mac80211/mlme.c With merge help from Antonio Quartulli (batman-adv) and Stephen Rothwell (drivers/net/usb/qmi_wwan.c). The net/mac80211/mlme.c conflict seemed easy enough, accounting for a conversion to some new tracing macros. Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-10net: Fix memory leak - vlan_info structAmir Hanania
In driver reload test there is a memory leak. The structure vlan_info was not freed when the driver was removed. It was not released since the nr_vids var is one after last vlan was removed. The nr_vids is one, since vlan zero is added to the interface when the interface is being set, but the vlan zero is not deleted at unregister. Fix - delete vlan zero when we unregister the device. Signed-off-by: Amir Hanania <amir.hanania@intel.com> Acked-by: John Fastabend <john.r.fastabend@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-10Merge tag 'batman-adv-fix-for-davem' of git://git.open-mesh.org/linux-mergeDavid S. Miller
Included changes: - fix a bug generated by the wrong interaction between the GW feature and the Bridge Loop Avoidance
2012-07-10net: Fix non-kernel-doc comments with kernel-doc start markerBen Hutchings
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-10net: Fix (nearly-)kernel-doc comments for various functionsBen Hutchings
Fix incorrect start markers, wrapped summary lines, missing section breaks, incorrect separators, and some name mismatches. Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-10net: Properly define functions with no parametersBen Hutchings
Defining a function with no parameters as 'T foo()' is the deprecated K&R style, and is not strictly equivalent to defining it as 'T foo(void)'. Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-10ipv4: Remove inetpeer from routes.David S. Miller
No longer used. Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-10ipv4: Calling ->cow_metrics() now is a bug.David S. Miller
Nothing every writes to ipv4 metrics any longer. PMTU is stored in rt->rt_pmtu. Dynamic TCP metrics are stored in a special TCP metrics cache, completely outside of the routes. Therefore ->cow_metrics() can simply nothing more than a WARN_ON trigger so we can catch anyone who tries to add new writes to ipv4 route metrics. Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-10ipv4: Kill dst_copy_metrics() call from ipv4_blackhole_route().David S. Miller
Blackhole routes have a COW metrics operation that returns NULL always, therefore this dst_copy_metrics() call did absolutely nothing. Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-10ipv4: Enforce max MTU metric at route insertion time.David S. Miller
Rather than at every struct rtable creation. Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-10ipv4: Maintain redirect and PMTU info in struct rtable again.David S. Miller
Maintaining this in the inetpeer entries was not the right way to do this at all. Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-10rtnetlink: Remove ts/tsage args to rtnl_put_cacheinfo().David S. Miller
Nobody provides non-zero values any longer. Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-10inet: Kill FLOWI_FLAG_PRECOW_METRICS.David S. Miller
No longer needed. TCP writes metrics, but now in it's own special cache that does not dirty the route metrics. Therefore there is no longer any reason to pre-cow metrics in this way. Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-10inet: Minimize use of cached route inetpeer.David S. Miller
Only use it in the absolutely required cases: 1) COW'ing metrics 2) ipv4 PMTU 3) ipv4 redirects Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-10inet: Remove ->get_peer() method.David S. Miller
No longer used. Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-10tcp: Remove tw->tw_peerDavid S. Miller
No longer used. Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-10tcp: Move timestamps from inetpeer to metrics cache.David S. Miller
With help from Lin Ming. Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-10net: Don't report route RTT metric value in cache dumps.David S. Miller
We don't maintain it dynamically any longer, so reporting it would be extremely misleading. Report zero instead. Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-10tcp: Maintain dynamic metrics in local cache.David S. Miller
Maintain a local hash table of TCP dynamic metrics blobs. Computed TCP metrics are no longer maintained in the route metrics. The table uses RCU and an extremely simple hash so that it has low latency and low overhead. A simple hash is legitimate because we only make metrics blobs for fully established connections. Some tweaking of the default hash table sizes, metric timeouts, and the hash chain length limit certainly could use some tweaking. But the basic design seems sound. With help from Eric Dumazet and Joe Perches. Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-10tcp: Abstract back handling peer aliveness test into helper function.David S. Miller
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-10tcp: Move dynamnic metrics handling into seperate file.David S. Miller
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-09net/rxrpc/ar-peer.c: remove invalid reference to list iterator variableJulia Lawall
If list_for_each_entry, etc complete a traversal of the list, the iterator variable ends up pointing to an address at an offset from the list head, and not a meaningful structure. Thus this value should not be used after the end of the iterator. This seems to be a copy-paste bug from a previous debugging message, and so the meaningless value is just deleted. This problem was found using Coccinelle (http://coccinelle.lip6.fr/). Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-09net: cgroup: fix out of bounds accessesEric Dumazet
dev->priomap is allocated by extend_netdev_table() called from update_netdev_tables(). And this is only called if write_priomap() is called. But if write_priomap() is not called, it seems we can have out of bounds accesses in cgrp_destroy(), read_priomap() & skb_update_prio() With help from Gao Feng Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Neil Horman <nhorman@tuxdriver.com> Cc: Gao feng <gaofeng@cn.fujitsu.com> Acked-by: Gao feng <gaofeng@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-09Merge branch 'master' of ↵John W. Linville
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless into for-davem
2012-07-09mac80211: destroy assoc_data correctly if assoc failsEliad Peller
If association failed due to internal error (e.g. no supported rates IE), we call ieee80211_destroy_assoc_data() with assoc=true, while we actually reject the association. This results in the BSSID not being zeroed out. After passing assoc=false, we no longer have to call sta_info_destroy_addr() explicitly. While on it, move the "associated" message after the assoc_success check. Cc: stable@vger.kernel.org [3.4+] Signed-off-by: Eliad Peller <eliad@wizery.com> Reviewed-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2012-07-09NFC: Prevent NULL deref when getting socket nameSasha Levin
llcp_sock_getname can be called without a device attached to the nfc_llcp_sock. This would lead to the following BUG: [ 362.341807] BUG: unable to handle kernel NULL pointer dereference at (null) [ 362.341815] IP: [<ffffffff836258e5>] llcp_sock_getname+0x75/0xc0 [ 362.341818] PGD 31b35067 PUD 30631067 PMD 0 [ 362.341821] Oops: 0000 [#627] PREEMPT SMP DEBUG_PAGEALLOC [ 362.341826] CPU 3 [ 362.341827] Pid: 7816, comm: trinity-child55 Tainted: G D W 3.5.0-rc4-next-20120628-sasha-00005-g9f23eb7 #479 [ 362.341831] RIP: 0010:[<ffffffff836258e5>] [<ffffffff836258e5>] llcp_sock_getname+0x75/0xc0 [ 362.341832] RSP: 0018:ffff8800304fde88 EFLAGS: 00010286 [ 362.341834] RAX: 0000000000000000 RBX: ffff880033cb8000 RCX: 0000000000000001 [ 362.341835] RDX: ffff8800304fdec4 RSI: ffff8800304fdec8 RDI: ffff8800304fdeda [ 362.341836] RBP: ffff8800304fdea8 R08: 7ebcebcb772b7ffb R09: 5fbfcb9c35bdfd53 [ 362.341838] R10: 4220020c54326244 R11: 0000000000000246 R12: ffff8800304fdec8 [ 362.341839] R13: ffff8800304fdec4 R14: ffff8800304fdec8 R15: 0000000000000044 [ 362.341841] FS: 00007effa376e700(0000) GS:ffff880035a00000(0000) knlGS:0000000000000000 [ 362.341843] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 362.341844] CR2: 0000000000000000 CR3: 0000000030438000 CR4: 00000000000406e0 [ 362.341851] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 362.341856] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 362.341858] Process trinity-child55 (pid: 7816, threadinfo ffff8800304fc000, task ffff880031270000) [ 362.341858] Stack: [ 362.341862] ffff8800304fdea8 ffff880035156780 0000000000000000 0000000000001000 [ 362.341865] ffff8800304fdf78 ffffffff83183b40 00000000304fdec8 0000006000000000 [ 362.341868] ffff8800304f0027 ffffffff83729649 ffff8800304fdee8 ffff8800304fdf48 [ 362.341869] Call Trace: [ 362.341874] [<ffffffff83183b40>] sys_getpeername+0xa0/0x110 [ 362.341877] [<ffffffff83729649>] ? _raw_spin_unlock_irq+0x59/0x80 [ 362.341882] [<ffffffff810f342b>] ? do_setitimer+0x23b/0x290 [ 362.341886] [<ffffffff81985ede>] ? trace_hardirqs_on_thunk+0x3a/0x3f [ 362.341889] [<ffffffff8372a539>] system_call_fastpath+0x16/0x1b [ 362.341921] Code: 84 00 00 00 00 00 b8 b3 ff ff ff 48 85 db 74 54 66 41 c7 04 24 27 00 49 8d 7c 24 12 41 c7 45 00 60 00 00 00 48 8b 83 28 05 00 00 <8b> 00 41 89 44 24 04 0f b6 83 41 05 00 00 41 88 44 24 10 0f b6 [ 362.341924] RIP [<ffffffff836258e5>] llcp_sock_getname+0x75/0xc0 [ 362.341925] RSP <ffff8800304fde88> [ 362.341926] CR2: 0000000000000000 [ 362.341928] ---[ end trace 6d450e935ee18bf3 ]--- Signed-off-by: Sasha Levin <levinsasha928@gmail.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2012-07-09mac80211: correct size the argument to kzalloc in minstrel_htThomas Huehn
msp has type struct minstrel_ht_sta_priv not struct minstrel_ht_sta. (This incorporates the fixup originally posted as "mac80211: fix kzalloc memory corruption introduced in minstrel_ht". -- JWL) Reported-by: Fengguang Wu <wfg@linux.intel.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Thomas Huehn <thomas@net.t-labs.tu-berlin.de> Acked-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2012-07-09Merge branch 'master' of git://1984.lsi.us.es/nfDavid S. Miller
Pablo Neira Ayuso says: ==================== * One to get the timeout special parameter for the SET target back working (this was introduced while trying to fix another bug in 3.4) from Jozsef Kadlecsik. * One crash fix if containers and nf_conntrack are used reported by Hans Schillstrom by myself. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-09netfilter: ipset: timeout fixing bug broke SET target special timeout valueJozsef Kadlecsik
The patch "127f559 netfilter: ipset: fix timeout value overflow bug" broke the SET target when no timeout was specified. Reported-by: Jean-Philippe Menil <jean-philippe.menil@univ-nantes.fr> Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-07-09cgroup: fix panic in netprio_cgroupGao feng
we set max_prioidx to the first zero bit index of prioidx_map in function get_prioidx. So when we delete the low index netprio cgroup and adding a new netprio cgroup again,the max_prioidx will be set to the low index. when we set the high index cgroup's net_prio.ifpriomap,the function write_priomap will call update_netdev_tables to alloc memory which size is sizeof(struct netprio_map) + sizeof(u32) * (max_prioidx + 1), so the size of array that map->priomap point to is max_prioidx +1, which is low than what we actually need. fix this by adding check in get_prioidx,only set max_prioidx when max_prioidx low than the new prioidx. Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-09small cleanup in ax25_addr_parse()Dan Carpenter
The comments were wrong here because "AX25_MAX_DIGIS" is 8 but the comments say 6. Also I've changed the "7" to "AX25_ADDR_LEN". Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-09netem: add limitation to reordered packetsEric Dumazet
Fix two netem bugs : 1) When a frame was dropped by tfifo_enqueue(), drop counter was incremented twice. 2) When reordering is triggered, we enqueue a packet without checking queue limit. This can OOM pretty fast when this is repeated enough, since skbs are orphaned, no socket limit can help in this situation. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Mark Gordon <msg@google.com> Cc: Andreas Terzis <aterzis@google.com> Cc: Yuchung Cheng <ycheng@google.com> Cc: Hagen Paul Pfeifer <hagen@jauu.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-08sctp: refactor sctp_packet_append_chunk and clenup some memory leaksNeil Horman
While doing some recent work on sctp sack bundling I noted that sctp_packet_append_chunk was pretty inefficient. Specifially, it was called recursively while trying to bundle auth and sack chunks. Because of that we call sctp_packet_bundle_sack and sctp_packet_bundle_auth a total of 4 times for every call to sctp_packet_append_chunk, knowing that at least 3 of those calls will do nothing. So lets refactor sctp_packet_bundle_auth to have an outer part that does the attempted bundling, and an inner part that just does the chunk appends. This saves us several calls per iteration that we just don't need. Also, noticed that the auth and sack bundling fail to free the chunks they allocate if the append fails, so make sure we add that in Signed-off-by: Neil Horman <nhorman@tuxdriver.com> CC: Vlad Yasevich <vyasevich@gmail.com> CC: "David S. Miller" <davem@davemloft.net> CC: linux-sctp@vger.kernel.org Acked-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-08ieee802154: verify packet size before trying to allocate itSasha Levin
Currently when sending data over datagram, the send function will attempt to allocate any size passed on from the userspace. We should make sure that this size is checked and limited. We'll limit it to the MTU of the device, which is checked later anyway. Signed-off-by: Sasha Levin <levinsasha928@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>