summaryrefslogtreecommitdiffstats
path: root/net
AgeCommit message (Collapse)Author
2012-09-04mac80211: Do not check for valid hw_queues for P2P_DEVICEIlan Peer
A P2P Device interface does not have a netdev, and is not expected to be used for transmitting data, so there is no need to assign hw queues for it. Signed-off-by: Ilan Peer <ilan.peer@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2012-09-03openvswitch: Increase maximum number of datapath ports.Pravin B Shelar
Use hash table to store ports of datapath. Allow 64K ports per switch. Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: Jesse Gross <jesse@nicira.com>
2012-09-03openvswitch: Fix FLOW_BUFSIZE definition.Jesse Gross
The vlan encapsulation fields in the maximum flow defintion were never updated when the representation changed before upstreaming. In theory this could cause a kernel panic when a maximum length flow is used. In practice this has never happened (to my knowledge) because skb allocations are padded out to a cache line so you would need the right combination of flow and packet being sent to userspace. Signed-off-by: Jesse Gross <jesse@nicira.com>
2012-09-03Merge branch 'master' of git://1984.lsi.us.es/nf-nextDavid S. Miller
2012-09-03fq_codel: dont reinit flow stateEric Dumazet
When fq_codel builds a new flow, it should not reset codel state. Codel algo needs to get previous values (lastcount, drop_next) to get proper behavior. Signed-off-by: Dave Taht <dave.taht@gmail.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Dave Taht <dave.taht@bufferbloat.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-09-03tcp: use PRR to reduce cwin in CWR stateYuchung Cheng
Use proportional rate reduction (PRR) algorithm to reduce cwnd in CWR state, in addition to Recovery state. Retire the current rate-halving in CWR. When losses are detected via ACKs in CWR state, the sender enters Recovery state but the cwnd reduction continues and does not restart. Rename and refactor cwnd reduction functions since both CWR and Recovery use the same algorithm: tcp_init_cwnd_reduction() is new and initiates reduction state variables. tcp_cwnd_reduction() is previously tcp_update_cwnd_in_recovery(). tcp_ends_cwnd_reduction() is previously tcp_complete_cwr(). The rate halving functions and logic such as tcp_cwnd_down(), tcp_min_cwnd(), and the cwnd moderation inside tcp_enter_cwr() are removed. The unused parameter, flag, in tcp_cwnd_reduction() is also removed. Signed-off-by: Yuchung Cheng <ycheng@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-09-03tcp: move tcp_update_cwnd_in_recoveryYuchung Cheng
To prepare replacing rate halving with PRR algorithm in CWR state. Signed-off-by: Yuchung Cheng <ycheng@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-09-03tcp: move tcp_enter_cwr()Yuchung Cheng
To prepare replacing rate halving with PRR algorithm in CWR state. Signed-off-by: Yuchung Cheng <ycheng@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-09-03sctp: Don't charge for data in sndbuf again when transmitting packetThomas Graf
SCTP charges wmem_alloc via sctp_set_owner_w() in sctp_sendmsg() and via skb_set_owner_w() in sctp_packet_transmit(). If a sender runs out of sndbuf it will sleep in sctp_wait_for_sndbuf() and expects to be waken up by __sctp_write_space(). Buffer space charged via sctp_set_owner_w() is released in sctp_wfree() which calls __sctp_write_space() directly. Buffer space charged via skb_set_owner_w() is released via sock_wfree() which calls sk->sk_write_space() _if_ SOCK_USE_WRITE_QUEUE is not set. sctp_endpoint_init() sets SOCK_USE_WRITE_QUEUE on all sockets. Therefore if sctp_packet_transmit() manages to queue up more than sndbuf bytes, sctp_wait_for_sndbuf() will never be woken up again unless it is interrupted by a signal. This could be fixed by clearing the SOCK_USE_WRITE_QUEUE flag but ... Charging for the data twice does not make sense in the first place, it leads to overcharging sndbuf by a factor 2. Therefore this patch only charges a single byte in wmem_alloc when transmitting an SCTP packet to ensure that the socket stays alive until the packet has been released. This means that control chunks are no longer accounted for in wmem_alloc which I believe is not a problem as skb->truesize will typically lead to overcharging anyway and thus compensates for any control overhead. Signed-off-by: Thomas Graf <tgraf@suug.ch> CC: Vlad Yasevich <vyasevic@redhat.com> CC: Neil Horman <nhorman@tuxdriver.com> CC: David Miller <davem@davemloft.net> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-09-03net: sock_edemux() should take care of timewait socketsEric Dumazet
sock_edemux() can handle either a regular socket or a timewait socket Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-09-03Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-nextPablo Neira Ayuso
This merges (3f509c6 netfilter: nf_nat_sip: fix incorrect handling of EBUSY for RTCP expectation) to Patrick McHardy's IPv6 NAT changes.
2012-09-03netfilter: properly annotate ipv4_netfilter_{init,fini}()Jan Beulich
Despite being just a few bytes of code, they should still have proper annotations. Signed-off-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-09-03netfilter: pass 'nf_hook_ops' instead of 'list_head' to nf_queue()Michael Wang
Since 'list_for_each_continue_rcu' has already been replaced by 'list_for_each_entry_continue_rcu', pass 'list_head' to nf_queue() as a parameter can not benefit us any more. This patch will replace 'list_head' with 'nf_hook_ops' as the parameter of nf_queue() and __nf_queue() to save code. Signed-off-by: Michael Wang <wangyun@linux.vnet.ibm.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-09-03netfilter: pass 'nf_hook_ops' instead of 'list_head' to nf_iterate()Michael Wang
Since 'list_for_each_continue_rcu' has already been replaced by 'list_for_each_entry_continue_rcu', pass 'list_head' to nf_iterate() as a parameter can not benefit us any more. This patch will replace 'list_head' with 'nf_hook_ops' as the parameter of nf_iterate() to save code. Signed-off-by: Michael Wang <wangyun@linux.vnet.ibm.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-09-03netfilter: remove xt_NOTRACKCong Wang
It was scheduled to be removed for a long time. Cc: Pablo Neira Ayuso <pablo@netfilter.org> Cc: Patrick McHardy <kaber@trash.net> Cc: "David S. Miller" <davem@davemloft.net> Cc: netfilter@vger.kernel.org Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-09-03netfilter: nf_conntrack: add nf_ct_timeout_lookupPablo Neira Ayuso
This patch adds the new nf_ct_timeout_lookup function to encapsulate the timeout policy attachment that is called in the nf_conntrack_in path. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-09-03netfilter: xt_CT: refactorize xt_ct_tg_checkPablo Neira Ayuso
This patch adds xt_ct_set_helper and xt_ct_set_timeout to reduce the size of xt_ct_tg_check. This aims to improve code mantainability by splitting xt_ct_tg_check in smaller chunks. Suggested by Eric Dumazet. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-09-03netfilter: xt_socket: fix compilation warnings with gcc 4.7Pablo Neira Ayuso
This patch fixes compilation warnings in xt_socket with gcc-4.7. In file included from net/netfilter/xt_socket.c:22:0: net/netfilter/xt_socket.c: In function ‘socket_mt6_v1’: include/net/netfilter/nf_tproxy_core.h:175:23: warning: ‘sport’ may be used uninitialized in this function [-Wmaybe-uninitialized] net/netfilter/xt_socket.c:265:16: note: ‘sport’ was declared here In file included from net/netfilter/xt_socket.c:22:0: include/net/netfilter/nf_tproxy_core.h:175:23: warning: ‘dport’ may be used uninitialized in this function [-Wmaybe-uninitialized] net/netfilter/xt_socket.c:265:9: note: ‘dport’ was declared here In file included from net/netfilter/xt_socket.c:22:0: include/net/netfilter/nf_tproxy_core.h:175:6: warning: ‘saddr’ may be used uninitialized in this function [-Wmaybe-uninitialized] net/netfilter/xt_socket.c:264:27: note: ‘saddr’ was declared here In file included from net/netfilter/xt_socket.c:22:0: include/net/netfilter/nf_tproxy_core.h:175:6: warning: ‘daddr’ may be used uninitialized in this function [-Wmaybe-uninitialized] net/netfilter/xt_socket.c:264:19: note: ‘daddr’ was declared here In file included from net/netfilter/xt_socket.c:22:0: net/netfilter/xt_socket.c: In function ‘socket_match.isra.4’: include/net/netfilter/nf_tproxy_core.h:75:2: warning: ‘protocol’ may be used uninitialized in this function [-Wmaybe-uninitialized] net/netfilter/xt_socket.c:113:5: note: ‘protocol’ was declared here In file included from include/net/tcp.h:37:0, from net/netfilter/xt_socket.c:17: include/net/inet_hashtables.h:356:45: warning: ‘sport’ may be used uninitialized in this function [-Wmaybe-uninitialized] net/netfilter/xt_socket.c:112:16: note: ‘sport’ was declared here In file included from net/netfilter/xt_socket.c:22:0: include/net/netfilter/nf_tproxy_core.h:106:23: warning: ‘dport’ may be used uninitialized in this function [-Wmaybe-uninitialized] net/netfilter/xt_socket.c:112:9: note: ‘dport’ was declared here In file included from include/net/tcp.h:37:0, from net/netfilter/xt_socket.c:17: include/net/inet_hashtables.h:356:15: warning: ‘saddr’ may be used uninitialized in this function [-Wmaybe-uninitialized] net/netfilter/xt_socket.c:111:16: note: ‘saddr’ was declared here In file included from include/net/tcp.h:37:0, from net/netfilter/xt_socket.c:17: include/net/inet_hashtables.h:356:15: warning: ‘daddr’ may be used uninitialized in this function [-Wmaybe-uninitialized] net/netfilter/xt_socket.c:111:9: note: ‘daddr’ was declared here In file included from net/netfilter/xt_socket.c:22:0: net/netfilter/xt_socket.c: In function ‘socket_mt6_v1’: include/net/netfilter/nf_tproxy_core.h:175:23: warning: ‘sport’ may be used uninitialized in this function [-Wmaybe-uninitialized] net/netfilter/xt_socket.c:268:16: note: ‘sport’ was declared here In file included from net/netfilter/xt_socket.c:22:0: include/net/netfilter/nf_tproxy_core.h:175:23: warning: ‘dport’ may be used uninitialized in this function [-Wmaybe-uninitialized] net/netfilter/xt_socket.c:268:9: note: ‘dport’ was declared here In file included from net/netfilter/xt_socket.c:22:0: include/net/netfilter/nf_tproxy_core.h:175:6: warning: ‘saddr’ may be used uninitialized in this function [-Wmaybe-uninitialized] net/netfilter/xt_socket.c:267:27: note: ‘saddr’ was declared here In file included from net/netfilter/xt_socket.c:22:0: include/net/netfilter/nf_tproxy_core.h:175:6: warning: ‘daddr’ may be used uninitialized in this function [-Wmaybe-uninitialized] net/netfilter/xt_socket.c:267:19: note: ‘daddr’ was declared here Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-09-02openvswitch: Fix typoJoe Stringer
Signed-off-by: Joe Stringer <joe@wand.net.nz> Signed-off-by: Jesse Gross <jesse@nicira.com>
2012-09-02Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds
Pull networking fixes from David Miller: 1) NLA_PUT* --> nla_put_* conversion got one case wrong in nfnetlink_log, fix from Patrick McHardy. 2) Missed error return check in ipw2100 driver, from Julia Lawall. 3) PMTU updates in ipv4 were setting the expiry time incorrectly, fix from Eric Dumazet. 4) SFC driver erroneously reversed src and dst when reporting filters via ethtool. 5) Memory leak in CAN protocol and wrong setting of IRQF_SHARED in sja1000 can platform driver, from Alexey Khoroshilov and Sven Schmitt. 6) Fix multicast traffic scaling regression in ipv4_dst_destroy, only take the lock when we really need to. From Eric Dumazet. 7) Fix non-root process spoofing in netlink, from Pablo Neira Ayuso. 8) CWND reduction in TCP is done incorrectly during non-SACK recovery, fix from Yuchung Cheng. 9) Revert netpoll change, and fix what was actually a driver specific problem. From Amerigo Wang. This should cure bootup hangs with netconsole some people reported. 10) Fix xen-netfront invoking __skb_fill_page_desc() with a NULL page pointer. From Ian Campbell. 11) SIP NAT fix for expectiontation creation, from Pablo Neira Ayuso. 12) __ip_rt_update_pmtu() needs RCU locking, from Eric Dumazet. 13) Fix usbnet deadlock on resume, can't use GFP_KERNEL in this situation. From Oliver Neukum. 14) The davinci ethernet driver triggers an OOPS on removal because it frees an MDIO object before unregistering it. Fix from Bin Liu. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (41 commits) net: qmi_wwan: add several new Gobi devices fddi: 64 bit bug in smt_add_para() net: ethernet: fix kernel OOPS when remove davinci_mdio module net/xfrm/xfrm_state.c: fix error return code net: ipv6: fix error return code net: qmi_wwan: new device: Foxconn/Novatel E396 usbnet: fix deadlock in resume cs89x0 : packet reception not working netfilter: nf_conntrack: fix racy timer handling with reliable events bnx2x: Correct the ndo_poll_controller call bnx2x: Move netif_napi_add to the open call ipv4: must use rcu protection while calling fib_lookup bnx2x: fix 57840_MF pci id net: ipv4: ipmr_expire_timer causes crash when removing net namespace e1000e: DoS while TSO enabled caused by link partner with small MSS l2tp: avoid to use synchronize_rcu in tunnel free function gianfar: fix default tx vlan offload feature flag netfilter: nf_nat_sip: fix incorrect handling of EBUSY for RTCP expectation xen-netfront: use __pskb_pull_tail to ensure linear area is big enough on RX netfilter: nfnetlink_log: fix error return code in init path ...
2012-09-016lowpan: handle NETDEV_UNREGISTER eventAlan Ott
Before, it was impossible to remove a wpan device which had lowpan attached to it. Signed-off-by: Alan Ott <alan@signal11.us> Signed-off-by: David S. Miller <davem@tempietto.lan>
2012-09-016lowpan: Make a copy of skb's delivered to 6lowpanAlan Ott
Since lowpan_process_data() modifies the skb (by calling skb_pull()), we need our own copy so that it doesn't affect the data received by other protcols (in this case, af_ieee802154). Signed-off-by: Alan Ott <alan@signal11.us> Signed-off-by: David S. Miller <davem@tempietto.lan>
2012-09-01treewide: fix comment/printk/variable typosAnatol Pomozov
Signed-off-by: Anatol Pomozov <anatol.pomozov@gmail.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2012-08-31tcp: TCP Fast Open Server - main code pathJerry Chu
This patch adds the main processing path to complete the TFO server patches. A TFO request (i.e., SYN+data packet with a TFO cookie option) first gets processed in tcp_v4_conn_request(). If it passes the various TFO checks by tcp_fastopen_check(), a child socket will be created right away to be accepted by applications, rather than waiting for the 3WHS to finish. In additon to the use of TFO cookie, a simple max_qlen based scheme is put in place to fend off spoofed TFO attack. When a valid ACK comes back to tcp_rcv_state_process(), it will cause the state of the child socket to switch from either TCP_SYN_RECV to TCP_ESTABLISHED, or TCP_FIN_WAIT1 to TCP_FIN_WAIT2. At this time retransmission will resume for any unack'ed (data, FIN,...) segments. Signed-off-by: H.K. Jerry Chu <hkchu@google.com> Cc: Yuchung Cheng <ycheng@google.com> Cc: Neal Cardwell <ncardwell@google.com> Cc: Eric Dumazet <edumazet@google.com> Cc: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-08-31tcp: TCP Fast Open Server - support TFO listenersJerry Chu
This patch builds on top of the previous patch to add the support for TFO listeners. This includes - 1. allocating, properly initializing, and managing the per listener fastopen_queue structure when TFO is enabled 2. changes to the inet_csk_accept code to support TFO. E.g., the request_sock can no longer be freed upon accept(), not until 3WHS finishes 3. allowing a TCP_SYN_RECV socket to properly poll() and sendmsg() if it's a TFO socket 4. properly closing a TFO listener, and a TFO socket before 3WHS finishes 5. supporting TCP_FASTOPEN socket option 6. modifying tcp_check_req() to use to check a TFO socket as well as request_sock 7. supporting TCP's TFO cookie option 8. adding a new SYN-ACK retransmit handler to use the timer directly off the TFO socket rather than the listener socket. Note that TFO server side will not retransmit anything other than SYN-ACK until the 3WHS is completed. The patch also contains an important function "reqsk_fastopen_remove()" to manage the somewhat complex relation between a listener, its request_sock, and the corresponding child socket. See the comment above the function for the detail. Signed-off-by: H.K. Jerry Chu <hkchu@google.com> Cc: Yuchung Cheng <ycheng@google.com> Cc: Neal Cardwell <ncardwell@google.com> Cc: Eric Dumazet <edumazet@google.com> Cc: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-08-31tcp: TCP Fast Open Server - header & support functionsJerry Chu
This patch adds all the necessary data structure and support functions to implement TFO server side. It also documents a number of flags for the sysctl_tcp_fastopen knob, and adds a few Linux extension MIBs. In addition, it includes the following: 1. a new TCP_FASTOPEN socket option an application must call to supply a max backlog allowed in order to enable TFO on its listener. 2. A number of key data structures: "fastopen_rsk" in tcp_sock - for a big socket to access its request_sock for retransmission and ack processing purpose. It is non-NULL iff 3WHS not completed. "fastopenq" in request_sock_queue - points to a per Fast Open listener data structure "fastopen_queue" to keep track of qlen (# of outstanding Fast Open requests) and max_qlen, among other things. "listener" in tcp_request_sock - to point to the original listener for book-keeping purpose, i.e., to maintain qlen against max_qlen as part of defense against IP spoofing attack. 3. various data structure and functions, many in tcp_fastopen.c, to support server side Fast Open cookie operations, including /proc/sys/net/ipv4/tcp_fastopen_key to allow manual rekeying. Signed-off-by: H.K. Jerry Chu <hkchu@google.com> Cc: Yuchung Cheng <ycheng@google.com> Cc: Neal Cardwell <ncardwell@google.com> Cc: Eric Dumazet <edumazet@google.com> Cc: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-08-31ipv6: remove some deadcodeSorin Dumitru
__ipv6_regen_rndid no longer returns anything other than 0 so there's no point in verifying what it returns Signed-off-by: Sorin Dumitru <sdumitru@ixiacom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-08-31net/xfrm/xfrm_state.c: fix error return codeJulia Lawall
Initialize return variable before exiting on an error path. A simplified version of the semantic match that finds this problem is as follows: (http://coccinelle.lip6.fr/) // <smpl> ( if@p1 (\(ret < 0\|ret != 0\)) { ... return ret; } | ret@p1 = 0 ) ... when != ret = e1 when != &ret *if(...) { ... when != ret = e2 when forall return ret; } // </smpl> Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-08-31net: ipv6: fix error return codeJulia Lawall
Initialize return variable before exiting on an error path. The initial initialization of the return variable is also dropped, because that value is never used. A simplified version of the semantic match that finds this problem is as follows: (http://coccinelle.lip6.fr/) // <smpl> ( if@p1 (\(ret < 0\|ret != 0\)) { ... return ret; } | ret@p1 = 0 ) ... when != ret = e1 when != &ret *if(...) { ... when != ret = e2 when forall return ret; } // </smpl> Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-08-31net: fix documentation of skb_needs_linearize().Rami Rosen
skb_needs_linearize() does not check highmem DMA as it does not call illegal_highdma() anymore, so there is no need to mention highmem DMA here. (Indeed, ~NETIF_F_SG flag, which is checked in skb_needs_linearize(), can be set when illegal_highdma() returns true, and we are assured that illegal_highdma() is invoked prior to skb_needs_linearize() as skb_needs_linearize() is a static method called only once. But ~NETIF_F_SG can be set not only there in this same invocation path. It can also be set when can_checksum_protocol() returns false). see commit 02932ce9e2c136e6fab2571c8e0dd69ae8ec9853, Convert skb_need_linearize() to use precomputed features. Signed-off-by: Rami Rosen <rosenr@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-08-31ipv4: Minor logic clean-up in ipv4_mtuAlexander Duyck
In ipv4_mtu there is some logic where we are testing for a non-zero value and a timer expiration, then setting the value to zero, and then testing if the value is zero we set it to a value based on the dst. Instead of bothering with the extra steps it is easier to just cleanup the logic so that we set it to the dst based value if it is zero or if the timer has expired. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
2012-08-31net:atm:fix up ENOIOCTLCMD error handlingWanlong Gao
At commit 07d106d0, Linus pointed out that ENOIOCTLCMD should be translated as ENOTTY to user mode. Cc: "David S. Miller" <davem@davemloft.net> Cc: netdev@vger.kernel.org Signed-off-by: Wanlong Gao <gaowanlong@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-08-31openvswitch: using kfree_rcu() to simplify the codeWei Yongjun
The callback function of call_rcu() just calls a kfree(), so we can use kfree_rcu() instead of call_rcu() + callback function. spatch with a semantic match is used to found this problem. (http://coccinelle.lip6.fr/) Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Acked-by: Jesse Gross <jesse@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-08-31af_unix: fix shutdown parameter checkingXi Wang
Return -EINVAL rather than 0 given an invalid "mode" parameter. Signed-off-by: Xi Wang <xi.wang@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-08-31decnet: fix shutdown parameter checkingXi Wang
The allowed value of "how" is SHUT_RD/SHUT_WR/SHUT_RDWR (0/1/2), rather than SHUTDOWN_MASK (3). Signed-off-by: Xi Wang <xi.wang@gmail.com> Acked-by: Steven Whitehouse <swhiteho@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-08-31Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
Merge the 'net' tree to get the recent set of netfilter bug fixes in order to assist with some merge hassles Pablo is going to have to deal with for upcoming changes. Signed-off-by: David S. Miller <davem@davemloft.net>
2012-08-31Merge branch 'master' of git://1984.lsi.us.es/nfDavid S. Miller
2012-08-31netfilter: nf_conntrack: fix racy timer handling with reliable eventsPablo Neira Ayuso
Existing code assumes that del_timer returns true for alive conntrack entries. However, this is not true if reliable events are enabled. In that case, del_timer may return true for entries that were just inserted in the dying list. Note that packets / ctnetlink may hold references to conntrack entries that were just inserted to such list. This patch fixes the issue by adding an independent timer for event delivery. This increases the size of the ecache extension. Still we can revisit this later and use variable size extensions to allocate this area on demand. Tested-by: Oliver Smith <olipro@8.c.9.b.0.7.4.0.1.0.0.2.ip6.arpa> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-08-30ipv4: must use rcu protection while calling fib_lookupEric Dumazet
Following lockdep splat was reported by Pavel Roskin : [ 1570.586223] =============================== [ 1570.586225] [ INFO: suspicious RCU usage. ] [ 1570.586228] 3.6.0-rc3-wl-main #98 Not tainted [ 1570.586229] ------------------------------- [ 1570.586231] /home/proski/src/linux/net/ipv4/route.c:645 suspicious rcu_dereference_check() usage! [ 1570.586233] [ 1570.586233] other info that might help us debug this: [ 1570.586233] [ 1570.586236] [ 1570.586236] rcu_scheduler_active = 1, debug_locks = 0 [ 1570.586238] 2 locks held by Chrome_IOThread/4467: [ 1570.586240] #0: (slock-AF_INET){+.-...}, at: [<ffffffff814f2c0c>] release_sock+0x2c/0xa0 [ 1570.586253] #1: (fnhe_lock){+.-...}, at: [<ffffffff815302fc>] update_or_create_fnhe+0x2c/0x270 [ 1570.586260] [ 1570.586260] stack backtrace: [ 1570.586263] Pid: 4467, comm: Chrome_IOThread Not tainted 3.6.0-rc3-wl-main #98 [ 1570.586265] Call Trace: [ 1570.586271] [<ffffffff810976ed>] lockdep_rcu_suspicious+0xfd/0x130 [ 1570.586275] [<ffffffff8153042c>] update_or_create_fnhe+0x15c/0x270 [ 1570.586278] [<ffffffff815305b3>] __ip_rt_update_pmtu+0x73/0xb0 [ 1570.586282] [<ffffffff81530619>] ip_rt_update_pmtu+0x29/0x90 [ 1570.586285] [<ffffffff815411dc>] inet_csk_update_pmtu+0x2c/0x80 [ 1570.586290] [<ffffffff81558d1e>] tcp_v4_mtu_reduced+0x2e/0xc0 [ 1570.586293] [<ffffffff81553bc4>] tcp_release_cb+0xa4/0xb0 [ 1570.586296] [<ffffffff814f2c35>] release_sock+0x55/0xa0 [ 1570.586300] [<ffffffff815442ef>] tcp_sendmsg+0x4af/0xf50 [ 1570.586305] [<ffffffff8156fc60>] inet_sendmsg+0x120/0x230 [ 1570.586308] [<ffffffff8156fb40>] ? inet_sk_rebuild_header+0x40/0x40 [ 1570.586312] [<ffffffff814f4bdd>] ? sock_update_classid+0xbd/0x3b0 [ 1570.586315] [<ffffffff814f4c50>] ? sock_update_classid+0x130/0x3b0 [ 1570.586320] [<ffffffff814ec435>] do_sock_write+0xc5/0xe0 [ 1570.586323] [<ffffffff814ec4a3>] sock_aio_write+0x53/0x80 [ 1570.586328] [<ffffffff8114bc83>] do_sync_write+0xa3/0xe0 [ 1570.586332] [<ffffffff8114c5a5>] vfs_write+0x165/0x180 [ 1570.586335] [<ffffffff8114c805>] sys_write+0x45/0x90 [ 1570.586340] [<ffffffff815d2722>] system_call_fastpath+0x16/0x1b Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Pavel Roskin <proski@gnu.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-08-30net: ipv4: ipmr_expire_timer causes crash when removing net namespaceFrancesco Ruggeri
When tearing down a net namespace, ipv4 mr_table structures are freed without first deactivating their timers. This can result in a crash in run_timer_softirq. This patch mimics the corresponding behaviour in ipv6. Locking and synchronization seem to be adequate. We are about to kfree mrt, so existing code should already make sure that no other references to mrt are pending or can be created by incoming traffic. The functions invoked here do not cause new references to mrt or other race conditions to be created. Invoking del_timer_sync guarantees that ipmr_expire_timer is inactive. Both ipmr_expire_process (whose completion we may have to wait in del_timer_sync) and mroute_clean_tables internally use mfc_unres_lock or other synchronizations when needed, and they both only modify mrt. Tested in Linux 3.4.8. Signed-off-by: Francesco Ruggeri <fruggeri@aristanetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-08-30netpoll: provide an IP ident in UDP framesEric Dumazet
Let's fill IP header ident field with a meaningful value, it might help some setups. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-08-30l2tp: avoid to use synchronize_rcu in tunnel free functionxeb@mail.ru
Avoid to use synchronize_rcu in l2tp_tunnel_free because context may be atomic. Signed-off-by: Dmitry Kozlov <xeb@mail.ru> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-08-30netfilter: nf_nat_sip: fix incorrect handling of EBUSY for RTCP expectationPablo Neira Ayuso
We're hitting bug while trying to reinsert an already existing expectation: kernel BUG at kernel/timer.c:895! invalid opcode: 0000 [#1] SMP [...] Call Trace: <IRQ> [<ffffffffa0069563>] nf_ct_expect_related_report+0x4a0/0x57a [nf_conntrack] [<ffffffff812d423a>] ? in4_pton+0x72/0x131 [<ffffffffa00ca69e>] ip_nat_sdp_media+0xeb/0x185 [nf_nat_sip] [<ffffffffa00b5b9b>] set_expected_rtp_rtcp+0x32d/0x39b [nf_conntrack_sip] [<ffffffffa00b5f15>] process_sdp+0x30c/0x3ec [nf_conntrack_sip] [<ffffffff8103f1eb>] ? irq_exit+0x9a/0x9c [<ffffffffa00ca738>] ? ip_nat_sdp_media+0x185/0x185 [nf_nat_sip] We have to remove the RTP expectation if the RTCP expectation hits EBUSY since we keep trying with other ports until we succeed. Reported-by: Rafal Fitt <rafalf@aplusc.com.pl> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-08-30net: dev: fix the incorrect hold of net namespace's lo deviceGao feng
When moving a net device from one net namespace to another net namespace,dev_change_net_namespace calls NETDEV_DOWN event,so the original net namespace's dst entries which beloned to this net device will be put into dst_garbage list. then dev_change_net_namespace will set this net device's net to the new net namespace. If we unregister this net device's driver, this will trigger the NETDEV_UNREGISTER_FINAL event, dst_ifdown will be called, and get this net device's dst entries from dst_garbage list, put these entries' dev to the new net namespace's lo device. It's not what we want,actually we need these dst entries hold the original net namespace's lo device,this incorrect device holding will trigger emg message like below. unregister_netdevice: waiting for lo to become free. Usage count = 1 so we should call NETDEV_UNREGISTER_FINAL event in dev_change_net_namespace too,in order to make sure dst entries already in the dst_garbage list, we need rcu_barrier before we call NETDEV_UNREGISTER_FINAL event. With help form Eric Dumazet. Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-08-30netfilter: nfnetlink_log: fix error return code in init pathJulia Lawall
Initialize return variable before exiting on an error path. A simplified version of the semantic match that finds this problem is as follows: (http://coccinelle.lip6.fr/) // <smpl> ( if@p1 (\(ret < 0\|ret != 0\)) { ... return ret; } | ret@p1 = 0 ) ... when != ret = e1 when != &ret *if(...) { ... when != ret = e2 when forall return ret; } // </smpl> Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-08-30netfilter: ctnetlink: fix error return code in init pathJulia Lawall
Initialize return variable before exiting on an error path. A simplified version of the semantic match that finds this problem is as follows: (http://coccinelle.lip6.fr/) // <smpl> ( if@p1 (\(ret < 0\|ret != 0\)) { ... return ret; } | ret@p1 = 0 ) ... when != ret = e1 when != &ret *if(...) { ... when != ret = e2 when forall return ret; } // </smpl> Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-08-30ipvs: fix error return codeJulia Lawall
Initialize return variable before exiting on an error path. A simplified version of the semantic match that finds this problem is as follows: (http://coccinelle.lip6.fr/) // <smpl> ( if@p1 (\(ret < 0\|ret != 0\)) { ... return ret; } | ret@p1 = 0 ) ... when != ret = e1 when != &ret *if(...) { ... when != ret = e2 when forall return ret; } // </smpl> Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr> Acked-by: Simon Horman <horms@verge.net.au> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-08-30netfilter: ip6tables: add stateless IPv6-to-IPv6 Network Prefix Translation ↵Patrick McHardy
target Signed-off-by: Patrick McHardy <kaber@trash.net>
2012-08-30netfilter: nf_nat: support IPv6 in TFTP NAT helperPablo Neira Ayuso
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Patrick McHardy <kaber@trash.net>
2012-08-30netfilter: nf_nat: support IPv6 in IRC NAT helperPablo Neira Ayuso
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Patrick McHardy <kaber@trash.net>