summaryrefslogtreecommitdiffstats
path: root/net
AgeCommit message (Collapse)Author
2014-01-16sctp: remove the unnecessary assignmentwangweidong
When go the right path, the status is 0, no need to assign it again. So just remove the assignment. Signed-off-by: Wang Weidong <wangweidong1@huawei.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-16net_sched: act: pick a different type for act_xtWANG Cong
In tcf_register_action() we check either ->type or ->kind to see if there is an existing action registered, but ipt action registers two actions with same type but different kinds. They should have different types too. Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-16Merge tag 'batman-adv-for-davem' of git://git.open-mesh.org/linux-mergeDavid S. Miller
Included change: - properly format already existing kerneldoc Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-16net_sched: act: use tcf_hash_release() in net/sched/act_police.cWANG Cong
Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-16net: add NETDEV_PRECHANGEMTU to notify before mtu change happensVeaceslav Falico
Currently, if a device changes its mtu, first the change happens (invloving all the side effects), and after that the NETDEV_CHANGEMTU is sent so that other devices can catch up with the new mtu. However, if they return NOTIFY_BAD, then the change is reverted and error returned. This is a really long and costy operation (sometimes). To fix this, add NETDEV_PRECHANGEMTU notification which is called prior to any change actually happening, and if any callee returns NOTIFY_BAD - the change is aborted. This way we're skipping all the playing with apply/revert the mtu. CC: "David S. Miller" <davem@davemloft.net> CC: Jiri Pirko <jiri@resnulli.us> CC: Eric Dumazet <edumazet@google.com> CC: Nicolas Dichtel <nicolas.dichtel@6wind.com> CC: Cong Wang <amwang@redhat.com> Signed-off-by: Veaceslav Falico <vfalico@redhat.com> Acked-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-16net: Check skb->rxhash in gro_receiveTom Herbert
When initializing a gro_list for a packet, first check the rxhash of the incoming skb against that of the skb's in the list. This should be a very strong inidicator of whether the flow is going to be matched, and potentially allows a lot of other checks to be short circuited. Use skb_hash_raw so that we don't force the hash to be calculated. Tested by running netperf 200 TCP_STREAMs between two machines with GRO, HW rxhash, and 1G. Saw no performance degration, slight reduction of time in dev_gro_receive. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-16packet: use percpu mmap tx frame pending refcountDaniel Borkmann
In PF_PACKET's packet mmap(), we can avoid using one atomic_inc() and one atomic_dec() call in skb destructor and use a percpu reference count instead in order to determine if packets are still pending to be sent out. Micro-benchmark with [1] that has been slightly modified (that is, protcol = 0 in socket(2) and bind(2)), example on a rather crappy testing machine; I expect it to scale and have even better results on bigger machines: ./packet_mm_tx -s7000 -m7200 -z700000 em1, avg over 2500 runs: With patch: 4,022,015 cyc Without patch: 4,812,994 cyc time ./packet_mm_tx -s64 -c10000000 em1 > /dev/null, stable: With patch: real 1m32.241s user 0m0.287s sys 1m29.316s Without patch: real 1m38.386s user 0m0.265s sys 1m35.572s In function tpacket_snd(), it is okay to use packet_read_pending() since in fast-path we short-circuit the condition already with ph != NULL, since we have next frames to process. In case we have MSG_DONTWAIT, we also do not execute this path as need_wait is false here anyway, and in case of _no_ MSG_DONTWAIT flag, it is okay to call a packet_read_pending(), because when we ever reach that path, we're done processing outgoing frames anyway and only look if there are skbs still outstanding to be orphaned. We can stay lockless in this percpu counter since it's acceptable when we reach this path for the sum to be imprecise first, but we'll level out at 0 after all pending frames have reached the skb destructor eventually through tx reclaim. When people pin a tx process to particular CPUs, we expect overflows to happen in the reference counter as on one CPU we expect heavy increase; and distributed through ksoftirqd on all CPUs a decrease, for example. As David Laight points out, since the C language doesn't define the result of signed int overflow (i.e. rather than wrap, it is allowed to saturate as a possible outcome), we have to use unsigned int as reference count. The sum over all CPUs when tx is complete will result in 0 again. The BUG_ON() in tpacket_destruct_skb() we can remove as well. It can _only_ be set from inside tpacket_snd() path and we made sure to increase tx_ring.pending in any case before we called po->xmit(skb). So testing for tx_ring.pending == 0 is not too useful. Instead, it would rather have been useful to test if lower layers didn't orphan the skb so that we're missing ring slots being put back to TP_STATUS_AVAILABLE. But such a bug will be caught in user space already as we end up realizing that we do not have any TP_STATUS_AVAILABLE slots left anymore. Therefore, we're all set. Btw, in case of RX_RING path, we do not make use of the pending member, therefore we also don't need to use up any percpu memory here. Also note that __alloc_percpu() already returns a zero-filled percpu area, so initialization is done already. [1] http://wiki.ipxwarzone.com/index.php5?title=Linux_packet_mmap Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-16packet: don't unconditionally schedule() in case of MSG_DONTWAITDaniel Borkmann
In tpacket_snd(), when we've discovered a first frame that is not in status TP_STATUS_SEND_REQUEST, and return a NULL buffer, we exit the send routine in case of MSG_DONTWAIT, since we've finished traversing the mmaped send ring buffer and don't care about pending frames. While doing so, we still unconditionally call an expensive schedule() in the packet_current_frame() "error" path, which is unnecessary in this case since it's enough to just quit the function. Also, in case MSG_DONTWAIT is not set, we should rather test for need_resched() first and do schedule() only if necessary since meanwhile pending frames could already have finished processing and called skb destructor. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-16packet: improve socket create/bind latency in some casesDaniel Borkmann
Most people acquire PF_PACKET sockets with a protocol argument in the socket call, e.g. libpcap does so with htons(ETH_P_ALL) for all its sockets. Most likely, at some point in time a subsequent bind() call will follow, e.g. in libpcap with ... memset(&sll, 0, sizeof(sll)); sll.sll_family = AF_PACKET; sll.sll_ifindex = ifindex; sll.sll_protocol = htons(ETH_P_ALL); ... as arguments. What happens in the kernel is that already in socket() syscall, we install a proto hook via register_prot_hook() if our protocol argument is != 0. Yet, in bind() we're almost doing the same work by doing a unregister_prot_hook() with an expensive synchronize_net() call in case during socket() the proto was != 0, plus follow-up register_prot_hook() with a bound device to it this time, in order to limit traffic we get. In the case when the protocol and user supplied device index (== 0) does not change from socket() to bind(), we can spare us doing the same work twice. Similarly for re-binding to the same device and protocol. For these scenarios, we can decrease create/bind latency from ~7447us (sock-bind-2 case) to ~89us (sock-bind-1 case) with this patch. Alternatively, for the first case, if people care, they should simply create their sockets with proto == 0 argument and define the protocol during bind() as this saves a call to synchronize_net() as well (sock-bind-3 case). In all other cases, we're tied to user space behaviour we must not change, also since a bind() is not strictly required. Thus, we need the synchronize_net() to make sure no asynchronous packet processing paths still refer to the previous elements of po->prot_hook. In case of mmap()ed sockets, the workflow that includes bind() is socket() -> setsockopt(<ring>) -> bind(). In that case, a pair of {__unregister, register}_prot_hook is being called from setsockopt() in order to install the new protocol receive handler. Thus, when we call bind and can skip a re-hook, we have already previously installed the new handler. For fanout, this is handled different entirely, so we should be good. Timings on an i7-3520M machine: * sock-bind-1: 89 us * sock-bind-2: 7447 us * sock-bind-3: 75 us sock-bind-1: socket(PF_PACKET, SOCK_RAW, htons(ETH_P_IP)) = 3 bind(3, {sa_family=AF_PACKET, proto=htons(ETH_P_IP), if=all(0), pkttype=PACKET_HOST, addr(0)={0, }, 20) = 0 sock-bind-2: socket(PF_PACKET, SOCK_RAW, htons(ETH_P_IP)) = 3 bind(3, {sa_family=AF_PACKET, proto=htons(ETH_P_IP), if=lo(1), pkttype=PACKET_HOST, addr(0)={0, }, 20) = 0 sock-bind-3: socket(PF_PACKET, SOCK_RAW, 0) = 3 bind(3, {sa_family=AF_PACKET, proto=htons(ETH_P_IP), if=lo(1), pkttype=PACKET_HOST, addr(0)={0, }, 20) = 0 Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-16net/ipv4: don't use module_init in non-modular gre_offloadPaul Gortmaker
Recent commit 438e38fadca2f6e57eeecc08326c8a95758594d4 ("gre_offload: statically build GRE offloading support") added new module_init/module_exit calls to the gre_offload.c file. The file is obj-y and can't be anything other than built-in. Currently it can never be built modular, so using module_init as an alias for __initcall can be somewhat misleading. Fix this up now, so that we can relocate module_init from init.h into module.h in the future. If we don't do this, we'd have to add module.h to obviously non-modular code, and that would be a worse thing. We also make the inclusion explicit. Note that direct use of __initcall is discouraged, vs. one of the priority categorized subgroups. As __initcall gets mapped onto device_initcall, our use of device_initcall directly in this change means that the runtime impact is zero -- it will remain at level 6 in initcall ordering. As for the module_exit, rather than replace it with __exitcall, we simply remove it, since it appears only UML does anything with those, and even for UML, there is no relevant cleanup to be done here. Cc: Eric Dumazet <edumazet@google.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-16net: eth_type_trans() should use skb_header_pointer()Eric Dumazet
eth_type_trans() can read uninitialized memory as drivers do not necessarily pull more than 14 bytes in skb->head before calling it. As David suggested, we can use skb_header_pointer() to fix this without breaking some drivers that might not expect eth_type_trans() pulling 2 additional bytes. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-16Merge branch 'master' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nftables Pablo Neira Ayuso says: ==================== This small batch contains several Netfilter fixes for your net-next tree, more specifically: * Fix compilation warning in nft_ct in NF_CONNTRACK_MARK is not set, from Kristian Evensen. * Add dependency to IPV6 for NF_TABLES_INET. This one has been reported by the several robots that are testing .config combinations, from Paul Gortmaker. * Fix default base chain policy setting in nf_tables, from myself. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-16neigh: use NEIGH_VAR_INIT in ndo_neigh_setup functions.Jiri Pirko
When ndo_neigh_setup is called, the bitfield used by NEIGH_VAR_SET is not initialized yet. This might cause confusion for the people who use NEIGH_VAR_SET in ndo_neigh_setup. So rather introduce NEIGH_VAR_INIT for usage in ndo_neigh_setup. Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-15ipv6 addrconf: don't cleanup prefix route for IFA_F_NOPREFIXROUTEThomas Haller
Refactor the deletion/update of prefix routes when removing an address. Now also consider IFA_F_NOPREFIXROUTE and if there is an address present with this flag, to not cleanup the route. Instead, assume that userspace is taking care of this route. Also perform the same cleanup, when userspace changes an existing address to add NOPREFIXROUTE (to an address that didn't have this flag). This is done because when the address was added, a prefix route was created for it. Since the user now wants to handle this route by himself, we cleanup this route. This cleanup of the route is not totally robust. There is no guarantee, that the route we are about to delete was really the one added by the kernel. This behavior does not change by the patch, and in practice it should work just fine. Signed-off-by: Thomas Haller <thaller@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-15ipv6 addrconf: add IFA_F_NOPREFIXROUTE flag to suppress creation of IP6 routesThomas Haller
When adding/modifying an IPv6 address, the userspace application needs a way to suppress adding a prefix route. This is for example relevant together with IFA_F_MANAGERTEMPADDR, where userspace creates autoconf generated addresses, but depending on on-link, no route for the prefix should be added. Signed-off-by: Thomas Haller <thaller@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-15Revert "batman-adv: drop dependency against CRC16"David S. Miller
This reverts commit 12afc36e38b3b6a0ec9bda71632c2285e7fdbab2. The dependency is actually still necessary. Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-15sctp: create helper function to enable|disable sackdelaywangweidong
add sctp_spp_sackdelay_{enable|disable} helper function for avoiding code duplication. Signed-off-by: Wang Weidong <wangweidong1@huawei.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-15ipv6: move IPV6_TCLASS_SHIFT into ipv6.h and define a helperLi RongQing
Two places defined IPV6_TCLASS_SHIFT, so we should move it into ipv6.h, and use this macro as possible. And define ip6_tclass helper to return tclass Signed-off-by: Li RongQing <roy.qing.li@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-15net: move 6lowpan compression code to separate moduleDmitry Eremin-Solenikov
IEEE 802.15.4 and Bluetooth networking stacks share 6lowpan compression code. Instead of introducing Makefile/Kconfig hacks, build this code as a separate module referenced from both ieee802154 and bluetooth modules. This fixes the following build error observed in some kernel configurations: net/built-in.o: In function `header_create': 6lowpan.c:(.text+0x166149): undefined reference to `lowpan_header_compress' net/built-in.o: In function `bt_6lowpan_recv': (.text+0x166b3c): undefined reference to `lowpan_process_data' Reported-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Dmitry Eremin-Solenikov <dmitry_eremin@mentor.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-15net: rename sysfs symlinks on device name changeVeaceslav Falico
Currently, we don't rename the upper/lower_ifc symlinks in /sys/class/net/*/ , which might result stale/duplicate links/names. Fix this by adding netdev_adjacent_rename_links(dev, oldname) which renames all the upper/lower interface's links to dev from the upper/lower_oldname to the new name. We don't need a rollback because only we control these symlinks and if we fail to rename them - sysfs will anyway complain. Reported-by: Ding Tianhong <dingtianhong@huawei.com> CC: Ding Tianhong <dingtianhong@huawei.com> CC: "David S. Miller" <davem@davemloft.net> CC: Eric Dumazet <edumazet@google.com> CC: Nicolas Dichtel <nicolas.dichtel@6wind.com> CC: Cong Wang <amwang@redhat.com> Signed-off-by: Veaceslav Falico <vfalico@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-15net: add sysfs helpers for netdev_adjacent logicVeaceslav Falico
They clean up the code a bit and can be used further. CC: Ding Tianhong <dingtianhong@huawei.com> CC: "David S. Miller" <davem@davemloft.net> CC: Eric Dumazet <edumazet@google.com> CC: Nicolas Dichtel <nicolas.dichtel@6wind.com> CC: Cong Wang <amwang@redhat.com> Signed-off-by: Veaceslav Falico <vfalico@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-16batman-adv: use consistent kerneldoc styleSimon Wunderlich
Reported-by: Antonio Quartulli <antonio@meshcoding.com> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de> Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch> Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>
2014-01-15neigh: split lines for NEIGH_VAR_SET so they are not too longJiri Pirko
introduced by: commit 1f9248e5606afc6485255e38ad57bdac08fa7711 "neigh: convert parms to an array" Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-15netfilter: nft_ct: fix compilation warning if NF_CONNTRACK_MARK is not setKristian Evensen
net/netfilter/nft_ct.c: In function 'nft_ct_set_eval': net/netfilter/nft_ct.c:136:6: warning: unused variable 'value' [-Wunused-variable] Reported-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: Kristian Evensen <kristian.evensen@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-01-14tcp: do not export tcp_gso_segment() and tcp_gro_receive()Eric Dumazet
tcp_gso_segment() and tcp_gro_receive() no longer need to be exported. IPv4 and IPv6 offloads are statically linked. Note that tcp_gro_complete() is still used by bnx2x, unfortunately. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-14net: nl80211: __dev_get_by_index instead of dev_get_by_index to find interfaceYing Xue
As __cfg80211_rdev_from_attrs(), nl80211_dump_wiphy_parse() and nl80211_set_wiphy() are all under rtnl_lock protection, __dev_get_by_index() instead of dev_get_by_index() should be used to find interface handler in them allowing us to avoid to change interface reference counter. Cc: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-14can: use __dev_get_by_index instead of dev_get_by_index to find interfaceYing Xue
As cgw_create_job() is always under rtnl_lock protection, __dev_get_by_index() instead of dev_get_by_index() should be used to find interface handler in it having us avoid to change interface reference counter. Cc: Oliver Hartkopp <socketcan@hartkopp.net> Signed-off-by: Ying Xue <ying.xue@windriver.com> Acked-by: Oliver Hartkopp <socketcan@hartkopp.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-14caif: __dev_get_by_index instead of dev_get_by_index to find interfaceYing Xue
The following call chains indicate that chnl_net_open() is under rtnl_lock protection as __dev_open() is protected by rtnl_lock. So if __dev_get_by_index() instead of dev_get_by_index() is used to find interface handler in it, this would help us avoid to change interface reference counter. __dev_open() chnl_net_open() Cc: Dmitry Tarnyagin <dmitry.tarnyagin@lockless.no> Signed-off-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-14batman-adv: use __dev_get_by_index instead of dev_get_by_index to find interfaceYing Xue
The following call chains indicate that batadv_is_on_batman_iface() is always under rtnl_lock protection as call_netdevice_notifier() is protected by rtnl_lock. So if __dev_get_by_index() rather than dev_get_by_index() is used to find interface handler in it, this would help us avoid to change interface reference counter. call_netdevice_notifier() batadv_hard_if_event() batadv_hardif_add_interface() batadv_is_valid_iface() batadv_is_on_batman_iface() Cc: Antonio Quartulli <antonio@meshcoding.com> Signed-off-by: Ying Xue <ying.xue@windriver.com> Acked-by: Antonio Quartulli <antonio@meshcoding.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-14decnet: use __dev_get_by_index instead of dev_get_by_index to find interfaceYing Xue
The following call chain we can identify that dn_cache_getroute() is protected under rtnl_lock. So if we use __dev_get_by_index() instead of dev_get_by_index() to find interface handlers in it, this would help us avoid to change interface reference counter. rtnetlink_rcv() rtnl_lock() netlink_rcv_skb() dn_cache_getroute() rtnl_unlock() Signed-off-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-14dcb: use __dev_get_by_name instead of dev_get_by_name to find interfaceYing Xue
The following call chain indicates that dcb_doit() is protected under rtnl_lock. So if we use __dev_get_by_name() instead of dev_get_by_name() to find interface handlers in it, this would help us avoid to change interface reference counter. rtnetlink_rcv() rtnl_lock() netlink_rcv_skb() dcb_doit() rtnl_unlock() Cc: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-14IPv6: move the anycast_src_echo_reply sysctl to netns_sysctl_ipv6FX Le Bail
This change move anycast_src_echo_reply sysctl with other ipv6 sysctls. Suggested-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: Francois-Xavier Le Bail <fx.lebail@yahoo.com> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-14sctp: remove a redundant NULL checkDan Carpenter
It confuses Smatch when we check "sinit" for NULL and then non-NULL and that causes a false positive warning later. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-14tipc: spelling fixesstephen hemminger
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-14ipv6: addrconf spelling fixesstephen hemminger
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-14net: Spelling s/transmition/transmission/Geert Uytterhoeven
Signed-off-by: Geert Uytterhoeven <geert+renesas@linux-m68k.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-14net: replace macros net_random and net_srandom with direct calls to prandomAruna-Hewapathirane
This patch removes the net_random and net_srandom macros and replaces them with direct calls to the prandom ones. As new commits only seem to use prandom_u32 there is no use to keep them around. This change makes it easier to grep for users of prandom_u32. Signed-off-by: Aruna-Hewapathirane <aruna.hewapathirane@gmail.com> Suggested-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-14ipv6: copy traffic class from ping request to replyHannes Frederic Sowa
Suggested-by: Simon Schneider <simon-schneider@gmx.net> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-14ipv4: register igmp_notifier even when !CONFIG_PROC_FSWANG Cong
We still need this notifier even when we don't config PROC_FS. It should be rare to have a kernel without PROC_FS, so just for completeness. Cc: Stephen Hemminger <stephen@networkplumber.org> Cc: David S. Miller <davem@davemloft.net> Cc: Patrick McHardy <kaber@trash.net> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-14net: Add trace events for all receive entry points, exposing more skb fieldsBen Hutchings
The existing net/netif_rx and net/netif_receive_skb trace events provide little information about the skb, nor do they indicate how it entered the stack. Add trace events at entry of each of the exported functions, including most fields that are likely to be interesting for debugging driver datapath behaviour. Split netif_rx() and netif_receive_skb() so that internal calls are not traced. Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-14net: Add net_dev_start_xmit trace event, exposing more skb fieldsBen Hutchings
The existing net/net_dev_xmit trace event provides little information about the skb that has been passed to the driver, and it is not simple to add more since the skb may already have been freed at the point the event is emitted. Add a separate trace event before the skb is passed to the driver, including most fields that are likely to be interesting for debugging driver datapath behaviour. Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-14net: Fix indentation in dev_hard_start_xmit()Ben Hutchings
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-14Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
2014-01-14net: add skb_checksum_setupPaul Durrant
This patch adds a function to set up the partial checksum offset for IP packets (and optionally re-calculate the pseudo-header checksum) into the core network code. The implementation was previously private and duplicated between xen-netback and xen-netfront, however it is not xen-specific and is potentially useful to any network driver. Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Cc: David Miller <davem@davemloft.net> Cc: Eric Dumazet <edumazet@google.com> Cc: Veaceslav Falico <vfalico@redhat.com> Cc: Alexander Duyck <alexander.h.duyck@intel.com> Cc: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-14netfilter: Add dependency on IPV6 for NF_TABLES_INETPaul Gortmaker
Commit 1d49144c0aa ("netfilter: nf_tables: add "inet" table for IPv4/IPv6") allows creation of non-IPV6 enabled .config files that will fail to configure/link as follows: warning: (NF_TABLES_INET) selects NF_TABLES_IPV6 which has unmet direct dependencies (NET && INET && IPV6 && NETFILTER && NF_TABLES) warning: (NF_TABLES_INET) selects NF_TABLES_IPV6 which has unmet direct dependencies (NET && INET && IPV6 && NETFILTER && NF_TABLES) warning: (NF_TABLES_INET) selects NF_TABLES_IPV6 which has unmet direct dependencies (NET && INET && IPV6 && NETFILTER && NF_TABLES) net/built-in.o: In function `nft_reject_eval': nft_reject.c:(.text+0x651e8): undefined reference to `nf_ip6_checksum' nft_reject.c:(.text+0x65270): undefined reference to `ip6_route_output' nft_reject.c:(.text+0x656c4): undefined reference to `ip6_dst_hoplimit' make: *** [vmlinux] Error 1 Since the feature is to allow for a mixed IPV4 and IPV6 table, it seems sensible to make it depend on IPV6. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Acked-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-01-13bridge: move br_net_exit() to br.cWANG Cong
And it can become static. Cc: Stephen Hemminger <stephen@networkplumber.org> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-13Merge branch 'master' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next Conflicts: net/xfrm/xfrm_policy.c Steffen Klassert says: ==================== This pull request has a merge conflict between commits be7928d20bab ("net: xfrm: xfrm_policy: fix inline not at beginning of declaration") and da7c224b1baa ("net: xfrm: xfrm_policy: silence compiler warning") from the net-next tree and commit 2f3ea9a95c58 ("xfrm: checkpatch erros with inline keyword position") from the ipsec-next tree. The version from net-next can be used, like it is done in linux-next. 1) Checkpatch cleanups, from Weilong Chen. 2) Fix lockdep complaints when pktgen is used with IPsec, from Fan Du. 3) Update pktgen to allow any combination of IPsec transport/tunnel mode and AH/ESP/IPcomp type, from Fan Du. 4) Make pktgen_dst_metrics static, Fengguang Wu. 5) Compile fix for pktgen when CONFIG_XFRM is not set, from Fan Du. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-13inet_diag: fix inet_diag_dump_icsk() to use correct state for timewait socketsNeal Cardwell
Fix inet_diag_dump_icsk() to reflect the fact that both TCP_TIME_WAIT and TCP_FIN_WAIT2 connections are represented by inet_timewait_sock (not just TIME_WAIT), and for such sockets the tw_substate field holds the real state, which can be either TCP_TIME_WAIT or TCP_FIN_WAIT2. This brings the inet_diag state-matching code in line with the field it uses to populate idiag_state. This is also analogous to the info exported in /proc/net/tcp, where get_tcp4_sock() exports sk->sk_state and get_timewait4_sock() exports tw->tw_substate. Before fixing this, (a) neither "ss -nemoi" nor "ss -nemoi state fin-wait-2" would return a socket in TCP_FIN_WAIT2; and (b) "ss -nemoi state time-wait" would also return sockets in state TCP_FIN_WAIT2. This is an old bug that predates 05dbc7b ("tcp/dccp: remove twchain"). Signed-off-by: Neal Cardwell <ncardwell@google.com> Cc: Eric Dumazet <edumazet@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-13Merge tag 'batman-adv-for-davem' of git://git.open-mesh.org/linux-mergeDavid S. Miller
Included changes: - drop dependency against CRC16 - move to new release version - add size check at compile time for packet structs - update copyright years in every file - implement new bonding/interface alternation feature Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-13net: make dev_set_mtu() honor notification return codeVeaceslav Falico
Currently, after changing the MTU for a device, dev_set_mtu() calls NETDEV_CHANGEMTU notification, however doesn't verify it's return code - which can be NOTIFY_BAD - i.e. some of the net notifier blocks refused this change, and continues nevertheless. To fix this, verify the return code, and if it's an error - then revert the MTU to the original one, notify again and pass the error code. CC: Jiri Pirko <jiri@resnulli.us> CC: "David S. Miller" <davem@davemloft.net> CC: Eric Dumazet <edumazet@google.com> CC: Alexander Duyck <alexander.h.duyck@intel.com> CC: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: Veaceslav Falico <vfalico@redhat.com> Reviewed-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>