summaryrefslogtreecommitdiffstats
path: root/net/ipv4/ipmr.c
AgeCommit message (Collapse)Author
2013-03-28net-next: replace obsolete NLMSG_* with type safe nlmsg_*Hong zhi guo
Signed-off-by: Hong Zhiguo <honkiko@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-26GRE: Refactor GRE tunneling code.Pravin B Shelar
Following patch refactors GRE code into ip tunneling code and GRE specific code. Common tunneling code is moved to ip_tunnel module. ip_tunnel module is written as generic library which can be used by different tunneling implementations. ip_tunnel module contains following components: - packet xmit and rcv generic code. xmit flow looks like (gre_xmit/ipip_xmit)->ip_tunnel_xmit->ip_local_out. - hash table of all devices. - lookup for tunnel devices. - control plane operations like device create, destroy, ioctl, netlink operations code. - registration for tunneling modules, like gre, ipip etc. - define single pcpu_tstats dev->tstats. - struct tnl_ptk_info added to pass parsed tunnel packet parameters. ipip.h header is renamed to ip_tunnel.h Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-18net: proc: change proc_net_remove to remove_proc_entryGao feng
proc_net_remove is only used to remove proc entries that under /proc/net,it's not a general function for removing proc entries of netns. if we want to remove some proc entries which under /proc/net/stat/, we still need to call remove_proc_entry. this patch use remove_proc_entry to replace proc_net_remove. we can remove proc_net_remove after this patch. Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-18net: proc: change proc_net_fops_create to proc_createGao feng
Right now, some modules such as bonding use proc_create to create proc entries under /proc/net/, and other modules such as ipv4 use proc_net_fops_create. It looks a little chaos.this patch changes all of proc_net_fops_create to proc_create. we can remove proc_net_fops_create after this patch. Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-22ipmr: fix sparse warning when testing origin or groupNicolas Dichtel
mfc_mcastgrp and mfc_origin are __be32, thus we need to convert INADDR_ANY. Because INADDR_ANY is 0, this patch just fix sparse warnings. Reported-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-21mcast: add multicast proxy support (IPv4 and IPv6)Nicolas Dichtel
This patch add the support of proxy multicast, ie being able to build a static multicast tree. It adds the support of (*,*) and (*,G) entries. The user should define an (*,*) entry which is not used for real forwarding. This entry defines the upstream in iif and contains all interfaces from the static tree in its oifs. It will be used to forward packet upstream when they come from an interface belonging to the static tree. Hence, the user should define (*,G) entries to build its static tree. Note that upstream interface must be part of oifs: packets are sent to all oifs interfaces except the input interface. This ensures to always join the whole static tree, even if the packet is not coming from the upstream interface. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Acked-by: David L Stevens <dlstevens@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-12-04ipmr: advertise new mfc entries via rtnlNicolas Dichtel
This patch allows to monitor mfc activities via rtnetlink. To avoid parsing two times the mfc oifs, we use maxvif to allocate the rtnl msg, thus we may allocate some superfluous space. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-12-04ipmr/ip6mr: allow to get unresolved cache via netlinkNicolas Dichtel
/proc/net/ip[6]_mr_cache allows to get all mfc entries, even if they are put in the unresolved list (mfc[6]_unres_queue). But only the table RT_TABLE_DEFAULT is displayed. This patch adds the parsing of the unresolved list when the dump is made via rtnetlink, hence each table can be checked. In IPv6, we set rtm_type in ip6mr_fill_mroute(), because in case of unresolved mfc __ip6mr_fill_mroute() will not set it. In IPv4, it is already done. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-12-04ipmr/ip6mr: report origin of mfc entry into rtnl msgNicolas Dichtel
A mfc entry can be static or not (added via the mroute_sk socket). The patch reports MFC_STATIC flag into rtm_protocol by setting rtm_protocol to RTPROT_STATIC or RTPROT_MROUTED. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-12-04ipmr/ip6mr: advertise mfc stats via rtnetlinkNicolas Dichtel
These statistics can be checked only via /proc/net/ip_mr_cache or SIOCGETSGCNT[_IN6] and thus only for the table RT_TABLE_DEFAULT. Advertising them via rtnetlink allows to get statistics for all cache entries, whatever the table is. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-12-04netconf: advertise mc_forwarding statusNicolas Dichtel
This patch advertise the MC_FORWARDING status for IPv4 and IPv6. This field is readonly, only multicast engine in the kernel updates it. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-11-29Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-11-26net: ipmr: limit MRT_TABLE identifiersEric Dumazet
Name of pimreg devices are built from following format : char name[IFNAMSIZ]; // IFNAMSIZ == 16 sprintf(name, "pimreg%u", mrt->id); We must therefore limit mrt->id to 9 decimal digits or risk a buffer overflow and a crash. Restrict table identifiers in [0 ... 999999999] interval. Reported-by: Chen Gang <gang.chen@asianux.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-11-25ipv4/ipmr and ipv6/ip6mr: Convert int mroute_do_<foo> to boolJoe Perches
Save a few bytes per table by convert mroute_do_assert and mroute_do_pim from int to bool. Remove !! as the compiler does that when assigning int to bool. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-11-25ipv4: ipmr: various fixes and cleanupsEric Dumazet
1) ip_mroute_setsockopt() & ip_mroute_getsockopt() should not access/set raw_sk(sk)->ipmr_table before making sure the socket is a raw socket, and protocol is IGMP 2) MRT_INIT should return -EINVAL if optlen != sizeof(int), not -ENOPROTOOPT 3) MRT_ASSERT & MRT_PIM should validate optlen 4) " (v) ? 1 : 0 " can be written as " !!v " Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-11-18net: Allow userns root to control ipv4Eric W. Biederman
Allow an unpriviled user who has created a user namespace, and then created a network namespace to effectively use the new network namespace, by reducing capable(CAP_NET_ADMIN) and capable(CAP_NET_RAW) calls to be ns_capable(net->user_ns, CAP_NET_ADMIN), or capable(net->user_ns, CAP_NET_RAW) calls. Settings that merely control a single network device are allowed. Either the network device is a logical network device where restrictions make no difference or the network device is hardware NIC that has been explicity moved from the initial network namespace. In general policy and network stack state changes are allowed while resource control is left unchanged. Allow creating raw sockets. Allow the SIOCSARP ioctl to control the arp cache. Allow the SIOCSIFFLAG ioctl to allow setting network device flags. Allow the SIOCSIFADDR ioctl to allow setting a netdevice ipv4 address. Allow the SIOCSIFBRDADDR ioctl to allow setting a netdevice ipv4 broadcast address. Allow the SIOCSIFDSTADDR ioctl to allow setting a netdevice ipv4 destination address. Allow the SIOCSIFNETMASK ioctl to allow setting a netdevice ipv4 netmask. Allow the SIOCADDRT and SIOCDELRT ioctls to allow adding and deleting ipv4 routes. Allow the SIOCADDTUNNEL, SIOCCHGTUNNEL and SIOCDELTUNNEL ioctls for adding, changing and deleting gre tunnels. Allow the SIOCADDTUNNEL, SIOCCHGTUNNEL and SIOCDELTUNNEL ioctls for adding, changing and deleting ipip tunnels. Allow the SIOCADDTUNNEL, SIOCCHGTUNNEL and SIOCDELTUNNEL ioctls for adding, changing and deleting ipsec virtual tunnel interfaces. Allow setting the MRT_INIT, MRT_DONE, MRT_ADD_VIF, MRT_DEL_VIF, MRT_ADD_MFC, MRT_DEL_MFC, MRT_ASSERT, MRT_PIM, MRT_TABLE socket options on multicast routing sockets. Allow setting and receiving IPOPT_CIPSO, IP_OPT_SEC, IP_OPT_SID and arbitrary ip options. Allow setting IP_SEC_POLICY/IP_XFRM_POLICY ipv4 socket option. Allow setting the IP_TRANSPARENT ipv4 socket option. Allow setting the TCP_REPAIR socket option. Allow setting the TCP_CONGESTION socket option. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-10-06sections: fix section conflicts in netAndi Kleen
Signed-off-by: Andi Kleen <ak@linux.intel.com> Cc: David Miller <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-09-10netlink: Rename pid to portid to avoid confusionEric W. Biederman
It is a frequent mistake to confuse the netlink port identifier with a process identifier. Try to reduce this confusion by renaming fields that hold port identifiers portid instead of pid. I have carefully avoided changing the structures exported to userspace to avoid changing the userspace API. I have successfully built an allyesconfig kernel with this change. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Acked-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-08-31Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
Merge the 'net' tree to get the recent set of netfilter bug fixes in order to assist with some merge hassles Pablo is going to have to deal with for upcoming changes. Signed-off-by: David S. Miller <davem@davemloft.net>
2012-08-30net: ipv4: ipmr_expire_timer causes crash when removing net namespaceFrancesco Ruggeri
When tearing down a net namespace, ipv4 mr_table structures are freed without first deactivating their timers. This can result in a crash in run_timer_softirq. This patch mimics the corresponding behaviour in ipv6. Locking and synchronization seem to be adequate. We are about to kfree mrt, so existing code should already make sure that no other references to mrt are pending or can be created by incoming traffic. The functions invoked here do not cause new references to mrt or other race conditions to be created. Invoking del_timer_sync guarantees that ipmr_expire_timer is inactive. Both ipmr_expire_process (whose completion we may have to wait in del_timer_sync) and mroute_clean_tables internally use mfc_unres_lock or other synchronizations when needed, and they both only modify mrt. Tested in Linux 3.4.8. Signed-off-by: Francesco Ruggeri <fruggeri@aristanetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-08-09net: Loopback ifindex is constant nowPavel Emelyanov
As pointed out, there are places, that access net->loopback_dev->ifindex and after ifindex generation is made per-net this value becomes constant equals 1. So go ahead and introduce the LOOPBACK_IFINDEX constant and use it where appropriate. Signed-off-by: Pavel Emelyanov <xemul@parallels.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-20ipv4: Kill rt->rt_oifDavid S. Miller
Never actually used. It was being set on output routes to the original OIF specified in the flow key used for the lookup. Adjust the only user, ipmr_rt_fib_lookup(), for greater correctness of the flowi4_oif and flowi4_iif values, thanks to feedback from Julian Anastasov. Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-20ipv4: Remove 'rt_mark' from 'struct rtable'David Miller
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-10net: Fix (nearly-)kernel-doc comments for various functionsBen Hutchings
Fix incorrect start markers, wrapped summary lines, missing section breaks, incorrect separators, and some name mismatches. Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-06-27ipmr: Do not use RTA_PUT() macrosThomas Graf
Also fix a needless skb tailroom check for a 4 bytes area after after each rtnexthop block. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-06-07snmp: fix OutOctets counter to include forwarded datagramsVincent Bernat
RFC 4293 defines ipIfStatsOutOctets (similar definition for ipSystemStatsOutOctets): The total number of octets in IP datagrams delivered to the lower layers for transmission. Octets from datagrams counted in ipIfStatsOutTransmits MUST be counted here. And ipIfStatsOutTransmits: The total number of IP datagrams that this entity supplied to the lower layers for transmission. This includes datagrams generated locally and those forwarded by this entity. Therefore, IPSTATS_MIB_OUTOCTETS must be incremented when incrementing IPSTATS_MIB_OUTFORWDATAGRAMS. IP_UPD_PO_STATS is not used since ipIfStatsOutRequests must not include forwarded datagrams: The total number of IP datagrams that local IP user-protocols (including ICMP) supplied to IP in requests for transmission. Note that this counter does not include any datagrams counted in ipIfStatsOutForwDatagrams. Signed-off-by: Vincent Bernat <bernat@luffy.cx> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-05-15net: Convert net_ratelimit uses to net_<level>_ratelimitedJoe Perches
Standardize the net core ratelimited logging functions. Coalesce formats, align arguments. Change a printk then vprintk sequence to use printf extension %pV. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-10Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
2012-04-02ipv4: Stop using NLA_PUT*().David S. Miller
These macros contain a hidden goto, and are thus extremely error prone and make code hard to audit. Signed-off-by: David S. Miller <davem@davemloft.net>
2012-03-28Remove all #inclusions of asm/system.hDavid Howells
Remove all #inclusions of asm/system.h preparatory to splitting and killing it. Performed with the following command: perl -p -i -e 's!^#\s*include\s*<asm/system[.]h>.*\n!!' `grep -Irl '^#\s*include\s*<asm/system[.]h>' *` Signed-off-by: David Howells <dhowells@redhat.com>
2012-03-11net: Convert printks to pr_<level>Joe Perches
Use a more current kernel messaging style. Convert a printk block to print_hex_dump. Coalesce formats, align arguments. Use %s, __func__ instead of embedding function names. Some messages that were prefixed with <foo>_close are now prefixed with <foo>_fini. Some ah4 and esp messages are now not prefixed with "ip ". The intent of this patch is to later add something like #define pr_fmt(fmt) "IPv4: " fmt. to standardize the output messages. Text size is trivially reduced. (x86-32 allyesconfig) $ size net/ipv4/built-in.o* text data bss dec hex filename 887888 31558 249696 1169142 11d6f6 net/ipv4/built-in.o.new 887934 31558 249800 1169292 11d78c net/ipv4/built-in.o.old Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-01-12net: reintroduce missing rcu_assign_pointer() callsEric Dumazet
commit a9b3cd7f32 (rcu: convert uses of rcu_assign_pointer(x, NULL) to RCU_INIT_POINTER) did a lot of incorrect changes, since it did a complete conversion of rcu_assign_pointer(x, y) to RCU_INIT_POINTER(x, y). We miss needed barriers, even on x86, when y is not NULL. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> CC: Stephen Hemminger <shemminger@vyatta.com> CC: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-11-29ipv4: remove useless codes in ipmr_device_event()RongQing.Li
Commit 7dc00c82 added a 'notify' parameter for vif_delete() to distinguish whether to unregister the device. When notify=1 means we does not need to unregister the device, so calling unregister_netdevice_many is useless. Signed-off-by: RongQing.Li <roy.qing.li@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-10-31net: Add export.h for EXPORT_SYMBOL/THIS_MODULE to non-modulesPaul Gortmaker
These files are non modular, but need to export symbols using the macros now living in export.h -- call out the include so that things won't break when we remove the implicit presence of module.h from everywhere. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2011-08-12net: cleanup some rcu_dereference_rawEric Dumazet
RCU api had been completed and rcu_access_pointer() or rcu_dereference_protected() are better than generic rcu_dereference_raw() Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-08-02rcu: convert uses of rcu_assign_pointer(x, NULL) to RCU_INIT_POINTERStephen Hemminger
When assigning a NULL value to an RCU protected pointer, no barrier is needed. The rcu_assign_pointer, used to handle that but will soon change to not handle the special case. Convert all rcu_assign_pointer of NULL value. //smpl @@ expression P; @@ - rcu_assign_pointer(P, NULL) + RCU_INIT_POINTER(P, NULL) // </smpl> Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-07-23ipv4: use RT_TOS after some rt_tos conversionsJulian Anastasov
rt_tos was changed to iph->tos but it must be filtered by RT_TOS Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-06-09rtnetlink: Compute and store minimum ifinfo dump sizeGreg Rose
The message size allocated for rtnl ifinfo dumps was limited to a single page. This is not enough for additional interface info available with devices that support SR-IOV and caused a bug in which VF info would not be displayed if more than approximately 40 VFs were created per interface. Implement a new function pointer for the rtnl_register service that will calculate the amount of data required for the ifinfo dump and allocate enough data to satisfy the request. Signed-off-by: Greg Rose <gregory.v.rose@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2011-05-04ipv4: Pass explicit saddr/daddr args to ipmr_get_route().David S. Miller
This eliminates the need to use rt->rt_{src,dst}. Signed-off-by: David S. Miller <davem@davemloft.net>
2011-05-03ipv4: Make caller provide on-stack flow key to ip_route_output_ports().David S. Miller
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-05-03ipv4: Rework ipmr_rt_fib_lookup() flow key initialization.David S. Miller
Use information from the skb as much as possible, currently this means daddr, saddr, and TOS. Signed-off-by: David S. Miller <davem@davemloft.net>
2011-04-22inet: constify ip headers and in6_addrEric Dumazet
Add const qualifiers to structs iphdr, ipv6hdr and in6_addr pointers where possible, to make code intention more obvious. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-12ipv4: Use flowi4 in ipmr code.David S. Miller
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-12net: Put flowi_* prefix on AF independent members of struct flowiDavid S. Miller
I intend to turn struct flowi into a union of AF specific flowi structs. There will be a common structure that each variant includes first, much like struct sock_common. This is the first step to move in that direction. Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-12ipv4: Create and use route lookup helpers.David S. Miller
The idea here is this minimizes the number of places one has to edit in order to make changes to how flows are defined and used. Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-09ipv4: Lookup multicast routes by rtable using helper.David S. Miller
Create a common helper for this operation, since we do it identically in three spots. Suggested by Eric Dumazet. Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-04ipv4: Remove flowi from struct rtable.David S. Miller
The only necessary parts are the src/dst addresses, the interface indexes, the TOS, and the mark. The rest is unnecessary bloat, which amounts to nearly 50 bytes on 64-bit. Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-02ipv4: Make output route lookup return rtable directly.David S. Miller
Instead of on the stack. Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-03net: Support compat SIOCGETVIFCNT ioctl in ipv4.David S. Miller
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-03net: Fix bug in compat SIOCGETSGCNT handling.David S. Miller
Commit 709b46e8d90badda1898caea50483c12af178e96 ("net: Add compat ioctl support for the ipv4 multicast ioctl SIOCGETSGCNT") added the correct plumbing to handle SIOCGETSGCNT properly. However, whilst definiting a proper "struct compat_sioc_sg_req" it isn't actually used in ipmr_compat_ioctl(). Correct this oversight. Signed-off-by: David S. Miller <davem@davemloft.net>