summaryrefslogtreecommitdiffstats
path: root/include
AgeCommit message (Collapse)Author
2010-12-07dccp: Policy-based packet dequeueing infrastructureTomasz Grobelny
This patch adds a generic infrastructure for policy-based dequeueing of TX packets and provides two policies: * a simple FIFO policy (which is the default) and * a priority based policy (set via socket options). Both policies honour the tx_qlen sysctl for the maximum size of the write queue (can be overridden via socket options). The priority policy uses skb->priority internally to assign an u32 priority identifier, using the same ranking as SO_PRIORITY. The skb->priority field is set to 0 when the packet leaves DCCP. The priority is supplied as ancillary data using cmsg(3), the patch also provides the requisite parsing routines. Signed-off-by: Tomasz Grobelny <tomasz@grobelny.oswiecenia.net> Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
2010-12-06__in_dev_get_rtnl() can use rtnl_dereference()Eric Dumazet
If caller holds RTNL, we dont need a memory barrier (smp_read_barrier_depends) included in rcu_dereference(). Just use rtnl_dereference() to properly document the assertions. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-12-06filter: add SKF_AD_RXHASH and SKF_AD_CPUEric Dumazet
Add SKF_AD_RXHASH and SKF_AD_CPU to filter ancillary mechanism, to be able to build advanced filters. This can help spreading packets on several sockets with a fast selection, after RPS dispatch to N cpus for example, or to catch a percentage of flows in one queue. tcpdump -s 500 "cpu = 1" : [0] ld CPU [1] jeq #1 jt 2 jf 3 [2] ret #500 [3] ret #0 # take 12.5 % of flows (average) tcpdump -s 1000 "rxhash & 7 = 2" : [0] ld RXHASH [1] and #7 [2] jeq #2 jt 3 jf 4 [3] ret #1000 [4] ret #0 Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Cc: Rui <wirelesser@gmail.com> Acked-by: Changli Gao <xiaosuo@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-12-06usbnet: changes for upcoming cdc_ncm driverAlexey Orishko
Changes: include/linux/usb/usbnet.h: - a new flag to indicate driver's capability to accumulate IP packets in Tx direction and extract several packets from single skb in Rx direction. drivers/net/usb/usbnet.c: - the procedure of counting packets in usbnet was updated due to the accumulating of IP packets in the driver - no short packets are sent if indicated by the flag in driver_info structure Signed-off-by: Alexey Orishko <alexey.orishko@stericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-12-06tg3: Move EEE definitions into mdio.hMatt Carlson
In commit 52b02d04c801fff51ca49ad033210846d1713253 entitled "tg3: Add EEE support", Ben Hutchings had commented that the EEE advertisement register will be in a standard location. This patch moves that definition into mdio.h and changes the code to use it. Signed-off-by: Matt Carlson <mcarlson@broadcom.com> Reviewed-by: Benjamin Li <benli@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-12-02tipc: Remove obsolete native API files and exportsAllan Stephens
As part of the removal of TIPC's native API support it is no longer necessary for TIPC to export symbols for routines that can be called by kernel-based applications, nor for it to have header files that kernel-based applications can include to access the declarations for those routines. This commit eliminates the exporting of symbols by TIPC and migrates the contents of each obsolete native API include file into its corresponding non-native API equivalent. The code which was migrated in this commit was migrated intact, in that there are no technical changes combined with the relocation. Signed-off-by: Allan Stephens <Allan.Stephens@windriver.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-12-02net: kill unused macros from head fileShan Wei
These macros have been defined for several years since v2.6.12-rc2(tracing by git), but never be used. So remove them. Signed-off-by: Shan Wei <shanwei@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-12-02net: snmp: fix the wrong ICMP_MIB_MAX valueShan Wei
__ICMP_MIB_MAX is equal to the total number of icmp mib, So no need to add 1. This wastes 4/8 bytes memory. Change it to be same as ICMP6_MIB_MAX, TCP_MIB_MAX, UDP_MIB_MAX. Signed-off-by: Shan Wei <shanwei@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-12-02ipv6: Create inet6_csk_route_req().David S. Miller
Brother of ipv4's inet_csk_route_req(). Signed-off-by: David S. Miller <davem@davemloft.net>
2010-12-02ipv6: Add rt6_get_peer() helper.David S. Miller
To go along side ipv4's rt_get_peer(). Signed-off-by: David S. Miller <davem@davemloft.net>
2010-12-01timewait_sock: Create and use getpeer op.David S. Miller
The only thing AF-specific about remembering the timestamp for a time-wait TCP socket is getting the peer. Abstract that behind a new timewait_sock_ops vector. Support for real IPV6 sockets is not filled in yet, but curiously this makes timewait recycling start to work for v4-mapped ipv6 sockets. Signed-off-by: David S. Miller <davem@davemloft.net>
2010-12-01inetpeer: Fix incorrect comment about inetpeer struct size.David S. Miller
Now with ipv6 support it is no longer less than 64 bytes. Signed-off-by: David S. Miller <davem@davemloft.net>
2010-12-01inetpeer: Kill use of inet_peer_address_t typedef.David S. Miller
They are verboten these days. Signed-off-by: David S. Miller <davem@davemloft.net>
2010-12-01net sched: use xps information for qdisc NUMA affinityEric Dumazet
Allocate qdisc memory according to NUMA properties of cpus included in xps map. To be effective, qdisc should be (re)setup after changes of /sys/class/net/eth<n>/queues/tx-<n>/xps_cpus I added a numa_node field in struct netdev_queue, containing NUMA node if all cpus included in xps_cpus share same node, else -1. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Cc: Ben Hutchings <bhutchings@solarflare.com> Cc: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-30inet: Turn ->remember_stamp into ->get_peer in connection AF ops.David S. Miller
Then we can make a completely generic tcp_remember_stamp() that uses ->get_peer() as a helper, minimizing the AF specific code and minimizing the eventual code duplication when we implement the ipv6 side of TW recycling. Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-30ipv6: Add infrastructure to bind inet_peer objects to routes.David S. Miller
They are only allowed on cached ipv6 routes. Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-30inetpeer: Add inet_getpeer_v6()David S. Miller
Now that all of the infrastructure is in place, we can add the ipv6 shorthand for peer creation. Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-30inetpeer: Make inet_getpeer() take an inet_peer_adress_t pointer.David S. Miller
And make an inet_getpeer_v4() helper, update callers. Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-30inetpeer: Introduce inet_peer_address_t.David S. Miller
Currently only the v4 aspect is used, but this will change. Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-29Merge branch 'for-davem' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6
2010-11-29xps: add __rcu annotationsEric Dumazet
Avoid sparse warnings : add __rcu annotations and use rcu_dereference_protected() where necessary. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Cc: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-29sctp: kill unused macros in head fileShan Wei
1. SCTP_CMD_NUM_VERBS,SCTP_CMD_MAX These two macros have never been used for several years since v2.6.12-rc2. 2.sctp_port_rover,sctp_port_alloc_lock The commit 063930 abandoned global variables of port_rover and port_alloc_lock, but still keep two macros to refer to them. So, remove them now. commit 06393009000779b00a558fd2f280882cc7dc2008 Author: Stephen Hemminger <shemminger@linux-foundation.org> Date: Wed Oct 10 17:30:18 2007 -0700 [SCTP]: port randomization Signed-off-by: Shan Wei <shanwei@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-28xps: Add CONFIG_XPSTom Herbert
This patch adds XPS_CONFIG option to enable and disable XPS. This is done in the same manner as RPS_CONFIG. This is also fixes build failure in XPS code when SMP is not enabled. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-28xfrm: fix gre key endianessTimo Teräs
fl->fl_gre_key is network byte order contrary to fl->fl_icmp_*. Make xfrm_flowi_{s|d}port return network byte order values for gre key too. Signed-off-by: Timo Teräs <timo.teras@iki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-28X25 remove bkl in subscription ioctlsandrew hendry
Signed-off-by: Andrew Hendry <andrew.hendry@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-28net: add netif_tx_queue_frozen_or_stoppedEric Dumazet
When testing struct netdev_queue state against FROZEN bit, we also test XOFF bit. We can test both bits at once and save some cycles. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-28sctp: kill unused macro definitionShan Wei
These macros have been existed for several years since v2.6.12-rc2. But they never be used. So remove them now. Signed-off-by: Shan Wei <shanwei@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-27rtnl: make link af-specific updates atomicThomas Graf
As David pointed out correctly, updates to af-specific attributes are currently not atomic. If multiple changes are requested and one of them fails, previous updates may have been applied already leaving the link behind in a undefined state. This patch splits the function parse_link_af() into two functions validate_link_af() and set_link_at(). validate_link_af() is placed to validate_linkmsg() check for errors as early as possible before any changes to the link have been made. set_link_af() is called to commit the changes later. This method is not fail proof, while it is currently sufficient to make set_link_af() inerrable and thus 100% atomic, the validation function method will not be able to detect all error scenarios in the future, there will likely always be errors depending on states which are f.e. not protected by rtnl_mutex and thus may change between validation and setting. Also, instead of silently ignoring unknown address families and config blocks for address families which did not register a set function the errors EAFNOSUPPORT respectively EOPNOSUPPORT are returned to avoid comitting 4 out of 5 update requests without notifying the user. Signed-off-by: Thomas Graf <tgraf@infradead.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-24Merge branch 'master' of ↵John W. Linville
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6 into for-davem
2010-11-24cfg80211: allow using CQM event to notify packet lossJohannes Berg
This adds the ability for drivers to use CQM events to notify about packet loss for specific stations (which could be the AP for the managed mode case). Since the threshold might be determined by the driver (it isn't passed in right now) it will be passed out of the driver to userspace in the event. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-11-24cfg80211: Add documentation for antenna opsBruno Randolf
Signed-off-by: Bruno Randolf <br1@einfach.org> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-11-24cfg80211/mac80211: improve ad-hoc multicast rate handlingFelix Fietkau
- store the multicast rate as an index instead of the rate value (reduces cpu overhead in a hotpath) - validate the rate values (must match a bitrate in at least one sband) Signed-off-by: Felix Fietkau <nbd@openwrt.org> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-11-24Merge branch 'master' of ↵John W. Linville
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6
2010-11-24Revert "nl80211/mac80211: Report signal average"John W. Linville
This reverts commit 86107fd170bc379869250eb7e1bd393a3a70e8ae. This patch inadvertantly changed the userland ABI. Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-11-24xps: Transmit Packet SteeringTom Herbert
This patch implements transmit packet steering (XPS) for multiqueue devices. XPS selects a transmit queue during packet transmission based on configuration. This is done by mapping the CPU transmitting the packet to a queue. This is the transmit side analogue to RPS-- where RPS is selecting a CPU based on receive queue, XPS selects a queue based on the CPU (previously there was an XPS patch from Eric Dumazet, but that might more appropriately be called transmit completion steering). Each transmit queue can be associated with a number of CPUs which will use the queue to send packets. This is configured as a CPU mask on a per queue basis in: /sys/class/net/eth<n>/queues/tx-<n>/xps_cpus The mappings are stored per device in an inverted data structure that maps CPUs to queues. In the netdevice structure this is an array of num_possible_cpu structures where each structure holds and array of queue_indexes for queues which that CPU can use. The benefits of XPS are improved locality in the per queue data structures. Also, transmit completions are more likely to be done nearer to the sending thread, so this should promote locality back to the socket on free (e.g. UDP). The benefits of XPS are dependent on cache hierarchy, application load, and other factors. XPS would nominally be configured so that a queue would only be shared by CPUs which are sharing a cache, the degenerative configuration woud be that each CPU has it's own queue. Below are some benchmark results which show the potential benfit of this patch. The netperf test has 500 instances of netperf TCP_RR test with 1 byte req. and resp. bnx2x on 16 core AMD XPS (16 queues, 1 TX queue per CPU) 1234K at 100% CPU No XPS (16 queues) 996K at 100% CPU Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-24xps: Improvements in TX queue selectionTom Herbert
In dev_pick_tx, don't do work in calculating queue index or setting the index in the sock unless the device has more than one queue. This allows the sock to be set only with a queue index of a multi-queue device which is desirable if device are stacked like in a tunnel. We also allow the mapping of a socket to queue to be changed. To maintain in order packet transmission a flag (ooo_okay) has been added to the sk_buff structure. If a transport layer sets this flag on a packet, the transmit queue can be changed for the socket. Presumably, the transport would set this if there was no possbility of creating OOO packets (for instance, there are no packets in flight for the socket). This patch includes the modification in TCP output for setting this flag. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-24scm: lower SCM_MAX_FDEric Dumazet
Lower SCM_MAX_FD from 255 to 253 so that allocations for scm_fp_list are halved. (commit f8d570a4 added two pointers in this structure) scm_fp_dup() should not copy whole structure (and trigger kmemcheck warnings), but only the used part. While we are at it, only allocate needed size. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-24ipv6: mcast: RCU conversionEric Dumazet
ipv6_sk_mc_lock rwlock becomes a spinlock. readers (inet6_mc_check()) now takes rcu_read_lock() instead of read lock. Writers dont need to disable BH anymore. struct ipv6_mc_socklist objects are reclaimed after one RCU grace period. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-24stmmac: add init/exit callback in plat_stmmacenet_data structGiuseppe CAVALLARO
This patch adds in the plat_stmmacenet_data the init and exit callbacks that can be used for invoking specific platform functions. For example, on ST targets, these call the PAD manager functions to set PIO lines and syscfg registers. The patch removes the stmmac_claim_resource only used on STM Kernels as well. Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-22cfg80211: Fix regulatory bug with multiple cards and delaysLuis R. Rodriguez
When two cards are connected with the same regulatory domain if CRDA had a delayed response then cfg80211's own set regulatory domain would still be the world regulatory domain. There was a bug on cfg80211's logic such that it assumed that once you pegged a request as the last request it was already the currently set regulatory domain. This would mean we would race setting a stale regulatory domain to secondary cards which had the same regulatory domain since the alpha2 would match. We fix this by processing each regulatory request atomically, and only move on to the next one once we get it fully processed. In the case CRDA is not present we will simply world roam. This issue is only present when you have a slow system and the CRDA processing is delayed. Because of this it is not a known regression. Without this fix when a delay is present with CRDA the second card would end up with an intersected regulatory domain and not allow it to use the channels it really is designed for. When two cards with two different regulatory domains were inserted you'd end up rejecting the second card's regulatory domain request. This fails with mac80211_hswim's regtest=2 (two requests, same alpha2) and regtest=3 (two requests, different alpha2) module parameter options. This was reproduced and tested against mac80211_hwsim using this CRDA delayer: #!/bin/bash echo $COUNTRY >> /tmp/log sleep 2 /sbin/crda.orig And these regulatory tests: modprobe mac80211_hwsim regtest=2 modprobe mac80211_hwsim regtest=3 Reported-by: Mark Mentovai <mark@moxienet.com> Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com> Tested-by: Mark Mentovai <mark@moxienet.com> Tested-by: Bruno Randolf <br1@einfach.org> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-11-22ssb: b43-pci-bridge: Add new vendor for BCM4318Daniel Klaffenbach
Add new vendor for Broadcom 4318. Signed-off-by: Daniel Klaffenbach <danielklaffenbach@gmail.com> Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net> Cc: Stable <stable@kernel.org> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-11-22macvlan: Introduce 'passthru' mode to takeover the underlying deviceSridhar Samudrala
With the current default 'vepa' mode, a KVM guest using virtio with macvtap backend has the following limitations. - cannot change/add a mac address on the guest virtio-net - cannot create a vlan device on the guest virtio-net - cannot enable promiscuous mode on guest virtio-net To address these limitations, this patch introduces a new mode called 'passthru' when creating a macvlan device which allows takeover of the underlying device and passing it to a guest using virtio with macvtap backend. Only one macvlan device is allowed in passthru mode and it inherits the mac address from the underlying device and sets it in promiscuous mode to receive and forward all the packets. Signed-off-by: Sridhar Samudrala <sri@us.ibm.com> ------------------------------------------------------------------------- Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-21netns: let net_generic take pointer-to-const argsJan Engelhardt
This commit is same in nature as v2.6.37-rc1-755-g3654654; the network namespace itself is not modified when calling net_generic, so the parameter can be const. Signed-off-by: Jan Engelhardt <jengelh@medozas.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-19Merge branch 'master' of ↵David S. Miller
master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/bonding/bond_main.c net/core/net-sysfs.c net/ipv6/addrconf.c
2010-11-19filter: optimize sk_run_filterEric Dumazet
Remove pc variable to avoid arithmetic to compute fentry at each filter instruction. Jumps directly manipulate fentry pointer. As the last instruction of filter[] is guaranteed to be a RETURN, and all jumps are before the last instruction, we dont need to check filter bounds (number of instructions in filter array) at each iteration, so we remove it from sk_run_filter() params. On x86_32 remove f_k var introduced in commit 57fe93b374a6b871 (filter: make sure filters dont read uninitialized memory) Note : We could use a CONFIG_ARCH_HAS_{FEW|MANY}_REGISTERS in order to avoid too many ifdefs in this code. This helps compiler to use cpu registers to hold fentry and A accumulator. On x86_32, this saves 401 bytes, and more important, sk_run_filter() runs much faster because less register pressure (One less conditional branch per BPF instruction) # size net/core/filter.o net/core/filter_pre.o text data bss dec hex filename 2948 0 0 2948 b84 net/core/filter.o 3349 0 0 3349 d15 net/core/filter_pre.o on x86_64 : # size net/core/filter.o net/core/filter_pre.o text data bss dec hex filename 5173 0 0 5173 1435 net/core/filter.o 5224 0 0 5224 1468 net/core/filter_pre.o Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Acked-by: Changli Gao <xiaosuo@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-18Merge branch 'master' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6
2010-11-18nl80211/mac80211: Report signal averageBruno Randolf
Extend nl80211 to report an exponential weighted moving average (EWMA) of the signal value. Since the signal value usually fluctuates between different packets, an average can be more useful than the value of the last packet. This uses the recently added generic EWMA library function. Signed-off-by: Bruno Randolf <br1@einfach.org> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-11-18lib: Add generic exponentially weighted moving average (EWMA) functionBruno Randolf
This adds generic functions for calculating Exponentially Weighted Moving Averages (EWMA). This implementation makes use of a structure which keeps the EWMA parameters and a scaled up internal representation to reduce rounding errors. The original idea for this implementation came from the rt2x00 driver (rt2x00link.c). I would like to use it in several places in the mac80211 and ath5k code and I hope it can be useful in many other places in the kernel code. Signed-off-by: Bruno Randolf <br1@einfach.org> Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-11-18net: move definitions of BPF_S_* to net/core/filter.cChangli Gao
BPF_S_* are used internally, should not be exposed to the others. Signed-off-by: Changli Gao <xiaosuo@gmail.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Acked-by: Hagen Paul Pfeifer <hagen@jauu.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-18net: Fix duplicate volatile warning.Tetsuo Handa
jiffies is defined as "volatile". extern unsigned long volatile __jiffy_data jiffies; ACCESS_ONCE() uses "volatile". As a result, some compilers warn duplicate `volatile' for ACCESS_ONCE(jiffies). Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: David S. Miller <davem@davemloft.net>