summaryrefslogtreecommitdiffstats
path: root/include
AgeCommit message (Collapse)Author
2010-11-17netfilter: allow hooks to pass error code back up the stackEric Paris
SELinux would like to pass certain fatal errors back up the stack. This patch implements the generic netfilter support for this functionality. Based-on-patch-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Eric Paris <eparis@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-16net: reorder struct sock fieldsEric Dumazet
Right now, fields in struct sock are not optimally ordered, because each path (RX softirq, TX completion, RX user, TX user) has to touch fields that are contained in many different cache lines. The really critical thing is to shrink number of cache lines that are used at RX softirq time : CPU handling softirqs for a device can receive many frames per second for many sockets. If load is too big, we can drop frames at NIC level. RPS or multiqueue cards can help, but better reduce latency if possible. This patch starts with UDP protocol, then additional patches will try to reduce latencies of other ones as well. At RX softirq time, fields of interest for UDP protocol are : (not counting ones in inet struct for the lookup) Read/Written: sk_refcnt (atomic increment/decrement) sk_rmem_alloc & sk_backlog.len (to check if there is room in queues) sk_receive_queue sk_backlog (if socket locked by user program) sk_rxhash sk_forward_alloc sk_drops Read only: sk_rcvbuf (sk_rcvqueues_full()) sk_filter sk_wq sk_policy[0] sk_flags Additional notes : - sk_backlog has one hole on 64bit arches. We can fill it to save 8 bytes. - sk_backlog is used only if RX sofirq handler finds the socket while locked by user. - sk_rxhash is written only once per flow. - sk_drops is written only if queues are full Final layout : [1] One section grouping all read/write fields, but placing rxhash and sk_backlog at the end of this section. [2] One section grouping all read fields in RX handler (sk_filter, sk_rcv_buf, sk_wq) [3] Section used by other paths I'll post a patch on its own to put sk_refcnt at the end of struct sock_common so that it shares same cache line than section [1] New offsets on 64bit arch : sizeof(struct sock)=0x268 offsetof(struct sock, sk_refcnt) =0x10 offsetof(struct sock, sk_lock) =0x48 offsetof(struct sock, sk_receive_queue)=0x68 offsetof(struct sock, sk_backlog)=0x80 offsetof(struct sock, sk_rmem_alloc)=0x80 offsetof(struct sock, sk_forward_alloc)=0x98 offsetof(struct sock, sk_rxhash)=0x9c offsetof(struct sock, sk_rcvbuf)=0xa4 offsetof(struct sock, sk_drops) =0xa0 offsetof(struct sock, sk_filter)=0xa8 offsetof(struct sock, sk_wq)=0xb0 offsetof(struct sock, sk_policy)=0xd0 offsetof(struct sock, sk_flags) =0xe0 Instead of : sizeof(struct sock)=0x270 offsetof(struct sock, sk_refcnt) =0x10 offsetof(struct sock, sk_lock) =0x50 offsetof(struct sock, sk_receive_queue)=0xc0 offsetof(struct sock, sk_backlog)=0x70 offsetof(struct sock, sk_rmem_alloc)=0xac offsetof(struct sock, sk_forward_alloc)=0x10c offsetof(struct sock, sk_rxhash)=0x128 offsetof(struct sock, sk_rcvbuf)=0x4c offsetof(struct sock, sk_drops) =0x16c offsetof(struct sock, sk_filter)=0x198 offsetof(struct sock, sk_wq)=0x88 offsetof(struct sock, sk_policy)=0x98 offsetof(struct sock, sk_flags) =0x130 Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-16udp: use atomic_inc_not_zero_hintEric Dumazet
UDP sockets refcount is usually 2, unless an incoming frame is going to be queued in receive or backlog queue. Using atomic_inc_not_zero_hint() permits to reduce latency, because processor issues less memory transactions. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-16macvlan: lockless tx pathEric Dumazet
macvlan is a stacked device, like tunnels. We should use the lockless mechanism we are using in tunnels and loopback. This patch completely removes locking in TX path. tx stat counters are added into existing percpu stat structure, renamed from rx_stats to pcpu_stats. Note : this reverts commit 2c11455321f37 (macvlan: add multiqueue capability) Note : rx_errors converted to a 32bit counter, like tx_dropped, since they dont need 64bit range. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Cc: Patrick McHardy <kaber@trash.net> Cc: Ben Greear <greearb@candelatech.com> Cc: Ben Hutchings <bhutchings@solarflare.com> Acked-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-16netlink: let nlmsg and nla functions take pointer-to-const argsJan Engelhardt
The changed functions do not modify the NL messages and/or attributes at all. They should use const (similar to strchr), so that callers which have a const nlmsg/nlattr around can make use of them without casting. While at it, constify a data array. Signed-off-by: Jan Engelhardt <jengelh@medozas.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-16Merge branch 'master' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6
2010-11-15netdev: add rcu annotations to receive handler hookstephen hemminger
Suggested by Eric's bridge RCU changes. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-15bridge: add proper RCU annotation to should_route_hookEric Dumazet
Add br_should_route_hook_t typedef, this is the only way we can get a clean RCU implementation for function pointer. Move route_hook to location where it is used. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-15include/net/caif/cfctrl.h: Remove unnecessary semicolonsJoe Perches
Signed-off-by: Joe Perches <joe@perches.com> Acked-by: Sjur Braendeland <sjur.brandeland@stericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-15include/linux/if_macvlan.h: Remove unnecessary semicolonsJoe Perches
Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-15Merge branch 'dccp' of git://eden-feed.erg.abdn.ac.uk/net-next-2.6David S. Miller
2010-11-15net: Simplify RX queue allocationTom Herbert
This patch move RX queue allocation to alloc_netdev_mq and freeing of the queues to free_netdev (symmetric to TX queue allocation). Each kobject RX queue takes a reference to the queue's device so that the device can't be freed before all the kobjects have been released-- this obviates the need for reference counts specific to RX queues. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-15xfrm: use gre key as flow upper protocol infoTimo Teräs
The GRE Key field is intended to be used for identifying an individual traffic flow within a tunnel. It is useful to be able to have XFRM policy selector matches to have different policies for different GRE tunnels. Signed-off-by: Timo Teräs <timo.teras@iki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-15bitops: Provide generic sign_extend32 functionAndreas Herrmann
This patch moves code out from wireless drivers where two different functions are defined in three code locations for the same purpose and provides a common function to sign extend a 32-bit value. Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-11-15wl1271: ref_clock cosmetic changesGery Kahn
Cosmetic cleanup for ref_clock code while configured by board. Signed-off-by: Gery Kahn <geryk@ti.com> Signed-off-by: Luciano Coelho <luciano.coelho@nokia.com>
2010-11-15cfg80211: fix disabling channels based on hintsLuis R. Rodriguez
After a module loads you will have loaded the world roaming regulatory domain or a custom regulatory domain. Further regulatory hints are welcomed and should be respected unless the regulatory hint is coming from a country IE as the IEEE spec allows for a country IE to be a subset of what is allowed by the local regulatory agencies. So disable all channels that do not fit a regulatory domain sent from a unless the hint is from a country IE and the country IE had no information about the band we are currently processing. This fixes a few regulatory issues, for example for drivers that depend on CRDA and had no 5 GHz freqencies allowed were not properly disabling 5 GHz at all, furthermore it also allows users to restrict devices further as was intended. If you recieve a country IE upon association we will also disable the channels that are not allowed if the country IE had at least one channel on the respective band we are procesing. This was the original intention behind this design but it was completely overlooked... Cc: David Quan <david.quan@atheros.com> Cc: Jouni Malinen <jouni.malinen@atheros.com> cc: Easwar Krishnan <easwar.krishnan@atheros.com> Cc: stable@kernel.org Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-11-15cfg80211: fix allowing country IEs for WIPHY_FLAG_STRICT_REGULATORYLuis R. Rodriguez
We should be enabling country IE hints for WIPHY_FLAG_STRICT_REGULATORY even if we haven't yet recieved regulatory domain hint for the driver if it needed one. Without this Country IEs are not passed on to drivers that have set WIPHY_FLAG_STRICT_REGULATORY, today this is just all Atheros chipset drivers: ath5k, ath9k, ar9170, carl9170. This was part of the original design, however it was completely overlooked... Cc: Easwar Krishnan <easwar.krishnan@atheros.com> Cc: stable@kernel.org Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-11-15rfkill: remove dead codeStephen Hemminger
The following code is defined but never used. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-11-15offloading: Force software GSO for multiple vlan tags.Jesse Gross
We currently use vlan_features to check for TSO support if there is a vlan tag. However, it's quite likely that the NIC is not able to do TSO when there is an arbitrary number of tags. Therefore if there is more than one tag (in-band or out-of-band), fall back to software emulation. Signed-off-by: Jesse Gross <jesse@nicira.com> CC: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-15dccp ccid-2: Schedule Sync as out-of-band mechanismGerrit Renker
The problem with Ack Vectors is that i) their length is variable and can in principle grow quite large, ii) it is hard to predict exactly how large they will be. Due to the second point it seems not a good idea to reduce the MPS; in particular when on average there is enough room for the Ack Vector and an increase in length is momentarily due to some burst loss, after which the Ack Vector returns to its normal/average length. The solution taken by this patch is to subtract a minimum-expected Ack Vector length from the MPS, and to defer any larger Ack Vectors onto a separate Sync - but only if indeed there is no space left on the skb. This patch provides the infrastructure to schedule Sync-packets for transporting (urgent) out-of-band data. Its signalling is quicker than scheduling an Ack, since it does not need to wait for new application data. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
2010-11-14Merge branch 'master' of ↵David S. Miller
master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
2010-11-12Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6Linus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (66 commits) can-bcm: fix minor heap overflow gianfar: Do not call device_set_wakeup_enable() under a spinlock ipv6: Warn users if maximum number of routes is reached. docs: Add neigh/gc_thresh3 and route/max_size documentation. axnet_cs: fix resume problem for some Ax88790 chip ipv6: addrconf: don't remove address state on ifdown if the address is being kept tcp: Don't change unlocked socket state in tcp_v4_err(). x25: Prevent crashing when parsing bad X.25 facilities cxgb4vf: add call to Firmware to reset VF State. cxgb4vf: Fail open if link_start() fails. cxgb4vf: flesh out PCI Device ID Table ... cxgb4vf: fix some errors in Gather List to skb conversion cxgb4vf: fix bug in Generic Receive Offload cxgb4vf: don't implement trivial (and incorrect) ndo_select_queue() ixgbe: Look inside vlan when determining offload protocol. bnx2x: Look inside vlan when determining checksum proto. vlan: Add function to retrieve EtherType from vlan packets. virtio-net: init link state correctly ucc_geth: Fix deadlock ucc_geth: Do not bring the whole IF down when TX failure. ...
2010-11-12Merge branch 'usb-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6 * 'usb-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6: (28 commits) Revert "USB: xhci: Use GFP_ATOMIC under spin_lock" USB: ohci-jz4740: Fix spelling in MODULE_ALIAS UWB: Return UWB_RSV_ALLOC_NOT_FOUND rather than crashing on NULL dereference if kzalloc fails usb: core: fix information leak to userland usb: misc: iowarrior: fix information leak to userland usb: misc: sisusbvga: fix information leak to userland usb: subtle increased memory usage in u_serial USB: option: fix when the driver is loaded incorrectly for some Huawei devices. USB: xhci: Use GFP_ATOMIC under spin_lock usb: gadget: goku_udc: add registered flag bit, fixing build USB: ehci/mxc: compile fix USB: Fix FSL USB driver on non Open Firmware systems USB: the development of the usb tree is now in git usb: musb: fail unaligned DMA transfers on v1.8 and above USB: ftdi_sio: add device IDs for Milkymist One JTAG/serial usb.h: fix ioctl kernel-doc info usb: musb: gadget: kill duplicate code in musb_gadget_queue() usb: musb: Fix handling of spurious SESSREQ usb: musb: fix kernel oops when loading musb_hdrc module for the 2nd time USB: musb: blackfin: push clkin value to platform resources ...
2010-11-12Merge branch 'tty-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty-2.6 * 'tty-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty-2.6: n_gsm: Fix length handling n_gsm: Copy n2 over when configuring via ioctl interface serial: bfin_5xx: grab port lock before making port termios changes serial: bfin_5xx: disable CON_PRINTBUFFER for consoles serial: bfin_5xx: remove redundant SSYNC to improve TX speed serial: bfin_5xx: always include DMA headers vcs: make proper usage of the poll flags amiserial: Remove unused variable icount 8250: Fix tcsetattr to avoid ioctl(TIOCMIWAIT) hang tty_ldisc: Fix BUG() on hangup TTY: restore tty_ldisc_wait_idle SERIAL: blacklist si3052 chip drivers/serial/bfin_5xx.c: Fix line continuation defects tty: prevent DOS in the flush_to_ldisc 8250: add support for Kouwell KW-L221N-2 nozomi: Fix warning from the previous TIOCGCOUNT changes tty: fix warning in synclink driver tty: Fix formatting in tty.h tty: the development tree is now done in git
2010-11-12igmp: RCU conversion of in_dev->mc_listEric Dumazet
in_dev->mc_list is protected by one rwlock (in_dev->mc_list_lock). This can easily be converted to a RCU protection. Writers hold RTNL, so mc_list_lock is removed, not replaced by a spinlock. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Cc: Cypher Wu <cypher.w@gmail.com> Cc: Américo Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-12vlan: Add function to retrieve EtherType from vlan packets.Hao Zheng
Depending on how a packet is vlan tagged (i.e. hardware accelerated or not), the encapsulated protocol is stored in different locations. This provides a consistent method of accessing that protocol, which is needed by drivers, security checks, etc. Signed-off-by: Hao Zheng <hzheng@nicira.com> Signed-off-by: Jesse Gross <jesse@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-12Merge git://git.kernel.org/pub/scm/linux/kernel/git/kaber/nf-2.6David S. Miller
2010-11-12Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: Input: do not pass injected events back to the originating handler Input: pcf8574_keypad - fix error handling in pcf8574_kp_probe Input: acecad - fix a memory leak in usb_acecad_probe error path Input: atkbd - add 'terminal' parameter for IBM Terminal keyboards Input: i8042 - add Sony VAIOs to MUX blacklist kgdboc: reset input devices (keyboards) when exiting debugger Input: export input_reset_device() for use in KGDB Input: adp5588-keys - unify common header defines
2010-11-12Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-blockLinus Torvalds
* 'for-linus' of git://git.kernel.dk/linux-2.6-block: (27 commits) block: remove unused copy_io_context() Documentation: remove anticipatory scheduler info block: remove REQ_HARDBARRIER ioprio: rcu_read_lock/unlock protect find_task_by_vpid call (V2) ioprio: fix RCU locking around task dereference block: ioctl: fix information leak to userland block: read i_size with i_size_read() cciss: fix proc warning on attempt to remove non-existant directory bio: take care not overflow page count when mapping/copying user data block: limit vec count in bio_kmalloc() and bio_alloc_map_data() block: take care not to overflow when calculating total iov length block: check for proper length of iov entries in blk_rq_map_user_iov() cciss: remove controllers supported by hpsa cciss: use usleep_range not msleep for small sleeps cciss: limit commands allocated on reset_devices cciss: Use kernel provided PCI state save and restore functions cciss: fix board status waiting code drbd: Removed checks for REQ_HARDBARRIER on incomming BIOs drbd: REQ_HARDBARRIER -> REQ_FUA transition for meta data accesses drbd: Removed the BIO_RW_BARRIER support form the receiver/epoch code ...
2010-11-12Merge branch 'perf-fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: perf, amd: Use kmalloc_node(,__GFP_ZERO) for northbridge structure allocation perf_events: Fix time tracking in samples perf trace: update usage perf trace: update Documentation with new perf trace variants perf trace: live-mode command-line cleanup perf trace record: handle commands correctly perf record: make the record options available outside perf record perf trace scripting: remove system-wide param from shell scripts perf trace scripting: fix some small memory leaks and missing error checks perf: Fix usages of profile_cpu in builtin-top.c to use cpu_list perf, ui: Eliminate stack-smashing protection compiler complaint
2010-11-12Merge branch 'drm-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6 * 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6: (39 commits) drm/ttm: Be consistent on ttm_bo_init() failures drm/radeon/kms: Fix retrying ttm_bo_init() after it failed once. drm/radeon/kms: fix thermal sensor reporting on rv6xx drm/radeon/kms: fix bugs in ddc and cd path router code drm/radeon/kms: add support for clock/data path routers drm: vmwgfx: fix information leak to userland drivers/gpu: Use vzalloc drm/vmwgfx: Fix oops on failing bo pin drm/ttm: Remove the CAP_SYS_ADMIN requirement for bo pinning drm/ttm: Make sure a sync object doesn't disappear while we use it drm/radeon/kms: don't disable shared encoders on pre-DCE3 display blocks drivers/gpu/drm: Update WARN uses drivers/gpu/drm/vmwgfx: Fix k.alloc switched arguments DRM: ignore invalid EDID extensions drm/radeon/kms: make the connector code less verbose drm/ttm: remove failed ttm binding error printout drm/ttm: Add a barrier when unreserving drm/ttm: Remove mm init error printouts and checks drm/ttm: Remove pointless list_empty check drm/ttm: Use private locks for the default bo range manager ...
2010-11-12Merge branch 'for-2.6.37' of git://linux-nfs.org/~bfields/linuxLinus Torvalds
* 'for-2.6.37' of git://linux-nfs.org/~bfields/linux: locks: remove dead lease error-handling code locks: fix leak on merging leases nfsd4: fix 4.1 connection registration race
2010-11-12backlight: add low threshold to pwm backlightArun Murthy
The intensity of the backlight can be varied from a range of max_brightness to zero. Though most, if not all the pwm based backlight devices start flickering at lower brightness value. And also for each device there exists a brightness value below which the backlight appears to be turned off though the value is not equal to zero. If the range of brightness for a device is from zero to max_brightness. A graph is plotted for brightness Vs intensity for the pwm based backlight device has to be a linear graph. intensity | / | / | / |/ --------- 0 max_brightness But pratically on measuring the above we note that the intensity of backlight goes to zero(OFF) when the value in not zero almost nearing to zero(some x%). so the graph looks like intensity | / | / | / | | ------------ 0 x max_brightness In order to overcome this drawback knowing this x% i.e nothing but the low threshold beyond which the backlight is off and will have no effect, the brightness value is being offset by the low threshold value(retaining the linearity of the graph). Now the graph becomes intensity | / | / | / | / ------------- 0 max_brightness With this for each and every digit increment in the brightness from zero there is a change in the intensity of backlight. Devices having this behaviour can set the low threshold brightness(lth_brightness) and pass the same as platform data else can have it as zero. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Arun Murthy <arun.murthy@stericsson.com> Acked-by: Linus Walleij <linus.walleij@stericsson.com> Acked-by: Richard Purdie <rpurdie@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-11-12leds: driver for National Semiconductors LP5523 chipSamu Onkalo
LP5523 chip is nine channel led driver with programmable engines. Driver provides support for that chip for direct access via led class or via programmable engines. Signed-off-by: Samu Onkalo <samu.p.onkalo@nokia.com> Cc: Richard Purdie <rpurdie@rpsys.net> Cc: Jean Delvare <khali@linux-fr.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-11-12leds: driver for National Semiconductor LP5521 chipSamu Onkalo
This patchset provides support for LP5521 and LP5523 LED driver chips from National Semicondutor. Both drivers supports programmable engines and naturally LED class features. Documentation is provided as a part of the patchset. I created "leds" subdirectory under Documentation. Perhaps the rest of the leds* documentation should be moved there. Datasheets are freely available at National Semiconductor www pages. This patch: LP5521 chip is three channel led driver with programmable engines. Driver provides support for that chip for direct access via led class or via programmable engines. Signed-off-by: Samu Onkalo <samu.p.onkalo@nokia.com> Cc: Richard Purdie <rpurdie@rpsys.net> Cc: Jean Delvare <khali@linux-fr.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-11-12led-class: always implement blinkingJohannes Berg
Currently, blinking LEDs can be awkward because it is not guaranteed that all LEDs implement blinking. The trigger that wants it to blink then needs to implement its own timer solution. Rather than require that, add led_blink_set() API that triggers can use. This function will attempt to use hw blinking, but if that fails implements a timer for it. To stop blinking again, brightness_set() also needs to be wrapped into API that will stop the software blink. As a result of this, the timer trigger becomes a very trivial one, and hopefully we can finally see triggers using blinking as well because it's always easy to use. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Acked-by: Richard Purdie <rpurdie@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-11-12radix-tree: fix RCU bugNick Piggin
Salman Qazi describes the following radix-tree bug: In the following case, we get can get a deadlock: 0. The radix tree contains two items, one has the index 0. 1. The reader (in this case find_get_pages) takes the rcu_read_lock. 2. The reader acquires slot(s) for item(s) including the index 0 item. 3. The non-zero index item is deleted, and as a consequence the other item is moved to the root of the tree. The place where it used to be is queued for deletion after the readers finish. 3b. The zero item is deleted, removing it from the direct slot, it remains in the rcu-delayed indirect node. 4. The reader looks at the index 0 slot, and finds that the page has 0 ref count 5. The reader looks at it again, hoping that the item will either be freed or the ref count will increase. This never happens, as the slot it is looking at will never be updated. Also, this slot can never be reclaimed because the reader is holding rcu_read_lock and is in an infinite loop. The fix is to re-use the same "indirect" pointer case that requires a slot lookup retry into a general "retry the lookup" bit. Signed-off-by: Nick Piggin <npiggin@kernel.dk> Reported-by: Salman Qazi <sqazi@google.com> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-11-12Restrict unprivileged access to kernel syslogDan Rosenberg
The kernel syslog contains debugging information that is often useful during exploitation of other vulnerabilities, such as kernel heap addresses. Rather than futilely attempt to sanitize hundreds (or thousands) of printk statements and simultaneously cripple useful debugging functionality, it is far simpler to create an option that prevents unprivileged users from reading the syslog. This patch, loosely based on grsecurity's GRKERNSEC_DMESG, creates the dmesg_restrict sysctl. When set to "0", the default, no restrictions are enforced. When set to "1", only users with CAP_SYS_ADMIN can read the kernel syslog via dmesg(8) or other mechanisms. [akpm@linux-foundation.org: explain the config option in kernel.txt] Signed-off-by: Dan Rosenberg <drosenberg@vsecurity.com> Acked-by: Ingo Molnar <mingo@elte.hu> Acked-by: Eugene Teo <eugeneteo@kernel.org> Acked-by: Kees Cook <kees.cook@canonical.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-11-12include/linux/highmem.h needs hardirq.hCatalin Marinas
Commit 3e4d3af501cc ("mm: stack based kmap_atomic()") introduced the kmap_atomic_idx_push() function which warns on in_irq() with CONFIG_DEBUG_HIGHMEM enabled. This patch includes linux/hardirq.h for the in_irq definition. Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-11-12atomic: add atomic_inc_not_zero_hint()Eric Dumazet
Followup of perf tools session in Netfilter WorkShop 2010 In the network stack we make high usage of atomic_inc_not_zero() in contexts we know the probable value of atomic before increment (2 for udp sockets for example) Using a special version of atomic_inc_not_zero() giving this hint can help processor to use less bus transactions. On x86 (MESI protocol) for example, this avoids entering Shared state, because "lock cmpxchg" issues an RFO (Read For Ownership) akpm: Adds a new include/linux/atomic.h. This means that new code should henceforth include linux/atomic.h and not asm/atomic.h. The presence of include/linux/atomic.h will in fact cause checkpatch.pl to warn about use of asm/atomic.h. The new include/linux/atomic.h becomes the place where arch-neutral atomic_t code should be placed. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Cc: Christoph Lameter <cl@linux-foundation.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Andi Kleen <andi@firstfloor.org> Cc: Arnaldo Carvalho de Melo <acme@infradead.org> Cc: David Miller <davem@davemloft.net> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Cc: Nick Piggin <npiggin@kernel.dk> Reviewed-by: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-11-12include/linux/resource.h needs types.hJean Delvare
Fix the following warning: usr/include/linux/resource.h:49: found __[us]{8,16,32,64} type without #include <linux/types.h> Signed-off-by: Jean Delvare <khali@linux-fr.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-11-12netfilter: NF_HOOK_COND has wrong conditionalEric Paris
The NF_HOOK_COND returns 0 when it shouldn't due to what I believe to be an error in the code as the order of operations is not what was intended. C will evalutate == before =. Which means ret is getting set to the bool result, rather than the return value of the function call. The code says if (ret = function() == 1) when it meant to say: if ((ret = function()) == 1) Normally the compiler would warn, but it doesn't notice it because its a actually complex conditional and so the wrong code is wrapped in an explict set of () [exactly what the compiler wants you to do if this was intentional]. Fixing this means that errors when netfilter denies a packet get propagated back up the stack rather than lost. Problem introduced by commit 2249065f (netfilter: get rid of the grossness in netfilter.h). Signed-off-by: Eric Paris <eparis@redhat.com> Cc: stable@kernel.org Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-11-11ipv4: Make rt->fl.iif tests lest obscure.David S. Miller
When we test rt->fl.iif against zero, we're seeing if it's an output or an input route. Make that explicit with some helper functions. Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-11net: get rid of rtable->idevEric Dumazet
It seems idev field in struct rtable has no special purpose, but adding extra atomic ops. We hold refcounts on the device itself (using percpu data, so pretty cheap in current kernel). infiniband case is solved using dst.dev instead of idev->dev Removal of this field means routing without route cache is now using shared data, percpu data, and only potential contention is a pair of atomic ops on struct neighbour per forwarded packet. About 5% speedup on routing test. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Roland Dreier <rolandd@cisco.com> Cc: Sean Hefty <sean.hefty@intel.com> Cc: Hal Rosenstock <hal.rosenstock@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-11neigh: reorder struct neighbourEric Dumazet
It is important to move nud_state outside of the often modified cache line (because of refcnt), to reduce false sharing in neigh_event_send() This is a followup of commit 0ed8ddf4045f (neigh: Protect neigh->ha[] with a seqlock) This gives a 7% speedup on routing test with IP route cache disabled. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-11block: remove unused copy_io_context()Jens Axboe
Reported-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-11-10perf_events: Fix time tracking in samplesStephane Eranian
This patch corrects time tracking in samples. Without this patch both time_enabled and time_running are bogus when user asks for PERF_SAMPLE_READ. One uses PERF_SAMPLE_READ to sample the values of other counters in each sample. Because of multiplexing, it is necessary to know both time_enabled, time_running to be able to scale counts correctly. In this second version of the patch, we maintain a shadow copy of ctx->time which allows us to compute ctx->time without calling update_context_time() from NMI context. We avoid the issue that update_context_time() must always be called with ctx->lock held. We do not keep shadow copies of the other event timings because if the lead event is overflowing then it is active and thus it's been scheduled in via event_sched_in() in which case neither tstamp_stopped, tstamp_running can be modified. This timing logic only applies to samples when PERF_SAMPLE_READ is used. Note that this patch does not address timing issues related to sampling inheritance between tasks. This will be addressed in a future patch. With this patch, the libpfm4 example task_smpl now reports correct counts (shown on 2.4GHz Core 2): $ task_smpl -p 2400000000 -e unhalted_core_cycles:u,instructions_retired:u,baclears noploop 5 noploop for 5 seconds IIP:0x000000004006d6 PID:5596 TID:5596 TIME:466,210,211,430 STREAM_ID:33 PERIOD:2,400,000,000 ENA=1,010,157,814 RUN=1,010,157,814 NR=3 2,400,000,254 unhalted_core_cycles:u (33) 2,399,273,744 instructions_retired:u (34) 53,340 baclears (35) Signed-off-by: Stephane Eranian <eranian@google.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <4cc6e14b.1e07e30a.256e.5190@mx.google.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-11-10net: avoid limits overflowEric Dumazet
Robin Holt tried to boot a 16TB machine and found some limits were reached : sysctl_tcp_mem[2], sysctl_udp_mem[2] We can switch infrastructure to use long "instead" of "int", now atomic_long_t primitives are available for free. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Reported-by: Robin Holt <holt@sgi.com> Reviewed-by: Robin Holt <holt@sgi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-10block: remove REQ_HARDBARRIERChristoph Hellwig
REQ_HARDBARRIER is dead now, so remove the leftovers. What's left at this point is: - various checks inside the block layer. - sanity checks in bio based drivers. - now unused bio_empty_barrier helper. - Xen blockfront use of BLKIF_OP_WRITE_BARRIER - it's dead for a while, but Xen really needs to sort out it's barrier situaton. - setting of ordered tags in uas - dead code copied from old scsi drivers. - scsi different retry for barriers - it's dead and should have been removed when flushes were converted to FS requests. - blktrace handling of barriers - removed. Someone who knows blktrace better should add support for REQ_FLUSH and REQ_FUA, though. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-11-10Merge branch 'for-2.6.37/drivers' into for-linusJens Axboe
Conflicts: drivers/block/cciss.c Signed-off-by: Jens Axboe <jaxboe@fusionio.com>