summaryrefslogtreecommitdiffstats
path: root/lib
AgeCommit message (Collapse)Author
2013-10-31lib/scatterlist.c: don't flush_kernel_dcache_page on slab pageMing Lei
Commit b1adaf65ba03 ("[SCSI] block: add sg buffer copy helper functions") introduces two sg buffer copy helpers, and calls flush_kernel_dcache_page() on pages in SG list after these pages are written to. Unfortunately, the commit may introduce a potential bug: - Before sending some SCSI commands, kmalloc() buffer may be passed to block layper, so flush_kernel_dcache_page() can see a slab page finally - According to cachetlb.txt, flush_kernel_dcache_page() is only called on "a user page", which surely can't be a slab page. - ARCH's implementation of flush_kernel_dcache_page() may use page mapping information to do optimization so page_mapping() will see the slab page, then VM_BUG_ON() is triggered. Aaro Koskinen reported the bug on ARM/kirkwood when DEBUG_VM is enabled, and this patch fixes the bug by adding test of '!PageSlab(miter->page)' before calling flush_kernel_dcache_page(). Signed-off-by: Ming Lei <ming.lei@canonical.com> Reported-by: Aaro Koskinen <aaro.koskinen@iki.fi> Tested-by: Simon Baatz <gmbnomis@gmail.com> Cc: Russell King - ARM Linux <linux@arm.linux.org.uk> Cc: Will Deacon <will.deacon@arm.com> Cc: Aaro Koskinen <aaro.koskinen@iki.fi> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Cc: Tejun Heo <tj@kernel.org> Cc: "James E.J. Bottomley" <JBottomley@parallels.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: <stable@vger.kernel.org> [3.2+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-10-29Kconfig: make KOBJECT_RELEASE debugging require timer debuggingLinus Torvalds
Without the timer debugging, the delayed kobject release will just result in undebuggable oopses if it triggers any latent bugs. That doesn't actually help debugging at all. So make DEBUG_KOBJECT_RELEASE depend on DEBUG_OBJECTS_TIMERS to avoid having people enable one without the other. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-10-16percpu_refcount: export symbolsMatias Bjorling
Export the interface to be used within modules. Signed-off-by: Matias Bjorling <m@bjorling.me> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-10-10kobject: show debug info on delayed kobject releaseFengguang Wu
Useful for locating buggy drivers on kernel oops. It may add dozens of new lines to boot dmesg. DEBUG_KOBJECT_RELEASE is hopefully only enabled in debug kernels (like maybe the Fedora rawhide one, or at developers), so being a bit more verbose is likely ok. Acked-by: Russell King <rmk+kernel@arm.linux.org.uk> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-10-01Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds
Pull networking changes from David Miller: 1) Multiply in netfilter IPVS can overflow when calculating destination weight. From Simon Kirby. 2) Use after free fixes in IPVS from Julian Anastasov. 3) SFC driver bug fixes from Daniel Pieczko. 4) Memory leak in pcan_usb_core failure paths, from Alexey Khoroshilov. 5) Locking and encapsulation fixes to serial line CAN driver, from Andrew Naujoks. 6) Duplex and VF handling fixes to bnx2x driver from Yaniv Rosner, Eilon Greenstein, and Ariel Elior. 7) In lapb, if no other packets are outstanding, T1 timeouts actually stall things and no packet gets sent. Fix from Josselin Costanzi. 8) ICMP redirects should not make it to the socket error queues, from Duan Jiong. 9) Fix bugs in skge DMA mapping error handling, from Nikulas Patocka. 10) Fix setting of VLAN priority field on via-rhine driver, from Roget Luethi. 11) Fix TX stalls and VLAN promisc programming in be2net driver from Ajit Khaparde. 12) Packet padding doesn't get handled correctly in new usbnet SG support code, from Ming Lei. 13) Fix races in netdevice teardown wrt. network namespace closing. From Eric W. Biederman. 14) Fix potential missed initialization of net_secret if not TCP connections are openned. From Eric Dumazet. 15) Cinterion PLXX product ID in qmi_wwan driver is wrong, from Aleksander Morgado. 16) skb_cow_head() can change skb->data and thus packet header pointers, don't use stale ip_hdr reference in ip_tunnel code. 17) Backend state transition handling fixes in xen-netback, from Paul Durrant. 18) Packet offset for AH protocol is handled wrong in flow dissector, from Eric Dumazet. 19) Taking down an fq packet scheduler instance can leave stale packets in the queues, fix from Eric Dumazet. 20) Fix performance regressions introduced by TCP Small Queues. From Eric Dumazet. 21) IPV6 GRE tunneling code calculates max_headroom incorrectly, from Hannes Frederic Sowa. 22) Multicast timer handlers in ipv4 and ipv6 can be the last and final reference to the ipv4/ipv6 specific network device state, so use the reference put that will check and release the object if the reference hits zero. From Salam Noureddine. 23) Fix memory corruption in ip_tunnel driver, and use skb_push() instead of __skb_push() so that similar bugs are less hard to find. From Steffen Klassert. 24) Add forgotten hookup of rtnl_ops in SIT and ip6tnl drivers, from Nicolas Dichtel. 25) fq scheduler doesn't accurately rate limit in certain circumstances, from Eric Dumazet. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (103 commits) pkt_sched: fq: rate limiting improvements ip6tnl: allow to use rtnl ops on fb tunnel sit: allow to use rtnl ops on fb tunnel ip_tunnel: Remove double unregister of the fallback device ip_tunnel_core: Change __skb_push back to skb_push ip_tunnel: Add fallback tunnels to the hash lists ip_tunnel: Fix a memory corruption in ip_tunnel_xmit qlcnic: Fix SR-IOV configuration ll_temac: Reset dma descriptors indexes on ndo_open skbuff: size of hole is wrong in a comment ipv6 mcast: use in6_dev_put in timer handlers instead of __in6_dev_put ipv4 igmp: use in_dev_put in timer handlers instead of __in_dev_put ethernet: moxa: fix incorrect placement of __initdata tag ipv6: gre: correct calculation of max_headroom powerpc/83xx: gianfar_ptp: select 1588 clock source through dts file Revert "powerpc/83xx: gianfar_ptp: select 1588 clock source through dts file" bonding: Fix broken promiscuity reference counting issue tcp: TSQ can use a dynamic limit dm9601: fix IFF_ALLMULTI handling pkt_sched: fq: qdisc dismantle fixes ...
2013-09-28lockref: use arch_mutex_cpu_relax() in CMPXCHG_LOOP()Heiko Carstens
Make use of arch_mutex_cpu_relax() so architectures can override the default cpu_relax() semantics. This is especially useful for s390, where cpu_relax() means that we yield() the current (virtual) cpu and therefore is very expensive, and would contradict the whole purpose of the lockless cmpxchg loop. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2013-09-27sysfs: Allow mounting without CONFIG_NETEric W. Biederman
In kobj_ns_current_may_mount the default should be to allow the mount. The test is only for a single kobj_ns_type at a time, and unless there is a reason to prevent it the mounting sysfs should be allowed. Subsystems that are not registered can't have are not involved so can't have a reason to prevent mounting sysfs. This is a bug-fix to commit 7dc5dbc879bd ("sysfs: Restrict mounting sysfs") that came in via the userns tree during the 3.12 merge window. Reported-and-tested-by: James Hogan <james.hogan@imgtec.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-09-27lockref: allow relaxed cmpxchg64 variant for lockless updatesWill Deacon
The 64-bit cmpxchg operation on the lockref is ordered by virtue of hazarding between the cmpxchg operation and the reference count manipulation. On weakly ordered memory architectures (such as ARM), it can be of great benefit to omit the barrier instructions where they are not needed. This patch moves the lockless lockref code over to a cmpxchg64_relaxed operation, which doesn't provide barrier semantics. If the operation isn't defined, we simply #define it as the usual 64-bit cmpxchg macro. Cc: Waiman Long <Waiman.Long@hp.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-09-20lib: introduce upper case hex ascii helpersAndre Naujoks
To be able to use the hex ascii functions in case sensitive environments the array hex_asc_upper[] and the needed functions for hex_byte_pack_upper() are introduced. Signed-off-by: Andre Naujoks <nautsch2@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-09-20lockref: use cmpxchg64 explicitly for lockless updatesWill Deacon
The cmpxchg() function tends not to support 64-bit arguments on 32-bit architectures. This could be either due to use of unsigned long arguments (like on ARM) or lack of instruction support (cmpxchgq on x86). However, these architectures may implement a specific cmpxchg64() function to provide 64-bit cmpxchg support instead. Since the lockref code requires a 64-bit cmpxchg and relies on the architecture selecting ARCH_USE_CMPXCHG_LOCKREF, move to using cmpxchg64 instead of cmpxchg and allow 32-bit architectures to make use of the lockless lockref implementation. Cc: Waiman Long <Waiman.Long@hp.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-09-13Merge branch 'genirq' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull generic hardirq option removal from Martin Schwidefsky: "All architectures now use generic hardirqs, s390 has been last to switch. With that the code under !CONFIG_GENERIC_HARDIRQS and the related HAVE_GENERIC_HARDIRQS and GENERIC_HARDIRQS config options can be removed. Yay!" * 'genirq' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: Remove GENERIC_HARDIRQ config option
2013-09-13Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6Linus Torvalds
Pull crypto fixes from Herbert Xu: "This fixes a 7+ year race condition in the crypto API that causes sporadic crashes when multiple threads load the same algorithm. It also fixes the crct10dif algorithm again to prevent boot failures on systems where the initramfs tool ignores module softdeps" * git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: crypto: crct10dif - Add fallback for broken initrds crypto: api - Fix race condition in larval lookup
2013-09-13Remove GENERIC_HARDIRQ config optionMartin Schwidefsky
After the last architecture switched to generic hard irqs the config options HAVE_GENERIC_HARDIRQS & GENERIC_HARDIRQS and the related code for !CONFIG_GENERIC_HARDIRQS can be removed. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2013-09-12Merge branch 'for-next' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending Pull SCSI target updates from Nicholas Bellinger: "Lots of activity again this round for I/O performance optimizations (per-cpu IDA pre-allocation for vhost + iscsi/target), and the addition of new fabric independent features to target-core (COMPARE_AND_WRITE + EXTENDED_COPY). The main highlights include: - Support for iscsi-target login multiplexing across individual network portals - Generic Per-cpu IDA logic (kent + akpm + clameter) - Conversion of vhost to use per-cpu IDA pre-allocation for descriptors, SGLs and userspace page pointer list - Conversion of iscsi-target + iser-target to use per-cpu IDA pre-allocation for descriptors - Add support for generic COMPARE_AND_WRITE (AtomicTestandSet) emulation for virtual backend drivers - Add support for generic EXTENDED_COPY (CopyOffload) emulation for virtual backend drivers. - Add support for fast memory registration mode to iser-target (Vu) The patches to add COMPARE_AND_WRITE and EXTENDED_COPY support are of particular significance, which make us the first and only open source target to support the full set of VAAI primitives. Currently Linux clients are lacking upstream support to actually utilize these primitives. However, with server side support now in place for folks like MKP + ZAB working on the client, this logic once reserved for the highest end of storage arrays, can now be run in VMs on their laptops" * 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: (50 commits) target/iscsi: Bump versions to v4.1.0 target: Update copyright ownership/year information to 2013 iscsi-target: Bump default TCP listen backlog to 256 target: Fix >= v3.9+ regression in PR APTPL + ALUA metadata write-out iscsi-target; Bump default CmdSN Depth to 64 iscsi-target: Remove unnecessary wait_for_completion in iscsi_get_thread_set iscsi-target: Add thread_set->ts_activate_sem + use common deallocate iscsi-target: Fix race with thread_pre_handler flush_signals + ISCSI_THREAD_SET_DIE target: remove unused including <linux/version.h> iser-target: introduce fast memory registration mode (FRWR) iser-target: generalize rdma memory registration and cleanup iser-target: move rdma wr processing to a shared function target: Enable global EXTENDED_COPY setup/release target: Add Third Party Copy (3PC) bit in INQUIRY response target: Enable EXTENDED_COPY setup in spc_parse_cdb target: Add support for EXTENDED_COPY copy offload emulation target: Avoid non-existent tg_pt_gp_mem in target_alua_state_check target: Add global device list for EXTENDED_COPY target: Make helpers non static for EXTENDED_COPY command setup target: Make spc_parse_naa_6h_vendor_specific non static ...
2013-09-12crypto: crct10dif - Add fallback for broken initrdsHerbert Xu
Unfortunately, even with a softdep some distros fail to include the necessary modules in the initrd. Therefore this patch adds a fallback path to restore existing behaviour where we cannot load the new crypto crct10dif algorithm. In order to do this, the underlying crct10dif has been split out from the crypto implementation so that it can be used on the fallback path. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2013-09-11lz4: fix compression/decompression signedness mismatchSergey Senozhatsky
LZ4 compression and decompression functions require different in signedness input/output parameters: unsigned char for compression and signed char for decompression. Change decompression API to require "(const) unsigned char *". Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Cc: Kyungsik Lee <kyungsik.lee@lge.com> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Yann Collet <yann.collet.73@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-09-11lib/radix-tree.c: make radix_tree_node_alloc() work correctly within interruptJan Kara
With users of radix_tree_preload() run from interrupt (block/blk-ioc.c is one such possible user), the following race can happen: radix_tree_preload() ... radix_tree_insert() radix_tree_node_alloc() if (rtp->nr) { ret = rtp->nodes[rtp->nr - 1]; <interrupt> ... radix_tree_preload() ... radix_tree_insert() radix_tree_node_alloc() if (rtp->nr) { ret = rtp->nodes[rtp->nr - 1]; And we give out one radix tree node twice. That clearly results in radix tree corruption with different results (usually OOPS) depending on which two users of radix tree race. We fix the problem by making radix_tree_node_alloc() always allocate fresh radix tree nodes when in interrupt. Using preloading when in interrupt doesn't make sense since all the allocations have to be atomic anyway and we cannot steal nodes from process-context users because some users rely on radix_tree_insert() succeeding after radix_tree_preload(). in_interrupt() check is somewhat ugly but we cannot simply key off passed gfp_mask as that is acquired from root_gfp_mask() and thus the same for all preload users. Another part of the fix is to avoid node preallocation in radix_tree_preload() when passed gfp_mask doesn't allow waiting. Again, preallocation in such case doesn't make sense and when preallocation would happen in interrupt we could possibly leak some allocated nodes. However, some users of radix_tree_preload() require following radix_tree_insert() to succeed. To avoid unexpected effects for these users, radix_tree_preload() only warns if passed gfp mask doesn't allow waiting and we provide a new function radix_tree_maybe_preload() for those users which get different gfp mask from different call sites and which are prepared to handle radix_tree_insert() failure. Signed-off-by: Jan Kara <jack@suse.cz> Cc: Jens Axboe <jaxboe@fusionio.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-09-11rbtree: allow tests to run as builtinCody P Schafer
No reason require rbtree test code to be a module, allow it to be builtin (streamlines my development process) Signed-off-by: Cody P Schafer <cody@linux.vnet.ibm.com> Reviewed-by: Seth Jennings <sjenning@linux.vnet.ibm.com> Cc: David Woodhouse <David.Woodhouse@intel.com> Cc: Rik van Riel <riel@redhat.com> Cc: Michel Lespinasse <walken@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-09-11rbtree_test: add test for postorder iterationCody P Schafer
Just check that we examine all nodes in the tree for the postorder iteration. Signed-off-by: Cody P Schafer <cody@linux.vnet.ibm.com> Reviewed-by: Seth Jennings <sjenning@linux.vnet.ibm.com> Cc: David Woodhouse <David.Woodhouse@intel.com> Cc: Rik van Riel <riel@redhat.com> Cc: Michel Lespinasse <walken@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-09-11rbtree: add postorder iteration functionsCody P Schafer
Postorder iteration yields all of a node's children prior to yielding the node itself, and this particular implementation also avoids examining the leaf links in a node after that node has been yielded. In what I expect will be its most common usage, postorder iteration allows the deletion of every node in an rbtree without modifying the rbtree nodes (no _requirement_ that they be nulled) while avoiding referencing child nodes after they have been "deleted" (most commonly, freed). I have only updated zswap to use this functionality at this point, but numerous bits of code (most notably in the filesystem drivers) use a hand rolled postorder iteration that NULLs child links as it traverses the tree. Each of those instances could be replaced with this common implementation. 1 & 2 add rbtree postorder iteration functions. 3 adds testing of the iteration to the rbtree runtime tests 4 allows building the rbtree runtime tests as builtins 5 updates zswap. This patch: Add postorder iteration functions for rbtree. These are useful for safely freeing an entire rbtree without modifying the tree at all. Signed-off-by: Cody P Schafer <cody@linux.vnet.ibm.com> Reviewed-by: Seth Jennings <sjenning@linux.vnet.ibm.com> Cc: David Woodhouse <David.Woodhouse@intel.com> Cc: Rik van Riel <riel@redhat.com> Cc: Michel Lespinasse <walken@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-09-11lib/decompressors: fix "no limit" output buffer lengthAlexandre Courbot
When decompressing into memory, the output buffer length is set to some arbitrarily high value (0x7fffffff) to indicate the output is, virtually, unlimited in size. The problem with this is that some platforms have their physical memory at high physical addresses (0x80000000 or more), and that the output buffer address and its "unlimited" length cannot be added without overflowing. An example of this can be found in inflate_fast(): /* next_out is the output buffer address */ out = strm->next_out - OFF; /* avail_out is the output buffer size. end will overflow if the output * address is >= 0x80000104 */ end = out + (strm->avail_out - 257); This has huge consequences on the performance of kernel decompression, since the following exit condition of inflate_fast() will be always true: } while (in < last && out < end); Indeed, "end" has overflowed and is now always lower than "out". As a result, inflate_fast() will return after processing one single byte of input data, and will thus need to be called an unreasonably high number of times. This probably went unnoticed because kernel decompression is fast enough even with this issue. Nonetheless, adjusting the output buffer length in such a way that the above pointer arithmetic never overflows results in a kernel decompression that is about 3 times faster on affected machines. Signed-off-by: Alexandre Courbot <acourbot@nvidia.com> Tested-by: Jon Medhurst <tixy@linaro.org> Cc: Stephen Warren <swarren@wwwdotorg.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-09-11lib/crc32: update the comments of crc32_{be,le}_generic()Gu Zheng
[akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-09-11lib/genalloc.c: correct dev_get_gen_pool documentationEmilio López
The documentation mentions a "name" parameter, which does not exist. This commit removes such mention from the function documentation. Signed-off-by: Emilio López <emilio@elopez.com.ar> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-09-11lib/genalloc.c: convert kmalloc_node(...GFP_ZERO...) to kzalloc_node(...)Joe Perches
Use the helper function instead of __GFP_ZERO. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-09-11lib/genalloc.c: fix overflow of ending address of memory chunkJoonyoung Shim
In struct gen_pool_chunk, end_addr means the end address of memory chunk (inclusive), but in the implementation it is treated as address + size of memory chunk (exclusive), so it points to the address plus one instead of correct ending address. The ending address of memory chunk plus one will cause overflow on the memory chunk including the last address of memory map, e.g. when starting address is 0xFFF00000 and size is 0x100000 on 32bit machine, ending address will be 0x100000000. Use correct ending address like starting address + size - 1. [akpm@linux-foundation.org: add comment to struct gen_pool_chunk:end_addr] Signed-off-by: Joonyoung Shim <jy0922.shim@samsung.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-09-10Merge tag 'dm-3.12-changes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm Pull device-mapper updates from Mike Snitzer: "Add the ability to collect I/O statistics on user-defined regions of a device-mapper device. This dm-stats code required the reintroduction of a div64_u64_rem() helper, but as a separate method that doesn't slow down div64_u64() -- especially on 32-bit systems. Allow the error target to replace request-based DM devices (e.g. multipath) in addition to bio-based DM devices. Various other small code fixes and improvements to thin-provisioning, DM cache and the DM ioctl interface" * tag 'dm-3.12-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: dm stripe: silence a couple sparse warnings dm: add statistics support dm thin: always return -ENOSPC if no_free_space is set dm ioctl: cleanup error handling in table_load dm ioctl: increase granularity of type_lock when loading table dm ioctl: prevent rename to empty name or uuid dm thin: set pool read-only if breaking_sharing fails block allocation dm thin: prefix pool error messages with pool device name dm: allow error target to replace bio-based and request-based targets math64: New separate div64_u64_rem helper dm space map: optimise sm_ll_dec and sm_ll_inc dm btree: prefetch child nodes when walking tree for a dm_btree_del dm btree: use pop_frame in dm_btree_del to cleanup code dm cache: eliminate holes in cache structure dm cache: fix stacking of geometry limits dm thin: fix stacking of geometry limits dm thin: add data block size limits to Documentation dm cache: add data block size limits to code and Documentation dm cache: document metadata device is exclussive to a cache dm: stop using WQ_NON_REENTRANT
2013-09-10Merge tag 'md/3.12' of git://neil.brown.name/mdLinus Torvalds
Pull md update from Neil Brown: "Headline item is multithreading for RAID5 so that more IO/sec can be supported on fast (SSD) devices. Also TILE-Gx SIMD suppor for RAID6 calculations and an assortment of bug fixes" * tag 'md/3.12' of git://neil.brown.name/md: raid5: only wakeup necessary threads md/raid5: flush out all pending requests before proceeding with reshape. md/raid5: use seqcount to protect access to shape in make_request. raid5: sysfs entry to control worker thread number raid5: offload stripe handle to workqueue raid5: fix stripe release order raid5: make release_stripe lockless md: avoid deadlock when dirty buffers during md_stop. md: Don't test all of mddev->flags at once. md: Fix apparent cut-and-paste error in super_90_validate raid6/test: replace echo -e with printf RAID: add tilegx SIMD implementation of raid6 md: fix safe_mode buglet. md: don't call md_allow_write in get_bitmap_file.
2013-09-09idr: Percpu idaKent Overstreet
Percpu frontend for allocating ids. With percpu allocation (that works), it's impossible to guarantee it will always be possible to allocate all nr_tags - typically, some will be stuck on a remote percpu freelist where the current job can't get to them. We do guarantee that it will always be possible to allocate at least (nr_tags / 2) tags - this is done by keeping track of which and how many cpus have tags on their percpu freelists. On allocation failure if enough cpus have tags that there could potentially be (nr_tags / 2) tags stuck on remote percpu freelists, we then pick a remote cpu at random to steal from. Note that there's no cpu hotplug notifier - we don't care, because steal_tags() will eventually get the down cpu's tags. We _could_ satisfy more allocations if we had a notifier - but we'll still meet our guarantees and it's absolutely not a correctness issue, so I don't think it's worth the extra code. From akpm: "It looks OK to me (that's as close as I get to an ack :)) v6 changes: - Add #include <linux/cpumask.h> to include/linux/percpu_ida.h to make alpha/arc builds happy (Fengguang) - Move second (cpu >= nr_cpu_ids) check inside of first check scope in steal_tags() (akpm + nab) v5 changes: - Change percpu_ida->cpus_have_tags to cpumask_t (kmo + akpm) - Add comment for percpu_ida_cpu->lock + ->nr_free (kmo + akpm) - Convert steal_tags() to use cpumask_weight() + cpumask_next() + cpumask_first() + cpumask_clear_cpu() (kmo + akpm) - Add comment for alloc_global_tags() (kmo + akpm) - Convert percpu_ida_alloc() to use cpumask_set_cpu() (kmo + akpm) - Convert percpu_ida_free() to use cpumask_set_cpu() (kmo + akpm) - Drop percpu_ida->cpus_have_tags allocation in percpu_ida_init() (kmo + akpm) - Drop percpu_ida->cpus_have_tags kfree in percpu_ida_destroy() (kmo + akpm) - Add comment for percpu_ida_alloc @ gfp (kmo + akpm) - Move to percpu_ida.c + percpu_ida.h (kmo + akpm + nab) v4 changes: - Fix tags.c reference in percpu_ida_init (akpm) Signed-off-by: Kent Overstreet <kmo@daterainc.com> Cc: Tejun Heo <tj@kernel.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Christoph Lameter <cl@linux-foundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Jens Axboe <axboe@kernel.dk> Cc: "Nicholas A. Bellinger" <nab@linux-iscsi.org> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2013-09-09Merge tag 'arc-v3.12-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc Pull ARC changes from Vineet Gupta: - ARC MM changes: - preparation for MMUv4 (accomodate new PTE bits, new cmds) - Rework the ASID allocation algorithm to remove asid-mm reverse map - Boilerplate code consolidation in Exception Handlers - Disable FRAME_POINTER for ARC - Unaligned Access Emulation for Big-Endian from Noam - Bunch of fixes (udelay, missing accessors) from Mischa * tag 'arc-v3.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc: ARC: fix new Section mismatches in build (post __cpuinit cleanup) Kconfig.debug: Add FRAME_POINTER anti-dependency for ARC ARC: Fix __udelay calculation ARC: remove console_verbose() from setup_arch() ARC: Add read*_relaxed to asm/io.h ARC: Handle un-aligned user space access in BE. ARC: [ASID] Track ASID allocation cycles/generations ARC: [ASID] activate_mm() == switch_mm() ARC: [ASID] get_new_mmu_context() to conditionally allocate new ASID ARC: [ASID] Refactor the TLB paranoid debug code ARC: [ASID] Remove legacy/unused debug code ARC: No need to flush the TLB in early boot ARC: MMUv4 preps/3 - Abstract out TLB Insert/Delete ARC: MMUv4 preps/2 - Reshuffle PTE bits ARC: MMUv4 preps/1 - Fold PTE K/U access flags ARC: Code cosmetics (Nothing semantical) ARC: Entry Handler tweaks: Optimize away redundant IRQ_DISABLE_SAVE ARC: Exception Handlers Code consolidation ARC: Add some .gitignore entries
2013-09-07lockref: add ability to mark lockrefs "dead"Linus Torvalds
The only actual current lockref user (dcache) uses zero reference counts even for perfectly live dentries, because it's a cache: there may not be any users, but that doesn't mean that we want to throw away the dentry. At the same time, the dentry cache does have a notion of a truly "dead" dentry that we must not even increment the reference count of, because we have pruned it and it is not valid. Currently that distinction is not visible in the lockref itself, and the dentry cache validation uses "lockref_get_or_lock()" to either get a new reference to a dentry that already had existing references (and thus cannot be dead), or get the dentry lock so that we can then verify the dentry and increment the reference count under the lock if that verification was successful. That's all somewhat complicated. This adds the concept of being "dead" to the lockref itself, by simply using a count that is negative. This allows a usage scenario where we can increment the refcount of a dentry without having to validate it, and pushing the special "we killed it" case into the lockref code. The dentry code itself doesn't actually use this yet, and it's probably too late in the merge window to do that code (the dentry_kill() code with its "should I decrement the count" logic really is pretty complex code), but let's introduce the concept at the lockref level now. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-09-07lockref: fix docbook argument namesLinus Torvalds
The code got rewritten, but the comments got copied as-is from older versions, and as a result the argument name in the comment didn't actually match the code any more. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-09-07Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace Pull namespace changes from Eric Biederman: "This is an assorted mishmash of small cleanups, enhancements and bug fixes. The major theme is user namespace mount restrictions. nsown_capable is killed as it encourages not thinking about details that need to be considered. A very hard to hit pid namespace exiting bug was finally tracked and fixed. A couple of cleanups to the basic namespace infrastructure. Finally there is an enhancement that makes per user namespace capabilities usable as capabilities, and an enhancement that allows the per userns root to nice other processes in the user namespace" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: userns: Kill nsown_capable it makes the wrong thing easy capabilities: allow nice if we are privileged pidns: Don't have unshare(CLONE_NEWPID) imply CLONE_THREAD userns: Allow PR_CAPBSET_DROP in a user namespace. namespaces: Simplify copy_namespaces so it is clear what is going on. pidns: Fix hang in zap_pid_ns_processes by sending a potentially extra wakeup sysfs: Restrict mounting sysfs userns: Better restrictions on when proc and sysfs can be mounted vfs: Don't copy mount bind mounts of /proc/<pid>/ns/mnt between namespaces kernel/nsproxy.c: Improving a snippet of code. proc: Restrict mounting the proc filesystem vfs: Lock in place mounts from more privileged users
2013-09-07Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6Linus Torvalds
Pull crypto update from Herbert Xu: "Here is the crypto update for 3.12: - Added MODULE_SOFTDEP to allow pre-loading of modules. - Reinstated crct10dif driver using the module softdep feature. - Allow via rng driver to be auto-loaded. - Split large input data when necessary in nx. - Handle zero length messages correctly for GCM/XCBC in nx. - Handle SHA-2 chunks bigger than block size properly in nx. - Handle unaligned lengths in omap-aes. - Added SHA384/SHA512 to omap-sham. - Added OMAP5/AM43XX SHAM support. - Added OMAP4 TRNG support. - Misc fixes" * git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (66 commits) Reinstate "crypto: crct10dif - Wrap crc_t10dif function all to use crypto transform framework" hwrng: via - Add MODULE_DEVICE_TABLE crypto: fcrypt - Fix bitoperation for compilation with clang crypto: nx - fix SHA-2 for chunks bigger than block size crypto: nx - fix GCM for zero length messages crypto: nx - fix XCBC for zero length messages crypto: nx - fix limits to sg lists for AES-CCM crypto: nx - fix limits to sg lists for AES-XCBC crypto: nx - fix limits to sg lists for AES-GCM crypto: nx - fix limits to sg lists for AES-CTR crypto: nx - fix limits to sg lists for AES-CBC crypto: nx - fix limits to sg lists for AES-ECB crypto: nx - add offset to nx_build_sg_lists() padata - Register hotcpu notifier after initialization padata - share code between CPU_ONLINE and CPU_DOWN_FAILED, same to CPU_DOWN_PREPARE and CPU_UP_CANCELED hwrng: omap - reorder OMAP TRNG driver code crypto: omap-sham - correct dma burst size crypto: omap-sham - Enable Polling mode if DMA fails crypto: tegra-aes - bitwise vs logical and crypto: sahara - checking the wrong variable ...
2013-09-07Reinstate "crypto: crct10dif - Wrap crc_t10dif function all to use crypto ↵Herbert Xu
transform framework" This patch reinstates commits 67822649d7305caf3dd50ed46c27b99c94eff996 39761214eefc6b070f29402aa1165f24d789b3f7 0b95a7f85718adcbba36407ef88bba0a7379ed03 31d939625a9a20b1badd2d4e6bf6fd39fa523405 2d31e518a42828df7877bca23a958627d60408bc Now that module softdeps are in the kernel we can use that to resolve the boot issue which cause the revert. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2013-09-05Merge branch 'for-linus' of git://git.linaro.org/people/rmk/linux-armLinus Torvalds
Pull ARM updates from Russell King: "This set includes adding support for Neon acceleration of RAID6 XOR code from Ard Biesheuvel, cache flushing and barrier updates from Will Deacon, and a cleanup to the ARM debug code which reduces the amount of code by about 500 lines. A few other cleanups, such as constifying the machine descriptors which already shouldn't be written to, cleaning up the printing of the L2 cache size" * 'for-linus' of git://git.linaro.org/people/rmk/linux-arm: (55 commits) ARM: 7826/1: debug: support debug ll on hisilicon soc ARM: 7830/1: delay: don't bother reporting bogomips in /proc/cpuinfo ARM: 7829/1: Add ".text.unlikely" and ".text.hot" to arm unwind tables ARM: 7828/1: ARMv7-M: implement restart routine common to all v7-M machines ARM: 7827/1: highbank: fix debug uart virtual address for LPAE ARM: 7823/1: errata: workaround Cortex-A15 erratum 773022 ARM: 7806/1: allow DEBUG_UNCOMPRESS for Tegra ARM: 7793/1: debug: use generic option for ep93xx PL10x debug port ARM: debug: move SPEAr debug to generic PL01x code ARM: debug: move davinci debug to generic 8250 code ARM: debug: move keystone debug to generic 8250 code ARM: debug: remove DEBUG_ROCKCHIP_UART ARM: debug: provide generic option choices for 8250 and PL01x ports ARM: debug: move PL01X debug include into arch/arm/include/debug/ ARM: debug: provide PL01x debug uart phys/virt address configuration options ARM: debug: add support for word accesses to debug/8250.S ARM: debug: move 8250 debug include into arch/arm/include/debug/ ARM: debug: provide 8250 debug uart phys/virt address configuration options ARM: debug: provide 8250 debug uart register shift configuration option ARM: debug: provide 8250 debug uart flow control configuration option ...
2013-09-05Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs pile 1 from Al Viro: "Unfortunately, this merge window it'll have a be a lot of small piles - my fault, actually, for not keeping #for-next in anything that would resemble a sane shape ;-/ This pile: assorted fixes (the first 3 are -stable fodder, IMO) and cleanups + %pd/%pD formats (dentry/file pathname, up to 4 last components) + several long-standing patches from various folks. There definitely will be a lot more (starting with Miklos' check_submount_and_drop() series)" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (26 commits) direct-io: Handle O_(D)SYNC AIO direct-io: Implement generic deferred AIO completions add formats for dentry/file pathnames kvm eventfd: switch to fdget powerpc kvm: use fdget switch fchmod() to fdget switch epoll_ctl() to fdget switch copy_module_from_fd() to fdget git simplify nilfs check for busy subtree ibmasmfs: don't bother passing superblock when not needed don't pass superblock to hypfs_{mkdir,create*} don't pass superblock to hypfs_diag_create_files don't pass superblock to hypfs_vm_create_files() oprofile: get rid of pointless forward declarations of struct super_block oprofilefs_create_...() do not need superblock argument oprofilefs_mkdir() doesn't need superblock argument don't bother with passing superblock to oprofile_create_stats_files() oprofile: don't bother with passing superblock to ->create_files() don't bother passing sb to oprofile_create_files() coh901318: don't open-code simple_read_from_buffer() ...
2013-09-05Merge branches 'debug-choice', 'devel-stable' and 'misc' into for-linusRussell King
2013-09-05Kconfig.debug: Add FRAME_POINTER anti-dependency for ARCVineet Gupta
Frame pointer on ARC doesn't serve the conventional purpose of stack unwinding due to the typical way ABI designates it's usage. Thus it's explicit usage on ARC is discouraged (gcc is free to use it, for some tricky stack frames even if -fomit-frame-pointer). Hence no point enabling it for ARC. References: http://www.spinics.net/lists/kernel/msg1593937.html Signed-off-by: Vineet Gupta <vgupta@synopsys.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: "Paul E. McKenney" <paul.mckenney@linaro.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Michel Lespinasse <walken@google.com> Cc: linux-kernel@vger.kernel.org
2013-09-04Merge tag 'stable/for-linus-3.12-rc0-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull Xen updates from Konrad Rzeszutek Wilk: "A couple of features and a ton of bug-fixes. There is also some maintership changes. Jeremy is enjoying the full-time work at the startup and as much as he would love to help - he can't find the time. I have a bunch of other things that I promised to work on - paravirt diet, get SWIOTLB working everywhere, etc, but haven't been able to find the time. As such both David Vrabel and Boris Ostrovsky have graciously volunteered to help with the maintership role. They will keep the lid on regressions, bug-fixes, etc. I will be in the background to help - but eventually there will be less of me doing the Xen GIT pulls and more of them. Stefano is still doing the ARM/ARM64 and will continue on doing so. Features: - Xen Trusted Platform Module (TPM) frontend driver - with the backend in MiniOS. - Scalability improvements in event channel. - Two extra Xen co-maintainers (David, Boris) and one going away (Jeremy) Bug-fixes: - Make the 1:1 mapping work during early bootup on selective regions. - Add scratch page to balloon driver to deal with unexpected code still holding on stale pages. - Allow NMIs on PV guests (64-bit only) - Remove unnecessary TLB flush in M2P code. - Fixes duplicate callbacks in Xen granttable code. - Fixes in PRIVCMD_MMAPBATCH ioctls to allow retries - Fix for events being lost due to rescheduling on different VCPUs. - More documentation" * tag 'stable/for-linus-3.12-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: (23 commits) hvc_xen: Remove unnecessary __GFP_ZERO from kzalloc drivers/xen-tpmfront: Fix compile issue with missing option. xen/balloon: don't set P2M entry for auto translated guest xen/evtchn: double free on error Xen: Fix retry calls into PRIVCMD_MMAPBATCH*. xen/pvhvm: Initialize xen panic handler for PVHVM guests xen/m2p: use GNTTABOP_unmap_and_replace to reinstate the original mapping xen: fix ARM build after 6efa20e4 MAINTAINERS: Remove Jeremy from the Xen subsystem. xen/events: document behaviour when scanning the start word for events x86/xen: during early setup, only 1:1 map the ISA region x86/xen: disable premption when enabling local irqs swiotlb-xen: replace dma_length with sg_dma_len() macro swiotlb: replace dma_length with sg_dma_len() macro xen/balloon: set a mapping for ballooned out pages xen/evtchn: improve scalability by using per-user locks xen/p2m: avoid unneccesary TLB flush in m2p_remove_override() MAINTAINERS: Add in two extra co-maintainers of the Xen tree. MAINTAINERS: Update the Xen subsystem's with proper mailing list. xen: replace strict_strtoul() with kstrtoul() ...
2013-09-04Merge branch 'x86-asmlinkage-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86/asmlinkage changes from Ingo Molnar: "As a preparation for Andi Kleen's LTO patchset (link time optimizations using GCC's -flto which build time optimization has steadily increased in quality over the past few years and might eventually be usable for the kernel too) this tree includes a handful of preparatory patches that make function calling convention annotations consistent again: - Mark every function without arguments (or 64bit only) that is used by assembly code with asmlinkage() - Mark every function with parameters or variables that is used by assembly code as __visible. For the vanilla kernel this has documentation, consistency and debuggability advantages, for the time being" * 'x86-asmlinkage-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/asmlinkage: Fix warning in xen asmlinkage change x86, asmlinkage, vdso: Mark vdso variables __visible x86, asmlinkage, power: Make various symbols used by the suspend asm code visible x86, asmlinkage: Make dump_stack visible x86, asmlinkage: Make 64bit checksum functions visible x86, asmlinkage, paravirt: Add __visible/asmlinkage to xen paravirt ops x86, asmlinkage, apm: Make APM data structure used from assembler visible x86, asmlinkage: Make syscall tables visible x86, asmlinkage: Make several variables used from assembler/linker script visible x86, asmlinkage: Make kprobes code visible and fix assembler code x86, asmlinkage: Make various syscalls asmlinkage x86, asmlinkage: Make 32bit/64bit __switch_to visible x86, asmlinkage: Make _*_start_kernel visible x86, asmlinkage: Make all interrupt handlers asmlinkage / __visible x86, asmlinkage: Change dotraplinkage into __visible on 32bit x86: Fix sys_call_table type in asm/syscall.h
2013-09-04Merge branch 'core-rcu-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull RCU updates from Ingo Molnar: "Main RCU changes this cycle were: - Full-system idle detection. This is for use by Frederic Weisbecker's adaptive-ticks mechanism. Its purpose is to allow the timekeeping CPU to shut off its tick when all other CPUs are idle. - Miscellaneous fixes. - Improved rcutorture test coverage. - Updated RCU documentation" * 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (30 commits) nohz_full: Force RCU's grace-period kthreads onto timekeeping CPU nohz_full: Add full-system-idle state machine jiffies: Avoid undefined behavior from signed overflow rcu: Simplify _rcu_barrier() processing rcu: Make rcutorture emit online failures if verbose rcu: Remove unused variable from rcu_torture_writer() rcu: Sort rcutorture module parameters rcu: Increase rcutorture test coverage rcu: Add duplicate-callback tests to rcutorture doc: Fix memory-barrier control-dependency example rcu: Update RTFP documentation nohz_full: Add full-system-idle arguments to API nohz_full: Add full-system idle states and variables nohz_full: Add per-CPU idle-state tracking nohz_full: Add rcu_dyntick data for scalable detection of all-idle state nohz_full: Add Kconfig parameter for scalable detection of all-idle state nohz_full: Add testing information to documentation rcu: Eliminate unused APIs intended for adaptive ticks rcu: Select IRQ_WORK from TREE_PREEMPT_RCU rculist: list_first_or_null_rcu() should use list_entry_rcu() ...
2013-09-04add formats for dentry/file pathnamesAl Viro
New formats: %p[dD][234]?. The next pointer is interpreted as struct dentry * or struct file * resp. ('d' => dentry, 'D' => file) and the last component(s) of pathname are printed (%pd => just the last one, %pd2 => the last two, etc.) Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-09-03Merge tag 'pm+acpi-3.12-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI and power management updates from Rafael Wysocki: 1) ACPI-based PCI hotplug (ACPIPHP) subsystem rework and introduction of Intel Thunderbolt support on systems that use ACPI for signalling Thunderbolt hotplug events. This also should make ACPIPHP work in some cases in which it was known to have problems. From Rafael J Wysocki, Mika Westerberg and Kirill A Shutemov. 2) ACPI core code cleanups and dock station support cleanups from Jiang Liu and Rafael J Wysocki. 3) Fixes for locking problems related to ACPI device hotplug from Rafael J Wysocki. 4) ACPICA update to version 20130725 includig fixes, cleanups, support for more than 256 GPEs per GPE block and a change to make the ACPI PM Timer optional (we've seen systems without the PM Timer in the field already). One of the fixes, related to the DeRefOf operator, is necessary to prevent some Windows 8 oriented AML from causing problems to happen. From Bob Moore, Lv Zheng, and Jung-uk Kim. 5) Removal of the old and long deprecated /proc/acpi/event interface and related driver changes from Thomas Renninger. 6) ACPI and Xen changes to make the reduced hardware sleep work with the latter from Ben Guthro. 7) ACPI video driver cleanups and a blacklist of systems that should not tell the BIOS that they are compatible with Windows 8 (or ACPI backlight and possibly other things will not work on them). From Felipe Contreras. 8) Assorted ACPI fixes and cleanups from Aaron Lu, Hanjun Guo, Kuppuswamy Sathyanarayanan, Lan Tianyu, Sachin Kamat, Tang Chen, Toshi Kani, and Wei Yongjun. 9) cpufreq ondemand governor target frequency selection change to reduce oscillations between min and max frequencies (essentially, it causes the governor to choose target frequencies proportional to load) from Stratos Karafotis. 10) cpufreq fixes allowing sysfs attributes file permissions to be preserved over suspend/resume cycles Srivatsa S Bhat. 11) Removal of Device Tree parsing for CPU device nodes from multiple cpufreq drivers that required some changes related to of_get_cpu_node() to be made in a few architectures and in the driver core. From Sudeep KarkadaNagesha. 12) cpufreq core fixes and cleanups related to mutual exclusion and driver module references from Viresh Kumar, Lukasz Majewski and Rafael J Wysocki. 13) Assorted cpufreq fixes and cleanups from Amit Daniel Kachhap, Bartlomiej Zolnierkiewicz, Hanjun Guo, Jingoo Han, Joseph Lo, Julia Lawall, Li Zhong, Mark Brown, Sascha Hauer, Stephen Boyd, Stratos Karafotis, and Viresh Kumar. 14) Fixes to prevent race conditions in coupled cpuidle from happening from Colin Cross. 15) cpuidle core fixes and cleanups from Daniel Lezcano and Tuukka Tikkanen. 16) Assorted cpuidle fixes and cleanups from Daniel Lezcano, Geert Uytterhoeven, Jingoo Han, Julia Lawall, Linus Walleij, and Sahara. 17) System sleep tracing changes from Todd E Brandt and Shuah Khan. 18) PNP subsystem conversion to using struct dev_pm_ops for power management from Shuah Khan. * tag 'pm+acpi-3.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (217 commits) cpufreq: Don't use smp_processor_id() in preemptible context cpuidle: coupled: fix race condition between pokes and safe state cpuidle: coupled: abort idle if pokes are pending cpuidle: coupled: disable interrupts after entering safe state ACPI / hotplug: Remove containers synchronously driver core / ACPI: Avoid device hot remove locking issues cpufreq: governor: Fix typos in comments cpufreq: governors: Remove duplicate check of target freq in supported range cpufreq: Fix timer/workqueue corruption due to double queueing ACPI / EC: Add ASUSTEK L4R to quirk list in order to validate ECDT ACPI / thermal: Add check of "_TZD" availability and evaluating result cpufreq: imx6q: Fix clock enable balance ACPI: blacklist win8 OSI for buggy laptops cpufreq: tegra: fix the wrong clock name cpuidle: Change struct menu_device field types cpuidle: Add a comment warning about possible overflow cpuidle: Fix variable domains in get_typical_interval() cpuidle: Fix menu_device->intervals type cpuidle: CodingStyle: Break up multiple assignments on single line cpuidle: Check called function parameter in get_typical_interval() ...
2013-09-03lockref: Relax in cmpxchg loopLuck, Tony
While we are likley to succeed and break out of this loop, it isn't guaranteed. We should be power and thread friendly if we do have to go around for a second (or third, or more) attempt. Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-09-03Merge tag 'driver-core-3.12-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core patches from Greg KH: "Here's the big driver core pull request for 3.12-rc1. Lots of tiny changes here fixing up the way sysfs attributes are created, to try to make drivers simpler, and fix a whole class race conditions with creations of device attributes after the device was announced to userspace. All the various pieces are acked by the different subsystem maintainers" * tag 'driver-core-3.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (119 commits) firmware loader: fix pending_fw_head list corruption drivers/base/memory.c: introduce help macro to_memory_block dynamic debug: line queries failing due to uninitialized local variable sysfs: sysfs_create_groups returns a value. debugfs: provide debugfs_create_x64() when disabled rbd: convert bus code to use bus_groups firmware: dcdbas: use binary attribute groups sysfs: add sysfs_create/remove_groups for when SYSFS is not enabled driver core: add #include <linux/sysfs.h> to core files. HID: convert bus code to use dev_groups Input: serio: convert bus code to use drv_groups Input: gameport: convert bus code to use drv_groups driver core: firmware: use __ATTR_RW() driver core: core: use DEVICE_ATTR_RO driver core: bus: use DRIVER_ATTR_WO() driver core: create write-only attribute macros for devices and drivers sysfs: create __ATTR_WO() driver-core: platform: convert bus code to use dev_groups workqueue: convert bus code to use dev_groups MEI: convert bus code to use dev_groups ...
2013-09-03Merge branch 'rcu/next' of ↵Ingo Molnar
git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu into core/rcu Pull RCU updates from Paul E. McKenney: " * Update RCU documentation. These were posted to LKML at https://lkml.org/lkml/2013/8/19/611. * Miscellaneous fixes. These were posted to LKML at https://lkml.org/lkml/2013/8/19/619. * Full-system idle detection. This is for use by Frederic Weisbecker's adaptive-ticks mechanism. Its purpose is to allow the timekeeping CPU to shut off its tick when all other CPUs are idle. These were posted to LKML at https://lkml.org/lkml/2013/8/19/648. * Improve rcutorture test coverage. These were posted to LKML at https://lkml.org/lkml/2013/8/19/675. " Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-09-02lockref: implement lockless reference count updates using cmpxchg()Linus Torvalds
Instead of taking the spinlock, the lockless versions atomically check that the lock is not taken, and do the reference count update using a cmpxchg() loop. This is semantically identical to doing the reference count update protected by the lock, but avoids the "wait for lock" contention that you get when accesses to the reference count are contended. Note that a "lockref" is absolutely _not_ equivalent to an atomic_t. Even when the lockref reference counts are updated atomically with cmpxchg, the fact that they also verify the state of the spinlock means that the lockless updates can never happen while somebody else holds the spinlock. So while "lockref_put_or_lock()" looks a lot like just another name for "atomic_dec_and_lock()", and both optimize to lockless updates, they are fundamentally different: the decrement done by atomic_dec_and_lock() is truly independent of any lock (as long as it doesn't decrement to zero), so a locked region can still see the count change. The lockref structure, in contrast, really is a *locked* reference count. If you hold the spinlock, the reference count will be stable and you can modify the reference count without using atomics, because even the lockless updates will see and respect the state of the lock. In order to enable the cmpxchg lockless code, the architecture needs to do three things: (1) Make sure that the "arch_spinlock_t" and an "unsigned int" can fit in an aligned u64, and have a "cmpxchg()" implementation that works on such a u64 data type. (2) define a helper function to test for a spinlock being unlocked ("arch_spin_value_unlocked()") (3) select the "ARCH_USE_CMPXCHG_LOCKREF" config variable in its Kconfig file. This enables it for x86-64 (but not 32-bit, we'd need to make sure cmpxchg() turns into the proper cmpxchg8b in order to enable it for 32-bit mode). Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-09-02lockref: uninline lockref helper functionsLinus Torvalds
They aren't very good to inline, since they already call external functions (the spinlock code), and we're going to create rather more complicated versions of them that can do the reference count updates locklessly. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-08-28sysfs: Restrict mounting sysfsEric W. Biederman
Don't allow mounting sysfs unless the caller has CAP_SYS_ADMIN rights over the net namespace. The principle here is if you create or have capabilities over it you can mount it, otherwise you get to live with what other people have mounted. Instead of testing this with a straight forward ns_capable call, perform this check the long and torturous way with kobject helpers, this keeps direct knowledge of namespaces out of sysfs, and preserves the existing sysfs abstractions. Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2013-08-28dynamic debug: line queries failing due to uninitialized local variablejbaron@akamai.com
Settings of the form, 'line x module y +p', can fail arbitrarily due to an uninitialized local variable. With this patch results are consistent, as expected. Signed-off-by: Jason Baron <jbaron@akamai.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>