summaryrefslogtreecommitdiffstats
path: root/drivers
AgeCommit message (Collapse)Author
2014-12-13Merge branch 'for-3.19/core' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull block driver core update from Jens Axboe: "This is the pull request for the core block IO changes for 3.19. Not a huge round this time, mostly lots of little good fixes: - Fix a bug in sysfs blktrace interface causing a NULL pointer dereference, when enabled/disabled through that API. From Arianna Avanzini. - Various updates/fixes/improvements for blk-mq: - A set of updates from Bart, mostly fixing buts in the tag handling. - Cleanup/code consolidation from Christoph. - Extend queue_rq API to be able to handle batching issues of IO requests. NVMe will utilize this shortly. From me. - A few tag and request handling updates from me. - Cleanup of the preempt handling for running queues from Paolo. - Prevent running of unmapped hardware queues from Ming Lei. - Move the kdump memory limiting check to be in the correct location, from Shaohua. - Initialize all software queues at init time from Takashi. This prevents a kobject warning when CPUs are brought online that weren't online when a queue was registered. - Single writeback fix for I_DIRTY clearing from Tejun. Queued with the core IO changes, since it's just a single fix. - Version X of the __bio_add_page() segment addition retry from Maurizio. Hope the Xth time is the charm. - Documentation fixup for IO scheduler merging from Jan. - Introduce (and use) generic IO stat accounting helpers for non-rq drivers, from Gu Zheng. - Kill off artificial limiting of max sectors in a request from Christoph" * 'for-3.19/core' of git://git.kernel.dk/linux-block: (26 commits) bio: modify __bio_add_page() to accept pages that don't start a new segment blk-mq: Fix uninitialized kobject at CPU hotplugging blktrace: don't let the sysfs interface remove trace from running list blk-mq: Use all available hardware queues blk-mq: Micro-optimize bt_get() blk-mq: Fix a race between bt_clear_tag() and bt_get() blk-mq: Avoid that __bt_get_word() wraps multiple times blk-mq: Fix a use-after-free blk-mq: prevent unmapped hw queue from being scheduled blk-mq: re-check for available tags after running the hardware queue blk-mq: fix hang in bt_get() blk-mq: move the kdump check to blk_mq_alloc_tag_set blk-mq: cleanup tag free handling blk-mq: use 'nr_cpu_ids' as highest CPU ID count for hwq <-> cpu map blk: introduce generic io stat accounting help function blk-mq: handle the single queue case in blk_mq_hctx_next_cpu genhd: check for int overflow in disk_expand_part_tbl() blk-mq: add blk_mq_free_hctx_request() blk-mq: export blk_mq_free_request() blk-mq: use get_cpu/put_cpu instead of preempt_disable/preempt_enable ...
2014-12-13Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6Linus Torvalds
Pull crypto update from Herbert Xu: - The crypto API is now documented :) - Disallow arbitrary module loading through crypto API. - Allow get request with empty driver name through crypto_user. - Allow speed testing of arbitrary hash functions. - Add caam support for ctr(aes), gcm(aes) and their derivatives. - nx now supports concurrent hashing properly. - Add sahara support for SHA1/256. - Add ARM64 version of CRC32. - Misc fixes. * git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (77 commits) crypto: tcrypt - Allow speed testing of arbitrary hash functions crypto: af_alg - add user space interface for AEAD crypto: qat - fix problem with coalescing enable logic crypto: sahara - add support for SHA1/256 crypto: sahara - replace tasklets with kthread crypto: sahara - add support for i.MX53 crypto: sahara - fix spinlock initialization crypto: arm - replace memset by memzero_explicit crypto: powerpc - replace memset by memzero_explicit crypto: sha - replace memset by memzero_explicit crypto: sparc - replace memset by memzero_explicit crypto: algif_skcipher - initialize upon init request crypto: algif_skcipher - removed unneeded code crypto: algif_skcipher - Fixed blocking recvmsg crypto: drbg - use memzero_explicit() for clearing sensitive data crypto: drbg - use MODULE_ALIAS_CRYPTO crypto: include crypto- module prefix in template crypto: user - add MODULE_ALIAS crypto: sha-mb - remove a bogus NULL check crytpo: qat - Fix 64 bytes requests ...
2014-12-13Merge branch 'akpm' (second patch-bomb from Andrew)Linus Torvalds
Merge second patchbomb from Andrew Morton: - the rest of MM - misc fs fixes - add execveat() syscall - new ratelimit feature for fault-injection - decompressor updates - ipc/ updates - fallocate feature creep - fsnotify cleanups - a few other misc things * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (99 commits) cgroups: Documentation: fix trivial typos and wrong paragraph numberings parisc: percpu: update comments referring to __get_cpu_var percpu: update local_ops.txt to reflect this_cpu operations percpu: remove __get_cpu_var and __raw_get_cpu_var macros fsnotify: remove destroy_list from fsnotify_mark fsnotify: unify inode and mount marks handling fallocate: create FAN_MODIFY and IN_MODIFY events mm/cma: make kmemleak ignore CMA regions slub: fix cpuset check in get_any_partial slab: fix cpuset check in fallback_alloc shmdt: use i_size_read() instead of ->i_size ipc/shm.c: fix overly aggressive shmdt() when calls span multiple segments ipc/msg: increase MSGMNI, remove scaling ipc/sem.c: increase SEMMSL, SEMMNI, SEMOPM ipc/sem.c: change memory barrier in sem_lock() to smp_rmb() lib/decompress.c: consistency of compress formats for kernel image decompress_bunzip2: off by one in get_next_block() usr/Kconfig: make initrd compression algorithm selection not expert fault-inject: add ratelimit option ratelimit: add initialization macro ...
2014-12-13zram: use DEVICE_ATTR_[RW|RO|WO] to define zram sys device attributeGanesh Mahendran
In current zram, we use DEVICE_ATTR() to define sys device attributes. SO, we need to set (S_IRUGO | S_IWUSR) permission and other arguments manually. Linux already provids the macro DEVICE_ATTR_[RW|RO|WO] to define sys device attribute. It is simple and readable. This patch uses kernel defined macro DEVICE_ATTR_[RW|RO|WO] to define zram device attribute. Signed-off-by: Ganesh Mahendran <opensource.ganesh@gmail.com> Acked-by: Jerome Marchand <jmarchan@redhat.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Nitin Gupta <ngupta@vflare.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-12-13mm/zram: correct ZRAM_ZERO flag bit positionMahendran Ganesh
In struct zram_table_entry, the element *value* contains obj size and obj zram flags. Bit 0 to bit (ZRAM_FLAG_SHIFT - 1) represent obj size, and bit ZRAM_FLAG_SHIFT to the highest bit of unsigned long represent obj zram_flags. So the first zram flag(ZRAM_ZERO) should be from ZRAM_FLAG_SHIFT instead of (ZRAM_FLAG_SHIFT + 1). This patch fixes this cosmetic issue. Also fix a typo, "page in now accessed" -> "page is now accessed" Signed-off-by: Mahendran Ganesh <opensource.ganesh@gmail.com> Acked-by: Minchan Kim <minchan@kernel.org> Acked-by: Weijie Yang <weijie.yang@samsung.com> Acked-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Cc: Nitin Gupta <ngupta@vflare.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-12-13zram: implement rw_page operation of zramkaram.lee
This patch implements rw_page operation for zram block device. I implemented the feature in zram and tested it. Test bed was the G2, LG electronic mobile device, whtich has msm8974 processor and 2GB memory. With a memory allocation test program consuming memory, the system generates swap. Operating time of swap_write_page() was measured. -------------------------------------------------- | | operating time | improvement | | | (20 runs average) | | -------------------------------------------------- |with patch | 1061.15 us | +2.4% | -------------------------------------------------- |without patch| 1087.35 us | | -------------------------------------------------- Each test(with paged_io,with BIO) result set shows normal distribution and has equal variance. I mean the two values are valid result to compare. I can say operation with paged I/O(without BIO) is faster 2.4% with confidence level 95%. [minchan@kernel.org: make rw_page opeartion return 0] [minchan@kernel.org: rely on the bi_end_io for zram_rw_page fails] [sergey.senozhatsky@gmail.com: code cleanup] [minchan@kernel.org: add comment] Signed-off-by: karam.lee <karam.lee@lge.com> Acked-by: Minchan Kim <minchan@kernel.org> Acked-by: Jerome Marchand <jmarchan@redhat.com> Cc: Matthew Wilcox <matthew.r.wilcox@intel.com> Cc: Nitin Gupta <ngupta@vflare.org> Cc: <seungho1.park@lge.com> Signed-off-by: Minchan Kim <minchan@kernel.org> Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-12-13zram: change parameter from vaild_io_request()karam.lee
This patch changes parameter of valid_io_request for common usage. The purpose of valid_io_request() is to determine if bio request is valid or not. This patch use I/O start address and size instead of a BIO parameter for common usage. Signed-off-by: karam.lee <karam.lee@lge.com> Acked-by: Minchan Kim <minchan@kernel.org> Acked-by: Jerome Marchand <jmarchan@redhat.com> Cc: Matthew Wilcox <matthew.r.wilcox@intel.com> Cc: Nitin Gupta <ngupta@vflare.org> Cc: <seungho1.park@lge.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-12-13zram: remove bio parameter from zram_bvec_rw()karam.lee
Recently rw_page block device operation has been added. This patchset implements rw_page operation for zram block device and does some clean-up. This patch (of 3): Remove an unnecessary parameter(bio) from zram_bvec_rw() and zram_bvec_read(). zram_bvec_read() doesn't use a bio parameter, so remove it. zram_bvec_rw() calls a read/write operation not using bio, so a rw parameter replaces a bio parameter. Signed-off-by: karam.lee <karam.lee@lge.com> Acked-by: Minchan Kim <minchan@kernel.org> Acked-by: Jerome Marchand <jmarchan@redhat.com> Cc: Matthew Wilcox <matthew.r.wilcox@intel.com> Cc: Nitin Gupta <ngupta@vflare.org> Cc: <seungho1.park@lge.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-12-13mm: vmscan: invoke slab shrinkers from shrink_zone()Johannes Weiner
The slab shrinkers are currently invoked from the zonelist walkers in kswapd, direct reclaim, and zone reclaim, all of which roughly gauge the eligible LRU pages and assemble a nodemask to pass to NUMA-aware shrinkers, which then again have to walk over the nodemask. This is redundant code, extra runtime work, and fairly inaccurate when it comes to the estimation of actually scannable LRU pages. The code duplication will only get worse when making the shrinkers cgroup-aware and requiring them to have out-of-band cgroup hierarchy walks as well. Instead, invoke the shrinkers from shrink_zone(), which is where all reclaimers end up, to avoid this duplication. Take the count for eligible LRU pages out of get_scan_count(), which considers many more factors than just the availability of swap space, like zone_reclaimable_pages() currently does. Accumulate the number over all visited lruvecs to get the per-zone value. Some nodes have multiple zones due to memory addressing restrictions. To avoid putting too much pressure on the shrinkers, only invoke them once for each such node, using the class zone of the allocation as the pivot zone. For now, this integrates the slab shrinking better into the reclaim logic and gets rid of duplicative invocations from kswapd, direct reclaim, and zone reclaim. It also prepares for cgroup-awareness, allowing memcg-capable shrinkers to be added at the lruvec level without much duplication of both code and runtime work. This changes kswapd behavior, which used to invoke the shrinkers for each zone, but with scan ratios gathered from the entire node, resulting in meaningless pressure quantities on multi-zone nodes. Zone reclaim behavior also changes. It used to shrink slabs until the same amount of pages were shrunk as were reclaimed from the LRUs. Now it merely invokes the shrinkers once with the zone's scan ratio, which makes the shrinkers go easier on caches that implement aging and would prefer feeding back pressure from recently used slab objects to unused LRU pages. [vdavydov@parallels.com: assure class zone is populated] Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Cc: Dave Chinner <david@fromorbit.com> Signed-off-by: Vladimir Davydov <vdavydov@parallels.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-12-13memory-hotplug: remove redundant call of page_to_pfnZhang Zhen
This is just a small optimization. The start_pfn can be obtained directly by phys_index << PFN_SECTION_SHIFT. So the call of page_to_pfn() is redundant and remove it. Signed-off-by: Zhang Zhen <zhenzhang.zhang@huawei.com> Acked-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com> Acked-by: David Rientjes <rientjes@google.com> Cc: Dave Hansen <dave@sr71.net> Cc: Wang Nan <wangnan0@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-12-13iommu/amd: use handle_mm_fault directlyJesse Barnes
This could be useful for debug in the future if we want to track major/minor faults more closely, and also avoids the put_page trick we used with gup. In order to do this, we also track the task struct in the PASID state structure. This lets us update the appropriate task stats after the fault has been handled, and may aid with debug in the future as well. Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org> Tested-by: Oded Gabbay <oded.gabbay@amd.com> Cc: Joerg Roedel <jroedel@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-12-13rtc: snvs: fix build with CONFIG_PM_SLEEP disabledGuenter Roeck
Commit 7654e9d4fd8f ("drivers/rtc/rtc-snvs: fix suspend/resume") replaces SIMPLE_DEV_PM_OPS with direct declaration of snvs_rtc_pm_ops, but does so outside #ifdef CONFIG_PM_SLEEP. This causes the driver build to fail if CONFIG_PM_SLEEP is not configured. Fixes: 7654e9d4fd8f ("drivers/rtc/rtc-snvs: fix suspend/resume") Signed-off-by: Guenter Roeck <linux@roeck-us.net> Cc: Sanchayan Maity <maitysanchayan@gmail.com> Cc: Alessandro Zummo <a.zummo@towertech.it> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-12-13irqchip: gicv3-its: Fix ITT allocationMarc Zyngier
When issuing a MAPD command, one of the parameters passed to the ITS is the number of EventID bits used to index the per-device Interrupt Translation Table (ITT). Crucially, this is the number of bits *minus one*. This has two consequences: - The size of the ITT has to be a strict power of two, no matter how many different events the device is actually going to generate. - It is impossible to express an ITT with a single entry, as you would have to tell the ITS to "use zero bit from the EventID", and that clashes with "minus one" above. Fix this by allocating the ITT with the number of vectors rounded up to the next power of two, with a minimum of two entries. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Cc: linux-arm-kernel@lists.infradead.org Cc: Yun Wu (Abel) <wuyun.wu@huawei.com> Cc: Robert Richter <robert.richter@caviumnetworks.com> Cc: Jason Cooper <jason@lakedaemon.net> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2014-12-13irqchip: gicv3-its: Move some alloc/free code to activate/deactivateMarc Zyngier
The ITS code could do a bit less in the alloc/free paths, and a bit more in the activate/deactivate methods, giving a better separation between software allocation and HW programing. Suggested-by: Wuyun Wu (Abel) <wuyun.wu@huawei.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Cc: linux-arm-kernel@lists.infradead.org Cc: Yun Wu (Abel) <wuyun.wu@huawei.com> Cc: Robert Richter <robert.richter@caviumnetworks.com> Cc: Jason Cooper <jason@lakedaemon.net> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2014-12-13irqchip: gicv3-its: Fix domain free in multi-MSI caseMarc Zyngier
Fix stupid thinko on the path freeing the interrupts, where only the first interrupt would get reset, and none of the others. This should only affect multi-MSI allocations. Reported-by: Wuyun Wu (Abel) <wuyun.wu@huawei.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Cc: linux-arm-kernel@lists.infradead.org Cc: Robert Richter <robert.richter@caviumnetworks.com> Cc: Jason Cooper <jason@lakedaemon.net> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2014-12-12iscsi-target: Drop left-over bogus iscsi_np->tpg_npNicholas Bellinger
This patch drops the left-over iscsi_np->tpg_np pointer, now that iser-target PI is able to dynamically allocate PI contexts per I/O, instead of needing to determine support using a TPG attribute with this bogus reference. Cc: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2014-12-12iser-target: Fix wc->wr_id cast warningNicholas Bellinger
CC [M] drivers/infiniband/ulp/isert/ib_isert.o drivers/infiniband/ulp/isert/ib_isert.c: In function ‘isert_cq_comp_err’: drivers/infiniband/ulp/isert/ib_isert.c:1979:42: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast] Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2014-12-12iser-target: Remove code duplicationSagi Grimberg
- Fall-through in switch case instead in do_control_comp. - Move rkey invalidation to a function. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2014-12-12iser-target: Adjust log levels and prettify some printsSagi Grimberg
debug_level 1 (warn): Include warning messages. debug_level 2 (info): Include relevant info for control plane. debug_level 3 (debug): Include relevant info in the IO path. Also, added/removed some logging messages. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2014-12-12iser-target: Use debug_level parameter to control logging levelSagi Grimberg
Personal preference, easier control of the log level with a single modparam which can be changed dynamically. Allows better saparation of control and IO plains. Replaced throughout ib_isert.c: s/pr_debug/isert_dbg/g s/pr_info/isert_info/g s/pr_warn/isert_warn/g s/pr_err/isert_err/g Plus nit checkpatch warning change. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2014-12-12iser-target: Fix logout sequenceSagi Grimberg
We don't want to wait for conn_logout_comp from isert_comp_wq context as this blocks further completions from being processed. Instead we wait for it conditionally (if logout response was actually posted) in wait_conn. This wait should normally happen immediately as it occurs after we consumed all the completions (including flush errors) and conn_logout_comp should have been completed. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2014-12-12iser-target: Don't wait for session commands from completion contextSagi Grimberg
Might result in a deadlock where completion context waits for session commands release where the later might need a final completion for it. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2014-12-12iser-target: Reduce CQ lock contention by batch pollingSagi Grimberg
In order to reduce the contention on CQ locking (present in some LLDDs) we poll in batches of 16 work completion items. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2014-12-12iser-target: Introduce isert_poll_budgetSagi Grimberg
In case the CQ is packed with completions, we can't just hog the CPU forever. Poll until a sufficient budget (currently hard-coded to 64k completions) and if budget is exhausted, bailout and give a chance to other threads. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2014-12-12iser-target: Remove an atomic operation from the IO pathSagi Grimberg
In order to know that we consumed all the connection completions we maintain atomic post_send_buf_count for each IO post send. But we can know that if we post a "beacon" (zero length RECV work request) after we move the QP into error state and the target does not serve any new IO. When we consume it, we know we finished all the connection completion and we can go ahead and destroy stuff. In error completion handler we now just need to check for ISERT_BEACON_WRID to arrive and then wait for session commands to cleanup and complete conn_wait_comp_err. We reserve another CQ and QP entries to fit the zero length post recv. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2014-12-12iser-target: Remove redundant call to isert_conn_terminateSagi Grimberg
We are calling session reinstatement, wait_conn will start connection termination. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2014-12-12iser-target: Use single CQ for TX and RXSagi Grimberg
Using TX and RX CQs attached to the same vector might create a throttling effect coming from the serial processing of a work-queue. Use one CQ instead, it will do better in interrupt processing and it provides a simpler code. Also, We get rid of redundant isert_rx_wq. Next we can remove the atomic post_send_buf_count from the IO path. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2014-12-12iser-target: Centralize completion elements to a contextSagi Grimberg
A pre-step before going to a single CQ. Also this makes the code a little more simple to read. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2014-12-12iser-target: Cast wr_id with uintptr_t instead of unsinged longSagi Grimberg
Nit, uintptr_t is designed for pointer casting, use it. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2014-12-12iser-target: Unite error completion handler for RX and TXSagi Grimberg
As a pre-step to a single CQ, we unite the error completion handlers to a single handler. This patch does not change any functionality. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2014-12-12iser-target: Remove interrupt coalescingSagi Grimberg
It is disabled at the moment, we will get that back in once the target is more stable. This reverts commit 95b60f0 "Add support for completion interrupt coalescing" Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2014-12-12iser-target: Work-around live target stack shutdown resource cleanupSagi Grimberg
Currently we have no way to tell that the target stack is in shutdown sequence. In case we have open connections, the initiator immediately attempts to reconnect in a DDOS attack style, so we may end up terminating the iser enabled network portal while it's np_accept_list still have pending connections. The workaround is simply release all the connections in the list. A proper fix will be to start shutdown sequence by shutting the network portal to avoid initiator immediate reconnect attempts. But the temporary work around seems to work at this point, so I think we can do this for now... Reported-by: Slava Shwartsman <valyushash@gmail.com> Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2014-12-12iscsi,iser-target: Expose supported protection ops according to t10_piSagi Grimberg
iSER will report supported protection operations based on the tpg attribute t10_pi settings and HCA PI offload capabilities. If the HCA does not support PI offload or tpg attribute t10_pi is not set, we fall to SW PI mode. In order to do that, we move iscsit_get_sup_prot_ops after connection tpg assignment. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Cc: <stable@vger.kernel.org> # v3.14+ Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2014-12-12iser-target: Fix NULL dereference in SW mode DIFSagi Grimberg
Fallback to software mode DIF if HCA does not support PI (without crashing obviously). It is still possible to run with backend protection and an unprotected frontend, so looking at the command prot_op is not enough. Check device PI capability on a per-IO basis (isert_prot_cmd inline static) to determine if we need to handle protection information. Trace: BUG: unable to handle kernel NULL pointer dereference at 0000000000000010 IP: [<ffffffffa037f8b1>] isert_reg_sig_mr+0x351/0x3b0 [ib_isert] Call Trace: [<ffffffff812b003a>] ? swiotlb_map_sg_attrs+0x7a/0x130 [<ffffffffa038184d>] isert_reg_rdma+0x2fd/0x370 [ib_isert] [<ffffffff8108f2ec>] ? idle_balance+0x6c/0x2c0 [<ffffffffa0382b68>] isert_put_datain+0x68/0x210 [ib_isert] [<ffffffffa02acf5b>] lio_queue_data_in+0x2b/0x30 [iscsi_target_mod] [<ffffffffa02306eb>] target_complete_ok_work+0x21b/0x310 [target_core_mod] [<ffffffff8106ece2>] process_one_work+0x182/0x3b0 [<ffffffff8106fda0>] worker_thread+0x120/0x3c0 [<ffffffff8106fc80>] ? maybe_create_worker+0x190/0x190 [<ffffffff8107594e>] kthread+0xce/0xf0 [<ffffffff81075880>] ? kthread_freezable_should_stop+0x70/0x70 [<ffffffff8159a22c>] ret_from_fork+0x7c/0xb0 [<ffffffff81075880>] ? kthread_freezable_should_stop+0x70/0x70 Reported-by: Slava Shwartsman <valyushash@gmail.com> Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Cc: <stable@vger.kernel.org> # v3.14+ Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2014-12-12iser-target: Allocate PI contexts dynamicallySagi Grimberg
This patch converts to allocate PI contexts dynamically in order avoid a potentially bogus np->tpg_np and associated NULL pointer dereference in isert_connect_request() during iser-target endpoint shutdown with multiple network portals. Also, there is really no need to allocate these at connection establishment since it is not guaranteed that all the IOs on that connection will be to a PI formatted device. We can do it in a lazy fashion so the initial burst will have a transient slow down, but very fast all IOs will allocate a PI context. Squashed: iser-target: Centralize PI context handling code Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Cc: <stable@vger.kernel.org> # v3.14+ Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2014-12-12iser-target: Fix implicit termination of connectionsSagi Grimberg
In situations such as bond failover, The new session establishment implicitly invokes the termination of the old connection. So, we don't want to wait for the old connection wait_conn to completely terminate before we accept the new connection and post a login response. The solution is to deffer the comp_wait completion and the conn_put to a work so wait_conn will effectively be non-blocking (flush errors are assumed to come very fast). We allocate isert_release_wq with WQ_UNBOUND and WQ_UNBOUND_MAX_ACTIVE to spread the concurrency of release works. Reported-by: Slava Shwartsman <valyushash@gmail.com> Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Cc: <stable@vger.kernel.org> # v3.10+ Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2014-12-12iser-target: Handle ADDR_CHANGE event for listener cm_idSagi Grimberg
The np listener cm_id will also get ADDR_CHANGE event upcall (in case it is bound to a specific IP). Handle it correctly by creating a new cm_id and implicitly destroy the old one. Since this is the second event a listener np cm_id may encounter, we move the np cm_id event handling to a routine. Squashed: iser-target: Move cma_id setup to a function Reported-by: Slava Shwartsman <valyushash@gmail.com> Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Cc: <stable@vger.kernel.org> # v3.10+ Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2014-12-12iser-target: Fix connected_handler + teardown flow raceSagi Grimberg
Take isert_conn pointer from cm_id->qp->qp_context. This will allow us to know that the cm_id context is always the network portal. This will make the cm_id event check (connection or network portal) more reliable. In order to avoid a NULL dereference in cma_id->qp->qp_context we destroy the qp after we destroy the cm_id (and make the dereference safe). session stablishment/teardown sequences can happen in parallel, we should take into account that connected_handler might race with connection teardown flow. Also, protect isert_conn->conn_device->active_qps decrement within the error patch during QP creation failure and the normal teardown path in isert_connect_release(). Squashed: iser-target: Decrement completion context active_qps in error flow Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Cc: <stable@vger.kernel.org> # v3.10+ Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2014-12-12iser-target: Parallelize CM connection establishmentSagi Grimberg
There is no point in accepting a new CM request only when we are completely done with the last iscsi login. Instead we accept immediately, this will also cause the CM connection to reach connected state and the initiator is allowed to send the first login. We mark that we got the initial login and let iscsi layer pick it up when it gets there. This reduces the parallel login sequence by a factor of more then 4 (and more for multi-login) and also prevents the initiator (who does all logins in parallel) from giving up on login timeout expiration. In order to support multiple login requests sequence (CHAP) we call isert_rx_login_req from isert_rx_completion insead of letting isert_get_login_rx call it. Squashed: iser-target: Use kref_get_unless_zero in connected_handler iser-target: Acquire conn_mutex when changing connection state iser-target: Reject connect request in failure path Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Cc: <stable@vger.kernel.org> # v3.10+ Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2014-12-12iser-target: Fix flush + disconnect completion handlingSagi Grimberg
ISER_CONN_UP state is not sufficient to know if we should wait for completion of flush errors and disconnected_handler event. Instead, split it to 2 states: - ISER_CONN_UP: Got to CM connected phase, This state indicates that we need to wait for a CM disconnect event before going to teardown. - ISER_CONN_FULL_FEATURE: Got to full feature phase after we posted login response, This state indicates that we posted recv buffers and we need to wait for flush completions before going to teardown. Also avoid deffering disconnected handler to a work, and handle it within disconnected handler. More work here is needed to handle DEVICE_REMOVAL event correctly (cleanup all resources). Squashed: iser-target: Don't deffer disconnected handler to a work Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Cc: <stable@vger.kernel.org> # v3.10+ Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2014-12-12iscsi,iser-target: Initiate termination only onceSagi Grimberg
Since commit 0fc4ea701fcf ("Target/iser: Don't put isert_conn inside disconnected handler") we put the conn kref in isert_wait_conn, so we need .wait_conn to be invoked also in the error path. Introduce call to isert_conn_terminate (called under lock) which transitions the connection state to TERMINATING and calls rdma_disconnect. If the state is already teminating, just bail out back (temination started). Also, make sure to destroy the connection when getting a connect error event if didn't get to connected (state UP). Same for the handling of REJECTED and UNREACHABLE cma events. Squashed: iscsi-target: Add call to wait_conn in establishment error flow Reported-by: Slava Shwartsman <valyushash@gmail.com> Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Cc: <stable@vger.kernel.org> # v3.10+ Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2014-12-13Merge branches 'thermal-core-fix', 'thermal-soc' and 'thermal-int340x' into nextZhang Rui
2014-12-12mtd: tests: abort torturetest on erase errorsBrian Norris
The torture test should quit once it actually induces an error in the flash. This step was accidentally removed during refactoring. Without this fix, the torturetest just continues infinitely, or until the maximum cycle count is reached. e.g.: ... [ 7619.218171] mtd_test: error -5 while erasing EB 100 [ 7619.297981] mtd_test: error -5 while erasing EB 100 [ 7619.377953] mtd_test: error -5 while erasing EB 100 [ 7619.457998] mtd_test: error -5 while erasing EB 100 [ 7619.537990] mtd_test: error -5 while erasing EB 100 ... Fixes: 6cf78358c94f ("mtd: mtd_torturetest: use mtd_test helpers") Signed-off-by: Brian Norris <computersforpeace@gmail.com> Cc: Akinobu Mita <akinobu.mita@gmail.com> Cc: <stable@vger.kernel.org>
2014-12-12mtd: physmap_of: fix potential NULL dereferenceArd Biesheuvel
On device remove, when testing the cmtd field of an of_flash struct to decide whether it is a concatenated device or not, we get a false positive on cmtd == NULL, and dereference it subsequently. This may occur if of_flash_remove() is called from the cleanup path of of_flash_probe(). Instead, test for NULL first, and only then perform the test for a concatenated device. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Brian Norris <computersforpeace@gmail.com>
2014-12-12Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds
Pull another networking update from David Miller: "Small follow-up to the main merge pull from the other day: 1) Alexander Duyck's DMA memory barrier patch set. 2) cxgb4 driver fixes from Karen Xie. 3) Add missing export of fixed_phy_register() to modules, from Mark Salter. 4) DSA bug fixes from Florian Fainelli" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (24 commits) net/macb: add TX multiqueue support for gem linux/interrupt.h: remove the definition of unused tasklet_hi_enable jme: replace calls to redundant function net: ethernet: davicom: Allow to select DM9000 for nios2 net: ethernet: smsc: Allow to select SMC91X for nios2 cxgb4: Add support for QSA modules libcxgbi: fix freeing skb prematurely cxgb4i: use set_wr_txq() to set tx queues cxgb4i: handle non-pdu-aligned rx data cxgb4i: additional types of negative advice cxgb4/cxgb4i: set the max. pdu length in firmware cxgb4i: fix credit check for tx_data_wr cxgb4i: fix tx immediate data credit check net: phy: export fixed_phy_register() fib_trie: Fix trie balancing issue if new node pushes down existing node vlan: Add ability to always enable TSO/UFO r8169:update rtl8168g pcie ephy parameter net: dsa: bcm_sf2: force link for all fixed PHY devices fm10k/igb/ixgbe: Use dma_rmb on Rx descriptor reads r8169: Use dma_rmb() and dma_wmb() for DescOwn checks ...
2014-12-13mmc: atmel-mci: use SET_RUNTIME_PM_OPS() macroLudovic Desroches
The currently used SET_PM_RUNTIME_PM_OPS() macro is defined to the SET_RUNTIME_PM_OPS() macro. Convert to the later, since that's the proper one to use. Signed-off-by: Ludovic Desroches <ludovic.desroches@atmel.com> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-12-13PM / Kconfig: Replace PM_RUNTIME with PM in dependenciesRafael J. Wysocki
After commit b2b49ccbdd54 (PM: Kconfig: Set PM_RUNTIME if PM_SLEEP is selected) PM_RUNTIME is always set if PM is set, so Kconfig options depending on CONFIG_PM_RUNTIME may now be changed to depend on CONFIG_PM. Replace PM_RUNTIME with PM in Kconfig dependencies throughout the tree. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Felipe Balbi <balbi@ti.com> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Acked-by: Tejun Heo <tj@kernel.org>
2014-12-13phy / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PMRafael J. Wysocki
After commit b2b49ccbdd54 (PM: Kconfig: Set PM_RUNTIME if PM_SLEEP is selected) PM_RUNTIME is always set if PM is set, so #ifdef blocks depending on CONFIG_PM_RUNTIME may now be changed to depend on CONFIG_PM. Replace CONFIG_PM_RUNTIME with CONFIG_PM everywhere under drivers/phy/. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Kishon Vijay Abraham I <kishon@ti.com>
2014-12-13video / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PMRafael J. Wysocki
After commit b2b49ccbdd54 (PM: Kconfig: Set PM_RUNTIME if PM_SLEEP is selected) PM_RUNTIME is always set if PM is set, so #ifdef blocks depending on CONFIG_PM_RUNTIME may now be changed to depend on CONFIG_PM. The alternative of CONFIG_PM_SLEEP and CONFIG_PM_RUNTIME may be replaced with CONFIG_PM too. Make these changes in 2 files under drivers/video/. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Tomi Valkeinen <tomi.valkeinen@ti.com> Acked-by: Jingoo Han <jg1.han@samsung.com>
2014-12-13tty / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PMRafael J. Wysocki
After commit b2b49ccbdd54 (PM: Kconfig: Set PM_RUNTIME if PM_SLEEP is selected) PM_RUNTIME is always set if PM is set, so #ifdef blocks depending on CONFIG_PM_RUNTIME may now be changed to depend on CONFIG_PM. Replace CONFIG_PM_RUNTIME with CONFIG_PM everywhere under drivers/tty/. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>