summaryrefslogtreecommitdiffstats
path: root/include/linux
AgeCommit message (Collapse)Author
2015-02-11mm: memcontrol: default hierarchy interface for memoryJohannes Weiner
Introduce the basic control files to account, partition, and limit memory using cgroups in default hierarchy mode. This interface versioning allows us to address fundamental design issues in the existing memory cgroup interface, further explained below. The old interface will be maintained indefinitely, but a clearer model and improved workload performance should encourage existing users to switch over to the new one eventually. The control files are thus: - memory.current shows the current consumption of the cgroup and its descendants, in bytes. - memory.low configures the lower end of the cgroup's expected memory consumption range. The kernel considers memory below that boundary to be a reserve - the minimum that the workload needs in order to make forward progress - and generally avoids reclaiming it, unless there is an imminent risk of entering an OOM situation. - memory.high configures the upper end of the cgroup's expected memory consumption range. A cgroup whose consumption grows beyond this threshold is forced into direct reclaim, to work off the excess and to throttle new allocations heavily, but is generally allowed to continue and the OOM killer is not invoked. - memory.max configures the hard maximum amount of memory that the cgroup is allowed to consume before the OOM killer is invoked. - memory.events shows event counters that indicate how often the cgroup was reclaimed while below memory.low, how often it was forced to reclaim excess beyond memory.high, how often it hit memory.max, and how often it entered OOM due to memory.max. This allows users to identify configuration problems when observing a degradation in workload performance. An overcommitted system will have an increased rate of low boundary breaches, whereas increased rates of high limit breaches, maximum hits, or even OOM situations will indicate internally overcommitted cgroups. For existing users of memory cgroups, the following deviations from the current interface are worth pointing out and explaining: - The original lower boundary, the soft limit, is defined as a limit that is per default unset. As a result, the set of cgroups that global reclaim prefers is opt-in, rather than opt-out. The costs for optimizing these mostly negative lookups are so high that the implementation, despite its enormous size, does not even provide the basic desirable behavior. First off, the soft limit has no hierarchical meaning. All configured groups are organized in a global rbtree and treated like equal peers, regardless where they are located in the hierarchy. This makes subtree delegation impossible. Second, the soft limit reclaim pass is so aggressive that it not just introduces high allocation latencies into the system, but also impacts system performance due to overreclaim, to the point where the feature becomes self-defeating. The memory.low boundary on the other hand is a top-down allocated reserve. A cgroup enjoys reclaim protection when it and all its ancestors are below their low boundaries, which makes delegation of subtrees possible. Secondly, new cgroups have no reserve per default and in the common case most cgroups are eligible for the preferred reclaim pass. This allows the new low boundary to be efficiently implemented with just a minor addition to the generic reclaim code, without the need for out-of-band data structures and reclaim passes. Because the generic reclaim code considers all cgroups except for the ones running low in the preferred first reclaim pass, overreclaim of individual groups is eliminated as well, resulting in much better overall workload performance. - The original high boundary, the hard limit, is defined as a strict limit that can not budge, even if the OOM killer has to be called. But this generally goes against the goal of making the most out of the available memory. The memory consumption of workloads varies during runtime, and that requires users to overcommit. But doing that with a strict upper limit requires either a fairly accurate prediction of the working set size or adding slack to the limit. Since working set size estimation is hard and error prone, and getting it wrong results in OOM kills, most users tend to err on the side of a looser limit and end up wasting precious resources. The memory.high boundary on the other hand can be set much more conservatively. When hit, it throttles allocations by forcing them into direct reclaim to work off the excess, but it never invokes the OOM killer. As a result, a high boundary that is chosen too aggressively will not terminate the processes, but instead it will lead to gradual performance degradation. The user can monitor this and make corrections until the minimal memory footprint that still gives acceptable performance is found. In extreme cases, with many concurrent allocations and a complete breakdown of reclaim progress within the group, the high boundary can be exceeded. But even then it's mostly better to satisfy the allocation from the slack available in other groups or the rest of the system than killing the group. Otherwise, memory.max is there to limit this type of spillover and ultimately contain buggy or even malicious applications. - The original control file names are unwieldy and inconsistent in many different ways. For example, the upper boundary hit count is exported in the memory.failcnt file, but an OOM event count has to be manually counted by listening to memory.oom_control events, and lower boundary / soft limit events have to be counted by first setting a threshold for that value and then counting those events. Also, usage and limit files encode their units in the filename. That makes the filenames very long, even though this is not information that a user needs to be reminded of every time they type out those names. To address these naming issues, as well as to signal clearly that the new interface carries a new configuration model, the naming conventions in it necessarily differ from the old interface. - The original limit files indicate the state of an unset limit with a very high number, and a configured limit can be unset by echoing -1 into those files. But that very high number is implementation and architecture dependent and not very descriptive. And while -1 can be understood as an underflow into the highest possible value, -2 or -10M etc. do not work, so it's not inconsistent. memory.low, memory.high, and memory.max will use the string "infinity" to indicate and set the highest possible value. [akpm@linux-foundation.org: use seq_puts() for basic strings] Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Michal Hocko <mhocko@suse.cz> Cc: Vladimir Davydov <vdavydov@parallels.com> Cc: Greg Thelen <gthelen@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-11mm: page_counter: pull "-1" handling out of page_counter_memparse()Johannes Weiner
The unified hierarchy interface for memory cgroups will no longer use "-1" to mean maximum possible resource value. In preparation for this, make the string an argument and let the caller supply it. Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Michal Hocko <mhocko@suse.cz> Cc: Vladimir Davydov <vdavydov@parallels.com> Cc: Greg Thelen <gthelen@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-11vmscan: force scan offline memory cgroupsVladimir Davydov
Since commit b2052564e66d ("mm: memcontrol: continue cache reclaim from offlined groups") pages charged to a memory cgroup are not reparented when the cgroup is removed. Instead, they are supposed to be reclaimed in a regular way, along with pages accounted to online memory cgroups. However, an lruvec of an offline memory cgroup will sooner or later get so small that it will be scanned only at low scan priorities (see get_scan_count()). Therefore, if there are enough reclaimable pages in big lruvecs, pages accounted to offline memory cgroups will never be scanned at all, wasting memory. Fix this by unconditionally forcing scanning dead lruvecs from kswapd. [akpm@linux-foundation.org: fix build] Signed-off-by: Vladimir Davydov <vdavydov@parallels.com> Acked-by: Michal Hocko <mhocko@suse.cz> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-11mm: microoptimize zonelist operationsVlastimil Babka
next_zones_zonelist() returns a zoneref pointer, as well as a zone pointer via extra parameter. Since the latter can be trivially obtained by dereferencing the former, the overhead of the extra parameter is unjustified. This patch thus removes the zone parameter from next_zones_zonelist(). Both callers happen to be in the same header file, so it's simple to add the zoneref dereference inline. We save some bytes of code size. add/remove: 0/0 grow/shrink: 0/3 up/down: 0/-105 (-105) function old new delta nr_free_zone_pages 129 115 -14 __alloc_pages_nodemask 2300 2285 -15 get_page_from_freelist 2652 2576 -76 add/remove: 0/0 grow/shrink: 1/0 up/down: 10/0 (10) function old new delta try_to_compact_pages 569 579 +10 Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Cc: Mel Gorman <mgorman@suse.de> Cc: Zhang Yanfei <zhangyanfei@cn.fujitsu.com> Cc: Minchan Kim <minchan@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Rik van Riel <riel@redhat.com> Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Michal Hocko <mhocko@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-11mm: reduce try_to_compact_pages parametersVlastimil Babka
Expand the usage of the struct alloc_context introduced in the previous patch also for calling try_to_compact_pages(), to reduce the number of its parameters. Since the function is in different compilation unit, we need to move alloc_context definition in the shared mm/internal.h header. With this change we get simpler code and small savings of code size and stack usage: add/remove: 0/0 grow/shrink: 0/1 up/down: 0/-27 (-27) function old new delta __alloc_pages_direct_compact 283 256 -27 add/remove: 0/0 grow/shrink: 0/1 up/down: 0/-13 (-13) function old new delta try_to_compact_pages 582 569 -13 Stack usage of __alloc_pages_direct_compact goes from 24 to none (per scripts/checkstack.pl). Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Michal Hocko <mhocko@suse.cz> Cc: Mel Gorman <mgorman@suse.de> Cc: Zhang Yanfei <zhangyanfei@cn.fujitsu.com> Cc: Minchan Kim <minchan@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Rik van Riel <riel@redhat.com> Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-11mm/hugetlb: take page table lock in follow_huge_pmd()Naoya Horiguchi
We have a race condition between move_pages() and freeing hugepages, where move_pages() calls follow_page(FOLL_GET) for hugepages internally and tries to get its refcount without preventing concurrent freeing. This race crashes the kernel, so this patch fixes it by moving FOLL_GET code for hugepages into follow_huge_pmd() with taking the page table lock. This patch intentionally removes page==NULL check after pte_page. This is justified because pte_page() never returns NULL for any architectures or configurations. This patch changes the behavior of follow_huge_pmd() for tail pages and then tail pages can be pinned/returned. So the caller must be changed to properly handle the returned tail pages. We could have a choice to add the similar locking to follow_huge_(addr|pud) for consistency, but it's not necessary because currently these functions don't support FOLL_GET flag, so let's leave it for future development. Here is the reproducer: $ cat movepages.c #include <stdio.h> #include <stdlib.h> #include <numaif.h> #define ADDR_INPUT 0x700000000000UL #define HPS 0x200000 #define PS 0x1000 int main(int argc, char *argv[]) { int i; int nr_hp = strtol(argv[1], NULL, 0); int nr_p = nr_hp * HPS / PS; int ret; void **addrs; int *status; int *nodes; pid_t pid; pid = strtol(argv[2], NULL, 0); addrs = malloc(sizeof(char *) * nr_p + 1); status = malloc(sizeof(char *) * nr_p + 1); nodes = malloc(sizeof(char *) * nr_p + 1); while (1) { for (i = 0; i < nr_p; i++) { addrs[i] = (void *)ADDR_INPUT + i * PS; nodes[i] = 1; status[i] = 0; } ret = numa_move_pages(pid, nr_p, addrs, nodes, status, MPOL_MF_MOVE_ALL); if (ret == -1) err("move_pages"); for (i = 0; i < nr_p; i++) { addrs[i] = (void *)ADDR_INPUT + i * PS; nodes[i] = 0; status[i] = 0; } ret = numa_move_pages(pid, nr_p, addrs, nodes, status, MPOL_MF_MOVE_ALL); if (ret == -1) err("move_pages"); } return 0; } $ cat hugepage.c #include <stdio.h> #include <sys/mman.h> #include <string.h> #define ADDR_INPUT 0x700000000000UL #define HPS 0x200000 int main(int argc, char *argv[]) { int nr_hp = strtol(argv[1], NULL, 0); char *p; while (1) { p = mmap((void *)ADDR_INPUT, nr_hp * HPS, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS | MAP_HUGETLB, -1, 0); if (p != (void *)ADDR_INPUT) { perror("mmap"); break; } memset(p, 0, nr_hp * HPS); munmap(p, nr_hp * HPS); } } $ sysctl vm.nr_hugepages=40 $ ./hugepage 10 & $ ./movepages 10 $(pgrep -f hugepage) Fixes: e632a938d914 ("mm: migrate: add hugepage migration code to move_pages()") Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Reported-by: Hugh Dickins <hughd@google.com> Cc: James Hogan <james.hogan@imgtec.com> Cc: David Rientjes <rientjes@google.com> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@suse.cz> Cc: Rik van Riel <riel@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Luiz Capitulino <lcapitulino@redhat.com> Cc: Nishanth Aravamudan <nacc@linux.vnet.ibm.com> Cc: Lee Schermerhorn <lee.schermerhorn@hp.com> Cc: Steve Capper <steve.capper@linaro.org> Cc: <stable@vger.kernel.org> [3.12+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-11mm: fix typo of MIGRATE_RESERVE in commentBaoquan He
Found it when I want to jump to the definition of MIGRATE_RESERVE ctags. Signed-off-by: Baoquan He <bhe@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-11mm: memcontrol: track move_lock state internallyJohannes Weiner
The complexity of memcg page stat synchronization is currently leaking into the callsites, forcing them to keep track of the move_lock state and the IRQ flags. Simplify the API by tracking it in the memcg. Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Michal Hocko <mhocko@suse.cz> Reviewed-by: Vladimir Davydov <vdavydov@parallels.com> Cc: Wu Fengguang <fengguang.wu@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-11swap: remove unused mem_cgroup_uncharge_swapcache declarationVladimir Davydov
The body of this function was removed by commit 0a31bc97c80c ("mm: memcontrol: rewrite uncharge API"). Signed-off-by: Vladimir Davydov <vdavydov@parallels.com> Acked-by: Michal Hocko <mhocko@suse.cz> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-11mm:add KPF_ZERO_PAGE flag for /proc/kpageflagsWang, Yalin
Add KPF_ZERO_PAGE flag for zero_page, so that userspace processes can detect zero_page in /proc/kpageflags, and then do memory analysis more accurately. Signed-off-by: Yalin Wang <yalin.wang@sonymobile.com> Acked-by: Kirill A. Shutemov <kirill@shutemov.name> Cc: Konstantin Khlebnikov <koct9i@gmail.com> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-11mm: add VM_BUG_ON_PAGE() to page_mapcount()Wang, Yalin
Add VM_BUG_ON_PAGE() for slab pages. _mapcount is an union with slab struct in struct page, so we must avoid accessing _mapcount if this page is a slab page. Also remove the unneeded bracket. Signed-off-by: Yalin Wang <yalin.wang@sonymobile.com> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-11mm: add fields for compound destructor and order into struct pageKirill A. Shutemov
Currently, we use lru.next/lru.prev plus cast to access or set destructor and order of compound page. Let's replace it with explicit fields in struct page. Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: Jerome Marchand <jmarchan@redhat.com> Acked-by: Christoph Lameter <cl@linux.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-11f2fs: introduce macros to convert bytes and blocks in f2fsJaegeuk Kim
This patch adds two macros for transition between byte and block offsets. Currently, f2fs only supports 4KB blocks, so use the default size for now. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2015-02-11f2fs: split UMOUNT and FASTBOOT flagsJaegeuk Kim
This patch adds FASTBOOT flag into checkpoint as follows. - CP_UMOUNT_FLAG is set when system is umounted. - CP_FASTBOOT_FLAG is set when intermediate checkpoint having node summaries was done. So, if you get CP_UMOUNT_FLAG from checkpoint, the system was umounted cleanly. Instead, if there was sudden-power-off, you can get CP_FASTBOOT_FLAG or nothing. Reviewed-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2015-02-11net: Infrastructure for CHECKSUM_PARTIAL with remote checsum offloadTom Herbert
This patch adds infrastructure so that remote checksum offload can set CHECKSUM_PARTIAL instead of calling csum_partial and writing the modfied checksum field. Add skb_remcsum_adjust_partial function to set an skb for using CHECKSUM_PARTIAL with remote checksum offload. Changed skb_remcsum_process and skb_gro_remcsum_process to take a boolean argument to indicate if checksum partial can be set or the checksum needs to be modified using the normal algorithm. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-11net: Use more bit fields in napi_gro_cbTom Herbert
This patch moves the free and same_flow fields to be bit fields (2 and 1 bit sized respectively). This frees up some space for u16's. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-11net: Clarify meaning of CHECKSUM_PARTIAL for receive pathTom Herbert
The current meaning of CHECKSUM_PARTIAL for validating checksums is that _all_ checksums in the packet are considered valid. However, in the manner that CHECKSUM_PARTIAL is set only the checksum at csum_start+csum_offset and any preceding checksums may be considered valid. If there are checksums in the packet after csum_offset it is possible they have not been verfied. This patch changes CHECKSUM_PARTIAL logic in skb_csum_unnecessary and __skb_gro_checksum_validate_needed to only considered checksums referring to csum_start and any preceding checksums (with starting offset before csum_start) to be verified. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-11net: Fix remcsum in GRO path to not change packetTom Herbert
Remote checksum offload processing is currently the same for both the GRO and non-GRO path. When the remote checksum offload option is encountered, the checksum field referred to is modified in the packet. So in the GRO case, the packet is modified in the GRO path and then the operation is skipped when the packet goes through the normal path based on skb->remcsum_offload. There is a problem in that the packet may be modified in the GRO path, but then forwarded off host still containing the remote checksum option. A remote host will again perform RCO but now the checksum verification will fail since GRO RCO already modified the checksum. To fix this, we ensure that GRO restores a packet to it's original state before returning. In this model, when GRO processes a remote checksum option it still changes the checksum per the algorithm but on return from lower layer processing the checksum is restored to its original value. In this patch we add define gro_remcsum structure which is passed to skb_gro_remcsum_process to save offset and delta for the checksum being changed. After lower layer processing, skb_gro_remcsum_cleanup is called to restore the checksum before returning from GRO. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-11treewide: Remove unnecessary SSB_DEVTABLE_END macroJoe Perches
Use the normal {} instead of a macro to terminate an array. Remove the macro too. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-11treewide: Remove unnecessary BCMA_CORETABLE_END macroJoe Perches
Use the normal {} instead of a macro to terminate an array. Remove the macro too. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-11Merge tag 'pinctrl-v3.20-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl Pull pincontrol updates from Linus Walleij: :This is the bulk of pin control changes for the v3.20 cycle: Framework changes and enhancements: - Passing -DDEBUG recursively to subdir drivers so we get debug messages properly turned on. - Infer map type from DT property in the groups parsing code in the generic pinconfig code. - Support for custom parameter passing in generic pin config. This is used when you are using the generic pin config, but want to add a few custom properties that no other driver will use. New drivers: - Driver for the Xilinx Zynq - Driver for the AmLogic Meson SoCs New features in drivers: - Sleep support (suspend/resume) for the Cherryview driver - mvebeu a38x can now mux a UART on pins MPP19 and MPP20 - Migrated the qualcomm driver to generic pin config handling of extended config options in the core code. - Support BUS1 and AUDIO in the Exynos pin controller. - Add some missing functions in the sun6i driver. - Add support for the A31S variant in the sun6i driver. - EMEv2 support in the Renesas PFC driver. - Add support for Qualcomm MSM8916 in the qcom driver. Deleted features - Drop support for the SiRF Marco that was never released to the market. - Drop SH7372 support as the support for this platform is removed from the kernel" * tag 'pinctrl-v3.20-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl: (40 commits) sh-pfc: emev2 - Fix mangled author name pinctrl: cherryview: Configure HiZ pins to be input when requested as GPIOs pinctrl: imx25: fix numbering for pins pinctrl: pinctrl-imx: don't use invalid value of conf_reg pinctrl: qcom: delete pin_config_get/set pinconf operations pinctrl: qcom: Add msm8916 pinctrl driver DT: pinctrl: Document Qualcomm MSM8916 pinctrl binding pinctrl: qcom: increase variable size for register offsets pinctrl: hide PCONFDUMP in #ifdef pinctrl: rockchip: Only mask interrupts; never disable pinctrl: zynq: Fix usb0 pins pinctrl: sh-pfc: sh7372: Remove DT binding documentation pinctrl: sh-pfc: sh7372: Remove PFC support sh-pfc: Add emev2 pinmux support sh-pfc: add macro to define pinmux without function pinctrl: add driver for Amlogic Meson SoCs staging: drivers: pinctrl: Fixed checkpatch.pl warnings pinctrl: exynos: Add AUDIO pin controller for exynos7 sh-pfc: r8a7790: add MLB+ pin group sh-pfc: r8a7791: add MLB+ pin group ...
2015-02-11Merge tag 'gpio-v3.20-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio Pull GPIO changes from Linus Walleij: "This is the GPIO bulk changes for the v3.20 series: GPIOLIB core changes: - Create and use of_mm_gpiochip_remove() for removing memory-mapped OF GPIO chips - GPIO MMIO library suppports bgpio_set_multiple for switching several lines at once, a feature merged in the last cycle. New drivers: - New driver for the APM X-gene standby GPIO controller - New driver for the Fujitsu MB86S7x GPIO controller Cleanups: - Moved rcar driver to use gpiolib irqchip - Moxart converted to the GPIO MMIO library - GE driver converted to GPIO MMIO library - Move sx150x to irqdomain - Move max732x to irqdomain - Move vx855 to use managed resources - Move dwapb to use managed resources - Clean tc3589x from platform data - Clean stmpe driver to use device tree only probe New subtypes: - sx1506 support in the sx150x driver - Quark 1000 SoC support in the SCH driver - Support X86 in the Xilinx driver - Support PXA1928 in the PXA driver Extended drivers: - max732x supports device tree probe - sx150x supports device tree probe Various minor cleanups and bug fixes" * tag 'gpio-v3.20-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio: (61 commits) gpio: kconfig: replace PPC_OF with PPC gpio: pxa: add PXA1928 gpio type support dt/bindings: gpio: add compatible string for marvell,pxa1928-gpio gpio: pxa: remove mach IRQ includes gpio: max732x: use an inline function for container cast gpio: use sizeof() instead of hardcoded values gpio: max732x: add set_multiple function gpio: sch: Consolidate similar algorithms gpio: tz1090-pdc: Use resource_size to fix off-by-one resource size calculation gpio: ge: Convert to use devm_kstrdup gpio: correctly use const char * const gpio: sx150x: fixup OF support gpio: mpc8xxx: Use of_mm_gpiochip_remove gpio: Add Fujitsu MB86S7x GPIO driver gpio: mpc8xxx: Convert to platform device interface. gpio: zevio: Use of_mm_gpiochip_remove gpio: gpio-mm-lantiq: Use of_mm_gpiochip_remove gpio: gpio-mm-lantiq: Use of_property_read_u32 gpio: gpio-mm-lantiq: Do not replicate code gpio :gpio-mm-lantiq: Use devm_kzalloc ...
2015-02-11Merge tag 'mmc-v3.20-1' of git://git.linaro.org/people/ulf.hansson/mmcLinus Torvalds
Pull MMC updates from Ulf Hansson: "MMC core: - Support for MMC power sequences. - SDIO function devicetree subnode parsing. - Refactor the hardware reset routines and enable it for SD cards. - Various code quality improvements, especially for slot-gpio. MMC host: - dw_mmc: Various fixes and cleanups. - dw_mmc: Convert to mmc_send_tuning(). - moxart: Fix probe logic. - sdhci: Various fixes and cleanups - sdhci: Asynchronous request handling support. - sdhci-pxav3: Various fixes and cleanups. - sdhci-tegra: Fixes for T114, T124 and T132. - rtsx: Various fixes and cleanups. - rtsx: Support for SDIO. - sdhi/tmio: Refactor and cleanup of header files. - omap_hsmmc: Use slot-gpio and common MMC DT parser. - Make all hosts to deal with errors from mmc_of_parse(). - sunxi: Various fixes and cleanups. - sdhci: Support for Fujitsu SDHCI controller f_sdh30" * tag 'mmc-v3.20-1' of git://git.linaro.org/people/ulf.hansson/mmc: (117 commits) mmc: sdhci-s3c: solve problem with sleeping in atomic context mmc: pwrseq: add driver for emmc hardware reset mmc: moxart: fix probe logic mmc: core: Invoke mmc_pwrseq_post_power_on() prior MMC_POWER_ON state mmc: pwrseq_simple: Add optional reference clock support mmc: pwrseq: Document optional clock for the simple power sequence mmc: pwrseq_simple: Extend to support more pins mmc: pwrseq: Document that simple sequence support more than one GPIO mmc: Add hardware dependencies for sdhci-pxav3 and sdhci-pxav2 mmc: sdhci-pxav3: Modify clock settings for the SDR50 and DDR50 modes mmc: sdhci-pxav3: Extend binding with SDIO3 conf reg for the Armada 38x mmc: sdhci-pxav3: Fix Armada 38x controller's caps according to erratum ERR-7878951 mmc: sdhci-pxav3: Fix SDR50 and DDR50 capabilities for the Armada 38x flavor mmc: sdhci: switch voltage before sdhci_set_ios in runtime resume mmc: tegra: Write xfer_mode, CMD regs in together mmc: Resolve BKOPS compatability issue mmc: sdhci-pxav3: fix setting of pdata->clk_delay_cycles mmc: dw_mmc: rockchip: remove incorrect __exit_p() mmc: dw_mmc: exynos: remove incorrect __exit_p() mmc: Fix menuconfig alignment of MMC_SDHCI_* options ...
2015-02-11Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsiLinus Torvalds
Pull first round of SCSI updates from James Bottomley: "This is the usual grab bag of driver updates (hpsa, storvsc, mp2sas, megaraid_sas, ses) plus an assortment of minor updates. There's also an update to ufs which adds new phy drivers and finally a new logging infrastructure for SCSI" * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (114 commits) scsi_logging: return void for dev_printk() functions scsi: print single-character strings with seq_putc scsi: merge consecutive seq_puts calls scsi: replace seq_printf with seq_puts aha152x: replace seq_printf with seq_puts advansys: replace seq_printf with seq_puts scsi: remove SPRINTF macro sg: remove an unused variable hpsa: Use local workqueues instead of system workqueues hpsa: add in P840ar controller model name hpsa: add in gen9 controller model names hpsa: detect and report failures changing controller transport modes hpsa: shorten the wait for the CISS doorbell mode change ack hpsa: refactor duplicated scan completion code into a new routine hpsa: move SG descriptor set-up out of hpsa_scatter_gather() hpsa: do not use function pointers in fast path command submission hpsa: print CDBs instead of kernel virtual addresses for uncommon errors hpsa: do not use a void pointer for scsi_cmd field of struct CommandList hpsa: return failed from device reset/abort handlers hpsa: check for ctlr lockup after command allocation in main io path ...
2015-02-11block: remove unused function blk_bio_map_sgChristoph Hellwig
Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>
2015-02-11Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input Pull input updates from Dmitry Torokhov: "The first round of updates for the input subsystem. A few new drivers (power button handler for AXP20x PMIC, tps65218 power button driver, sun4i keys driver, regulator haptic driver, NI Ettus Research USRP E3x0 button, Alwinner A10/A20 PS/2 controller). Updates to Synaptics and ALPS touchpad drivers (with more to come later), brand new Focaltech PS/2 support, update to Cypress driver to handle Gen5 (in addition to Gen3) devices, and number of other fixups to various drivers as well as input core" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: (54 commits) Input: elan_i2c - fix wrong %p extension Input: evdev - do not queue SYN_DROPPED if queue is empty Input: gscps2 - fix MODULE_DEVICE_TABLE invocation Input: synaptics - use dmax in input_mt_assign_slots Input: pxa27x_keypad - remove unnecessary ARM includes Input: ti_am335x_tsc - replace delta filtering with median filtering ARM: dts: AM335x: Make charge delay a DT parameter for TSC Input: ti_am335x_tsc - read charge delay from DT Input: ti_am335x_tsc - remove udelay in interrupt handler Input: ti_am335x_tsc - interchange touchscreen and ADC steps Input: MT - add support for balanced slot assignment Input: drv2667 - remove wrong and unneeded drv2667-haptics modalias Input: drv260x - remove wrong and unneeded drv260x-haptics modalias Input: cap11xx - remove wrong and unneeded cap11xx modalias Input: sun4i-ts - add support for touchpanel controller on A31 Input: serio - add support for Alwinner A10/A20 PS/2 controller Input: gtco - use sign_extend32() for sign extension Input: elan_i2c - verify firmware signature applying it Input: elantech - remove stale comment from Kconfig Input: cyapa - off by one in cyapa_update_fw_store() ...
2015-02-11Merge tag 'fbdev-3.20' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tomba/linux Pull fbdev changes from Tomi Valkeinen: - omapdss: add DRA7xxx SoC support - fbdev: support DMT (Display Monitor Timing) calculation * tag 'fbdev-3.20' of git://git.kernel.org/pub/scm/linux/kernel/git/tomba/linux: (40 commits) omapfb: Return error code when applying overlay settings fails OMAPDSS: DPI: DRA7xx support OMAPDSS: HDMI: Add DRA7xx support OMAPDSS: DISPC: program dispc polarities to control module OMAPDSS: DISPC: Add DRA7xx support OMAPDSS: Add Video PLLs for DRA7xx OMAPDSS: Add functions for external control of PLL OMAPDSS: DSS: Add DRA7xx base support Doc/DT: Add DT binding doc for DRA7xx DSS OMAPDSS: add define for DRA7xx HW version OMAPDSS: encoder-tpd12s015: Fix race issue with LS_OE OMAPDSS: OMAP5: fix digit output's allowed mgrs OMAPDSS: constify port arrays OMAPDSS: PLL: add dss_pll_wait_reset_done() OMAPDSS: Add enum dss_pll_id video: fbdev: fix sys_copyarea video/mmpfb: allow modular build fb: via: turn gpiolib and i2c selects into dependencies fbdev: ssd1307fb: return proper error code if write command fails fbdev: fix CVT vertical front and back porch values ...
2015-02-11Merge tag 'media/v3.20-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media Pull media updates from Mauro Carvalho Chehab: - Some documentation updates and a few new pixel formats - Stop btcx-risc abuse by cx88 and move it to bt8xx driver - New platform driver: am437x - New webcam driver: toptek - New remote controller hardware protocols added to img-ir driver - Removal of a few very old drivers that relies on old kABIs and are for very hard to find hardware: parallel port webcam drivers (bw-qcam, c-cam, pms and w9966), tlg2300, Video In/Out for SGI (vino) - Removal of the USB Telegent driver (tlg2300). The company that developed this driver has long gone and the hardware is hard to find. As it relies on a legacy set of kABI symbols and nobody seems to care about it, remove it. - several improvements at rtl2832 driver - conversion on cx28521 and au0828 to use videobuf2 (VB2) - several improvements, fixups and board additions * tag 'media/v3.20-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: (321 commits) [media] dvb_net: Convert local hex dump to print_hex_dump_debug [media] dvb_net: Use standard debugging facilities [media] dvb_net: Use vsprintf %pM extension to print Ethernet addresses [media] staging: lirc_serial: adjust boolean assignments [media] stb0899: use sign_extend32() for sign extension [media] si2168: add support for 1.7MHz bandwidth [media] si2168: return error if set_frontend is called with invalid parameters [media] lirc_dev: avoid potential null-dereference [media] mn88472: simplify bandwidth registers setting code [media] dvb: tc90522: re-add symbol-rate report [media] lmedm04: add read snr, signal strength and ber call backs [media] lmedm04: Create frontend call back for read status [media] lmedm04: create frontend callbacks for signal/snr/ber/ucblocks [media] lmedm04: Fix usb_submit_urb BOGUS urb xfer, pipe 1 != type 3 in interrupt urb [media] lmedm04: Increase Interupt due time to 200 msec [media] cx88-dvb: whitespace cleanup [media] rtl28xxu: properly initialize pdata [media] rtl2832: declare functions as static [media] rtl2830: declare functions as static [media] rtl2832_sdr: add kernel-doc comments for platform_data ...
2015-02-11Merge tag 'for-v3.20' of git://git.infradead.org/battery-2.6Linus Torvalds
Pull power supply and reset changes from Sebastian Reichel: "New drivers: - charger driver for Maxim 77693 - battery gauge driver for LTC 2941/2943 - battery gauge driver for RT5033 - reset driver for R-Mobile platforms Convert drivers to restart handler framework: - arm-versatile - at91 - st-poweroff Misc: - remove deprecated sun6i reboot driver - use alarmtimer instead of rtc in charger-manager - misc fixes" * tag 'for-v3.20' of git://git.infradead.org/battery-2.6: (48 commits) power_supply: 88pm860x: Fix leaked power supply on probe fail power/reset: restart-poweroff: Remove arm dependencies power/reset: st-poweroff: Fix misleading Kconfig description power/reset: st-poweroff: Register with kernel restart handler power/reset: Remove sun6i reboot driver power/reset: at91: Register with kernel restart handler power/reset: arm-versatile: Register with kernel restart handler power: test_power: Use enum as index for array of supplies Add devicetree binding documentation for the LTC2941/LTC2943 driver Add LTC2941/LTC2943 Battery Gauge Driver power/reset: brcmstb: Add support for old 65nm chips power/reset: brcmstb: Use the DT "compatible" string to indicate bit positions power/reset: brcmstb: Make the driver buildable on MIPS power: charger-manager: Use alarmtimer for battery monitoring in suspend. power/reset: at91-poweroff: Fix error handling and other compiler warnings bq27x00_battery: Call power_supply_changed only when capacity changed bq27x00_battery: fix register offset for bq27425 power: max14577: Remove SYSFS dependency from Kconfig power: bq24190_charger: suppress build warning power: reset: Add reset driver for R-Mobile platforms ...
2015-02-11lguest: remove NOTIFY call and eventfd facility.Rusty Russell
Disappointing, as this was kind of neat (especially getting to use RCU to manage the address -> eventfd mapping). But now the devices are PCI handled in userspace, we get rid of both the NOTIFY hypercall and the interface to connect an eventfd. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2015-02-11lguest: remove lguest bus definitions from header.Rusty Russell
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2015-02-11lguest: add infrastructure for userspace to deliver a trap to the guest.Rusty Russell
This is required for instruction emulation to move to userspace. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2015-02-11lguest: write more information to userspace about pending traps.Rusty Russell
This is preparation for userspace handling MMIO and ioport accesses. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2015-02-11lguest: add operations to get/set a register from the Launcher.Rusty Russell
We use the ptrace API struct, and we currently don't let them set anything but the normal registers (we'd have to filter the others). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2015-02-10Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-nextLinus Torvalds
Pull networking updates from David Miller: 1) More iov_iter conversion work from Al Viro. [ The "crypto: switch af_alg_make_sg() to iov_iter" commit was wrong, and this pull actually adds an extra commit on top of the branch I'm pulling to fix that up, so that the pre-merge state is ok. - Linus ] 2) Various optimizations to the ipv4 forwarding information base trie lookup implementation. From Alexander Duyck. 3) Remove sock_iocb altogether, from CHristoph Hellwig. 4) Allow congestion control algorithm selection via routing metrics. From Daniel Borkmann. 5) Make ipv4 uncached route list per-cpu, from Eric Dumazet. 6) Handle rfs hash collisions more gracefully, also from Eric Dumazet. 7) Add xmit_more support to r8169, e1000, and e1000e drivers. From Florian Westphal. 8) Transparent Ethernet Bridging support for GRO, from Jesse Gross. 9) Add BPF packet actions to packet scheduler, from Jiri Pirko. 10) Add support for uniqu flow IDs to openvswitch, from Joe Stringer. 11) New NetCP ethernet driver, from Muralidharan Karicheri and Wingman Kwok. 12) More sanely handle out-of-window dupacks, which can result in serious ACK storms. From Neal Cardwell. 13) Various rhashtable bug fixes and enhancements, from Herbert Xu, Patrick McHardy, and Thomas Graf. 14) Support xmit_more in be2net, from Sathya Perla. 15) Group Policy extensions for vxlan, from Thomas Graf. 16) Remove Checksum Offload support for vxlan, from Tom Herbert. 17) Like ipv4, support lockless transmit over ipv6 UDP sockets. From Vlad Yasevich. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1494+1 commits) crypto: fix af_alg_make_sg() conversion to iov_iter ipv4: Namespecify TCP PMTU mechanism i40e: Fix for stats init function call in Rx setup tcp: don't include Fast Open option in SYN-ACK on pure SYN-data openvswitch: Only set TUNNEL_VXLAN_OPT if VXLAN-GBP metadata is set ipv6: Make __ipv6_select_ident static ipv6: Fix fragment id assignment on LE arches. bridge: Fix inability to add non-vlan fdb entry net: Mellanox: Delete unnecessary checks before the function call "vunmap" cxgb4: Add support in cxgb4 to get expansion rom version via ethtool ethtool: rename reserved1 memeber in ethtool_drvinfo for expansion ROM version net: dsa: Remove redundant phy_attach() IB/mlx4: Reset flow support for IB kernel ULPs IB/mlx4: Always use the correct port for mirrored multicast attachments net/bonding: Fix potential bad memory access during bonding events tipc: remove tipc_snprintf tipc: nl compat add noop and remove legacy nl framework tipc: convert legacy nl stats show to nl compat tipc: convert legacy nl net id get to nl compat tipc: convert legacy nl net id set to nl compat ...
2015-02-10Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jikos/livepatching Pull live patching infrastructure from Jiri Kosina: "Let me provide a bit of history first, before describing what is in this pile. Originally, there was kSplice as a standalone project that implemented stop_machine()-based patching for the linux kernel. This project got later acquired, and the current owner is providing live patching as a proprietary service, without any intentions to have their implementation merged. Then, due to rising user/customer demand, both Red Hat and SUSE started working on their own implementation (not knowing about each other), and announced first versions roughly at the same time [1] [2]. The principle difference between the two solutions is how they are making sure that the patching is performed in a consistent way when it comes to different execution threads with respect to the semantic nature of the change that is being introduced. In a nutshell, kPatch is issuing stop_machine(), then looking at stacks of all existing processess, and if it decides that the system is in a state that can be patched safely, it proceeds insterting code redirection machinery to the patched functions. On the other hand, kGraft provides a per-thread consistency during one single pass of a process through the kernel and performs a lazy contignuous migration of threads from "unpatched" universe to the "patched" one at safe checkpoints. If interested in a more detailed discussion about the consistency models and its possible combinations, please see the thread that evolved around [3]. It pretty quickly became obvious to the interested parties that it's absolutely impractical in this case to have several isolated solutions for one task to co-exist in the kernel. During a dedicated Live Kernel Patching track at LPC in Dusseldorf, all the interested parties sat together and came up with a joint aproach that would work for both distro vendors. Steven Rostedt took notes [4] from this meeting. And the foundation for that aproach is what's present in this pull request. It provides a basic infrastructure for function "live patching" (i.e. code redirection), including API for kernel modules containing the actual patches, and API/ABI for userspace to be able to operate on the patches (look up what patches are applied, enable/disable them, etc). It's relatively simple and minimalistic, as it's making use of existing kernel infrastructure (namely ftrace) as much as possible. It's also self-contained, in a sense that it doesn't hook itself in any other kernel subsystem (it doesn't even touch any other code). It's now implemented for x86 only as a reference architecture, but support for powerpc, s390 and arm is already in the works (adding arch-specific support basically boils down to teaching ftrace about regs-saving). Once this common infrastructure gets merged, both Red Hat and SUSE have agreed to immediately start porting their current solutions on top of this, abandoning their out-of-tree code. The plan basically is that each patch will be marked by flag(s) that would indicate which consistency model it is willing to use (again, the details have been sketched out already in the thread at [3]). Before this happens, the current codebase can be used to patch a large group of secruity/stability problems the patches for which are not too complex (in a sense that they don't introduce non-trivial change of function's return value semantics, they don't change layout of data structures, etc) -- this corresponds to LEAVE_FUNCTION && SWITCH_FUNCTION semantics described at [3]. This tree has been in linux-next since December. [1] https://lkml.org/lkml/2014/4/30/477 [2] https://lkml.org/lkml/2014/7/14/857 [3] https://lkml.org/lkml/2014/11/7/354 [4] http://linuxplumbersconf.org/2014/wp-content/uploads/2014/10/LPC2014_LivePatching.txt [ The core code is introduced by the three commits authored by Seth Jennings, which got a lot of changes incorporated during numerous respins and reviews of the initial implementation. All the followup commits have materialized only after public tree has been created, so they were not folded into initial three commits so that the public tree doesn't get rebased ]" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/livepatching: livepatch: add missing newline to error message livepatch: rename config to CONFIG_LIVEPATCH livepatch: fix uninitialized return value livepatch: support for repatching a function livepatch: enforce patch stacking semantics livepatch: change ARCH_HAVE_LIVE_PATCHING to HAVE_LIVE_PATCHING livepatch: fix deferred module patching order livepatch: handle ancient compilers with more grace livepatch: kconfig: use bool instead of boolean livepatch: samples: fix usage example comments livepatch: MAINTAINERS: add git tree location livepatch: use FTRACE_OPS_FL_IPMODIFY livepatch: move x86 specific ftrace handler code to arch/x86 livepatch: samples: add sample live patching module livepatch: kernel: add support for live patching livepatch: kernel: add TAINT_LIVEPATCH
2015-02-10Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid Pull HID updates from Jiri Kosina: "Updates for HID code - improveements of Logitech HID++ procotol implementation, from Benjamin Tissoires - support for composite RMI devices, from Andrew Duggan - new driver for BETOP controller, from Huang Bo - fixup for conflicting mapping in HID core between PC-101/103/104 and PC-102/105 keyboards from David Herrmann - new hardware support and fixes in Wacom driver, from Ping Cheng - assorted small fixes and device ID additions all over the place" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid: (33 commits) HID: wacom: add support for Cintiq 27QHD and 27QHD touch HID: wacom: consolidate input capability settings for pen and touch HID: wacom: make sure touch arbitration is applied consistently HID: pidff: Fix initialisation forMicrosoft Sidewinder FF Pro 2 HID: hyperv: match wait_for_completion_timeout return type HID: wacom: Report ABS_MISC event for Cintiq Companion Hybrid HID: Use Kbuild idiom in Makefiles HID: do not bind to Microchip Pick16F1454 HID: hid-lg4ff: use DEVICE_ATTR_RW macro HID: hid-lg4ff: fix sysfs attribute permission HID: wacom: peport In Range event according to the spec HID: wacom: process invalid Cintiq and Intuos data in wacom_intuos_inout() HID: rmi: Add support for the touchpad in the Razer Blade 14 laptop HID: rmi: Support touchpads with external buttons HID: rmi: Use hid_report_len to compute the size of reports HID: logitech-hidpp: store the name of the device in struct hidpp HID: microsoft: add support for Japanese Surface Type Cover 3 HID: fixup the conflicting keyboard mappings quirk HID: apple: fix battery support for the 2009 ANSI wireless keyboard HID: fix Kconfig text ...
2015-02-10Merge branch 'akpm' (patches from Andrew)Linus Torvalds
Merge misc updates from Andrew Morton: "Bite-sized chunks this time, to avoid the MTA ratelimiting woes. - fs/notify updates - ocfs2 - some of MM" That laconic "some MM" is mainly the removal of remap_file_pages(), which is a big simplification of the VM, and which gets rid of a *lot* of random cruft and special cases because we no longer support the non-linear mappings that it used. From a user interface perspective, nothing has changed, because the remap_file_pages() syscall still exists, it's just done by emulating the old behavior by creating a lot of individual small mappings instead of one non-linear one. The emulation is slower than the old "native" non-linear mappings, but nobody really uses or cares about remap_file_pages(), and simplifying the VM is a big advantage. * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (78 commits) memcg: zap memcg_slab_caches and memcg_slab_mutex memcg: zap memcg_name argument of memcg_create_kmem_cache memcg: zap __memcg_{charge,uncharge}_slab mm/page_alloc.c: place zone_id check before VM_BUG_ON_PAGE check mm: hugetlb: fix type of hugetlb_treat_as_movable variable mm, hugetlb: remove unnecessary lower bound on sysctl handlers"? mm: memory: merge shared-writable dirtying branches in do_wp_page() mm: memory: remove ->vm_file check on shared writable vmas xtensa: drop _PAGE_FILE and pte_file()-related helpers x86: drop _PAGE_FILE and pte_file()-related helpers unicore32: drop pte_file()-related helpers um: drop _PAGE_FILE and pte_file()-related helpers tile: drop pte_file()-related helpers sparc: drop pte_file()-related helpers sh: drop _PAGE_FILE and pte_file()-related helpers score: drop _PAGE_FILE and pte_file()-related helpers s390: drop pte_file()-related helpers parisc: drop _PAGE_FILE and pte_file()-related helpers openrisc: drop _PAGE_FILE and pte_file()-related helpers nios2: drop _PAGE_FILE and pte_file()-related helpers ...
2015-02-10Merge branch 'for_linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs Pull quota interface unification and misc cleanups from Jan Kara: "The first part of the series unifying XFS and VFS quota interfaces. This part unifies turning quotas on and off so quota-tools and xfs_quota can be used to manage any filesystem. This is useful so that userspace doesn't have to distinguish which filesystem it is working with. As a result we can then easily reuse tests for project quotas in XFS for ext4. This also contains minor cleanups and fixes for udf, isofs, and ext3" * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: (23 commits) udf: remove bool assignment to 0/1 udf: use bool for done quota: Store maximum space limit in bytes quota: Remove quota_on_meta callback ocfs2: Use generic helpers for quotaon and quotaoff ext4: Use generic helpers for quotaon and quotaoff quota: Add ->quota_{enable,disable} callbacks for VFS quotas quota: Wire up ->quota_{enable,disable} callbacks into Q_QUOTA{ON,OFF} quota: Split ->set_xstate callback into two xfs: Remove some pointless quota checks xfs: Remove some useless flags tests xfs: Remove useless test quota: Verify flags passed to Q_SETINFO quota: Cleanup flags definitions ocfs2: Move OLQF_CLEAN flag out of generic quota flags quota: Don't store flags for v2 quota format jbd: drop jbd_ENOSYS debug udf: destroy sbi mutex in put_super udf: Check length of extended attributes and allocation descriptors udf: Remove repeated loads blocksize ...
2015-02-10Merge tag 'locks-v3.20-1' of git://git.samba.org/jlayton/linuxLinus Torvalds
Pull file locking related changes #1 from Jeff Layton: "This patchset contains a fairly major overhaul of how file locks are tracked within the inode. Rather than a single list, we now create a per-inode "lock context" that contains individual lists for the file locks, and a new dedicated spinlock for them. There are changes in other trees that are based on top of this set so it may be easiest to pull this in early" * tag 'locks-v3.20-1' of git://git.samba.org/jlayton/linux: locks: update comments that refer to inode->i_flock locks: consolidate NULL i_flctx checks in locks_remove_file locks: keep a count of locks on the flctx lists locks: clean up the lm_change prototype locks: add a dedicated spinlock to protect i_flctx lists locks: remove i_flock field from struct inode locks: convert lease handling to file_lock_context locks: convert posix locks to file_lock_context locks: move flock locks to file_lock_context ceph: move spinlocking into ceph_encode_locks_to_buffer and ceph_count_locks locks: add a new struct file_locking_context pointer to struct inode locks: have locks_release_file use flock_lock_file to release generic flock locks locks: add new struct list_head to struct file_lock
2015-02-10Merge tag 'pm+acpi-3.20-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI and power management updates from Rafael Wysocki: "We have a few new features this time, including a new SFI-based cpufreq driver, a new devfreq driver for Tegra Activity Monitor, a new devfreq class for providing its governors with raw utilization data and a new ACPI driver for AMD SoCs. Still, the majority of changes here are reworks of existing code to make it more straightforward or to prepare it for implementing new features on top of it. The primary example is the rework of ACPI resources handling from Jiang Liu, Thomas Gleixner and Lv Zheng with support for IOAPIC hotplug implemented on top of it, but there is quite a number of changes of this kind in the cpufreq core, ACPICA, ACPI EC driver, ACPI processor driver and the generic power domains core code too. The most active developer is Viresh Kumar with his cpufreq changes. Specifics: - Rework of the core ACPI resources parsing code to fix issues in it and make using resource offsets more convenient and consolidation of some resource-handing code in a couple of places that have grown analagous data structures and code to cover the the same gap in the core (Jiang Liu, Thomas Gleixner, Lv Zheng). - ACPI-based IOAPIC hotplug support on top of the resources handling rework (Jiang Liu, Yinghai Lu). - ACPICA update to upstream release 20150204 including an interrupt handling rework that allows drivers to install raw handlers for ACPI GPEs which then become entirely responsible for the given GPE and the ACPICA core code won't touch it (Lv Zheng, David E Box, Octavian Purdila). - ACPI EC driver rework to fix several concurrency issues and other problems related to events handling on top of the ACPICA's new support for raw GPE handlers (Lv Zheng). - New ACPI driver for AMD SoCs analogous to the LPSS (Low-Power Subsystem) driver for Intel chips (Ken Xue). - Two minor fixes of the ACPI LPSS driver (Heikki Krogerus, Jarkko Nikula). - Two new blacklist entries for machines (Samsung 730U3E/740U3E and 510R) where the native backlight interface doesn't work correctly while the ACPI one does (Hans de Goede). - Rework of the ACPI processor driver's handling of idle states to make the code more straightforward and less bloated overall (Rafael J Wysocki). - Assorted minor fixes related to ACPI and SFI (Andreas Ruprecht, Andy Shevchenko, Hanjun Guo, Jan Beulich, Rafael J Wysocki, Yaowei Bai). - PCI core power management modification to avoid resuming (some) runtime-suspended devices during system suspend if they are in the right states already (Rafael J Wysocki). - New SFI-based cpufreq driver for Intel platforms using SFI (Srinidhi Kasagar). - cpufreq core fixes, cleanups and simplifications (Viresh Kumar, Doug Anderson, Wolfram Sang). - SkyLake CPU support and other updates for the intel_pstate driver (Kristen Carlson Accardi, Srinivas Pandruvada). - cpufreq-dt driver cleanup (Markus Elfring). - Init fix for the ARM big.LITTLE cpuidle driver (Sudeep Holla). - Generic power domains core code fixes and cleanups (Ulf Hansson). - Operating Performance Points (OPP) core code cleanups and kernel documentation update (Nishanth Menon). - New dabugfs interface to make the list of PM QoS constraints available to user space (Nishanth Menon). - New devfreq driver for Tegra Activity Monitor (Tomeu Vizoso). - New devfreq class (devfreq_event) to provide raw utilization data to devfreq governors (Chanwoo Choi). - Assorted minor fixes and cleanups related to power management (Andreas Ruprecht, Krzysztof Kozlowski, Rickard Strandqvist, Pavel Machek, Todd E Brandt, Wonhong Kwon). - turbostat updates (Len Brown) and cpupower Makefile improvement (Sriram Raghunathan)" * tag 'pm+acpi-3.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (151 commits) tools/power turbostat: relax dependency on APERF_MSR tools/power turbostat: relax dependency on invariant TSC Merge branch 'pci/host-generic' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci into acpi-resources tools/power turbostat: decode MSR_*_PERF_LIMIT_REASONS tools/power turbostat: relax dependency on root permission ACPI / video: Add disable_native_backlight quirk for Samsung 510R ACPI / PM: Remove unneeded nested #ifdef USB / PM: Remove unneeded #ifdef and associated dead code intel_pstate: provide option to only use intel_pstate with HWP ACPI / EC: Add GPE reference counting debugging messages ACPI / EC: Add query flushing support ACPI / EC: Refine command storm prevention support ACPI / EC: Add command flushing support. ACPI / EC: Introduce STARTED/STOPPED flags to replace BLOCKED flag ACPI: add AMD ACPI2Platform device support for x86 system ACPI / table: remove duplicate NULL check for the handler of acpi_table_parse() ACPI / EC: Update revision due to raw handler mode. ACPI / EC: Reduce ec_poll() by referencing the last register access timestamp. ACPI / EC: Fix several GPE handling issues by deploying ACPI_GPE_DISPATCH_RAW_HANDLER mode. ACPICA: Events: Enable APIs to allow interrupt/polling adaptive request based GPE handling model ...
2015-02-10Merge tag 'pci-v3.20-changes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull PCI changes from Bjorn Helgaas: "Enumeration - Move domain assignment from arm64 to generic code (Lorenzo Pieralisi) - ARM: Remove artificial dependency on pci_sys_data domain (Lorenzo Pieralisi) - ARM: Move to generic PCI domains (Lorenzo Pieralisi) - Generate uppercase hex for modalias var in uevent (Ricardo Ribalda Delgado) - Add and use generic config accessors on ARM, PowerPC (Rob Herring) Resource management - Free resources on failure in of_pci_get_host_bridge_resources() (Lorenzo Pieralisi) - Fix infinite loop with ROM image of size 0 (Michel Dänzer) PCI device hotplug - Handle surprise add even if surprise removal isn't supported (Bjorn Helgaas) Virtualization - Mark AMD/ATI VGA devices that don't reset on D3hot->D0 transition (Alex Williamson) - Add DMA alias quirk for Adaptec 3405 (Alex Williamson) - Add Wellsburg (X99) to Intel PCH root port ACS quirk (Alex Williamson) - Add ACS quirk for Emulex NICs (Vasundhara Volam) MSI - Fail MSI-X mappings if there's no space assigned to MSI-X BAR (Yijing Wang) Freescale Layerscape host bridge driver - Fix platform_no_drv_owner.cocci warnings (Julia Lawall) NVIDIA Tegra host bridge driver - Remove unnecessary tegra_pcie_fixup_bridge() (Lucas Stach) Renesas R-Car host bridge driver - Fix error handling of irq_of_parse_and_map() (Dmitry Torokhov) TI Keystone host bridge driver - Fix error handling of irq_of_parse_and_map() (Dmitry Torokhov) - Fix misspelling of current function in debug output (Julia Lawall) Xilinx AXI host bridge driver - Fix harmless format string warning (Arnd Bergmann) Miscellaneous - Use standard parsing functions for ASPM sysfs setters (Chris J Arges) - Add pci_device_to_OF_node() stub for !CONFIG_OF (Kevin Hao) - Delete unnecessary NULL pointer checks (Markus Elfring) - Add and use defines for PCIe Max_Read_Request_Size (Rafał Miłecki) - Include clk.h instead of clk-private.h (Stephen Boyd)" * tag 'pci-v3.20-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (48 commits) PCI: Add pci_device_to_OF_node() stub for !CONFIG_OF PCI: xilinx: Convert to use generic config accessors PCI: xgene: Convert to use generic config accessors PCI: tegra: Convert to use generic config accessors PCI: rcar: Convert to use generic config accessors PCI: generic: Convert to use generic config accessors powerpc/powermac: Convert PCI to use generic config accessors powerpc/fsl_pci: Convert PCI to use generic config accessors ARM: ks8695: Convert PCI to use generic config accessors ARM: sa1100: Convert PCI to use generic config accessors ARM: integrator: Convert PCI to use generic config accessors PCI: versatile: Add DT-based ARM Versatile PB PCIe host driver ARM: dts: versatile: add PCI controller binding of/pci: Free resources on failure in of_pci_get_host_bridge_resources() PCI: versatile: Add DT docs for ARM Versatile PB PCIe driver PCI: Fail MSI-X mappings if there's no space assigned to MSI-X BAR r8169: use PCI define for Max_Read_Request_Size [SCSI] esas2r: use PCI define for Max_Read_Request_Size tile: use PCI define for Max_Read_Request_Size rapidio/tsi721: use PCI define for Max_Read_Request_Size ...
2015-02-10memcg: zap memcg_slab_caches and memcg_slab_mutexVladimir Davydov
mem_cgroup->memcg_slab_caches is a list of kmem caches corresponding to the given cgroup. Currently, it is only used on css free in order to destroy all caches corresponding to the memory cgroup being freed. The list is protected by memcg_slab_mutex. The mutex is also used to protect kmem_cache->memcg_params->memcg_caches arrays and synchronizes kmem_cache_destroy vs memcg_unregister_all_caches. However, we can perfectly get on without these two. To destroy all caches corresponding to a memory cgroup, we can walk over the global list of kmem caches, slab_caches, and we can do all the synchronization stuff using the slab_mutex instead of the memcg_slab_mutex. This patch therefore gets rid of the memcg_slab_caches and memcg_slab_mutex. Apart from this nice cleanup, it also: - assures that rcu_barrier() is called once at max when a root cache is destroyed or a memory cgroup is freed, no matter how many caches have SLAB_DESTROY_BY_RCU flag set; - fixes the race between kmem_cache_destroy and kmem_cache_create that exists, because memcg_cleanup_cache_params, which is called from kmem_cache_destroy after checking that kmem_cache->refcount=0, releases the slab_mutex, which gives kmem_cache_create a chance to make an alias to a cache doomed to be destroyed. Signed-off-by: Vladimir Davydov <vdavydov@parallels.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@suse.cz> Acked-by: Christoph Lameter <cl@linux.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-10memcg: zap memcg_name argument of memcg_create_kmem_cacheVladimir Davydov
Instead of passing the name of the memory cgroup which the cache is created for in the memcg_name_argument, let's obtain it immediately in memcg_create_kmem_cache. Signed-off-by: Vladimir Davydov <vdavydov@parallels.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-10memcg: zap __memcg_{charge,uncharge}_slabVladimir Davydov
They are simple wrappers around memcg_{charge,uncharge}_kmem, so let's zap them and call these functions directly. Signed-off-by: Vladimir Davydov <vdavydov@parallels.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-10mm: hugetlb: fix type of hugetlb_treat_as_movable variableAndrey Ryabinin
hugetlb_treat_as_movable declared as unsigned long, but proc_dointvec() used for parsing it: static struct ctl_table vm_table[] = { ... { .procname = "hugepages_treat_as_movable", .data = &hugepages_treat_as_movable, .maxlen = sizeof(int), .mode = 0644, .proc_handler = proc_dointvec, }, This seems harmless, but it's better to use int type here. Signed-off-by: Andrey Ryabinin <a.ryabinin@samsung.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Manfred Spraul <manfred@colorfullife.com> Acked-by: David Rientjes <rientjes@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-10mm: remove rest usage of VM_NONLINEAR and pte_file()Kirill A. Shutemov
One bit in ->vm_flags is unused now! Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Dan Carpenter <dan.carpenter@oracle.com> Cc: Michal Hocko <mhocko@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-10mm: replace vma->sharead.linear with vma->sharedKirill A. Shutemov
After removing vma->shared.nonlinear we have only one member of vma->shared union, which doesn't make much sense. This patch drops the union and move struct vma->shared.linear to vma->shared. Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-10rmap: drop support of non-linear mappingsKirill A. Shutemov
We don't create non-linear mappings anymore. Let's drop code which handles them in rmap. Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-10mm: drop vm_ops->remap_pages and generic_file_remap_pages() stubKirill A. Shutemov
Nobody uses it anymore. [akpm@linux-foundation.org: fix filemap_xip.c] Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Wu Fengguang <fengguang.wu@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>