summaryrefslogtreecommitdiffstats
AgeCommit message (Collapse)Author
2013-04-15ioatdma: skip silicon bug workaround for pq_align for cb3.3Dave Jiang
The alignment workaround is only necessary for cb3.2 or earlier platforms. Signed-off-by: Dave Jiang <dave.jiang@intel.com> Acked-by: Dan Williams <djbw@fb.com> Signed-off-by: Vinod Koul <vinod.koul@intel.com>
2013-04-15ioatdma: Removing PQ val disable for cb3.3Dave Jiang
The PQ Val ops work on the newer hardware so we should actually provide support for it and remove the disabling bits. Signed-off-by: Dave Jiang <dave.jiang@intel.com> Acked-by: Dan Williams <djbw@fb.com> Signed-off-by: Vinod Koul <vinod.koul@intel.com>
2013-04-15ioatdma: skip legacy reset bits since v3.3 plattform doesn't need itDave Jiang
Make it so only 3.2 and earlier platform need the PCI config register clearings since this implementation does not have the registers. Signed-off-by: Dave Jiang <dave.jiang@intel.com> Acked-by: Dan Williams <djbw@fb.com> Signed-off-by: Vinod Koul <vinod.koul@intel.com>
2013-04-15ioatdma: channel reset scheme fixup on Intel Atom S1200 platformsDave Jiang
The Intel Atom S1200 family ioatdma changed the channel reset behavior. It does a reset similar to PCI FLR by resetting all the MSIX registers. We have to re-init msix interrupts because of this. This workaround is only specific to this platform and is not expected to carry over to the later generations. Signed-off-by: Dave Jiang <dave.jiang@intel.com> Acked-by: Dan Williams <djbw@fb.com> Signed-off-by: Vinod Koul <vinod.koul@intel.com>
2013-04-15ioatdma: Add 64bit chansts register read for ioat v3.3.Dave Jiang
The channel status register for v3.3 is now 64bit. Use readq if available on v3.3 platforms. Signed-off-by: Dave Jiang <dave.jiang@intel.com> Acked-by: Dan Williams <djbw@fb.com> Signed-off-by: Vinod Koul <vinod.koul@intel.com>
2013-04-15ioatdma: Adding PCI IDs for Intel Atom S1200 product family ioatdma devicesDave Jiang
These should be good for the IOAT DMA devices on the Intel Atom S1269, S1279, and S1289 platforms. We are also adding IOAT v3.3 definition for the new DMA engine. Signed-off-by: Dave Jiang <dave.jiang@intel.com> Acked-by: Dan Williams <djbw@fb.com> Signed-off-by: Vinod Koul <vinod.koul@intel.com>
2013-04-15ioatdma: Adding Haswell devid for ioatdmaDave Jiang
Adding Haswell PCI device IDs for ioatdma and simplify the detection of certain Xeon CPUs that has alignment bugs so that modifications can be changed at a single place going forward. Signed-off-by: Dave Jiang <dave.jiang@intel.com> Acked-by: Dan Williams <djbw@fb.com> Signed-off-by: Vinod Koul <vinod.koul@intel.com>
2013-04-15dmaengine: OMAP: Register SDMA controller with Device Tree DMA driverJon Hunter
If the device-tree blob is present during boot, then register the SDMA controller with the device-tree DMA driver so that we can use device-tree to look-up DMA client information. Signed-off-by: Jon Hunter <jon-hunter@ti.com> Reviewed-by: Felipe Balbi <balbi@ti.com> Acked-by: Santosh Shilimkar <santosh.shilimkar@ti.com> Tested-by: Santosh Shilimkar <santosh.shilimkar@ti.com> Acked-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Vinod Koul <vinod.koul@intel.com>
2013-04-15dw_dmac: remove unnecessary ENODEV checkAndy Shevchenko
If CONFIG_OF is not set the of_node of the device will always be NULL. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Vinod Koul <vinod.koul@intel.com>
2013-04-15dmaengine: dw_dmac: simplify master selectionArnd Bergmann
The patch to add the common DMA binding added a dummy dw_dma_slave structure into the dw_dma_chan structure in order to configure the masters correctly. It turns out that this can be simplified if we pick the DMA masters in the dwc_alloc_chan_resources function instead and save them in the dw_dma_chan structure directly. This could be simplified further once all users that today use dw_dma_slave for configuration get converted to device tree based setup instead. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Cc: linux-arm-kernel@lists.infradead.org Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Vinod Koul <vinod.koul@intel.com>
2013-04-15dw_dmac: rename DT related methods to reflect their belongingAndy Shevchenko
Since we will have not only DT cases in future let's rename DT related methods to reflect their belonging. The rename was done as follows: struct dw_dma_filter_args -> struct dw_dma_of_filter_args dw_dma_generic_filter() -> dw_dma_of_filter() dw_dma_xlate() -> dw_dma_of_xlate() dw_dma_id_table -> dw_dma_of_id_table There is no functional change. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Vinod Koul <vinod.koul@intel.com>
2013-04-15dw_dmac: fix style of the commentsAndy Shevchenko
Let's use capital letter as a first one in the comments. There is no functional changes. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Vinod Koul <vinod.koul@intel.com>
2013-04-15dma: Make the 'mask' parameter of __dma_request_channel constLars-Peter Clausen
The 'mask' parameter is not modified in __dma_request_channel and really shouldn't be. Make this explicit by making the parameter const. Signed-off-by: Lars-Peter Clausen <lars@metafoo.de> Signed-off-by: Vinod Koul <vinod.koul@intel.com>
2013-04-15dmaengine:sirf:take clock and enable it while probingBarry Song
there is hardcode which enabled the clock of dmaengine before, this patch takes the clock by standard clock API and enable it in probe. Signed-off-by: Barry Song <Baohua.Song@csr.com> Signed-off-by: Vinod Koul <vinod.koul@intel.com>
2013-04-15dw_dmac: don't wait for FIFO_EMPTY endlessly in dwc_chan_pauseAndy Shevchenko
When we pause the channel after transfer is completed we might stuck in the dwc_chan_pause() because the FIFO_EMPTY flag will never be asserted. To avoid the endless loop we introduce a timeout here (*). The proper solution is to somehow get the residue in FIFO and avoid busyloop when transfer is done, but this task is not simple and fast. Unfortunately we can't use cpu_relax() in conjunction with jiffies checker, due to we have interrupts disabled by spin_lock_irqsave() and there is a big chance that no interrupts will come to update the jiffies.. (*) The worst case is AHB write * FIFO size / hclk = 5.12 us, where AHB write = 2 cycles, hclk = 100 MHz, burst size = 1 byte, FIFO size = 256 bytes. The proposed 40us timeout might be considered as a big one, though we enter to that state only when we have the transfer already completed. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Vinod Koul <vinod.koul@intel.com>
2013-04-15dma: imx-dma: Remove redundant NULL check before kfreeSyam Sidhardhan
kfree on NULL pointer is a no-op. Signed-off-by: Syam Sidhardhan <s.syam@samsung.com> Signed-off-by: Vinod Koul <vinod.koul@intel.com>
2013-04-15dma: ipu: ipu_idmac: Fix section mismatchFabio Estevam
Since commit 84c1e63c12 (dma: Remove erroneous __exit and __exit_p() references) the following section mismatch happens: WARNING: drivers/built-in.o(.text+0x20f94): Section mismatch in reference from the function ipu_remove() to the function .exit.text:ipu_idmac_exit() The function ipu_remove() references a function in an exit section. Often the function ipu_idmac_exit() has valid usage outside the exit section and the fix is to remove the __exit annotation of ipu_idmac_exit. Remove the '__exit' annotation from ipu_idmac_exit in order to fix it. Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com> Acked-by: Maxin B. John <maxin.john@enea.com> Signed-off-by: Vinod Koul <vinod.koul@intel.com>
2013-04-15dma: tegra: assume CONFIG_OFStephen Warren
Tegra only supports, and always enables, device tree. Remove all ifdefs and runtime checks for DT support from the driver. Signed-off-by: Stephen Warren <swarren@nvidia.com> Signed-off-by: Vinod Koul <vinod.koul@intel.com>
2013-04-15dma: pl330: Convert to devm_ioremap_resource()Sachin Kamat
Use the newly introduced devm_ioremap_resource() instead of devm_request_and_ioremap() which provides more consistent error handling. Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org> Reviewed-by: Thierry Reding <thierry.reding@avionic-design.de> Signed-off-by: Vinod Koul <vinod.koul@intel.com>
2013-04-15dmatest: append verify result to resultsAndy Shevchenko
Comparison between buffers is stored to the dedicated structure. Note that the verify result is now accessible only via file 'results' in the debugfs. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Vinod Koul <vinod.koul@intel.com>
2013-04-15dmatest: gather test results in the linked listAndy Shevchenko
The patch provides a storage for the test results in the linked list. The gathered data could be used after test is done. The new file 'results' represents gathered data of the in progress test. The messages collected are printed to the kernel log as well. Example of output: % cat /sys/kernel/debug/dmatest/results dma0chan0-copy0: #1: No errors with src_off=0x7bf dst_off=0x8ad len=0x3fea (0) The message format is unified across the different types of errors. A number in the parens represents additional information, e.g. error code, error counter, or status. Note that the buffer comparison is done in the old way, i.e. data is not collected and just printed out. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Vinod Koul <vinod.koul@intel.com>
2013-04-15dmatest: define MAX_ERROR_COUNT constantAndy Shevchenko
Its meaning is to limit amount of error messages to be printed out when buffer mismatch is occured. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Vinod Koul <vinod.koul@intel.com>
2013-04-15dmatest: return actual state in 'run' fileAndy Shevchenko
The following command should return actual state of the test. % cat /sys/kernel/debug/dmatest/run To wait for test done the user may perform a busy loop that checks the state. % while [ $(cat /sys/kernel/debug/dmatest/run) = "Y" ] > do > echo -n "." > sleep 1 > done > echo Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Vinod Koul <vinod.koul@intel.com>
2013-04-15dmatest: run test via debugfsAndy Shevchenko
Instead of doing modprobe dmatest ... modprobe -r dmatest we allow user to run tests interactively. The dmatest could be built as module or inside kernel. Let's consider those cases. 1. When dmatest is built as a module... After mounting debugfs and loading the module, the /sys/kernel/debug/dmatest folder with nodes will be created. They are the same as module parameters with addition of the 'run' node that controls run and stop phases of the test. Note that in this case test will not run on load automatically. Example of usage: % echo dma0chan0 > /sys/kernel/debug/dmatest/channel % echo 2000 > /sys/kernel/debug/dmatest/timeout % echo 1 > /sys/kernel/debug/dmatest/iterations % echo 1 > /sys/kernel/debug/dmatest/run After a while you will start to get messages about current status or error like in the original code. Note that running a new test will stop any in progress test. 2. When built-in in the kernel... The module parameters that is supplied to the kernel command line will be used for the first performed test. After user gets a control, the test could be interrupted or re-run with same or different parameters. For the details see the above section "1. When dmatest is built as a module..." In both cases the module parameters are used as initial values for the test case. You always could check them at run-time by running % grep -H . /sys/module/dmatest/parameters/* Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Vinod Koul <vinod.koul@intel.com>
2013-04-15dmatest: split test parameters to separate structureAndy Shevchenko
Better to keep test parameters separate from internal variables. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Vinod Koul <vinod.koul@intel.com>
2013-04-15dmatest: move dmatest_channels and nr_channels to dmatest_infoAndy Shevchenko
We don't need to have them global and later we would like to protect access to them as well. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Vinod Koul <vinod.koul@intel.com>
2013-04-15dmatest: create dmatest_info to keep test parametersAndy Shevchenko
The proposed change will remove usage of the module parameters as global variables. In future it helps to run different test cases sequentially. The patch introduces the run_threaded_test() and stop_threaded_test() functions that could be used later outside of dmatest_init, dmatest_exit scope. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Vinod Koul <vinod.koul@intel.com>
2013-04-15dmatest: allocate memory for pq_coefs from heapAndy Shevchenko
This will help in future to hide a global variable usage. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Vinod Koul <vinod.koul@intel.com>
2013-04-15dmatest: cancel thread immediately when asked forAndy Shevchenko
If user have the timeout alike issues and wants to cancel the thread immediately, the current call of wait_event_freezable_timeout is preventing to this until timeout is expired. Thus, user will experience the unnecessary delays. Adding kthread_should_stop() check inside wait_event_freezable_timeout() solves that. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Vinod Koul <vinod.koul@intel.com>
2013-04-15ioatdma: allow all channels to have irq coalescing supportDave Jiang
Looks like only the RAID channels are allowed to have irq coalescing support in the existing code. Fixing that. The ioat3 cleanup code can handle memcpy ops anyways Signed-off-by: Dave Jiang <dave.jiang@intel.com> Acked-by: Dan Williams <djbw@fb.com> Signed-off-by: Vinod Koul <vinod.koul@intel.com>
2013-04-15ioatdma: make debug output more readableDave Jiang
Making OP field a hex instead of integer to make it more readable. Also add the dump out of the NEXT field. Signed-off-by: Dave Jiang <dave.jiang@intel.com> Acked-by: Dan Williams <djbw@fb.com> Signed-off-by: Vinod Koul <vinod.koul@intel.com>
2013-04-15dma: Remove erroneous __exit and __exit_p() referencesMaxin B. John
Removing the annotation with __exit and referencing with __exit_p() present in dma driver module remove hooks. Part of the __devexit and __devexit_p() purge. Signed-off-by: Maxin B. John <maxin.john@enea.com> Acked-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Vinod Koul <vinod.koul@intel.com>
2013-04-15dma: timb_dma: Fix compiler warningMaxin B. John
Fix this compiler warning: warning: 'td_remove' defined but not used [-Wunused-function] Signed-off-by: Maxin B. John <maxin.john@enea.com> Signed-off-by: Vinod Koul <vinod.koul@intel.com>
2013-04-15pch_dma: Use GFP_ATOMIC because called from interrupt contextTomoya MORINAGA
pdc_desc_get() is called from pd_prep_slave_sg, and the function is called from interrupt context(e.g. Uart driver "pch_uart.c"). In fact, I saw kernel error message. So, GFP_ATOMIC must be used not GFP_NOIO. Signed-off-by: Tomoya MORINAGA <tomoya.rohm@gmail.com> Signed-off-by: Vinod Koul <vinod.koul@intel.com>
2013-04-15DMA: PL330: allow submitting 2 requests at a timeJassi Brar
Fix the logic to allow mc programming of second transfer after first has been done, by removing immediate return upon success and iterating until we detect QFull or DMAC dying. Reported-by: Alvaro Moran <dirac3000@gmail.com> Tested-by: Alvaro Moran <dirac3000@gmail.com> Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org> Signed-off-by: Vinod Koul <vinod.koul@intel.com>
2013-04-14Linux 3.9-rc7v3.9-rc7Linus Torvalds
2013-04-14Merge branch 'x86-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Ingo Molnar: "Misc fixes" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/mm: Flush lazy MMU when DEBUG_PAGEALLOC is set x86/mm/cpa/selftest: Fix false positive in CPA self test x86/mm/cpa: Convert noop to functional fix x86, mm: Patch out arch_flush_lazy_mmu_mode() when running on bare metal x86, mm, paravirt: Fix vmalloc_fault oops during lazy MMU updates
2013-04-14Merge branch 'sched-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler fixes from Ingo Molnar: "Misc fixlets" * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched/cputime: Fix accounting on multi-threaded processes sched/debug: Fix sd->*_idx limit range avoiding overflow sched_clock: Prevent 64bit inatomicity on 32bit systems sched: Convert BUG_ON()s in try_to_wake_up_local() to WARN_ON_ONCE()s
2013-04-14Merge branch 'perf-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Ingo Molnar: "Misc fixlets" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf: Fix error return code ftrace: Fix strncpy() use, use strlcpy() instead of strncpy() perf: Fix strncpy() use, use strlcpy() instead of strncpy() perf: Fix strncpy() use, always make sure it's NUL terminated perf: Fix ring_buffer perf_output_space() boundary calculation perf/x86: Fix uninitialized pt_regs in intel_pmu_drain_bts_buffer()
2013-04-14Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linuxLinus Torvalds
Pull drm fixes from Dave Airlie: "One fix for a hotplug locking regressions, and one fix for an oops if you unplug the monitor at an inopportune moment on the udl device." * 'drm-fixes' of git://people.freedesktop.org/~airlied/linux: drm/fb-helper: Fix locking in drm_fb_helper_hotplug_event udl: handle EDID failure properly.
2013-04-14Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu Pull m68knommu fix from Greg Ungerer: "This contains only a single compilation fix for ColdFire m68k targets that use local non-GPIOLIB support." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu: m68k: define a local gpio_request_one() function
2013-04-14Merge git://www.linux-watchdog.org/linux-watchdogLinus Torvalds
Pull watchdog fix from Wim Van Sebroeck: "It will fix compile errors for the at91rm9200_wdt driver" * git://www.linux-watchdog.org/linux-watchdog: watchdog: Revert the AT91RM9200_WATCHDOG dependency
2013-04-14Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs Pull one more btrfs fix from Chris Mason: "This has a recent fix from Josef for our tree log replay code. It fixes problems where the inode counter for the number of bytes in the file wasn't getting updated properly during fsync replay. The commit did get rebased this morning, but it was only to clean up the subject line. The code hasn't changed." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: Btrfs: make sure nbytes are right after log replay
2013-04-14Merge tag 'trace-fixes-v3.9-rc-v3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull ftrace fixes from Steven Rostedt: "Namhyung Kim found and fixed a bug that can crash the kernel by simply doing: echo 1234 | tee -a /sys/kernel/debug/tracing/set_ftrace_pid Luckily, this can only be done by root, but still is a nasty bug." * tag 'trace-fixes-v3.9-rc-v3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: ftrace: Move ftrace_filter_lseek out of CONFIG_DYNAMIC_FTRACE section tracing: Fix possible NULL pointer dereferences
2013-04-14Add file_ns_capable() helper function for open-time capability checkingLinus Torvalds
Nothing is using it yet, but this will allow us to delay the open-time checks to use time, without breaking the normal UNIX permission semantics where permissions are determined by the opener (and the file descriptor can then be passed to a different process, or the process can drop capabilities). Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-04-14watchdog: Revert the AT91RM9200_WATCHDOG dependencyNicolas Ferre
Compiling the at91rm9200_wdt.c driver without at91rm9200 support was leading to several errors: drivers/built-in.o: In function `at91_wdt_close': at91_adc.c:(.text+0xc9fe4): undefined reference to `at91_st_base' drivers/built-in.o: In function `at91_wdt_write': at91_adc.c:(.text+0xca004): undefined reference to `at91_st_base' drivers/built-in.o: In function `at91wdt_shutdown': at91_adc.c:(.text+0xca01c): undefined reference to `at91_st_base' drivers/built-in.o: In function `at91wdt_suspend': at91_adc.c:(.text+0xca038): undefined reference to `at91_st_base' drivers/built-in.o: In function `at91_wdt_open': at91_adc.c:(.text+0xca0cc): undefined reference to `at91_st_base' drivers/built-in.o:at91_adc.c:(.text+0xca2c8): more undefined references to `at91_st_base' follow So, reverting the modification of the "depends" Kconfig line introduced by patch a6a1bcd37 (watchdog: at91rm9200: add DT support) seems to be the good solution. Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com> Acked-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
2013-04-13vfs: Revert spurious fix to spinning prevention in prune_icache_sbSuleiman Souhlal
Revert commit 62a3ddef6181 ("vfs: fix spinning prevention in prune_icache_sb"). This commit doesn't look right: since we are looking at the tail of the list (sb->s_inode_lru.prev) if we want to skip an inode, we should put it back at the head of the list instead of the tail, otherwise we will keep spinning on it. Discovered when investigating why prune_icache_sb came top in perf reports of a swapping load. Signed-off-by: Suleiman Souhlal <suleiman@google.com> Signed-off-by: Hugh Dickins <hughd@google.com> Cc: stable@vger.kernel.org # v3.2+ Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-04-13kobject: fix kset_find_obj() race with concurrent last kobject_put()Linus Torvalds
Anatol Pomozov identified a race condition that hits module unloading and re-loading. To quote Anatol: "This is a race codition that exists between kset_find_obj() and kobject_put(). kset_find_obj() might return kobject that has refcount equal to 0 if this kobject is freeing by kobject_put() in other thread. Here is timeline for the crash in case if kset_find_obj() searches for an object tht nobody holds and other thread is doing kobject_put() on the same kobject: THREAD A (calls kset_find_obj()) THREAD B (calls kobject_put()) splin_lock() atomic_dec_return(kobj->kref), counter gets zero here ... starts kobject cleanup .... spin_lock() // WAIT thread A in kobj_kset_leave() iterate over kset->list atomic_inc(kobj->kref) (counter becomes 1) spin_unlock() spin_lock() // taken // it does not know that thread A increased counter so it remove obj from list spin_unlock() vfree(module) // frees module object with containing kobj // kobj points to freed memory area!! kobject_put(kobj) // OOPS!!!! The race above happens because module.c tries to use kset_find_obj() when somebody unloads module. The module.c code was introduced in commit 6494a93d55fa" Anatol supplied a patch specific for module.c that worked around the problem by simply not using kset_find_obj() at all, but rather than make a local band-aid, this just fixes kset_find_obj() to be thread-safe using the proper model of refusing the get a new reference if the refcount has already dropped to zero. See examples of this proper refcount handling not only in the kref documentation, but in various other equivalent uses of this pattern by grepping for atomic_inc_not_zero(). [ Side note: the module race does indicate that module loading and unloading is not properly serialized wrt sysfs information using the module mutex. That may require further thought, but this is the correct fix at the kobject layer regardless. ] Reported-analyzed-and-tested-by: Anatol Pomozov <anatol.pomozov@gmail.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: stable@vger.kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-04-13Btrfs: make sure nbytes are right after log replayJosef Bacik
While trying to track down a tree log replay bug I noticed that fsck was always complaining about nbytes not being right for our fsynced file. That is because the new fsync stuff doesn't wait for ordered extents to complete, so the inodes nbytes are not necessarily updated properly when we log it. So to fix this we need to set nbytes to whatever it is on the inode that is on disk, so when we replay the extents we can just add the bytes that are being added as we replay the extent. This makes it work for the case that we have the wrong nbytes or the case that we logged everything and nbytes is actually correct. With this I'm no longer getting nbytes errors out of btrfsck. Cc: stable@vger.kernel.org Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2013-04-12x86-32: Fix possible incomplete TLB invalidate with PAE pagetablesDave Hansen
This patch attempts to fix: https://bugzilla.kernel.org/show_bug.cgi?id=56461 The symptom is a crash and messages like this: chrome: Corrupted page table at address 34a03000 *pdpt = 0000000000000000 *pde = 0000000000000000 Bad pagetable: 000f [#1] PREEMPT SMP Ingo guesses this got introduced by commit 611ae8e3f520 ("x86/tlb: enable tlb flush range support for x86") since that code started to free unused pagetables. On x86-32 PAE kernels, that new code has the potential to free an entire PMD page and will clear one of the four page-directory-pointer-table (aka pgd_t entries). The hardware aggressively "caches" these top-level entries and invlpg does not actually affect the CPU's copy. If we clear one we *HAVE* to do a full TLB flush, otherwise we might continue using a freed pmd page. (note, we do this properly on the population side in pud_populate()). This patch tracks whenever we clear one of these entries in the 'struct mmu_gather', and ensures that we follow up with a full tlb flush. BTW, I disassembled and checked that: if (tlb->fullmm == 0) and if (!tlb->fullmm && !tlb->need_flush_all) generate essentially the same code, so there should be zero impact there to the !PAE case. Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Cc: Peter Anvin <hpa@zytor.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Artem S Tashkinov <t.artem@mailcity.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>