summaryrefslogtreecommitdiffstats
AgeCommit message (Collapse)Author
2012-09-21x86, smap: Reduce the SMAP overhead for signal handlingH. Peter Anvin
Signal handling contains a bunch of accesses to individual user space items, which causes an excessive number of STAC and CLAC instructions. Instead, let get/put_user_try ... get/put_user_catch() contain the STAC and CLAC instructions. This means that get/put_user_try no longer nests, and furthermore that it is no longer legal to use user space access functions other than __get/put_user_ex() inside those blocks. However, these macros are x86-specific anyway and are only used in the signal-handling paths; a simple reordering of moving the larger subroutine calls out of the try...catch blocks resolves that problem. Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Link: http://lkml.kernel.org/r/1348256595-29119-12-git-send-email-hpa@linux.intel.com
2012-09-21x86, smap: A page fault due to SMAP is an oopsH. Peter Anvin
If we get a page fault due to SMAP, trigger an oops rather than spinning forever. Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Link: http://lkml.kernel.org/r/1348256595-29119-11-git-send-email-hpa@linux.intel.com
2012-09-21x86, smap: Turn on Supervisor Mode Access PreventionH. Peter Anvin
If Supervisor Mode Access Prevention is available and not disabled by the user, turn it on. Also fix the expansion of SMEP (Supervisor Mode Execution Prevention.) Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Link: http://lkml.kernel.org/r/1348256595-29119-10-git-send-email-hpa@linux.intel.com
2012-09-21x86, smap: Add STAC and CLAC instructions to control user space accessH. Peter Anvin
When Supervisor Mode Access Prevention (SMAP) is enabled, access to userspace from the kernel is controlled by the AC flag. To make the performance of manipulating that flag acceptable, there are two new instructions, STAC and CLAC, to set and clear it. This patch adds those instructions, via alternative(), when the SMAP feature is enabled. It also adds X86_EFLAGS_AC unconditionally to the SYSCALL entry mask; there is simply no reason to make that one conditional. Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Link: http://lkml.kernel.org/r/1348256595-29119-9-git-send-email-hpa@linux.intel.com
2012-09-21x86, uaccess: Merge prototypes for clear_user/__clear_userH. Peter Anvin
The prototypes for clear_user() and __clear_user() are identical in the 32- and 64-bit headers. No functionality change. Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Link: http://lkml.kernel.org/r/1348256595-29119-8-git-send-email-hpa@linux.intel.com
2012-09-21x86, smap: Add a header file with macros for STAC/CLACH. Peter Anvin
The STAC/CLAC instructions are only available with SMAP, but on the other hand they aren't needed if SMAP is not available, or before we start to run userspace, so construct them as alternatives which start out as noops and are enabled by the alternatives mechanism. Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Link: http://lkml.kernel.org/r/1348256595-29119-7-git-send-email-hpa@linux.intel.com
2012-09-21x86, alternative: Add header guards to <asm/alternative-asm.h>H. Peter Anvin
Add header guards to protect <asm/alternative-asm.h> against multiple inclusion. Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Link: http://lkml.kernel.org/r/1348256595-29119-6-git-send-email-hpa@linux.intel.com
2012-09-21x86, alternative: Use .pushsection/.popsectionH. Peter Anvin
.section/.previous doesn't nest. Use .pushsection/.popsection in <asm/alternative.h> so that they can be properly nested. Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Link: http://lkml.kernel.org/r/1348256595-29119-5-git-send-email-hpa@linux.intel.com
2012-09-21x86, smap: Add CR4 bit for SMAPH. Peter Anvin
Add X86_CR4_SMAP to <asm/processor-flags.h>. Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Link: http://lkml.kernel.org/r/1348256595-29119-4-git-send-email-hpa@linux.intel.com
2012-09-21x86-32, mm: The WP test should be done on a kernel pageH. Peter Anvin
PAGE_READONLY includes user permission, but this is a page used exclusively by the kernel; use PAGE_KERNEL_RO instead. Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Link: http://lkml.kernel.org/r/1348256595-29119-3-git-send-email-hpa@linux.intel.com
2012-09-09x86, cpufeature: Add feature bit for SMAPH. Peter Anvin
Add CPUID feature bit for Supervisor Mode Access Prevention (SMAP). Signed-off-by: H. Peter Anvin <hpa@zytor.com> Link: http://lkml.kernel.org/n/tip-ethzcr5nipikl6hd5q8ssepq@git.kernel.org
2012-09-01Linux 3.6-rc4v3.6-rc4Linus Torvalds
2012-09-01time: Move ktime_t overflow checking into timespec_valid_strictJohn Stultz
Andreas Bombe reported that the added ktime_t overflow checking added to timespec_valid in commit 4e8b14526ca7 ("time: Improve sanity checking of timekeeping inputs") was causing problems with X.org because it caused timeouts larger then KTIME_T to be invalid. Previously, these large timeouts would be clamped to KTIME_MAX and would never expire, which is valid. This patch splits the ktime_t overflow checking into a new timespec_valid_strict function, and converts the timekeeping codes internal checking to use this more strict function. Reported-and-tested-by: Andreas Bombe <aeb@debian.org> Cc: Zhouping Liu <zliu@redhat.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Prarit Bhargava <prarit@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Signed-off-by: John Stultz <john.stultz@linaro.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-08-31Merge git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull KVM bugfixes from Marcelo Tosatti. * git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: x86: fix KVM_GET_MSR for PV EOI kvm: Fix nonsense handling of compat ioctl
2012-08-31Merge tag 'parisc-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/parisc-2.6 Pull PARISC fixes from James Bottomley: "This is a set of two bug fixes. One is the ATOMIC problem which is now causing a compile failure in certain situations. The other is mishandling of PER_LINUX32 which may also cause user visible effects. Signed-off-by: James Bottomley <JBottomley@Parallels.com>" * tag 'parisc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/parisc-2.6: [PARISC] fix personality flag check in copy_thread() [PARISC] Redefine ATOMIC_INIT and ATOMIC64_INIT to drop the casts
2012-08-31Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 fixes from Martin Schwidefsky: "A couple of s390 bug fixes for 3.5-rc4" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: s390/32: Don't clobber personality flags on exec s390/smp: add missing smp_store_status() for !SMP s390/dasd: fix ioctl return value s390: Always use "long" for ssize_t to match size_t
2012-08-30Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linuxLinus Torvalds
Pull drm fixes from Dave Airlie: "A bunch of scattered fixes ati/intel/nouveau, couple of core ones, nothing too shocking or different." * 'drm-fixes' of git://people.freedesktop.org/~airlied/linux: drm: Add EDID_QUIRK_FORCE_REDUCED_BLANKING for ASUS VW222S gma500: Consider CRTC initially active. drm/radeon: fix dig encoder selection on DCE61 drm/radeon: fix double free in radeon_gpu_reset drm/radeon: force dma32 to fix regression rs4xx,rs6xx,rs740 drm/radeon: rework panel mode setup drm/radeon/atom: powergating fixes for DCE6 drm/radeon/atom: rework DIG modesetting on DCE3+ drm/radeon: don't disable plls that are in use by other crtcs drm/radeon: add proper checking of RESOLVE_BOX command for r600-r700 drm/radeon: initialize tracked CS state drm/radeon: fix reading CB_COLORn_MASK from the CS drm/nvc0/copy: check PUNITS to determine which copy engines are disabled i915: Quirk no_lvds on Gigabyte GA-D525TUD ITX motherboard drm/i915: Use the correct size of the GTT for placing the per-process entries drm: Check for invalid cursor flags drm: Initialize object type when using DRM_MODE() macro drm/i915: fix color order for BGR formats on IVB drm/i915: fix wrong order of parameters in port checking functions
2012-08-30s390/32: Don't clobber personality flags on execHeiko Carstens
In native 32 bit mode the personality flags were not correctly inherited. This is the s390 version of 59e4c3a2 "powerpc/32: Don't clobber personality flags on exec". Reported-by: Mike Frysinger <vapier@gentoo.org> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2012-08-30drm: Add EDID_QUIRK_FORCE_REDUCED_BLANKING for ASUS VW222SPaul Menzel
Connecting an ASUS VW222S [1] over VGA a garbled screen is shown with vertical stripes in the top half. In commit bc42aabc [2] commit bc42aabc6a01b92b0f961d65671564e0e1cd7592 Author: Adam Jackson <ajax@redhat.com> Date: Wed May 23 16:26:54 2012 -0400 drm/edid/quirks: ViewSonic VA2026w Adam Jackson added the quirk `EDID_QUIRK_FORCE_REDUCED_BLANKING` which is also needed for this ASUS monitor. All log files and output from `xrandr` is included in the referenced Bugzilla report #17629. Please note that this monitor only has a VGA (D-Sub) connector [1]. [1] http://www.asus.com/Display/LCD_Monitors/VW222S/ [2] http://git.kernel.org/?p=linux/kernel/git/torvalds/linux.git;a=commit;h=bc42aabc6a01b92b0f961d65671564e0e1cd7592 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=17629 Signed-off-by: Paul Menzel <paulepanter@users.sourceforge.net> Cc: <dri-devel@lists.freedesktop.org> Cc: Adam Jackson <ajax@redhat.com> Cc: Ian Pilcher <arequipeno@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Dave Airlie <airlied@redhat.com>
2012-08-30Merge branch 'drm-fixes-3.6' of git://people.freedesktop.org/~agd5f/linux ↵Dave Airlie
into drm-fixes Alex writes: Highlights: - fix a gart regression on older IGP chips - more MSAA fixes - fix a double free in gpu reset code - modesetting fixes - trinity dig encoder fix. * 'drm-fixes-3.6' of git://people.freedesktop.org/~agd5f/linux: drm/radeon: fix dig encoder selection on DCE61 drm/radeon: fix double free in radeon_gpu_reset drm/radeon: force dma32 to fix regression rs4xx,rs6xx,rs740 drm/radeon: rework panel mode setup drm/radeon/atom: powergating fixes for DCE6 drm/radeon/atom: rework DIG modesetting on DCE3+ drm/radeon: don't disable plls that are in use by other crtcs drm/radeon: add proper checking of RESOLVE_BOX command for r600-r700 drm/radeon: initialize tracked CS state drm/radeon: fix reading CB_COLORn_MASK from the CS
2012-08-30gma500: Consider CRTC initially active.Forest Bond
[this one ideally should make 3.6 - it fixes the very annoying mode setting bug] This causes the pipe to be forced off prior to initial mode set, which roughly mirrors the behavior of the i915 driver. It fixes initial mode setting on my Intel DN2800MT (Cedarview) board. Without it, mode setting triggers an out-of-range error from the monitor for most modes, but only on initial configuration (i.e. they can be configured successfully from userspace after that). Signed-off-by: Forest Bond <forest.bond@rapidrollout.com> Signed-off-by: Alan Cox <alan@linux.intel.com> Cc: stable <stable@vger.kernel.org> Signed-off-by: Dave Airlie <airlied@redhat.com>
2012-08-29drm/radeon: fix dig encoder selection on DCE61Alex Deucher
Was using the DCE41 code which was wrong. Fixes blank displays on a number of Trinity systems. Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2012-08-29Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs Pull btrfs fixes from Chris Mason: "I've split out the big send/receive update from my last pull request and now have just the fixes in my for-linus branch. The send/recv branch will wander over to linux-next shortly though. The largest patches in this pull are Josef's patches to fix DIO locking problems and his patch to fix a crash during balance. They are both well tested. The rest are smaller fixes that we've had queued. The last rc came out while I was hacking new and exciting ways to recover from a misplaced rm -rf on my dev box, so these missed rc3." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: (25 commits) Btrfs: fix that repair code is spuriously executed for transid failures Btrfs: fix ordered extent leak when failing to start a transaction Btrfs: fix a dio write regression Btrfs: fix deadlock with freeze and sync V2 Btrfs: revert checksum error statistic which can cause a BUG() Btrfs: remove superblock writing after fatal error Btrfs: allow delayed refs to be merged Btrfs: fix enospc problems when deleting a subvol Btrfs: fix wrong mtime and ctime when creating snapshots Btrfs: fix race in run_clustered_refs Btrfs: don't run __tree_mod_log_free_eb on leaves Btrfs: increase the size of the free space cache Btrfs: barrier before waitqueue_active Btrfs: fix deadlock in wait_for_more_refs btrfs: fix second lock in btrfs_delete_delayed_items() Btrfs: don't allocate a seperate csums array for direct reads Btrfs: do not strdup non existent strings Btrfs: do not use missing devices when showing devname Btrfs: fix that error value is changed by mistake Btrfs: lock extents as we map them in DIO ...
2012-08-29Merge git://www.linux-watchdog.org/linux-watchdogLinus Torvalds
Pull watchdog fixes from Wim Van Sebroeck: "This will fix a warning for watchdog-test.c and it will remove a duplicate include of delay.h" * git://www.linux-watchdog.org/linux-watchdog: watchdog: da9052: Remove duplicate inclusion of delay.h watchdog: fix watchdog-test.c build warning
2012-08-29mm, slab: lock the correct nodelist after reenabling irqsDavid Rientjes
cache_grow() can reenable irqs so the cpu (and node) can change, so ensure that we take list_lock on the correct nodelist. This fixes an issue with commit 072bb0aa5e06 ("mm: sl[au]b: add knowledge of PFMEMALLOC reserve pages") where list_lock for the wrong node was taken after growing the cache. Reported-and-tested-by: Haggai Eran <haggaie@mellanox.com> Signed-off-by: David Rientjes <rientjes@google.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-08-29drm/radeon: fix double free in radeon_gpu_resetChristian König
radeon_ring_restore is freeing the memory for the saved ring data. We need to remember that, otherwise we try to restore the ring data again on the next try. Additional to that it shouldn't try the reset infinitely if we have saved ring data. Signed-off-by: Christian König <deathsimple@vodafone.de> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2012-08-29drm/radeon: force dma32 to fix regression rs4xx,rs6xx,rs740Jerome Glisse
It seems some of those IGP dislike non dma32 page despite what documentation says. Fix regression since we allowed non dma32 pages. It seems it only affect some revision of those IGP chips as we don't know which one just force dma32 for all of them. https://bugzilla.redhat.com/show_bug.cgi?id=785375 Signed-off-by: Jerome Glisse <jglisse@redhat.com> Cc: <stable@vger.kernel.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2012-08-29drm/radeon: rework panel mode setupAlex Deucher
Adjust the panel mode setup to match the behavior of the vbios. Rather than checking for specific bridge chip ids, just check the eDP configuration register. This saves extra aux transactions and works across DP bridge chips without requiring additional per chip id checking. Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2012-08-29drm/radeon/atom: powergating fixes for DCE6Alex Deucher
Power gating is per crtc pair, but the powergating registers should be called individually. The hw handles power up/down properly. The pair is powered up if either crtc in the pair is powered up and the pair is not powered down until both crtcs in the pair are powered down. This simplifies programming and should save additional power as the previous code never actually power gated the crtc pair. Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2012-08-29drm/radeon/atom: rework DIG modesetting on DCE3+Alex Deucher
The ordering is important and the current drm code wasn't cutting it for modern DIG encoders. We need to have information about crtc before setting up the encoders so I've shifted the ordering a bit. Probably we'll need a full rework akin to danvet's recent intel patchs. This patch fixes numerous issues with DP bridge chips and makes link training much more reliable. Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2012-08-29drm/radeon: don't disable plls that are in use by other crtcsAlex Deucher
Some plls are shared for DP. Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2012-08-29drm/radeon: add proper checking of RESOLVE_BOX command for r600-r700Marek Olšák
Checking of the second colorbuffer was skipped on r700, because CB_TARGET_MASK was 0xf. With r600, CB_TARGET_MASK is changed to 0xff, so we must set the number of samples of the second colorbuffer to 1 in order to pass the CS checker. The DRM version is bumped, because RESOLVE_BOX is always rejected without this fix on r600. Signed-off-by: Marek Olšák <maraeo@gmail.com> Reviewed-by: Jerome Glisse <jglisse@redhat.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2012-08-29drm/radeon: initialize tracked CS stateMarek Olšák
This should help catch uninitialized registers and reject commands because of that. Signed-off-by: Marek Olšák <maraeo@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2012-08-29drm/radeon: fix reading CB_COLORn_MASK from the CSMarek Olšák
Signed-off-by: Marek Olšák <maraeo@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2012-08-29watchdog: da9052: Remove duplicate inclusion of delay.hSachin Kamat
delay.h header file was included twice. Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org> Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
2012-08-29watchdog: fix watchdog-test.c build warningRandy Dunlap
Fix compiler warning by making the function static: Documentation/watchdog/src/watchdog-test.c:34:6: warning: no previous prototype for 'term' Signed-off-by: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
2012-08-29Merge branch 'drm-intel-fixes' of ↵Dave Airlie
git://people.freedesktop.org/~danvet/drm-intel into drm-fixes Daniel writes: "Just a few smaller things: - Fix up a pipe vs. plane confusion from a refactoring, fixes a regression from 3.1 (Anhua Xu). - Fix ivb sprite pixel formats (Vijay). - Fixup ppgtt pde placement for machines where the Bios artifically limits the availbale gtt space in the name of ... product differentiation (Chris). This fixes an oops. - Yet another no_lvds quirk entry." * 'drm-intel-fixes' of git://people.freedesktop.org/~danvet/drm-intel: i915: Quirk no_lvds on Gigabyte GA-D525TUD ITX motherboard drm/i915: Use the correct size of the GTT for placing the per-process entries drm/i915: fix color order for BGR formats on IVB drm/i915: fix wrong order of parameters in port checking functions
2012-08-29Merge branch 'drm-nouveau-fixes' of ↵Dave Airlie
git://anongit.freedesktop.org/git/nouveau/linux-2.6 into drm-fixes Ben says its just a single fix to avoid the wrong pcopy units being used. * 'drm-nouveau-fixes' of git://anongit.freedesktop.org/git/nouveau/linux-2.6: drm/nvc0/copy: check PUNITS to determine which copy engines are disabled
2012-08-29drm/nvc0/copy: check PUNITS to determine which copy engines are disabledBen Skeggs
On some Fermi chipsets (NVCE particularly) PCOPY1 doesn't exist. And if what I've seen on Kepler is true of Fermi too, chipsets of the same type can have different PCOPY units available. This should fix a v3.5 regression reported by a number of people effecting suspend/resume on NVC8/NVCE chipsets. Cc: stable@vger.kernel.org [3.5] Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2012-08-28Btrfs: fix that repair code is spuriously executed for transid failuresStefan Behrens
If verify_parent_transid() fails for all mirrors, the current code calls repair_io_failure() anyway which means: - that the disk block is rewritten without repairing anything and - that a kernel log message is printed which misleadingly claims that a read error was corrected. This is an example: parent transid verify failed on 615015833600 wanted 110423 found 110424 parent transid verify failed on 615015833600 wanted 110423 found 110424 btrfs read error corrected: ino 1 off 615015833600 (dev /dev/...) It is wrong to ignore the results from verify_parent_transid() and to call repair_eb_io_failure() when the verification of the transids failed. This commit fixes the issue. Signed-off-by: Stefan Behrens <sbehrens@giantdisaster.de> Signed-off-by: Chris Mason <chris.mason@oracle.com>
2012-08-28Btrfs: fix ordered extent leak when failing to start a transactionLiu Bo
We cannot just return error before freeing ordered extent and releasing reserved space when we fail to start a transacion. Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
2012-08-28Btrfs: fix a dio write regressionLiu Bo
This bug is introduced by commit 3b8bde746f6f9bd36a9f05f5f3b6e334318176a9 (Btrfs: lock extents as we map them in DIO). In dio write, we should unlock the section which we didn't do IO on in case that we fall back to buffered write. But we need to not only unlock the section but also cleanup reserved space for the section. This bug was found while running xfstests 133, with this 133 no longer complains. Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
2012-08-28Btrfs: fix deadlock with freeze and sync V2Josef Bacik
We can deadlock with freeze right now because we unconditionally start a transaction in our ->sync_fs() call. To fix this just check and see if we have a running transaction to commit. This saves us from the deadlock because at this point we'll have the umount sem for the sb so we're safe from freezes coming in after we've done our check. With this patch the freeze xfstests no longer deadlocks. Thanks, Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
2012-08-28Btrfs: revert checksum error statistic which can cause a BUG()Stefan Behrens
Commit 442a4f6308e694e0fa6025708bd5e4e424bbf51c added btrfs device statistic counters for detected IO and checksum errors to Linux 3.5. The statistic part that counts checksum errors in end_bio_extent_readpage() can cause a BUG() in a subfunction: "kernel BUG at fs/btrfs/volumes.c:3762!" That part is reverted with the current patch. However, the counting of checksum errors in the scrub context remains active, and the counting of detected IO errors (read, write or flush errors) in all contexts remains active. Cc: stable <stable@vger.kernel.org> # 3.5 Signed-off-by: Stefan Behrens <sbehrens@giantdisaster.de> Signed-off-by: Chris Mason <chris.mason@oracle.com>
2012-08-28Btrfs: remove superblock writing after fatal errorStefan Behrens
With commit acce952b0, btrfs was changed to flag the filesystem with BTRFS_SUPER_FLAG_ERROR and switch to read-only mode after a fatal error happened like a write I/O errors of all mirrors. In such situations, on unmount, the superblock is written in btrfs_error_commit_super(). This is done with the intention to be able to evaluate the error flag on the next mount. A warning is printed in this case during the next mount and the log tree is ignored. The issue is that it is possible that the superblock points to a root that was not written (due to write I/O errors). The result is that the filesystem cannot be mounted. btrfsck also does not start and all the other btrfs-progs tools fail to start as well. However, mount -o recovery is working well and does the right things to recover the filesystem (i.e., don't use the log root, clear the free space cache and use the next mountable root that is stored in the root backup array). This patch removes the writing of the superblock when BTRFS_SUPER_FLAG_ERROR is set, and removes the handling of the error flag in the mount function. These lines can be used to reproduce the issue (using /dev/sdm): SCRATCH_DEV=/dev/sdm SCRATCH_MNT=/mnt echo 0 25165824 linear $SCRATCH_DEV 0 | dmsetup create foo ls -alLF /dev/mapper/foo mkfs.btrfs /dev/mapper/foo mount /dev/mapper/foo $SCRATCH_MNT echo bar > $SCRATCH_MNT/foo sync echo 0 25165824 error | dmsetup reload foo dmsetup resume foo ls -alF $SCRATCH_MNT touch $SCRATCH_MNT/1 ls -alF $SCRATCH_MNT sleep 35 echo 0 25165824 linear $SCRATCH_DEV 0 | dmsetup reload foo dmsetup resume foo sleep 1 umount $SCRATCH_MNT btrfsck /dev/mapper/foo dmsetup remove foo Signed-off-by: Stefan Behrens <sbehrens@giantdisaster.de> Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>
2012-08-28Btrfs: allow delayed refs to be mergedJosef Bacik
Daniel Blueman reported a bug with fio+balance on a ramdisk setup. Basically what happens is the balance relocates a tree block which will drop the implicit refs for all of its children and adds a full backref. Once the block is relocated we have to add the implicit refs back, so when we cow the block again we add the implicit refs for its children back. The problem comes when the original drop ref doesn't get run before we add the implicit refs back. The delayed ref stuff will specifically prefer ADD operations over DROP to keep us from freeing up an extent that will have references to it, so we try to add the implicit ref before it is actually removed and we panic. This worked fine before because the add would have just canceled the drop out and we would have been fine. But the backref walking work needs to be able to freeze the delayed ref stuff in time so we have this ever increasing sequence number that gets attached to all new delayed ref updates which makes us not merge refs and we run into this issue. So to fix this we need to merge delayed refs. So everytime we run a clustered ref we need to try and merge all of its delayed refs. The backref walking stuff locks the delayed ref head before processing, so if we have it locked we are safe to merge any refs inside of the sequence number. If there is no sequence number we can merge all refs. Doing this not only fixes our bug but keeps the delayed ref code from adding and removing useless refs and batching together multiple refs into one search instead of one search per delayed ref, which will really help our commit times. I ran this with Daniels test and 276 and I haven't seen any problems. Thanks, Reported-by: Daniel J Blueman <daniel@quora.org> Signed-off-by: Josef Bacik <jbacik@fusionio.com>
2012-08-28Btrfs: fix enospc problems when deleting a subvolJosef Bacik
Subvol delete is a special kind of awful where we use the global reserve to cover the ENOSPC requirements. The problem is once we're done removing everything we do a btrfs_update_inode(), which by default will try to do the delayed update stuff which will use it's own reserve. There will be no space in this reserve and we'll return ENOSPC. So instead use btrfs_update_inode_fallback() which will just fallback to updating the inode item in the case of enospc. This is fine because the global reserve covers the space requirements for this. With this patch I can now delete a subvol on a problem image Dave Sterba sent me. Thanks, Reported-by: David Sterba <dave@jikos.cz> Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2012-08-28Btrfs: fix wrong mtime and ctime when creating snapshotsMiao Xie
When we created a new snapshot, the mtime and ctime of its parent directory were not updated. Fix it. Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2012-08-28Btrfs: fix race in run_clustered_refsArne Jansen
With commit commit d1270cd91f308c9d22b2804720c36ccd32dbc35e Author: Arne Jansen <sensille@gmx.net> Date: Tue Sep 13 15:16:43 2011 +0200 Btrfs: put back delayed refs that are too new I added a window where the delayed_ref's head->ref_mod code can diverge from the sum of the remaining refs, because we release the head->mutex in the middle. This leads to btrfs_lookup_extent_info returning wrong numbers. This patch fixes this by adjusting the head's ref_mod with each delayed ref we run. Signed-off-by: Arne Jansen <sensille@gmx.net> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2012-08-28Btrfs: don't run __tree_mod_log_free_eb on leavesChris Mason
When we split a leaf, we may end up inserting a new root on top of that leaf. The reflog code was incorrectly assuming the old root was always a node. This makes sure we skip over leaves. Signed-off-by: Chris Mason <chris.mason@fusionio.com>