Age | Commit message (Collapse) | Author |
|
* pm-sleep:
PM / Hibernate: Implement compat_ioctl for /dev/snapshot
|
|
This allows uswsusp built for i386 to run on an x86_64 kernel (tested
with Debian package version 1.0+20110509-2).
References: http://bugs.debian.org/502816
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
|
|
* pm-freezer:
PM / Freezer: fix return value of freezable_schedule_timeout_killable()
|
|
...it should return the return code from schedule_timeout_killable(),
not the one from freezer_count().
All of the current callers ignore the return code so the bug is
harmless but it's worth fixing.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
|
|
* pm-domains:
PM / shmobile: Allow the A4R domain to be turned off at run time
PM / input / touchscreen: Make st1232 use device PM QoS constraints
PM / QoS: Introduce dev_pm_qos_add_ancestor_request()
PM / shmobile: Remove the stay_on flag from SH7372's PM domains
PM / shmobile: Don't include SH7372's INTCS in syscore suspend/resume
PM / shmobile: Add support for the sh7372 A4S power domain / sleep mode
ARM: S3C64XX: Implement basic power domain support
PM / shmobile: Use common always on power domain governor
PM / Domains: Provide an always on power domain governor
PM / Domains: Fix default system suspend/resume operations
PM / Domains: Make it possible to assign names to generic PM domains
PM / Domains: fix compilation failure for CONFIG_PM_GENERIC_DOMAINS unset
PM / Domains: Automatically update overoptimistic latency information
PM / Domains: Add default power off governor function (v4)
PM / Domains: Add device stop governor function (v4)
PM / Domains: Rework system suspend callback routines (v2)
PM / Domains: Introduce "save/restore state" device callbacks
PM / Domains: Make it possible to use per-device domain callbacks
|
|
* pm-runtime:
PM / Runtime: Use device PM QoS constraints (v2)
|
|
* pm-sleep: (51 commits)
PM: Drop generic_subsys_pm_ops
PM / Sleep: Remove forward-only callbacks from AMBA bus type
PM / Sleep: Remove forward-only callbacks from platform bus type
PM: Run the driver callback directly if the subsystem one is not there
PM / Sleep: Make pm_op() and pm_noirq_op() return callback pointers
PM / Sleep: Merge internal functions in generic_ops.c
PM / Sleep: Simplify generic system suspend callbacks
PM / Hibernate: Remove deprecated hibernation snapshot ioctls
PM / Sleep: Fix freezer failures due to racy usermodehelper_is_disabled()
PM / Sleep: Recommend [un]lock_system_sleep() over using pm_mutex directly
PM / Sleep: Replace mutex_[un]lock(&pm_mutex) with [un]lock_system_sleep()
PM / Sleep: Make [un]lock_system_sleep() generic
PM / Sleep: Use the freezer_count() functions in [un]lock_system_sleep() APIs
PM / Freezer: Remove the "userspace only" constraint from freezer[_do_not]_count()
PM / Hibernate: Replace unintuitive 'if' condition in kernel/power/user.c with 'else'
Freezer / sunrpc / NFS: don't allow TASK_KILLABLE sleeps to block the freezer
PM / Sleep: Unify diagnostic messages from device suspend/resume
ACPI / PM: Do not save/restore NVS on Asus K54C/K54HR
PM / Hibernate: Remove deprecated hibernation test modes
PM / Hibernate: Thaw processes in SNAPSHOT_CREATE_IMAGE ioctl test path
...
Conflicts:
kernel/kmod.c
|
|
* pm-devfreq:
PM/Devfreq: Add Exynos4-bus device DVFS driver for Exynos4210/4212/4412.
PM / devfreq: separate error paths from successful path
|
|
* pm-misc:
CPU: Add right qualifiers for alloc_frozen_cpus() and cpu_hotplug_pm_sync_init()
PM / Usermodehelper: Cleanup remnants of usermodehelper_pm_callback()
|
|
After adding PM QoS constraints for the I2C controller in the A4R
domain, that domain can be allowed to be turned off and on by runtime
PM, so remove the "always on" governor from it.
However, the A4R domain has to be "on" when suspend_device_irqs() and
resume_device_irqs() are executed during system suspend and resume,
respectively, so that those functions don't crash while accessing the
INTCS. For this reason, add a PM notifier to the SH7372 PM code and
make it restore power to A4R before system suspend and remove power
from all unused PM domains after system resume.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Magnus Damm <damm@opensource.se>
|
|
Make the st1232 driver use dev_pm_qos_add_ancestor_request() to
add a device PM QoS latency constraint for the controller it
depends on, so that the controller won't go into an overly deep
low-power state when the touchscreen has to be particularly
responsive (e.g. when the user moves his or her finger on it).
This change is based on a prototype patch from Guennadi Liakhovetski.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
|
|
Some devices, like the I2C controller on SH7372, are not
necessary for providing power to their children or forwarding
wakeup signals (and generally interrupts) from them. They are
only needed by their children when there's some data to transfer,
so they may be suspended for the majority of time and resumed
on demand, when the children have data to send or receive. For this
purpose, however, their power.ignore_children flags have to be set,
or the PM core wouldn't allow them to be suspended while their
children were active.
Unfortunately, in some situations it may take too much time to
resume such devices so that they can assist their children in
transferring data. For example, if such a device belongs to a PM
domain which goes to the "power off" state when that device is
suspended, it may take too much time to restore power to the
domain in response to the request from one of the device's
children. In that case, if the parent's resume time is critical,
the domain should stay in the "power on" state, although it still may
be desirable to power manage the parent itself (e.g. by manipulating
its clock).
In general, device PM QoS may be used to address this problem.
Namely, if the device's children added PM QoS latency constraints
for it, they would be able to prevent it from being put into an
overly deep low-power state. However, in some cases the devices
needing to be serviced are not the immediate children of a
"children-ignoring" device, but its grandchildren or even less
direct descendants. In those cases, the entity wanting to add a
PM QoS request for a given device's ancestor that ignores its
children will have to find it in the first place, so introduce a new
helper function that may be used to achieve that. This function,
dev_pm_qos_add_ancestor_request(), will search for the first
ancestor of the given device whose power.ignore_children flag is
set and will add a device PM QoS latency request for that ancestor
on behalf of the caller. The request added this way may be removed
with the help of dev_pm_qos_remove_request() in the future, like
any other device PM QoS latency request.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
|
|
SH7372 uses two independent mechanisms for ensuring that power
domains will never be turned off: the stay_on flag and the "always
on" domain governor. Moreover, the "always on" governor is only taken
into accout by runtime PM code paths, while the stay_on flag affects
all attempts to turn the given domain off. Thus setting the stay_on
flag causes the "always on" governor to be unnecessary, which is
quite confusing.
However, the stay_on flag is currently only set for two domains: A3SP
and A4S. Moreover, it only is set for the A3SP domain if
console_suspend_enabled is set, so stay_on won't be necessary for
that domain any more if console_suspend_enabled is checked directly
in its .suspend() routine. [This requires domain .suspend() to
return a result, but that is a minor modification.] Analogously,
stay_on won't be necessary for the A4S domain if it's .suspend()
routine always returns an error code.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Magnus Damm <damm@opensource.se>
|
|
Since the SH7372's INTCS in included into syscore suspend/resume,
which causes the chip to be accessed when PM domains have been
turned off during system suspend, the A4R domain containing the
INTCS has to stay on during system sleep, which is suboptimal
from the power consumption point of view.
For this reason, add a new INTC flag, skip_syscore_suspend, to mark
the INTCS for intc_suspend() and intc_resume(), so that they don't
touch it. This allows the A4R domain to be turned off during
system suspend and the INTCS state is resrored during system
resume by the A4R's "power on" code.
Suggested-by: Magnus Damm <damm@opensource.se>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Magnus Damm <damm@opensource.se>
|
|
The sh7372 contains a power domain named A4S which in turn
contains power domains for both I/O Devices and CPU cores.
At this point only System wide Suspend-to-RAM is supported,
but the the hardware can also support CPUIdle. With more
efforts in the future CPUIdle can work with bot A4S and A3SM.
Tested on the sh7372 Mackerel board.
[rjw: Rebased on top of the current linux-pm tree.]
Signed-off-by: Magnus Damm <damm@opensource.se>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
|
|
* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
vmwgfx: fix incorrect VRAM size check in vmw_kms_fb_create()
drm/radeon/kms: bail on BTC parts if MC ucode is missing
|
|
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
VFS: Fix race between CPU hotplug and lglocks
|
|
for linus: writeback reason binary tracing format fix
* tag 'writeback' of git://git.kernel.org/pub/scm/linux/kernel/git/wfg/linux:
writeback: show writeback reason with __print_symbolic
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild
* 'rc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild:
kconfig: adapt update-po-config to new UML layout
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media
* 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
[media] omap3isp: Fix crash caused by subdevs now having a pointer to devnodes
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
Btrfs: call d_instantiate after all ops are setup
Btrfs: fix worker lock misuse in find_worker
|
|
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
sparc64: Fix MSIQ HV call ordering in pci_sun4v_msiq_build_irq().
|
|
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
netfilter: xt_connbytes: handle negation correctly
net: relax rcvbuf limits
rps: fix insufficient bounds checking in store_rps_dev_flow_table_cnt()
net: introduce DST_NOPEER dst flag
mqprio: Avoid panic if no options are provided
bridge: provide a mtu() method for fake_dst_ops
|
|
|
|
"! --connbytes 23:42" should match if the packet/byte count is not in range.
As there is no explict "invert match" toggle in the match structure,
userspace swaps the from and to arguments
(i.e., as if "--connbytes 42:23" were given).
However, "what <= 23 && what >= 42" will always be false.
Change things so we use "||" in case "from" is larger than "to".
This change may look like it breaks backwards compatibility when "to" is 0.
However, older iptables binaries will refuse "connbytes 42:0",
and current releases treat it to mean "! --connbytes 0:42",
so we should be fine.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
This closes races where btrfs is calling d_instantiate too soon during
inode creation. All of the callers of btrfs_add_nondir are updated to
instantiate after the inode is fully setup in memory.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Chris Mason <chris.mason@oracle.com>
|
|
Dan Carpenter noticed that we were doing a double unlock on the worker
lock, and sometimes picking a worker thread without the lock held.
This fixes both errors.
Signed-off-by: Chris Mason <chris.mason@oracle.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
|
|
skb->truesize might be big even for a small packet.
Its even bigger after commit 87fb4b7b533 (net: more accurate skb
truesize) and big MTU.
We should allow queueing at least one packet per receiver, even with a
low RCVBUF setting.
Reported-by: Michal Simek <monstr@monstr.eu>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Setting a large rps_flow_cnt like (1 << 30) on 32-bit platform will
cause a kernel oops due to insufficient bounds checking.
if (count > 1<<30) {
/* Enforce a limit to prevent overflow */
return -EINVAL;
}
count = roundup_pow_of_two(count);
table = vmalloc(RPS_DEV_FLOW_TABLE_SIZE(count));
Note that the macro RPS_DEV_FLOW_TABLE_SIZE(count) is defined as:
... + (count * sizeof(struct rps_dev_flow))
where sizeof(struct rps_dev_flow) is 8. (1 << 30) * 8 will overflow
32 bits.
This patch replaces the magic number (1 << 30) with a symbolic bound.
Suggested-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Xi Wang <xi.wang@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Chris Boot reported crashes occurring in ipv6_select_ident().
[ 461.457562] RIP: 0010:[<ffffffff812dde61>] [<ffffffff812dde61>]
ipv6_select_ident+0x31/0xa7
[ 461.578229] Call Trace:
[ 461.580742] <IRQ>
[ 461.582870] [<ffffffff812efa7f>] ? udp6_ufo_fragment+0x124/0x1a2
[ 461.589054] [<ffffffff812dbfe0>] ? ipv6_gso_segment+0xc0/0x155
[ 461.595140] [<ffffffff812700c6>] ? skb_gso_segment+0x208/0x28b
[ 461.601198] [<ffffffffa03f236b>] ? ipv6_confirm+0x146/0x15e
[nf_conntrack_ipv6]
[ 461.608786] [<ffffffff81291c4d>] ? nf_iterate+0x41/0x77
[ 461.614227] [<ffffffff81271d64>] ? dev_hard_start_xmit+0x357/0x543
[ 461.620659] [<ffffffff81291cf6>] ? nf_hook_slow+0x73/0x111
[ 461.626440] [<ffffffffa0379745>] ? br_parse_ip_options+0x19a/0x19a
[bridge]
[ 461.633581] [<ffffffff812722ff>] ? dev_queue_xmit+0x3af/0x459
[ 461.639577] [<ffffffffa03747d2>] ? br_dev_queue_push_xmit+0x72/0x76
[bridge]
[ 461.646887] [<ffffffffa03791e3>] ? br_nf_post_routing+0x17d/0x18f
[bridge]
[ 461.653997] [<ffffffff81291c4d>] ? nf_iterate+0x41/0x77
[ 461.659473] [<ffffffffa0374760>] ? br_flood+0xfa/0xfa [bridge]
[ 461.665485] [<ffffffff81291cf6>] ? nf_hook_slow+0x73/0x111
[ 461.671234] [<ffffffffa0374760>] ? br_flood+0xfa/0xfa [bridge]
[ 461.677299] [<ffffffffa0379215>] ?
nf_bridge_update_protocol+0x20/0x20 [bridge]
[ 461.684891] [<ffffffffa03bb0e5>] ? nf_ct_zone+0xa/0x17 [nf_conntrack]
[ 461.691520] [<ffffffffa0374760>] ? br_flood+0xfa/0xfa [bridge]
[ 461.697572] [<ffffffffa0374812>] ? NF_HOOK.constprop.8+0x3c/0x56
[bridge]
[ 461.704616] [<ffffffffa0379031>] ?
nf_bridge_push_encap_header+0x1c/0x26 [bridge]
[ 461.712329] [<ffffffffa037929f>] ? br_nf_forward_finish+0x8a/0x95
[bridge]
[ 461.719490] [<ffffffffa037900a>] ?
nf_bridge_pull_encap_header+0x1c/0x27 [bridge]
[ 461.727223] [<ffffffffa0379974>] ? br_nf_forward_ip+0x1c0/0x1d4 [bridge]
[ 461.734292] [<ffffffff81291c4d>] ? nf_iterate+0x41/0x77
[ 461.739758] [<ffffffffa03748cc>] ? __br_deliver+0xa0/0xa0 [bridge]
[ 461.746203] [<ffffffff81291cf6>] ? nf_hook_slow+0x73/0x111
[ 461.751950] [<ffffffffa03748cc>] ? __br_deliver+0xa0/0xa0 [bridge]
[ 461.758378] [<ffffffffa037533a>] ? NF_HOOK.constprop.4+0x56/0x56
[bridge]
This is caused by bridge netfilter special dst_entry (fake_rtable), a
special shared entry, where attaching an inetpeer makes no sense.
Problem is present since commit 87c48fa3b46 (ipv6: make fragment
identifications less predictable)
Introduce DST_NOPEER dst flag and make sure ipv6_select_ident() and
__ip_select_ident() fallback to the 'no peer attached' handling.
Reported-by: Chris Boot <bootc@bootc.net>
Tested-by: Chris Boot <bootc@bootc.net>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Userspace may not provide TCA_OPTIONS, in fact tc currently does
so not do so if no arguments are specified on the command line.
Return EINVAL instead of panicing.
Signed-off-by: Thomas Graf <tgraf@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Commit 618f9bc74a039da76 (net: Move mtu handling down to the protocol
depended handlers) forgot the bridge netfilter case, adding a NULL
dereference in ip_fragment().
Reported-by: Chris Boot <bootc@bootc.net>
CC: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
* 'for-linus' of git://neil.brown.name/md:
md/bitmap: It is OK to clear bits during recovery.
md: don't give up looking for spares on first failure-to-add
md/raid5: ensure correct assessment of drives during degraded reshape.
md/linear: fix hot-add of devices to linear arrays.
|
|
commit d0a4bb492772ce5c4bdfba3744a99ed6f6fb238f introduced a
regression which is annoying but fairly harmless.
When writing to an array that is undergoing recovery (a spare
in being integrated into the array), writing to the array will
set bits in the bitmap, but they will not be cleared when the
write completes.
For bits covering areas that have not been recovered yet this is not a
problem as the recovery will clear the bits. However bits set in
already-recovered region will stay set and never be cleared.
This doesn't risk data integrity. The only negatives are:
- next time there is a crash, more resyncing than necessary will
be done.
- the bitmap doesn't look clean, which is confusing.
While an array is recovering we don't want to update the
'events_cleared' setting in the bitmap but we do still want to clear
bits that have very recently been set - providing they were written to
the recovering device.
So split those two needs - which previously both depended on 'success'
and always clear the bit of the write went to all devices.
Signed-off-by: NeilBrown <neilb@suse.de>
|
|
Before performing a recovery we try to remove any spares that
might not be working, then add any that might have become relevant.
Currently we abort on the first spare that cannot be added.
This is a false optimisation.
It is conceivable that - depending on rules in the personality - a
subsequent spare might be accepted.
Also the loop does other things like count the available spares and
reset the 'recovery_offset' value.
If we abort early these might not happen properly.
So remove the early abort.
In particular if you have an array what is undergoing recovery and
which has extra spares, then the recovery may not restart after as
reboot as the could of 'spares' might end up as zero.
Reported-by: Anssi Hannula <anssi.hannula@iki.fi>
Signed-off-by: NeilBrown <neilb@suse.de>
|
|
While reshaping a degraded array (as when reshaping a RAID0 by first
converting it to a degraded RAID4) we currently get confused about
which devices are in_sync. In most cases we get it right, but in the
region that is being reshaped we need to treat non-failed devices as
in-sync when we have the data but haven't actually written it out yet.
Reported-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
|
|
commit d70ed2e4fafdbef0800e73942482bb075c21578b
broke hot-add to a linear array.
After that commit, metadata if not written to devices until they
have been fully integrated into the array as determined by
saved_raid_disk. That patch arranged to clear that field after
a recovery completed.
However for linear arrays, there is no recovery - the integration is
instantaneous. So we need to explicitly clear the saved_raid_disk
field.
Signed-off-by: NeilBrown <neilb@suse.de>
|
|
This silently was working for many years and stopped working on
Niagara-T3 machines.
We need to set the MSIQ to VALID before we can set it's state to IDLE.
On Niagara-T3, setting the state to IDLE first was causing HV_EINVAL
errors. The hypervisor documentation says, rather ambiguously, that
the MSIQ must be "initialized" before one can set the state.
I previously understood this to mean merely that a successful setconf()
operation has been performed on the MSIQ, which we have done at this
point. But it seems to also mean that it has been set VALID too.
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
* 'usb-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
USB: Fix usb/isp1760 build on sparc
usb: gadget: epautoconf: do not change number of streams
usb: dwc3: core: fix cached revision on our structure
usb: musb: fix reset issue with full speed device
|
|
* 'upstream-linus' of git://github.com/jgarzik/libata-dev:
pata_of_platform: Add missing CONFIG_OF_IRQ dependency.
|
|
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
|
|
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: David Miller <davem@davemloft.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Commit e133e737 didn't correctly fix the integer overflow issue.
- unsigned int required_size;
+ u64 required_size;
...
required_size = mode_cmd->pitch * mode_cmd->height;
- if (unlikely(required_size > dev_priv->vram_size)) {
+ if (unlikely(required_size > (u64) dev_priv->vram_size)) {
Note that both pitch and height are u32. Their product is still u32 and
would overflow before being assigned to required_size. A correct way is
to convert pitch and height to u64 before the multiplication.
required_size = (u64)mode_cmd->pitch * (u64)mode_cmd->height;
This patch calls the existing vmw_kms_validate_mode_vram() for
validation.
Signed-off-by: Xi Wang <xi.wang@gmail.com>
Reviewed-and-tested-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
|
|
We already do this for cayman, need to also do it for
BTC parts. The default memory and voltage setup is not
adequate for advanced operation. Continuing will
result in an unusable display.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@kernel.org
Cc: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
|
|
Currently, the *_global_[un]lock_online() routines are not at all synchronized
with CPU hotplug. Soft-lockups detected as a consequence of this race was
reported earlier at https://lkml.org/lkml/2011/8/24/185. (Thanks to Cong Meng
for finding out that the root-cause of this issue is the race condition
between br_write_[un]lock() and CPU hotplug, which results in the lock states
getting messed up).
Fixing this race by just adding {get,put}_online_cpus() at appropriate places
in *_global_[un]lock_online() is not a good option, because, then suddenly
br_write_[un]lock() would become blocking, whereas they have been kept as
non-blocking all this time, and we would want to keep them that way.
So, overall, we want to ensure 3 things:
1. br_write_lock() and br_write_unlock() must remain as non-blocking.
2. The corresponding lock and unlock of the per-cpu spinlocks must not happen
for different sets of CPUs.
3. Either prevent any new CPU online operation in between this lock-unlock, or
ensure that the newly onlined CPU does not proceed with its corresponding
per-cpu spinlock unlocked.
To achieve all this:
(a) We introduce a new spinlock that is taken by the *_global_lock_online()
routine and released by the *_global_unlock_online() routine.
(b) We register a callback for CPU hotplug notifications, and this callback
takes the same spinlock as above.
(c) We maintain a bitmap which is close to the cpu_online_mask, and once it is
initialized in the lock_init() code, all future updates to it are done in
the callback, under the above spinlock.
(d) The above bitmap is used (instead of cpu_online_mask) while locking and
unlocking the per-cpu locks.
The callback takes the spinlock upon the CPU_UP_PREPARE event. So, if the
br_write_lock-unlock sequence is in progress, the callback keeps spinning,
thus preventing the CPU online operation till the lock-unlock sequence is
complete. This takes care of requirement (3).
The bitmap that we maintain remains unmodified throughout the lock-unlock
sequence, since all updates to it are managed by the callback, which takes
the same spinlock as the one taken by the lock code and released only by the
unlock routine. Combining this with (d) above, satisfies requirement (2).
Overall, since we use a spinlock (mentioned in (a)) to prevent CPU hotplug
operations from racing with br_write_lock-unlock, requirement (1) is also
taken care of.
By the way, it is to be noted that a CPU offline operation can actually run
in parallel with our lock-unlock sequence, because our callback doesn't react
to notifications earlier than CPU_DEAD (in order to maintain our bitmap
properly). And this means, since we use our own bitmap (which is stale, on
purpose) during the lock-unlock sequence, we could end up unlocking the
per-cpu lock of an offline CPU (because we had locked it earlier, when the
CPU was online), in order to satisfy requirement (2). But this is harmless,
though it looks a bit awkward.
Debugged-by: Cong Meng <mc@linux.vnet.ibm.com>
Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Cc: stable@vger.kernel.org
|
|
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
net: Add a flow_cache_flush_deferred function
ipv4: reintroduce route cache garbage collector
net: have ipconfig not wait if no dev is available
sctp: Do not account for sizeof(struct sk_buff) in estimated rwnd
asix: new device id
davinci-cpdma: fix locking issue in cpdma_chan_stop
sctp: fix incorrect overflow check on autoclose
r8169: fix Config2 MSIEnable bit setting.
llc: llc_cmsg_rcv was getting called after sk_eat_skb.
net: bpf_jit: fix an off-one bug in x86_64 cond jump target
iwlwifi: update SCD BC table for all SCD queues
Revert "Bluetooth: Revert: Fix L2CAP connection establishment"
Bluetooth: Clear RFCOMM session timer when disconnecting last channel
Bluetooth: Prevent uninitialized data access in L2CAP configuration
iwlwifi: allow to switch to HT40 if not associated
iwlwifi: tx_sync only on PAN context
mwifiex: avoid double list_del in command cancel path
ath9k: fix max phy rate at rate control init
nfc: signedness bug in __nci_request()
iwlwifi: do not set the sequence control bit is not needed
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: atmel/ac97c: using software reset instead hardware reset if not available
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/sameo/mfd-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/mfd-2.6:
mfd: Include linux/io.h to jz4740-adc
mfd: Use request_threaded_irq for twl4030-irq instead of irq_set_chained_handler
mfd: Base interrupt for twl4030-irq must be one-shot
mfd: Handle tps65910 clear-mask correctly
mfd: add #ifdef CONFIG_DEBUG_FS guard for ab8500_debug_resources
mfd: Fix twl-core oops while calling twl_i2c_* for unbound driver
mfd: include linux/module.h for ab5500-debugfs
mfd: Update wm8994 active device checks for WM1811
mfd: Set tps6586x bits if new value is different from the old one
mfd: Set da903x bits if new value is different from the old one
mfd: Set adp5520 bits if new value is different from the old one
mfd: Add missed free_irq in da903x_remove
|
|
lockdep reports a deadlock in jfs because a special inode's rw semaphore
is taken recursively. The mapping's gfp mask is GFP_NOFS, but is not
used when __read_cache_page() calls add_to_page_cache_lru().
Signed-off-by: Dave Kleikamp <dave.kleikamp@oracle.com>
Acked-by: Hugh Dickins <hughd@google.com>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|