Age | Commit message (Collapse) | Author |
|
Wakeup statistics used by Android are slightly different from what we
have in wakeup sources at the moment and there aren't any known
users of those statistics other than Android, so modify them to make
it easier for Android to switch to wakeup sources.
This removes the struct wakeup_source's hit_cout field, which is very
rough and therefore not very useful, and adds two new fields,
wakeup_count and expire_count. The first one tracks how many times
the wakeup source is activated with events_check_enabled set (which
roughly corresponds to the situations when a system power transition
to a sleep state is in progress and would be aborted by this wakeup
source if it were the only active one at that time) and the second
one is the number of times the wakeup source has been activated with
a timeout that expired.
Additionally, the last_time field is now updated when the wakeup
source is deactivated too (previously it was only updated during
the wakeup source's activation), which seems to be what Android does
with the analogous counter for wakelocks.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless into for-davem
|
|
This new driver registers the NXP ISP1301 chip via the I2C subsystem. The chip
is the USB transceiver shared by ohci-nxp, lpc32xx_udc (gadget) and
isp1301_omap.
ISP1301 is a very low-level driver that primarily separates out the I2C client
registration of the ISP1301 chip (including instantiation via DT), used by
other drivers, and declares the chip's registers. It's only a helper driver for
some OHCI and USB device drivers. The driver can be considered as a register
set extension of ohci-nxp, lpc32xx-udc and isp1301_omap, which in turn know
best what to do with the low level functionality (individual ISP1301 registers
and timing, see the different initialization strategies in those drivers).
Those drivers previously internally duplicated ISP1301 register definitions
which is solved by this new isp1301 driver. The ISP1301 registers exposed via
isp1301.h can be accessed by other drivers using it with standard i2c_smbus_*()
accesses.
Following patches let the respective USB host and gadget drivers use this
driver, instead of duplicating ISP1301 handling.
Signed-off-by: Roland Stigge <stigge@antcom.de>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
On some HCDs usb_unlink_urb() can directly call the
completion handler. That limits the spinlocks that can
be taken in the handler to locks not held while calling
usb_unlink_urb()
To prevent a race with resubmission, this patch exposes
usbcore's infrastructure for blocking submission, uses it
and so drops the lock without causing a race in usbhid.
Signed-off-by: Oliver Neukum <oneukum@suse.de>
Acked-by: Jiri Kosina <jkosina@suse.cz>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
fix kernel doc typos in function names
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add ECN (Explicit Congestion Notification) marking capability to netem
tc qdisc add dev eth0 root netem drop 0.5 ecn
Instead of dropping packets, try to ECN mark them.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Tom Herbert <therbert@google.com>
Cc: Hagen Paul Pfeifer <hagen@jauu.net>
Cc: Stephen Hemminger <shemminger@vyatta.com>
Acked-by: Hagen Paul Pfeifer <hagen@jauu.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
remove useless casts and rename variables for less confusion.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
L2TPv3 defines an IP encapsulation packet format where data is carried
directly over IP (no UDP). The kernel already has support for L2TP IP
encapsulation over IPv4 (l2tp_ip). This patch introduces support for
L2TP IP encapsulation over IPv6.
The implementation is derived from ipv6/raw and ipv4/l2tp_ip.
Signed-off-by: Chris Elston <celston@katalix.com>
Signed-off-by: James Chapman <jchapman@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This patch adds support for unmanaged L2TPv3 tunnels over IPv6 using
the netlink API. We already support unmanaged L2TPv3 tunnels over
IPv4. A patch to iproute2 to make use of this feature will be
submitted separately.
Signed-off-by: Chris Elston <celston@katalix.com>
Signed-off-by: James Chapman <jchapman@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Checkpatch warns about the use of __attribute__((packed)). So use the
recommended __packed syntax instead.
Signed-off-by: James Chapman <jchapman@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
drivers/infiniband/ulp/srp/ib_srp.c #defines pr_fmt() PFX fmt, but PFX
is not #defined until after <linux/*> headers are included.
This results in a bad expansion of the pr_warn() in the stub function.
2084c2084
< printk("<4>" PFX "dyndbg supported only in " "CONFIG_DYNAMIC_DEBUG builds\n")
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
The core branch is behind driver commits that we want to build
on for 3.5, hence I'm pulling in a later -rc.
Linux 3.4-rc5
Conflicts:
Documentation/feature-removal-schedule.txt
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
If we are in the middle of an I2C transfer we need to deny suspend
of the AB8500 core. Implement an atomic reference counter for the
I2C operations to make sure we don't do this.
Signed-off-by: Jonas Aaberg <jonas.aberg@stericsson.com>
Reviewed-by: Mattias Wallin <mattias.wallin@stericsson.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
|
|
The AB8505 and AB9540 has extended support for micro USB
resistance detection, used for detecting chargers. Let's
register resources for this resource. Let's also split off the
separate codec device for AB9540.
Signed-off-by: Virupax Sadashivpetimath <virupax.sadashivpetimath@stericsson.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
|
|
Switch the driver over to device group handling. By adding the
HID_GROUP_MULTITOUCH group to hid-core, hid-generic will no longer
match multitouch devices. By adding the HID_GROUP_MULTITOUCH entry to
the device list, hid-multitouch will match all unknown multitouch
devices, and udev will automatically load the module.
Since HID_QUIRK_MULTITOUCH never gets set, the special quirks handling
can be removed. Since all HID MT devices have HID_DG_CONTACTID, they
can be removed from the hid_have_special_driver list.
With this patch, the unknown device ids are no longer NULL, so the code
is modified to check for the generic entry instead.
Signed-off-by: Henrik Rydberg <rydberg@euromail.se>
Acked-by: Benjamin Tissoires <benjamin.tissoires@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
|
|
Devices that do not have a special driver are handled by the generic
driver. This patch does the same thing using device groups; Instead of
forcing a particular driver, the appropriate driver is picked up by
udev. As a consequence, one can now move a device from generic to
specific handling by a simple rebind. By adding a new device id to the
generic driver, the same thing can be done in reverse.
Signed-off-by: Henrik Rydberg <rydberg@euromail.se>
Acked-by: Benjamin Tissoires <benjamin.tissoires@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
|
|
Most HID drivers do not need to know what bus driver is in use.
A generic group driver can drive any hid device, and the device
list should not need to be duplicated for each new bus.
This patch adds wildcard matching to the HID bus, simplifying device
list handling for group drivers.
Signed-off-by: Henrik Rydberg <rydberg@euromail.se>
Acked-by: Benjamin Tissoires <benjamin.tissoires@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
|
|
In order to allow the report descriptor to influence the hid device
properties, one needs to parse the descriptor early, without reference
to any driver. Scan the descriptor for group information during device
add, before the device has been broadcast to userland. The device
modalias will contain group information which can be used to
differentiate between modules. For starters, just handle the generic
group.
Signed-off-by: Henrik Rydberg <rydberg@euromail.se>
Acked-by: Benjamin Tissoires <benjamin.tissoires@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
|
|
HID devices are only partially presented to userland. Hotplugged
devices emit events containing a modalias based on the basic bus,
vendor and product entities. However, in practise a hid device can
depend on details such as a single usb interface or a particular item
in a report descriptor.
This patch adds a device group to the hid device id, and broadcasts it
using uevent and the device modalias. The module alias generation is
modified to match. As a consequence, a device with a non-zero group
will be processed by the corresponding group driver instead of by the
generic hid driver.
Signed-off-by: Henrik Rydberg <rydberg@euromail.se>
Acked-by: Benjamin Tissoires <benjamin.tissoires@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
|
|
The low-level driver can read the report descriptor, but it cannot
determine driver-specific changes to it. The hid core can fixup
and parse the report descriptor during driver attach, but does
not have direct access to the descriptor when doing so.
To be able to handle attach/detach of hid drivers properly,
a semantic change to hid_parse_report() is needed. This function has
been used in two ways, both as descriptor reader in the ll drivers and
as a parsor in the probe of the drivers. This patch splits the usage
by introducing hid_open_report(), and modifies the hid_parse() macro
to call hid_open_report() instead. The only usage of hid_parse_report()
is then to read and store the device descriptor. As a consequence, we
can handle the report fixups automatically inside the hid core.
Signed-off-by: Henrik Rydberg <rydberg@euromail.se>
Tested-by: Nikolai Kondrashov <spbnick@gmail.com>
Tested-by: Benjamin Tissoires <benjamin.tissoires@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
|
|
Adding support for device sleep through the external input control
signal "SLEEP".
Changing the SLEEP signal state can switch the device into SLEEP and
ACTIVE state.
Also adding sleep configuration for different resources so that they
should be keep on during sleep state of device.
Signed-off-by: Laxman Dewangan <ldewangan@nvidia.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
|
|
The mfd/asic3 driver does not set the ds1wm_driver_data clock_rate field
before passing the structure to the DS1WM w1 busmaster driver.
This was not noticed before commit 26a6afb, because ds1wm_find_divisor()
unintentionally returned the correct divisor when a zero clock_rate was
passed in. However after that commit DS1WM fails a zero clock_rate:
ds1wm ds1wm: no suitable divisor for 0Hz clock
This patch sets the ds1wm_driver_data clock_rate field.
Signed-off-by: Paul Parsons <lost.distance@yahoo.com>
Acked-by: Philipp Zabel <philipp.zabel@gmail.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
|
|
This driver currently creates resources for use by a forthcoming ICH
chipset GPIO driver. It could be expanded to create the resources for
converting the esb2rom (mtd) and iTCO_wdt (wdt), and potentially more,
drivers to use the mfd model.
Signed-off-by: Aaron Sierra <asierra@xes-inc.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
|
|
GRO can check if skb to be merged has its skb->head mapped to a page
fragment, instead of a kmalloc() area.
We 'upgrade' skb->head as a fragment in itself
This avoids the frag_list fallback, and permits to build true GRO skb
(one sk_buff and up to 16 fragments), using less memory.
This reduces number of cache misses when user makes its copy, since a
single sk_buff is fetched.
This is a followup of patch "net: allow skb->head to be a page fragment"
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Maciej Żenczykowski <maze@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Tom Herbert <therbert@google.com>
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Cc: Ben Hutchings <bhutchings@solarflare.com>
Cc: Matt Carlson <mcarlson@broadcom.com>
Cc: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
skb->head is currently allocated from kmalloc(). This is convenient but
has the drawback the data cannot be converted to a page fragment if
needed.
We have three spots were it hurts :
1) GRO aggregation
When a linear skb must be appended to another skb, GRO uses the
frag_list fallback, very inefficient since we keep all struct sk_buff
around. So drivers enabling GRO but delivering linear skbs to network
stack aren't enabling full GRO power.
2) splice(socket -> pipe).
We must copy the linear part to a page fragment.
This kind of defeats splice() purpose (zero copy claim)
3) TCP coalescing.
Recently introduced, this permits to group several contiguous segments
into a single skb. This shortens queue lengths and save kernel memory,
and greatly reduce probabilities of TCP collapses. This coalescing
doesnt work on linear skbs (or we would need to copy data, this would be
too slow)
Given all these issues, the following patch introduces the possibility
of having skb->head be a fragment in itself. We use a new skb flag,
skb->head_frag to carry this information.
build_skb() is changed to accept a frag_size argument. Drivers willing
to provide a page fragment instead of kmalloc() data will set a non zero
value, set to the fragment size.
Then, on situations we need to convert the skb head to a frag in itself,
we can check if skb->head_frag is set and avoid the copies or various
fallbacks we have.
This means drivers currently using frags could be updated to avoid the
current skb->head allocation and reduce their memory footprint (aka skb
truesize). (thats 512 or 1024 bytes saved per skb). This also makes
bpf/netfilter faster since the 'first frag' will be part of skb linear
part, no need to copy data.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Maciej Żenczykowski <maze@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Tom Herbert <therbert@google.com>
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Cc: Ben Hutchings <bhutchings@solarflare.com>
Cc: Matt Carlson <mcarlson@broadcom.com>
Cc: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This patch implements the directed yield hypercall found on other
System z hypervisors. It delegates execution time to the virtual cpu
specified in the instruction's parameter.
Useful to avoid long spinlock waits in the guest.
Christian Borntraeger: moved common code in virt/kvm/
Signed-off-by: Konstantin Weitz <WEITZKON@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
|
|
struct nfs_direct_req can't compile when struct pnfs_ds_commit_info
is undefined.
Reported-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: Fred Isaman <iisaman@netapp.com>
|
|
This patch adds support for Cirrus Logic CS42L52 Low Power Stereo Codec
Signed-off-by: Brian Austin <brian.austin@cirrus.com>
Signed-off-by: Georgi Vlaev <joe@nucleusys.com>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"This is a set of SAS and SATA fixes; there are one or two longstanding
bug fixes, but most of this is regression fixes."
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
[SCSI] libfc: update mfs boundry checking
[SCSI] Revert "[SCSI] libsas: fix sas port naming"
[SCSI] libsas: fix false positive 'device attached' conditions
[SCSI] libsas, libata: fix start of life for a sas ata_port
[SCSI] libsas: fix ata_eh clobbering ex_phys via smp_ata_check_ready
[SCSI] libsas: unify domain_device sas_rphy lifetimes
[SCSI] libsas: fix sas_get_port_device regression
[SCSI] libsas: fix sas_find_bcast_phy() in the presence of 'vacant' phys
[SCSI] libsas: introduce sas_work to fix sas_drain_work vs sas_queue_work
[SCSI] libata: Pass correct DMA device to scsi host
[SCSI] scsi_lib: use correct DMA device in __scsi_alloc_queue
|
|
More recent versions of the UEFI spec have added new attributes for
variables. Add them.
Signed-off-by: Matthew Garrett <mjg@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Some devices does not support bulk read and write operations, for them
we have series of single write and read operations.
Signed-off-by: Anthony Olech <Anthony.Olech@diasemi.com>
Signed-off-by: Ashish Jangam <ashish.jangam@kpitcummins.com>
[Fixed coding style, don't check use_single_rw before assign --broonie ]
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
|
|
A PCIe downstream port is a P2P bridge. Its secondary interface is
a link that should lead only to device 0 (unless ARI is enabled)[1], so
we don't probe for non-zero device numbers.
Some Stratus ftServer systems have a PCIe downstream port (02:00.0) that
leads to both an upstream port (03:00.0) and a downstream port (03:01.0),
and 03:01.0 has important devices below it:
[0000:02]-+-00.0-[03-3c]--+-00.0-[04-09]--...
\-01.0-[0a-0d]--+-[USB]
+-[NIC]
+-...
Previously, we didn't enumerate device 03:01.0, so USB and the network
didn't work. This patch adds a DMI quirk to scan all device numbers,
not just 0, below a downstream port.
Based on a patch by Prarit Bhargava.
[1] PCIe spec r3.0, sec 7.3.1
CC: Myron Stowe <mstowe@redhat.com>
CC: Don Dutile <ddutile@redhat.com>
CC: James Paradis <james.paradis@stratus.com>
CC: Matthew Wilcox <matthew.r.wilcox@intel.com>
CC: Jesse Barnes <jbarnes@virtuousgeek.org>
CC: Prarit Bhargava <prarit@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
|
|
We need a hook to release host bridge resources allocated when creating
root bus.
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
|
|
<asm-generic/statfs.h> is exported to userspace, so using
BITS_PER_LONG is invalid. We need to use __BITS_PER_LONG instead.
This is kernel bugzilla 43165.
Reported-by: H.J. Lu <hjl.tools@gmail.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Link: http://lkml.kernel.org/r/1335465916-16965-1-git-send-email-hpa@linux.intel.com
Acked-by: Arnd Bergmann <arnd@arndb.de>
Cc: <stable@vger.kernel.org>
|
|
Use that device for pci_root_bus bridge pointer.
Use pci_release_bus_bridge_dev() to release allocated pci_host_bridge in
remove path.
Use root bus bridge pointer to get host bridge pointer instead of searching
host bridge list. That leaves the host bridge list unused, so remove it.
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
|
|
This introduces a fake module param $module.dyndbg. Its based upon
Thomas Renninger's $module.ddebug boot-time debugging patch from
https://lkml.org/lkml/2010/9/15/397
The 'fake' module parameter is provided for all modules, whether or
not they need it. It is not explicitly added to each module, but is
implemented in callbacks invoked from parse_args.
For builtin modules, dynamic_debug_init() now directly calls
parse_args(..., &ddebug_dyndbg_boot_params_cb), to process the params
undeclared in the modules, just after the ddebug tables are processed.
While its slightly weird to reprocess the boot params, parse_args() is
already called repeatedly by do_initcall_levels(). More importantly,
the dyndbg queries (given in ddebug_query or dyndbg params) cannot be
activated until after the ddebug tables are ready, and reusing
parse_args is cleaner than doing an ad-hoc parse. This reparse would
break options like inc_verbosity, but they probably should be params,
like verbosity=3.
ddebug_dyndbg_boot_params_cb() handles both bare dyndbg (aka:
ddebug_query) and module-prefixed dyndbg params, and ignores all other
parameters. For example, the following will enable pr_debug()s in 4
builtin modules, in the order given:
dyndbg="module params +p; module aio +p" module.dyndbg=+p pci.dyndbg
For loadable modules, parse_args() in load_module() calls
ddebug_dyndbg_module_params_cb(). This handles bare dyndbg params as
passed from modprobe, and errors on other unknown params.
Note that modprobe reads /proc/cmdline, so "modprobe foo" grabs all
foo.params, strips the "foo.", and passes these to the kernel.
ddebug_dyndbg_module_params_cb() is again called for the unknown
params; it handles dyndbg, and errors on others. The "doing" arg
added previously contains the module name.
For non CONFIG_DYNAMIC_DEBUG builds, the stub function accepts
and ignores $module.dyndbg params, other unknowns get -ENOENT.
If no param value is given (as in pci.dyndbg example above), "+p" is
assumed, which enables all pr_debug callsites in the module.
The dyndbg fake parameter is not shown in /sys/module/*/parameters,
thus it does not use any resources. Changes to it are made via the
control file.
Also change pr_info in ddebug_exec_queries to vpr_info,
no need to see it all the time.
Signed-off-by: Jim Cromie <jim.cromie@gmail.com>
CC: Thomas Renninger <trenn@suse.de>
CC: Rusty Russell <rusty@rustcorp.com.au>
Acked-by: Jason Baron <jbaron@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Add a 3rd arg, named "doing", to unknown-options callbacks invoked
from parse_args(). The arg is passed as:
"Booting kernel" from start_kernel(),
initcall_level_names[i] from do_initcall_level(),
mod->name from load_module(), via parse_args(), parse_one()
parse_args() already has the "name" parameter, which is renamed to
"doing" to better reflect current uses 1,2 above. parse_args() passes
it to an altered parse_one(), which now passes it down into the
unknown option handler callbacks.
The mod->name will be needed to handle dyndbg for loadable modules,
since params passed by modprobe are not qualified (they do not have a
"$modname." prefix), and by the time the unknown-param callback is
called, the module name is not otherwise available.
Minor tweaks:
Add param-name to parse_one's pr_debug(), current message doesnt
identify the param being handled, add it.
Add a pr_info to print current level and level_name of the initcall,
and number of registered initcalls at that level. This adds 7 lines
to dmesg output, like:
initlevel:6=device, 172 registered initcalls
Drop "parameters" from initcall_level_names[], its unhelpful in the
pr_info() added above. This array is passed into parse_args() by
do_initcall_level().
CC: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Jim Cromie <jim.cromie@gmail.com>
Acked-by: Jason Baron <jbaron@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
This commit implements an SRCU state machine in support of call_srcu().
The state machine is preemptible, light-weight, and single-threaded,
minimizing synchronization overhead. In particular, there is no longer
any need for synchronize_srcu() to be guarded by a mutex.
Expedited processing is handled, at least in the absence of concurrent
grace-period operations on that same srcu_struct structure, by having
the synchronize_srcu_expedited() thread take on the role of the
workqueue thread for one iteration.
There is a reasonable probability that a given SRCU callback will
be invoked on the same CPU that registered it, however, there is no
guarantee. Concurrent SRCU grace-period primitives can cause callbacks
to be executed elsewhere, even in absence of CPU-hotplug operations.
Callbacks execute in process context, but under the influence of
local_bh_disable(), so it is illegal to sleep in an SRCU callback
function.
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
|
|
The old srcu_barrier() macro is now unused. This commit removes it so
that it may be used for the SRCU flavor of rcu_barrier(), which will in
turn be needed to allow the upcoming call_srcu() to be used from within
modules.
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
|
|
This commit implements a variant of Peter's algorithm, which may be found
at https://lkml.org/lkml/2012/2/1/119.
o Make the checking lock-free to enable parallel checking.
Parallel checking is required when (1) the original checking
task is preempted for a long time, (2) sychronize_srcu_expedited()
starts during an ongoing SRCU grace period, or (3) we wish to
avoid acquiring a lock.
o Since the checking is lock-free, we avoid a mutex in state machine
for call_srcu().
o Remove the SRCU_REF_MASK and remove the coupling with the flipping.
This might allow us to remove the preempt_disable() in future
versions, though such removal will need great care because it
rescinds the one-old-reader-per-CPU guarantee.
o Remove a smp_mb(), simplify the comments and make the smp_mb() pairs
more intuitive.
Inspired-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
|
|
The purpose of the upper bit of SRCU's per-CPU counters is to guarantee
that no reasonable series of srcu_read_lock() and srcu_read_unlock()
operations can return the value of the counter to its original value.
This guarantee is require only after the index has been switched to
the other set of counters, so at most one srcu_read_lock() can affect
a given CPU's counter. The number of srcu_read_unlock() operations
on a given counter is limited to the number of tasks in the system,
which given the Linux kernel's current structure is limited to far less
than 2^30 on 32-bit systems and far less than 2^62 on 64-bit systems.
(Something about a limited number of bytes in the kernel's address space.)
Therefore, if srcu_read_lock() increments the upper bits, then
srcu_read_unlock() need not do so. In this case, an srcu_read_lock() and
an srcu_read_unlock() will flip the lower bit of the upper field of the
counter. An unreasonably large additional number of srcu_read_unlock()
operations would be required to return the counter to its initial value,
thus preserving the guarantee.
This commit takes this approach, which further allows it to shrink
the size of the upper field to one bit, making the number of
srcu_read_unlock() operations required to return the counter to its
initial value even more unreasonable than before.
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
|
|
The current implementation of synchronize_srcu_expedited() can cause
severe OS jitter due to its use of synchronize_sched(), which in turn
invokes try_stop_cpus(), which causes each CPU to be sent an IPI.
This can result in severe performance degradation for real-time workloads
and especially for short-interation-length HPC workloads. Furthermore,
because only one instance of try_stop_cpus() can be making forward progress
at a given time, only one instance of synchronize_srcu_expedited() can
make forward progress at a time, even if they are all operating on
distinct srcu_struct structures.
This commit, inspired by an earlier implementation by Peter Zijlstra
(https://lkml.org/lkml/2012/1/31/211) and by further offline discussions,
takes a strictly algorithmic bits-in-memory approach. This has the
disadvantage of requiring one explicit memory-barrier instruction in
each of srcu_read_lock() and srcu_read_unlock(), but on the other hand
completely dispenses with OS jitter and furthermore allows SRCU to be
used freely by CPUs that RCU believes to be idle or offline.
The update-side implementation handles the single read-side memory
barrier by rechecking the per-CPU counters after summing them and
by running through the update-side state machine twice.
This implementation has passed moderate rcutorture testing on both
x86 and Power. Also updated to use this_cpu_ptr() instead of per_cpu_ptr(),
as suggested by Peter Zijlstra.
Reported-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Reviewed-by: Lai Jiangshan <laijs@cn.fujitsu.com>
|
|
Denys Fedoryshchenko reported frequent crashes on a proxy server and kindly
provided a lockdep report that explains it all :
[ 762.903868]
[ 762.903880] =================================
[ 762.903890] [ INFO: inconsistent lock state ]
[ 762.903903] 3.3.4-build-0061 #8 Not tainted
[ 762.904133] ---------------------------------
[ 762.904344] inconsistent {IN-SOFTIRQ-W} -> {SOFTIRQ-ON-W} usage.
[ 762.904542] squid/1603 [HC0[0]:SC0[0]:HE1:SE1] takes:
[ 762.904542] (key#3){+.?...}, at: [<c0232cc4>]
__percpu_counter_sum+0xd/0x58
[ 762.904542] {IN-SOFTIRQ-W} state was registered at:
[ 762.904542] [<c0158b84>] __lock_acquire+0x284/0xc26
[ 762.904542] [<c01598e8>] lock_acquire+0x71/0x85
[ 762.904542] [<c0349765>] _raw_spin_lock+0x33/0x40
[ 762.904542] [<c0232c93>] __percpu_counter_add+0x58/0x7c
[ 762.904542] [<c02cfde1>] sk_clone_lock+0x1e5/0x200
[ 762.904542] [<c0303ee4>] inet_csk_clone_lock+0xe/0x78
[ 762.904542] [<c0315778>] tcp_create_openreq_child+0x1b/0x404
[ 762.904542] [<c031339c>] tcp_v4_syn_recv_sock+0x32/0x1c1
[ 762.904542] [<c031615a>] tcp_check_req+0x1fd/0x2d7
[ 762.904542] [<c0313f77>] tcp_v4_do_rcv+0xab/0x194
[ 762.904542] [<c03153bb>] tcp_v4_rcv+0x3b3/0x5cc
[ 762.904542] [<c02fc0c4>] ip_local_deliver_finish+0x13a/0x1e9
[ 762.904542] [<c02fc539>] NF_HOOK.clone.11+0x46/0x4d
[ 762.904542] [<c02fc652>] ip_local_deliver+0x41/0x45
[ 762.904542] [<c02fc4d1>] ip_rcv_finish+0x31a/0x33c
[ 762.904542] [<c02fc539>] NF_HOOK.clone.11+0x46/0x4d
[ 762.904542] [<c02fc857>] ip_rcv+0x201/0x23e
[ 762.904542] [<c02daa3a>] __netif_receive_skb+0x319/0x368
[ 762.904542] [<c02dac07>] netif_receive_skb+0x4e/0x7d
[ 762.904542] [<c02dacf6>] napi_skb_finish+0x1e/0x34
[ 762.904542] [<c02db122>] napi_gro_receive+0x20/0x24
[ 762.904542] [<f85d1743>] e1000_receive_skb+0x3f/0x45 [e1000e]
[ 762.904542] [<f85d3464>] e1000_clean_rx_irq+0x1f9/0x284 [e1000e]
[ 762.904542] [<f85d3926>] e1000_clean+0x62/0x1f4 [e1000e]
[ 762.904542] [<c02db228>] net_rx_action+0x90/0x160
[ 762.904542] [<c012a445>] __do_softirq+0x7b/0x118
[ 762.904542] irq event stamp: 156915469
[ 762.904542] hardirqs last enabled at (156915469): [<c019b4f4>]
__slab_alloc.clone.58.clone.63+0xc4/0x2de
[ 762.904542] hardirqs last disabled at (156915468): [<c019b452>]
__slab_alloc.clone.58.clone.63+0x22/0x2de
[ 762.904542] softirqs last enabled at (156915466): [<c02ce677>]
lock_sock_nested+0x64/0x6c
[ 762.904542] softirqs last disabled at (156915464): [<c0349914>]
_raw_spin_lock_bh+0xe/0x45
[ 762.904542]
[ 762.904542] other info that might help us debug this:
[ 762.904542] Possible unsafe locking scenario:
[ 762.904542]
[ 762.904542] CPU0
[ 762.904542] ----
[ 762.904542] lock(key#3);
[ 762.904542] <Interrupt>
[ 762.904542] lock(key#3);
[ 762.904542]
[ 762.904542] *** DEADLOCK ***
[ 762.904542]
[ 762.904542] 1 lock held by squid/1603:
[ 762.904542] #0: (sk_lock-AF_INET){+.+.+.}, at: [<c03055c0>]
lock_sock+0xa/0xc
[ 762.904542]
[ 762.904542] stack backtrace:
[ 762.904542] Pid: 1603, comm: squid Not tainted 3.3.4-build-0061 #8
[ 762.904542] Call Trace:
[ 762.904542] [<c0347b73>] ? printk+0x18/0x1d
[ 762.904542] [<c015873a>] valid_state+0x1f6/0x201
[ 762.904542] [<c0158816>] mark_lock+0xd1/0x1bb
[ 762.904542] [<c015876b>] ? mark_lock+0x26/0x1bb
[ 762.904542] [<c015805d>] ? check_usage_forwards+0x77/0x77
[ 762.904542] [<c0158bf8>] __lock_acquire+0x2f8/0xc26
[ 762.904542] [<c0159b8e>] ? mark_held_locks+0x5d/0x7b
[ 762.904542] [<c0159cf6>] ? trace_hardirqs_on+0xb/0xd
[ 762.904542] [<c0158dd4>] ? __lock_acquire+0x4d4/0xc26
[ 762.904542] [<c01598e8>] lock_acquire+0x71/0x85
[ 762.904542] [<c0232cc4>] ? __percpu_counter_sum+0xd/0x58
[ 762.904542] [<c0349765>] _raw_spin_lock+0x33/0x40
[ 762.904542] [<c0232cc4>] ? __percpu_counter_sum+0xd/0x58
[ 762.904542] [<c0232cc4>] __percpu_counter_sum+0xd/0x58
[ 762.904542] [<c02cebc4>] __sk_mem_schedule+0xdd/0x1c7
[ 762.904542] [<c02d178d>] ? __alloc_skb+0x76/0x100
[ 762.904542] [<c0305e8e>] sk_wmem_schedule+0x21/0x2d
[ 762.904542] [<c0306370>] sk_stream_alloc_skb+0x42/0xaa
[ 762.904542] [<c0306567>] tcp_sendmsg+0x18f/0x68b
[ 762.904542] [<c031f3dc>] ? ip_fast_csum+0x30/0x30
[ 762.904542] [<c0320193>] inet_sendmsg+0x53/0x5a
[ 762.904542] [<c02cb633>] sock_aio_write+0xd2/0xda
[ 762.904542] [<c015876b>] ? mark_lock+0x26/0x1bb
[ 762.904542] [<c01a1017>] do_sync_write+0x9f/0xd9
[ 762.904542] [<c01a2111>] ? file_free_rcu+0x2f/0x2f
[ 762.904542] [<c01a17a1>] vfs_write+0x8f/0xab
[ 762.904542] [<c01a284d>] ? fget_light+0x75/0x7c
[ 762.904542] [<c01a1900>] sys_write+0x3d/0x5e
[ 762.904542] [<c0349ec9>] syscall_call+0x7/0xb
[ 762.904542] [<c0340000>] ? rp_sidt+0x41/0x83
Bug is that sk_sockets_allocated_read_positive() calls
percpu_counter_sum_positive() without BH being disabled.
This bug was added in commit 180d8cd942ce33
(foundations of per-cgroup memory pressure controlling.), since previous
code was using percpu_counter_read_positive() which is IRQ safe.
In __sk_mem_schedule() we dont need the precise count of allocated
sockets and can revert to previous behavior.
Reported-by: Denys Fedoryshchenko <denys@visp.net.lb>
Sined-off-by: Eric Dumazet <edumazet@google.com>
Cc: Glauber Costa <glommer@parallels.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
clean up some space-before-tabs problems.
Signed-off-by: Jim Cromie <jim.cromie@gmail.com>
Acked-by: Jason Baron <jbaron@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
Change order of init so netns init is ready
when register ioctl and netlink.
Ver2
Whitespace fixes and __init added.
Reported-by: "Ryan O'Hara" <rohara@redhat.com>
Signed-off-by: Hans Schillstrom <hans.schillstrom@ericsson.com>
Acked-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
|
ip_vs_create_timeout_table() can return NULL
All functions protocol init_netns is affected of this patch.
Signed-off-by: Hans Schillstrom <hans.schillstrom@ericsson.com>
Acked-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
|
Commit b6787242f327 ("HID: hidraw: add proper error handling to raw event
reporting") forgot to update the static inline version of
hidraw_report_event() for the case when CONFIG_HIDRAW is unset. Fix that
up.
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
|
|
These are new requests introduced by USB 3.0
Specification. Gadget controllers should implement
them.
Signed-off-by: Felipe Balbi <balbi@ti.com>
|
|
This option has been deprecated for many years now, and no userspace
tools use it anymore, so it should be safe to finally remove it.
Reported-by: Kay Sievers <kay@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|