summaryrefslogtreecommitdiffstats
AgeCommit message (Collapse)Author
2010-10-26sgi-xp: incoming XPC channel messages can come in after the channel's ↵Robin Holt
partition structures have been torn down Under some workloads, some channel messages have been observed being delayed on the sending side past the point where the receiving side has been able to tear down its partition structures. This condition is already detected in xpc_handle_activate_IRQ_uv(), but that information is not given to xpc_handle_activate_mq_msg_uv(). As a result, xpc_handle_activate_mq_msg_uv() assumes the structures still exist and references them, causing a NULL-pointer deref. Signed-off-by: Robin Holt <holt@sgi.com> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-10-26um: fix global timer issue when using CONFIG_NO_HZRichard Weinberger
This fixes a issue which was introduced by fe2cc53e ("uml: track and make up lost ticks"). timeval_to_ns() returns long long and not int. Due to that UML's timer did not work properlt and caused timer freezes. Signed-off-by: Richard Weinberger <richard@nod.at> Acked-by: Pekka Enberg <penberg@kernel.org> Cc: Jeff Dike <jdike@addtoit.com> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-10-26mm, page-allocator: do not check the state of a non-existant buddy during freeMel Gorman
There is a bug in commit 6dda9d55 ("page allocator: reduce fragmentation in buddy allocator by adding buddies that are merging to the tail of the free lists") that means a buddy at order MAX_ORDER is checked for merging. A page of this order never exists so at times, an effectively random piece of memory is being checked. Alan Curry has reported that this is causing memory corruption in userspace data on a PPC32 platform (http://lkml.org/lkml/2010/10/9/32). It is not clear why this is happening. It could be a cache coherency problem where pages mapped in both user and kernel space are getting different cache lines due to the bad read from kernel space (http://lkml.org/lkml/2010/10/13/179). It could also be that there are some special registers being io-remapped at the end of the memmap array and that a read has special meaning on them. Compiler bugs have been ruled out because the assembly before and after the patch looks relatively harmless. This patch fixes the problem by ensuring we are not reading a possibly invalid location of memory. It's not clear why the read causes corruption but one way or the other it is a buggy read. Signed-off-by: Mel Gorman <mel@csn.ul.ie> Cc: Corrado Zoccolo <czoccolo@gmail.com> Reported-by: Alan Curry <pacman@kosh.dhis.org> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Christoph Lameter <cl@linux-foundation.org> Cc: Rik van Riel <riel@redhat.com> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-10-26types.h: move misplaced commentAndrew Morton
This comment landed in the wrong place. Cc: Andi Kleen <andi@firstfloor.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: David Miller <davem@davemloft.net> Cc: Eric Paris <eparis@redhat.com> Cc: Jan Engelhardt <jengelh@medozas.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-10-26mm: fix return value of scan_lru_pages in memory unplugKAMEZAWA Hiroyuki
scan_lru_pages returns pfn. So, it's type should be "unsigned long" not "int". Note: I guess this has been work until now because memory hotplug tester's machine has not very big memory.... physical address < 32bit << PAGE_SHIFT. Reported-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-10-26Merge branches 'amso1100', 'cma', 'cxgb3', 'cxgb4', 'ehca', 'iboe', 'ipoib', ↵Roland Dreier
'misc', 'mlx4', 'nes', 'qib' and 'srp' into for-next
2010-10-26IB/qib: clean up properly if pci_set_consistent_dma_mask() failsJason Gunthorpe
Clean up properly if pci_set_consistent_dma_mask() fails. Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-10-26IB/qib: Allow driver to load if PCIe AER failsRalph Campbell
Some PCIe root complex chip sets don't support advanced error reporting. Allow the driver to load OK if pci_enable_pcie_error_reporting() fails. Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-10-26IB/qib: Fix uninitialized pointer if CONFIG_PCI_MSI not setRalph Campbell
If CONFIG_PCI_MSI is not set, and a QLE7140 is present, the pointer "dd" is uninitialized. Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-10-26IB/qib: Fix extra log level in qib_early_err()Jason Gunthorpe
Noticed this odd looking thing in dmesg: ib_qib 0000:02:00.0: <3>ib_qib: Unable to enable pcie error reporting: -5 which is due to a bad use of dev_info. Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Acked-by: Ralph Campbell <ralph.campbell@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-10-26x86: allocate space within a region top-downBjorn Helgaas
Request that allocate_resource() use available space from high addresses first, rather than the default of using low addresses first. The most common place this makes a difference is when we move or assign new PCI device resources. Low addresses are generally scarce, so it's better to use high addresses when possible. This follows Windows practice for PCI allocation. Reference: https://bugzilla.kernel.org/show_bug.cgi?id=16228#c42 Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-10-26x86: update iomem_resource end based on CPU physical address capabilitiesBjorn Helgaas
The iomem_resource map reflects the available physical address space. We statically initialize the end to -1, i.e., 0xffffffff_ffffffff, but of course we can only use as much as the CPU can address. This patch updates the end based on the CPU capabilities, so we don't mistakenly allocate space that isn't usable, as we're likely to do when allocating from the top-down. Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-10-26x86/PCI: allocate space from the end of a region, not the beginningBjorn Helgaas
Allocate from the end of a region, not the beginning. For example, if we need to allocate 0x800 bytes for a device on bus 0000:00 given these resources: [mem 0xbff00000-0xdfffffff] PCI Bus 0000:00 [mem 0xc0000000-0xdfffffff] PCI Bus 0000:02 the available space at [mem 0xbff00000-0xbfffffff] is passed to the alignment callback (pcibios_align_resource()). Prior to this patch, we would put the new 0x800 byte resource at the beginning of that available space, i.e., at [mem 0xbff00000-0xbff007ff]. With this patch, we put it at the end, at [mem 0xbffff800-0xbfffffff]. Reference: https://bugzilla.kernel.org/show_bug.cgi?id=16228#c41 Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-10-26PCI: allocate bus resources from the top downBjorn Helgaas
Allocate space from the highest-address PCI bus resource first, then work downward. Previously, we looked for space in PCI host bridge windows in the order we discovered the windows. For example, given the following windows (discovered via an ACPI _CRS method): pci_root PNP0A03:00: host bridge window [mem 0x000a0000-0x000bffff] pci_root PNP0A03:00: host bridge window [mem 0x000c0000-0x000effff] pci_root PNP0A03:00: host bridge window [mem 0x000f0000-0x000fffff] pci_root PNP0A03:00: host bridge window [mem 0xbff00000-0xf7ffffff] pci_root PNP0A03:00: host bridge window [mem 0xff980000-0xff980fff] pci_root PNP0A03:00: host bridge window [mem 0xff97c000-0xff97ffff] pci_root PNP0A03:00: host bridge window [mem 0xfed20000-0xfed9ffff] we attempted to allocate from [mem 0x000a0000-0x000bffff] first, then [mem 0x000c0000-0x000effff], and so on. With this patch, we allocate from [mem 0xff980000-0xff980fff] first, then [mem 0xff97c000-0xff97ffff], [mem 0xfed20000-0xfed9ffff], etc. Allocating top-down follows Windows practice, so we're less likely to trip over BIOS defects in the _CRS description. On the machine above (a Dell T3500), the [mem 0xbff00000-0xbfffffff] region doesn't actually work and is likely a BIOS defect. The symptom is that we move the AHCI controller to 0xbff00000, which leads to "Boot has failed, sleeping forever," a BUG in ahci_stop_engine(), or some other boot failure. Reference: https://bugzilla.kernel.org/show_bug.cgi?id=16228#c43 Reference: https://bugzilla.redhat.com/show_bug.cgi?id=620313 Reference: https://bugzilla.redhat.com/show_bug.cgi?id=629933 Reported-by: Brian Bloniarz <phunge0@hotmail.com> Reported-and-tested-by: Stefan Becker <chemobejk@gmail.com> Reported-by: Denys Vlasenko <dvlasenk@redhat.com> Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-10-26resources: support allocating space within a region from the top downBjorn Helgaas
Allocate space from the top of a region first, then work downward, if an architecture desires this. When we allocate space from a resource, we look for gaps between children of the resource. Previously, we always looked at gaps from the bottom up. For example, given this: [mem 0xbff00000-0xf7ffffff] PCI Bus 0000:00 [mem 0xbff00000-0xbfffffff] gap -- available [mem 0xc0000000-0xdfffffff] PCI Bus 0000:02 [mem 0xe0000000-0xf7ffffff] gap -- available we attempted to allocate from the [mem 0xbff00000-0xbfffffff] gap first, then the [mem 0xe0000000-0xf7ffffff] gap. With this patch an architecture can choose to allocate from the top gap [mem 0xe0000000-0xf7ffffff] first. We can't do this across the board because iomem_resource.end is initialized to 0xffffffff_ffffffff on 64-bit architectures, and most machines can't address the entire 64-bit physical address space. Therefore, we only allocate top-down if the arch requests it by clearing "resource_alloc_from_bottom". Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-10-26resources: handle overflow when aligning start of available areaBjorn Helgaas
If tmp.start is near ~0, ALIGN(tmp.start) may overflow, which would make us think there's more available space than there really is. We would likely return something that conflicts with a previous resource, which would cause a failure when allocate_resource() requests the newly- allocated region. Reference: https://bugzilla.redhat.com/show_bug.cgi?id=646027 Reported-by: Fabrice Bellet <fabrice@bellet.info> Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-10-26resources: ensure callback doesn't allocate outside available spaceBjorn Helgaas
The alignment callback returns a proposed location, which may have been adjusted to avoid ISA aliases or for other architecture-specific reasons. We already had a check ("tmp.start < tmp.end") to make sure the callback doesn't return an area that extends past the available area. This patch reworks the check to make sure it doesn't return an area that extends either below or above the available area. Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-10-26resources: factor out resource_clip() to simplify find_resource()Bjorn Helgaas
This factors out the min/max clipping to simplify find_resource(). No functional change. Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-10-26resources: add a default alignf to simplify find_resource()Bjorn Helgaas
This removes a test from find_resource(), which is getting cluttered. No functional change. Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-10-26x86, uv: Enable Westmere support on SGI UVRuss Anderson
Enable Westmere support on SGI UV. The UV initialization code is dependent on the APICID bits. Westmere-EX uses different APIC bit mapping than Nehalem-EX. This code reads the apic shift value from a UV MMR to do the proper bit decoding to determint the pnode. Signed-off-by: Russ Anderson <rja@sgi.com> LKML-Reference: <20101026212728.GB15071@sgi.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2010-10-26RDMA/cxgb4: Remove unnecessary KERN_<level> useJoe Perches
Signed-off-by: Joe Perches <joe@perches.com> Acked-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-10-26RDMA/cxgb3: Remove unnecessary KERN_<level> useJoe Perches
Signed-off-by: Joe Perches <joe@perches.com> Acked-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-10-26intel_idle: do not use the LAPIC timer for ATOM C2Len Brown
If we use the LAPIC timer during ATOM C2 on some nvidia chisets, the system stalls. https://bugzilla.kernel.org/show_bug.cgi?id=21032 Signed-off-by: Len Brown <len.brown@intel.com>
2010-10-26IPv6: Temp addresses are immediately deleted.Glenn Wurster
There is a bug in the interaction between ipv6_create_tempaddr and addrconf_verify. Because ipv6_create_tempaddr uses the cstamp and tstamp from the public address in creating a private address, if we have not received a router advertisement in a while, tstamp + temp_valid_lft might be < now. If this happens, the new address is created inside ipv6_create_tempaddr, then the loop within addrconf_verify starts again and the address is immediately deleted. We are left with no temporary addresses on the interface, and no more will be created until the public IP address is updated. To avoid this, set the expiry time to be the minimum of the time left on the public address or the config option PLUS the current age of the public interface. Signed-off-by: Glenn Wurster <gwurster@scs.carleton.ca> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-26IPv6: Create temporary address if none exists.Glenn Wurster
If privacy extentions are enabled, but no current temporary address exists, then create one when we get a router advertisement. Signed-off-by: Glenn Wurster <gwurster@scs.carleton.ca> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-26xen: initialize cpu masks for pv guests in xen_smp_initStefano Stabellini
Pv guests don't have ACPI and need the cpu masks to be set correctly as early as possible so we call xen_fill_possible_map from xen_smp_init. On the other hand the initial domain supports ACPI so in this case we skip xen_fill_possible_map and rely on it. However Xen might limit the number of cpus usable by the domain, so we filter those masks during smp initialization using the VCPUOP_is_up hypercall. It is important that the filtering is done before xen_setup_vcpu_info_placement. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
2010-10-26ASoC: fsl - fix build error in pcm030-audio-fabric.cLiam Girdwood
Fix build error:- sound/soc/fsl/pcm030-audio-fabric.c:27:33: fatal error: sound/soc-of-simple.h: No such file or directory Signed-off-by: Liam Girdwood <lrg@slimlogic.co.uk> Signed-off-by: Takashi Iwai <tiwai@suse.de>
2010-10-26sound/oss/sb_ess.c: delete double assignmentJulia Lawall
Delete successive assignments to the same location. A simplified version of the semantic match that finds this problem is as follows: (http://coccinelle.lip6.fr/) // <smpl> @@ expression i; @@ *i = ...; i = ...; // </smpl> Signed-off-by: Julia Lawall <julia@diku.dk> Signed-off-by: Takashi Iwai <tiwai@suse.de>
2010-10-26perf python scripting: Add futex-contention scriptArnaldo Carvalho de Melo
The equivalent to this SystemTAP script: http://sourceware.org/systemtap/wiki/WSFutexContention [root@doppio ~]# perf trace futex-contention Press control+C to stop and show the summary ^Cnpviewer.bin[15242] lock 7f0a8be19104 contended 29 times, 72806 avg ns npviewer.bin[15242] lock 7f0a8be19130 contended 2 times, 1355 avg ns synergyc[17245] lock f127f4 contended 1 times, 1830569 avg ns firefox[15116] lock 7f2b7238af0c contended 168 times, 1230390 avg ns synergyc[17245] lock f2fc20 contended 1 times, 33149 avg ns npviewer.bin[15255] lock 7f0a8be19074 contended 155 times, 73047 avg ns npviewer.bin[15255] lock 7f0a8be190a0 contended 127 times, 7088 avg ns synergyc[17247] lock f12854 contended 1 times, 46741 avg ns synergyc[17245] lock f12610 contended 1 times, 7358 avg ns [root@doppio ~]# Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Zanussi <tzanussi@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-10-26Merge branch 'misc' into releaseLen Brown
2010-10-26Merge branch 'acpi-mmio' into releaseLen Brown
Conflicts: drivers/acpi/osl.c Signed-off-by: Len Brown <len.brown@intel.com>
2010-10-26fib_hash: fix rcu sparse and logical errorsEric Dumazet
While fixing CONFIG_SPARSE_RCU_POINTER errors, I had to fix accesses to fz->fz_hash for real. - &fz->fz_hash[fn_hash(f->fn_key, fz)] + rcu_dereference(fz->fz_hash) + fn_hash(f->fn_key, fz) Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-26fib: fix fib_nl_newrule()Eric Dumazet
Some panic reports in fib_rules_lookup() show a rule could have a NULL pointer as a next pointer in the rules_list. This can actually happen because of a bug in fib_nl_newrule() : It checks if current rule is the destination of unresolved gotos. (Other rules have gotos to this about to be inserted rule) Problem is it does the resolution of the gotos before the rule is inserted in the rules_list (and has a valid next pointer) Fix this by moving the rules_list insertion before the changes on gotos. A lockless reader can not any more follow a ctarget pointer, unless destination is ready (has a valid next pointer) Reported-by: Oleg A. Arkhangelsky <sysoleg@yandex.ru> Reported-by: Joe Buehler <aspam@cox.net> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-26NTLM auth and sign - Use kernel crypto apis to calculate hashes and smb ↵Shirish Pargaonkar
signatures Use kernel crypto sync hash apis insetead of cifs crypto functions. The calls typically corrospond one to one except that insead of key init, setkey is used. Use crypto apis to generate smb signagtures also. Use hmac-md5 to genereate ntlmv2 hash, ntlmv2 response, and HMAC (CR1 of ntlmv2 auth blob. User crypto apis to genereate signature and to verify signature. md5 hash is used to calculate signature. Use secondary key to calculate signature in case of ntlmssp. For ntlmv2 within ntlmssp, during signature calculation, only 16 bytes key (a nonce) stored within session key is used. during smb signature calculation. For ntlm and ntlmv2 without extended security, 16 bytes key as well as entire response (24 bytes in case of ntlm and variable length in case of ntlmv2) is used for smb signature calculation. For kerberos, there is no distinction between key and response. Acked-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Shirish Pargaonkar <shirishpargaonkar@gmail.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
2010-10-26Merge branch 'ima-memory-use-fixes'Linus Torvalds
* ima-memory-use-fixes: IMA: fix the ToMToU logic IMA: explicit IMA i_flag to remove global lock on inode_delete IMA: drop refcnt from ima_iint_cache since it isn't needed IMA: only allocate iint when needed IMA: move read counter into struct inode IMA: use i_writecount rather than a private counter IMA: use inode->i_lock to protect read and write counters IMA: convert internal flags from long to char IMA: use unsigned int instead of long for counters IMA: drop the inode opencount since it isn't needed for operation IMA: use rbtree instead of radix tree for inode information cache
2010-10-26IMA: fix the ToMToU logicEric Paris
Current logic looks like this: rc = ima_must_measure(NULL, inode, MAY_READ, FILE_CHECK); if (rc < 0) goto out; if (mode & FMODE_WRITE) { if (inode->i_readcount) send_tomtou = true; goto out; } if (atomic_read(&inode->i_writecount) > 0) send_writers = true; Lets assume we have a policy which states that all files opened for read by root must be measured. Lets assume the file has permissions 777. Lets assume that root has the given file open for read. Lets assume that a non-root process opens the file write. The non-root process will get to ima_counts_get() and will check the ima_must_measure(). Since it is not supposed to measure it will goto out. We should check the i_readcount no matter what since we might be causing a ToMToU voilation! This is close to correct, but still not quite perfect. The situation could have been that root, which was interested in the mesurement opened and closed the file and another process which is not interested in the measurement is the one holding the i_readcount ATM. This is just overly strict on ToMToU violations, which is better than not strict enough... Signed-off-by: Eric Paris <eparis@redhat.com> Acked-by: Mimi Zohar <zohar@linux.vnet.ibm.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-10-26IMA: explicit IMA i_flag to remove global lock on inode_deleteEric Paris
Currently for every removed inode IMA must take a global lock and search the IMA rbtree looking for an associated integrity structure. Instead we explicitly mark an inode when we add an integrity structure so we only have to take the global lock and do the removal if it exists. Signed-off-by: Eric Paris <eparis@redhat.com> Acked-by: Mimi Zohar <zohar@linux.vnet.ibm.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-10-26IMA: drop refcnt from ima_iint_cache since it isn't neededEric Paris
Since finding a struct ima_iint_cache requires a valid struct inode, and the struct ima_iint_cache is supposed to have the same lifetime as a struct inode (technically they die together but don't need to be created at the same time) we don't have to worry about the ima_iint_cache outliving or dieing before the inode. So the refcnt isn't useful. Just get rid of it and free the structure when the inode is freed. Signed-off-by: Eric Paris <eapris@redhat.com> Acked-by: Mimi Zohar <zohar@linux.vnet.ibm.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-10-26IMA: only allocate iint when neededEric Paris
IMA always allocates an integrity structure to hold information about every inode, but only needed this structure to track the number of readers and writers currently accessing a given inode. Since that information was moved into struct inode instead of the integrity struct this patch stops allocating the integrity stucture until it is needed. Thus greatly reducing memory usage. Signed-off-by: Eric Paris <eparis@redhat.com> Acked-by: Mimi Zohar <zohar@linux.vnet.ibm.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-10-26IMA: move read counter into struct inodeEric Paris
IMA currently allocated an inode integrity structure for every inode in core. This stucture is about 120 bytes long. Most files however (especially on a system which doesn't make use of IMA) will never need any of this space. The problem is that if IMA is enabled we need to know information about the number of readers and the number of writers for every inode on the box. At the moment we collect that information in the per inode iint structure and waste the rest of the space. This patch moves those counters into the struct inode so we can eventually stop allocating an IMA integrity structure except when absolutely needed. This patch does the minimum needed to move the location of the data. Further cleanups, especially the location of counter updates, may still be possible. Signed-off-by: Eric Paris <eparis@redhat.com> Acked-by: Mimi Zohar <zohar@linux.vnet.ibm.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-10-26IMA: use i_writecount rather than a private counterEric Paris
IMA tracks the number of struct files which are holding a given inode readonly and the number which are holding the inode write or r/w. It needs this information so when a new reader or writer comes in it can tell if this new file will be able to invalidate results it already made about existing files. aka if a task is holding a struct file open RO, IMA measured the file and recorded those measurements and then a task opens the file RW IMA needs to note in the logs that the old measurement may not be correct. It's called a "Time of Measure Time of Use" (ToMToU) issue. The same is true is a RO file is opened to an inode which has an open writer. We cannot, with any validity, measure the file in question since it could be changing. This patch attempts to use the i_writecount field to track writers. The i_writecount field actually embeds more information in it's value than IMA needs but it should work for our purposes and allow us to shrink the struct inode even more. Signed-off-by: Eric Paris <eparis@redhat.com> Acked-by: Mimi Zohar <zohar@linux.vnet.ibm.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-10-26IMA: use inode->i_lock to protect read and write countersEric Paris
Currently IMA used the iint->mutex to protect the i_readcount and i_writecount. This patch uses the inode->i_lock since we are going to start using in inode objects and that is the most appropriate lock. Signed-off-by: Eric Paris <eparis@redhat.com> Acked-by: Mimi Zohar <zohar@linux.vnet.ibm.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-10-26IMA: convert internal flags from long to charEric Paris
The IMA flags is an unsigned long but there is only 1 flag defined. Lets save a little space and make it a char. This packs nicely next to the array of u8's. Signed-off-by: Eric Paris <eparis@redhat.com> Acked-by: Mimi Zohar <zohar@linux.vnet.ibm.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-10-26IMA: use unsigned int instead of long for countersEric Paris
Currently IMA uses 2 longs in struct inode. To save space (and as it seems impossible to overflow 32 bits) we switch these to unsigned int. The switch to unsigned does require slightly different checks for underflow, but it isn't complex. Signed-off-by: Eric Paris <eparis@redhat.com> Acked-by: Mimi Zohar <zohar@linux.vnet.ibm.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-10-26IMA: drop the inode opencount since it isn't needed for operationEric Paris
The opencount was used to help debugging to make sure that everything which created a struct file also correctly made the IMA calls. Since we moved all of that into the VFS this isn't as necessary. We should be able to get the same amount of debugging out of just the reader and write count. Signed-off-by: Eric Paris <eparis@redhat.com> Acked-by: Mimi Zohar <zohar@linux.vnet.ibm.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-10-26IMA: use rbtree instead of radix tree for inode information cacheEric Paris
The IMA code needs to store the number of tasks which have an open fd granting permission to write a file even when IMA is not in use. It needs this information in order to be enabled at a later point in time without losing it's integrity garantees. At the moment that means we store a little bit of data about every inode in a cache. We use a radix tree key'd on the inode's memory address. Dave Chinner pointed out that a radix tree is a terrible data structure for such a sparse key space. This patch switches to using an rbtree which should be more efficient. Bug report from Dave: "I just noticed that slabtop was reporting an awfully high usage of radix tree nodes: OBJS ACTIVE USE OBJ SIZE SLABS OBJ/SLAB CACHE SIZE NAME 4200331 2778082 66% 0.55K 144839 29 2317424K radix_tree_node 2321500 2060290 88% 1.00K 72581 32 2322592K xfs_inode 2235648 2069791 92% 0.12K 69864 32 279456K iint_cache That is, 2.7M radix tree nodes are allocated, and the cache itself is consuming 2.3GB of RAM. I know that the XFS inodei caches are indexed by radix tree node, but for 2 million cached inodes that would mean a density of 1 inode per radix tree node, which for a system with 16M inodes in the filsystems is an impossibly low density. The worst I've seen in a production system like kernel.org is about 20-25% density, which would mean about 150-200k radix tree nodes for that many inodes. So it's not the inode cache. So I looked up what the iint_cache was. It appears to used for storing per-inode IMA information, and uses a radix tree for indexing. It uses the *address* of the struct inode as the indexing key. That means the key space is extremely sparse - for XFS the struct inode addresses are approximately 1000 bytes apart, which means the closest the radix tree index keys get is ~1000. Which means that there is a single entry per radix tree leaf node, so the radix tree is using roughly 550 bytes for every 120byte structure being cached. For the above example, it's probably wasting close to 1GB of RAM...." Reported-by: Dave Chinner <david@fromorbit.com> Signed-off-by: Eric Paris <eparis@redhat.com> Acked-by: Mimi Zohar <zohar@linux.vnet.ibm.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-10-26drivers/atm/eni.c: Remove multiple uses of KERN_<level>Joe Perches
Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-26NTLM auth and sign - Define crypto hash functions and create and send keys ↵Shirish Pargaonkar
needed for key exchange Mark dependency on crypto modules in Kconfig. Defining per structures sdesc and cifs_secmech which are used to store crypto hash functions and contexts. They are stored per smb connection and used for all auth mechs to genereate hash values and signatures. Allocate crypto hashing functions, security descriptiors, and respective contexts when a smb/tcp connection is established. Release them when a tcp/smb connection is taken down. md5 and hmac-md5 are two crypto hashing functions that are used throught the life of an smb/tcp connection by various functions that calcualte signagure and ntlmv2 hash, HMAC etc. structure ntlmssp_auth is defined as per smb connection. ntlmssp_auth holds ciphertext which is genereated by rc4/arc4 encryption of secondary key, a nonce using ntlmv2 session key and sent in the session key field of the type 3 message sent by the client during ntlmssp negotiation/exchange A key is exchanged with the server if client indicates so in flags in type 1 messsage and server agrees in flag in type 2 message of ntlmssp negotiation. If both client and agree, a key sent by client in type 3 message of ntlmssp negotiation in the session key field. The key is a ciphertext generated off of secondary key, a nonce, using ntlmv2 hash via rc4/arc4. Signing works for ntlmssp in this patch. The sequence number within the server structure needs to be zero until session is established i.e. till type 3 packet of ntlmssp exchange of a to be very first smb session on that smb connection is sent. Acked-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Shirish Pargaonkar <shirishpargaonkar@gmail.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
2010-10-26tg3: Do not call device_set_wakeup_enable() under spin_lock_bhRafael J. Wysocki
The tg3 driver calls device_set_wakeup_enable() under spin_lock_bh, which causes a problem to happen after the recent core power management changes, because this function can sleep now. Fix this by moving the device_set_wakeup_enable() call out of the spin_lock_bh-protected area. Reported-by: Maxim Levitsky <maximlevitsky@gmail.com> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-26Merge branch 'master' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6