summaryrefslogtreecommitdiffstats
AgeCommit message (Collapse)Author
2013-07-25selinux: fix problems in netnode when BUG() is compiled outPaul Moore
When the BUG() macro is disabled at compile time it can cause some problems in the SELinux netnode code: invalid return codes and uninitialized variables. This patch fixes this by making sure we take some corrective action after the BUG() macro. Reported-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Paul Moore <pmoore@redhat.com> Signed-off-by: Eric Paris <eparis@redhat.com>
2013-07-25SELinux: use a helper function to determine seclabelEric Paris
Use a helper to determine if a superblock should have the seclabel flag rather than doing it in the function. I'm going to use this in the security server as well. Signed-off-by: Eric Paris <eparis@redhat.com>
2013-07-25SELinux: pass a superblock to security_fs_useEric Paris
Rather than passing pointers to memory locations, strings, and other stuff just give up on the separation and give security_fs_use the superblock. It just makes the code easier to read (even if not easier to reuse on some other OS) Signed-off-by: Eric Paris <eparis@redhat.com>
2013-07-25SELinux: do not handle seclabel as a special flagEric Paris
Instead of having special code around the 'non-mount' seclabel mount option just handle it like the mount options. Signed-off-by: Eric Paris <eparis@redhat.com>
2013-07-25SELinux: change sbsec->behavior to shortEric Paris
We only have 6 options, so char is good enough, but use a short as that packs nicely. This shrinks the superblock_security_struct just a little bit. Signed-off-by: Eric Paris <eparis@redhat.com>
2013-07-25SELinux: renumber the superblock optionsEric Paris
Just to make it clear that we have mount time options and flags, separate them. Since I decided to move the non-mount options above above 0x10, we need a short instead of a char. (x86 padding says this takes up no additional space as we have a 3byte whole in the structure) Signed-off-by: Eric Paris <eparis@redhat.com>
2013-07-25SELinux: do all flags twiddling in one placeEric Paris
Currently we set the initialize and seclabel flag in one place. Do some unrelated printk then we unset the seclabel flag. Eww. Instead do the flag twiddling in one place in the code not seperated by unrelated printk. Also don't set and unset the seclabel flag. Only set it if we need to. Signed-off-by: Eric Paris <eparis@redhat.com>
2013-07-25SELinux: rename SE_SBLABELSUPP to SBLABEL_MNTEric Paris
Just a flag rename as we prepare to make it not so special. Signed-off-by: Eric Paris <eparis@redhat.com>
2013-07-25SELinux: use define for number of bits in the mnt flags maskEric Paris
We had this random hard coded value of '8' in the code (I put it there) for the number of bits to check for mount options. This is stupid. Instead use the #define we already have which tells us the number of mount options. Signed-off-by: Eric Paris <eparis@redhat.com>
2013-07-25SELinux: make it harder to get the number of mnt opts wrongEric Paris
Instead of just hard coding a value, use the enum to out benefit. Signed-off-by: Eric Paris <eparis@redhat.com>
2013-07-25SELinux: remove crazy contortions around procEric Paris
We check if the fsname is proc and if so set the proc superblock security struct flag. We then check if the flag is set and use the string 'proc' for the fsname instead of just using the fsname. What's the point? It's always proc... Get rid of the useless conditional. Signed-off-by: Eric Paris <eparis@redhat.com>
2013-07-25SELinux: fix selinuxfs policy file on big endian systemsEric Paris
The /sys/fs/selinux/policy file is not valid on big endian systems like ppc64 or s390. Let's see why: static int hashtab_cnt(void *key, void *data, void *ptr) { int *cnt = ptr; *cnt = *cnt + 1; return 0; } static int range_write(struct policydb *p, void *fp) { size_t nel; [...] /* count the number of entries in the hashtab */ nel = 0; rc = hashtab_map(p->range_tr, hashtab_cnt, &nel); if (rc) return rc; buf[0] = cpu_to_le32(nel); rc = put_entry(buf, sizeof(u32), 1, fp); So size_t is 64 bits. But then we pass a pointer to it as we do to hashtab_cnt. hashtab_cnt thinks it is a 32 bit int and only deals with the first 4 bytes. On x86_64 which is little endian, those first 4 bytes and the least significant, so this works out fine. On ppc64/s390 those first 4 bytes of memory are the high order bits. So at the end of the call to hashtab_map nel has a HUGE number. But the least significant 32 bits are all 0's. We then pass that 64 bit number to cpu_to_le32() which happily truncates it to a 32 bit number and does endian swapping. But the low 32 bits are all 0's. So no matter how many entries are in the hashtab, big endian systems always say there are 0 entries because I screwed up the counting. The fix is easy. Use a 32 bit int, as the hashtab_cnt expects, for nel. Signed-off-by: Eric Paris <eparis@redhat.com> Signed-off-by: Paul Moore <pmoore@redhat.com>
2013-07-25SELinux: Enable setting security contexts on rootfs inodes.Stephen Smalley
rootfs (ramfs) can support setting of security contexts by userspace due to the vfs fallback behavior of calling the security module to set the in-core inode state for security.* attributes when the filesystem does not provide an xattr handler. No xattr handler required as the inodes are pinned in memory and have no backing store. This is useful in allowing early userspace to label individual files within a rootfs while still providing a policy-defined default via genfs. Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: Paul Moore <pmoore@redhat.com> Signed-off-by: Eric Paris <eparis@redhat.com>
2013-07-25SELinux: Increase ebitmap_node size for 64-bit configurationWaiman Long
Currently, the ebitmap_node structure has a fixed size of 32 bytes. On a 32-bit system, the overhead is 8 bytes, leaving 24 bytes for being used as bitmaps. The overhead ratio is 1/4. On a 64-bit system, the overhead is 16 bytes. Therefore, only 16 bytes are left for bitmap purpose and the overhead ratio is 1/2. With a 3.8.2 kernel, a boot-up operation will cause the ebitmap_get_bit() function to be called about 9 million times. The average number of ebitmap_node traversal is about 3.7. This patch increases the size of the ebitmap_node structure to 64 bytes for 64-bit system to keep the overhead ratio at 1/4. This may also improve performance a little bit by making node to node traversal less frequent (< 2) as more bits are available in each node. Signed-off-by: Waiman Long <Waiman.Long@hp.com> Acked-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: Paul Moore <pmoore@redhat.com> Signed-off-by: Eric Paris <eparis@redhat.com>
2013-07-25SELinux: Reduce overhead of mls_level_isvalid() function callWaiman Long
While running the high_systime workload of the AIM7 benchmark on a 2-socket 12-core Westmere x86-64 machine running 3.10-rc4 kernel (with HT on), it was found that a pretty sizable amount of time was spent in the SELinux code. Below was the perf trace of the "perf record -a -s" of a test run at 1500 users: 5.04% ls [kernel.kallsyms] [k] ebitmap_get_bit 1.96% ls [kernel.kallsyms] [k] mls_level_isvalid 1.95% ls [kernel.kallsyms] [k] find_next_bit The ebitmap_get_bit() was the hottest function in the perf-report output. Both the ebitmap_get_bit() and find_next_bit() functions were, in fact, called by mls_level_isvalid(). As a result, the mls_level_isvalid() call consumed 8.95% of the total CPU time of all the 24 virtual CPUs which is quite a lot. The majority of the mls_level_isvalid() function invocations come from the socket creation system call. Looking at the mls_level_isvalid() function, it is checking to see if all the bits set in one of the ebitmap structure are also set in another one as well as the highest set bit is no bigger than the one specified by the given policydb data structure. It is doing it in a bit-by-bit manner. So if the ebitmap structure has many bits set, the iteration loop will be done many times. The current code can be rewritten to use a similar algorithm as the ebitmap_contains() function with an additional check for the highest set bit. The ebitmap_contains() function was extended to cover an optional additional check for the highest set bit, and the mls_level_isvalid() function was modified to call ebitmap_contains(). With that change, the perf trace showed that the used CPU time drop down to just 0.08% (ebitmap_contains + mls_level_isvalid) of the total which is about 100X less than before. 0.07% ls [kernel.kallsyms] [k] ebitmap_contains 0.05% ls [kernel.kallsyms] [k] ebitmap_get_bit 0.01% ls [kernel.kallsyms] [k] mls_level_isvalid 0.01% ls [kernel.kallsyms] [k] find_next_bit The remaining ebitmap_get_bit() and find_next_bit() functions calls are made by other kernel routines as the new mls_level_isvalid() function will not call them anymore. This patch also improves the high_systime AIM7 benchmark result, though the improvement is not as impressive as is suggested by the reduction in CPU time spent in the ebitmap functions. The table below shows the performance change on the 2-socket x86-64 system (with HT on) mentioned above. +--------------+---------------+----------------+-----------------+ | Workload | mean % change | mean % change | mean % change | | | 10-100 users | 200-1000 users | 1100-2000 users | +--------------+---------------+----------------+-----------------+ | high_systime | +0.1% | +0.9% | +2.6% | +--------------+---------------+----------------+-----------------+ Signed-off-by: Waiman Long <Waiman.Long@hp.com> Acked-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: Paul Moore <pmoore@redhat.com> Signed-off-by: Eric Paris <eparis@redhat.com>
2013-07-25selinux: remove the BUG_ON() from selinux_skb_xfrm_sid()Paul Moore
Remove the BUG_ON() from selinux_skb_xfrm_sid() and propogate the error code up to the caller. Also check the return values in the only caller function, selinux_skb_peerlbl_sid(). Signed-off-by: Paul Moore <pmoore@redhat.com> Signed-off-by: Eric Paris <eparis@redhat.com>
2013-07-25selinux: cleanup the XFRM headerPaul Moore
Remove the unused get_sock_isec() function and do some formatting fixes. Signed-off-by: Paul Moore <pmoore@redhat.com> Signed-off-by: Eric Paris <eparis@redhat.com>
2013-07-25selinux: cleanup selinux_xfrm_decode_session()Paul Moore
Some basic simplification. Signed-off-by: Paul Moore <pmoore@redhat.com> Signed-off-by: Eric Paris <eparis@redhat.com>
2013-07-25selinux: cleanup some comment and whitespace issues in the XFRM codePaul Moore
Signed-off-by: Paul Moore <pmoore@redhat.com> Signed-off-by: Eric Paris <eparis@redhat.com>
2013-07-25selinux: cleanup selinux_xfrm_sock_rcv_skb() and selinux_xfrm_postroute_last()Paul Moore
Some basic simplification and comment reformatting. Signed-off-by: Paul Moore <pmoore@redhat.com> Signed-off-by: Eric Paris <eparis@redhat.com>
2013-07-25selinux: cleanup selinux_xfrm_policy_lookup() and ↵Paul Moore
selinux_xfrm_state_pol_flow_match() Do some basic simplification and comment reformatting. Signed-off-by: Paul Moore <pmoore@redhat.com> Signed-off-by: Eric Paris <eparis@redhat.com>
2013-07-25selinux: cleanup and consolidate the XFRM alloc/clone/delete/free codePaul Moore
The SELinux labeled IPsec code state management functions have been long neglected and could use some cleanup and consolidation. Signed-off-by: Paul Moore <pmoore@redhat.com> Signed-off-by: Eric Paris <eparis@redhat.com>
2013-07-25lsm: split the xfrm_state_alloc_security() hook implementationPaul Moore
The xfrm_state_alloc_security() LSM hook implementation is really a multiplexed hook with two different behaviors depending on the arguments passed to it by the caller. This patch splits the LSM hook implementation into two new hook implementations, which match the LSM hooks in the rest of the kernel: * xfrm_state_alloc * xfrm_state_alloc_acquire Also included in this patch are the necessary changes to the SELinux code; no other LSMs are affected. Signed-off-by: Paul Moore <pmoore@redhat.com> Signed-off-by: Eric Paris <eparis@redhat.com>
2013-06-30Linux 3.10v3.10Linus Torvalds
2013-06-30Merge branch 'merge' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc Pull another powerpc fix from Benjamin Herrenschmidt: "I mentioned that while we had fixed the kernel crashes, EEH error recovery didn't always recover... It appears that I had a fix for that already in powerpc-next (with a stable CC). I cherry-picked it today and did a few tests and it seems that things now work quite well. The patch is also pretty simple, so I see no reason to wait before merging it." * 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: powerpc/eeh: Fix fetching bus for single-dev-PE
2013-06-30Merge tag 'scsi-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "This is a set of seven bug fixes. Several fcoe fixes for locking problems, initiator issues and a VLAN API change, all of which could eventually lead to data corruption, one fix for a qla2xxx locking problem which could lead to multiple completions of the same request (and subsequent data corruption) and a use after free in the ipr driver. Plus one minor MAINTAINERS file update" (only six bugfixes in this pull, since I had already pulled the fcoe API fix directly from Robert Love) * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: [SCSI] ipr: Avoid target_destroy accessing memory after it was freed [SCSI] qla2xxx: Fix for locking issue between driver ISR and mailbox routines MAINTAINERS: Fix fcoe mailing list libfc: extend ex_lock to protect all of fc_seq_send libfc: Correct check for initiator role libfcoe: Fix Conflicting FCFs issue in the fabric
2013-06-30powerpc/eeh: Fix fetching bus for single-dev-PEGavin Shan
While running Linux as guest on top of phyp, we possiblly have PE that includes single PCI device. However, we didn't return its PCI bus correctly and it leads to failure on recovery from EEH errors for single-dev-PE. The patch fixes the issue. Cc: <stable@vger.kernel.org> # v3.7+ Cc: Steve Best <sbest@us.ibm.com> Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-29Merge branch 'merge' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc Pull powerpc fixes from Ben Herrenschmidt: "We discovered some breakage in our "EEH" (PCI Error Handling) code while doing error injection, due to a couple of regressions. One of them is due to a patch (37f02195bee9 "powerpc/pci: fix PCI-e devices rescan issue on powerpc platform") that, in hindsight, I shouldn't have merged considering that it caused more problems than it solved. Please pull those two fixes. One for a simple EEH address cache initialization issue. The other one is a patch from Guenter that I had originally planned to put in 3.11 but which happens to also fix that other regression (a kernel oops during EEH error handling and possibly hotplug). With those two, the couple of test machines I've hammered with error injection are remaining up now. EEH appears to still fail to recover on some devices, so there is another problem that Gavin is looking into but at least it's no longer crashing the kernel." * 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: powerpc/pci: Improve device hotplug initialization powerpc/eeh: Add eeh_dev to the cache during boot
2013-06-29ARM: dt: Only print warning, not WARN() on bad cpu map in device treeOlof Johansson
Due to recent changes and expecations of proper cpu bindings, there are now cases for many of the in-tree devicetrees where a WARN() will hit on boot due to badly formatted /cpus nodes. Downgrade this to a pr_warn() to be less alarmist, since it's not a new problem. Tested on Arndale, Cubox, Seaboard and Panda ES. Panda hits the WARN without this, the others do not. Acked-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Olof Johansson <olof@lixom.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-06-30powerpc/pci: Improve device hotplug initializationGuenter Roeck
Commit 37f02195b (powerpc/pci: fix PCI-e devices rescan issue on powerpc platform) fixes a problem with interrupt and DMA initialization on hot plugged devices. With this commit, interrupt and DMA initialization for hot plugged devices is handled in the pci device enable function. This approach has a couple of drawbacks. First, it creates two code paths for device initialization, one for hot plugged devices and another for devices known during the initial PCI scan. Second, the initialization code for hot plugged devices is only called when the device is enabled, ie typically in the probe function. Also, the platform specific setup code is called each time pci_enable_device() is called, not only once during device discovery, meaning it is actually called multiple times, once for devices discovered during the initial scan and again each time a driver is re-loaded. The visible result is that interrupt pins are only assigned to hot plugged devices when the device driver is loaded. Effectively this changes the PCI probe API, since pci_dev->irq and the device's dma configuration will now only be valid after pci_enable() was called at least once. A more subtle change is that platform specific PCI device setup is moved from device discovery into the driver's probe function, more specifically into the pci_enable_device() call. To fix the inconsistencies, add new function pcibios_add_device. Call pcibios_setup_device from pcibios_setup_bus_devices if device setup is not complete, and from pcibios_add_device if bus setup is complete. With this change, device setup code is moved back into device initialization, and called exactly once for both static and hot plugged devices. [ This also fixes a regression introduced by the above patch which causes dev->irq to be overwritten under some cirumstances after MSIs have been enabled for the device which leads to crashes due to the MSI core "hijacking" dev->irq to store the base MSI number and not the LSI. --BenH ] Cc: Yuanquan Chen <Yuanquan.Chen@freescale.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Hiroo Matsumoto <matsumoto.hiroo@jp.fujitsu.com> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-29Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6Linus Torvalds
Pull crypto fix from Herbert Xu: "This fixes a crash in the crypto layer exposed by an SCTP test tool" * git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: crypto: algboss - Hold ref count on larval
2013-06-29Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linuxLinus Torvalds
Pull drm/qxl fix from Dave Airlie: "Bad me forgot an access check, possible security issue, but since this is the first kernel with it, should be fine to just put it in now" * 'drm-fixes' of git://people.freedesktop.org/~airlied/linux: drm/qxl: add missing access check for execbuffer ioctl
2013-06-29Fix: kernel/ptrace.c: ptrace_peek_siginfo() missing __put_user() validationMathieu Desnoyers
This __put_user() could be used by unprivileged processes to write into kernel memory. The issue here is that even if copy_siginfo_to_user() fails, the error code is not checked before __put_user() is executed. Luckily, ptrace_peek_siginfo() has been added within the 3.10-rc cycle, so it has not hit a stable release yet. Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Acked-by: Oleg Nesterov <oleg@redhat.com> Cc: Andrey Vagin <avagin@openvz.org> Cc: Roland McGrath <roland@redhat.com> Cc: Paul McKenney <paulmck@linux.vnet.ibm.com> Cc: David Howells <dhowells@redhat.com> Cc: Dave Jones <davej@redhat.com> Cc: Pavel Emelyanov <xemul@parallels.com> Cc: Pedro Alves <palves@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-06-29Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client Pull Ceph fix from Sage Weil: "This is a recently spotted regression in the snapshot behavior... It turns out several tests weren't being run in the nightlies so this took a while to spot" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: rbd: send snapshot context with writes
2013-06-29Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull ubifs fixes from Al Viro: "A couple of ubifs readdir/lseek race fixes. Stable fodder, really nasty..." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: UBIFS: fix a horrid bug UBIFS: prepare to fix a horrid bug
2013-06-29Merge tag 'for-linus-20130628' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-mn10300 Pull two MN10300 fixes from David Howells: "The first fixes a problem with passing arrays rather than pointers to get_user() where __typeof__ then wants to declare and initialise an array variable which gcc doesn't like. The second fixes a problem whereby putting mem=xxx into the kernel command line causes init=xxx to get an incorrect value." * tag 'for-linus-20130628' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-mn10300: mn10300: Use early_param() to parse "mem=" parameter mn10300: Allow to pass array name to get_user()
2013-06-29Merge branch 'timers-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer fix from Thomas Gleixner: "Correct an ordering issue in the tick broadcast code. I really wish we'd get compensation for pain and suffering for each line of code we write to work around dysfunctional timer hardware." * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: tick: Fix tick_broadcast_pending_mask not cleared
2013-06-29Merge branch 'perf-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fix from Ingo Molnar: "One more fix for a recently discovered bug" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf: Disable monitoring on setuid processes for regular users
2013-06-29UBIFS: fix a horrid bugArtem Bityutskiy
Al Viro pointed me to the fact that '->readdir()' and '->llseek()' have no mutual exclusion, which means the 'ubifs_dir_llseek()' can be run while we are in the middle of 'ubifs_readdir()'. This means that 'file->private_data' can be freed while 'ubifs_readdir()' uses it, and this is a very bad bug: not only 'ubifs_readdir()' can return garbage, but this may corrupt memory and lead to all kinds of problems like crashes an security holes. This patch fixes the problem by using the 'file->f_version' field, which '->llseek()' always unconditionally sets to zero. We set it to 1 in 'ubifs_readdir()' and whenever we detect that it became 0, we know there was a seek and it is time to clear the state saved in 'file->private_data'. I tested this patch by writing a user-space program which runds readdir and seek in parallell. I could easily crash the kernel without these patches, but could not crash it with these patches. Cc: stable@vger.kernel.org Reported-by: Al Viro <viro@zeniv.linux.org.uk> Tested-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com> Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-06-29UBIFS: prepare to fix a horrid bugArtem Bityutskiy
Al Viro pointed me to the fact that '->readdir()' and '->llseek()' have no mutual exclusion, which means the 'ubifs_dir_llseek()' can be run while we are in the middle of 'ubifs_readdir()'. First of all, this means that 'file->private_data' can be freed while 'ubifs_readdir()' uses it. But this particular patch does not fix the problem. This patch is only a preparation, and the fix will follow next. In this patch we make 'ubifs_readdir()' stop using 'file->f_pos' directly, because 'file->f_pos' can be changed by '->llseek()' at any point. This may lead 'ubifs_readdir()' to returning inconsistent data: directory entry names may correspond to incorrect file positions. So here we introduce a local variable 'pos', read 'file->f_pose' once at very the beginning, and then stick to 'pos'. The result of this is that when 'ubifs_dir_llseek()' changes 'file->f_pos' while we are in the middle of 'ubifs_readdir()', the latter "wins". Cc: stable@vger.kernel.org Reported-by: Al Viro <viro@zeniv.linux.org.uk> Tested-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com> Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-06-28mn10300: Use early_param() to parse "mem=" parameterAkira Takeuchi
This fixes the problem that "init=" options may not be passed to kernel correctly. parse_mem_cmdline() of mn10300 arch gets rid of "mem=" string from redboot_command_line. Then init_setup() parses the "init=" options from static_command_line, which is a copy of redboot_command_line, and keeps the pointer to the init options in execute_command variable. Since the commit 026cee0 upstream (params: <level>_initcall-like kernel parameters), static_command_line becomes overwritten by saved_command_line at do_initcall_level(). Notice that saved_command_line is a command line which includes "mem=" string. As a result, execute_command may point to weird string by the length of "mem=" parameter. I noticed this problem when using the command line like this: mem=128M console=ttyS0,115200 init=/bin/sh Here is the processing flow of command line parameters. start_kernel() setup_arch(&command_line) parse_mem_cmdline(cmdline_p) * strcpy(boot_command_line, redboot_command_line); * Remove "mem=xxx" from redboot_command_line. * *cmdline_p = redboot_command_line; setup_command_line(command_line) <-- command_line is redboot_command_line * strcpy(saved_command_line, boot_command_line) * strcpy(static_command_line, command_line) parse_early_param() strlcpy(tmp_cmdline, boot_command_line, COMMAND_LINE_SIZE); parse_early_options(tmp_cmdline); parse_args("early options", cmdline, NULL, 0, 0, 0, do_early_param); parse_args("Booting ..", static_command_line, ...); init_setup() <-- save the pointer in execute_command rest_init() kernel_thread(kernel_init, NULL, CLONE_FS | CLONE_SIGHAND); At this point, execute_command points to "/bin/sh" string. kernel_init() kernel_init_freeable() do_basic_setup() do_initcalls() do_initcall_level() (*) strcpy(static_command_line, saved_command_line); Here, execute_command gets to point to "200" string !! Signed-off-by: David Howells <dhowells@redhat.com>
2013-06-28mn10300: Allow to pass array name to get_user()Akira Takeuchi
This fixes the following compile error: CC block/scsi_ioctl.o block/scsi_ioctl.c: In function 'sg_scsi_ioctl': block/scsi_ioctl.c:449: error: invalid initializer Signed-off-by: David Howells <dhowells@redhat.com>
2013-06-28drm/qxl: add missing access check for execbuffer ioctlDave Airlie
Reported-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: Dave Airlie <airlied@redhat.com>
2013-06-28powerpc/eeh: Add eeh_dev to the cache during bootThadeu Lima de Souza Cascardo
commit f8f7d63fd96ead101415a1302035137a866f8998 ("powerpc/eeh: Trace eeh device from I/O cache") broke EEH on pseries for devices that were present during boot and have not been hotplugged/DLPARed. eeh_check_failure will get the eeh_dev from the cache, and will get NULL. eeh_addr_cache_build adds the addresses to the cache, but eeh_dev for the giving pci_device is not set yet. Just reordering the call to eeh_addr_cache_insert_dev works fine. The ordering is similar to the one in eeh_add_device_late. Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@linux.vnet.ibm.com> Acked-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-27rbd: send snapshot context with writesJosh Durgin
Sending the right snapshot context with each write is required for snapshots to work. Due to the ordering of calls, the snapshot context is never set for any requests. This causes writes to the current version of the image to be reflected in all snapshots, which are supposed to be read-only. This happens because rbd_osd_req_format_write() sets the snapshot context based on obj_request->img_request. At this point, however, obj_request->img_request has not been set yet, to the snapshot context is set to NULL. Fix this by moving rbd_img_obj_request_add(), which sets obj_request->img_request, before the osd request formatting calls. This resolves: http://tracker.ceph.com/issues/5465 Reported-by: Karol Jurak <karol.jurak@gmail.com> Signed-off-by: Josh Durgin <josh.durgin@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com> Reviewed-by: Alex Elder <elder@linaro.org>
2013-06-26Merge tag 'fcoe1' into fixesJames Bottomley
This patch fixes a critical bug that was introduced in 3.9 related to VLAN tagging FCoE frames.
2013-06-26Merge tag 'fcoe' into fixesJames Bottomley
3.10 fixes
2013-06-26Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds
Pull networking fixes from David Miller: 1) Found via trinity: If you connect up an ipv6 socket to an ipv4 mapped address then an ipv6 one, sendmsg() can croak because ip6_sk_dst_check() assumes the route cached in the socket is an ipv6 one. In this case there is an ipv4 route attached, so it gets stomped on. Reported by Dave Jones and Hannes Frederic Sowa, fixed by Eric Dumazet. 2) AF_KEY notifications leak some kernel memory to userspace, fix from Mathias Krause. 3) DLCI calls __dev_get_by_name() without proper locking, and dlci_del doesn't validate that the device being deleted is actually a DLCI one. Fixes from Li Zefan. 4) Length check on bluetooth l2cap information responses is wrong, each response type has a different lenth, so we should make sure it's in a given range rather than enforce one single valid length. From Jaganath Kanakkassery. 5) Receive FIFO overflow is really easy to trigger in stress scenerios in the sh_eth driver, but the event isn't being handled properly at all. Specifically, the mask of error interrupts doesn't include the event so we never clear it, resulting in the driver becomming wedged processing an interrupt that never gets cleared. Fix from Sergei Shtylyov. 6) qlcnic sleeps while holding a spinlock, use mdelay() instead of msleep(). From Shahed Shaikh. 7) Missing curly braces causes SIP netfilter NAT module to always drop packets. Fix from Balazs Peter Odor. 8) ipt_ULOG in netfilter passes the wrong value to timer setup, causing the timer to dereference crap when it fires. Fix from Gao Feng. 9) Missing RCU protection around txq->axq_acq traversal in ath_txq_schedule(). Fix from Felix Fietkau. 10) Idle state transition test in ath9k_htc_config() is reversed, fix from Sujith Manoharan. 11) IPV6 forwarding handles unicast Router Alert packets incorrectly. It tests the wrong option state. Previously opt->ra being non-zero indicated a router alert marking in the SKB, but now it's indicated by a bit in opt->flags. Fix from YOSHIFUJI Hideaki. 12) SKB leak in GRE tunnel GSO handling, from Eric Dumazet. 13) get_user_pages_fast() error handling in TUN and MACVTAP use the same local variable for the base index and the loop iterator for page traversal, oops! Fix from Michael S Tsirkin. 14) ipv6_get_lladdr() can fail, and we must therefore check it's return value in inet6_set_iftoken(). For from Hannes Frederic Sowa. 15) If you change an interface name and meanwhile can sneak in something that looks up the name (like SO_BINDTODEVICE or SIOCGIFNAME) we can deadlock with CONFIG_PREEMPT=n. Fix this by providing a helper function that properly uses raw_seqcount_begin(). From Nicolas Schichan. 16) Chain noise calibration test is inverted in iwlwifi, fix from Nikolay Martynov. 17) Properly set TX iwlwifi descriptor flags for back requests. Fix from Emmanuel Grumbach. 18) We can't assume skb_transport_header() is set in xt_TCPOPTSTRAP module, fix from Pablo Neira Ayuso. 19) Some crummy APs don't provide the proper High Throughput info in association response frames. Add a workaround by assume we'll use whatever is in the beacon/probe. Fix from Johannes Berg. 20) mac80211 call to rate_idx_match_mask() swaps two arguments (mask and channel width). Fix from Simon Wunderlich. 21) xt_TCPMSS (like xt_TCPOPTSTRAP) must not try to handle fragmented frames. Fix from Phil Oester. 22) Fix rate control regression causing iwlwifi/iwlegacy chips to use 1Mbit/s on pre-11n networks. From Moshe Benji and Stanslaw Gruszka. 23) Disable brcmsmac power-save functions, they cause regressions. From Arend van Spriel. 24) Enforce a sane minimum MTU in l2cap_build_cmd() otherwise we can easily crash. Fix from Anderson Lizardo. 25) If a learning packet arrives during vxlan_stop() we crash, easily fixed by checking netif_running(). From Stephen Hemminger. 26) Static vxlan FDB entries should not be migrated, also from Stephen. 27) skb_clone() failures not handled in vxlan_xmit(), oops. Also from Stephen. 28) Add minimal driver for AR816x/AR817x ethernet chips, from Johannes Berg. 29) Fix regression in userspace VLAN acceleration control, added by the 802.1ad support changes. Fix from Fernando Luis Vazquez Cao. 30) Interval selection for MLD queries in the bridging code was reversed. Fix from Linus Lüssing. 31) ipv6's ndisc_send_redirect() erroneously writes to the packet we received not the packet we are building to send out. Fix from Matthias Schiffer. 32) Don't free netdev before unregistering it, in usb_8dev can driver. From Marc Kleine-Budde. 33) Fix nl80211 attribute buffer races, from Johannes Berg. 34) Although netlink_diag.h is under uapi/ it isn't present in Kbuild. From Stephen Hemminger. 35) Wrong address and family passed to MD5 key lookups in TCP, from Aydin Arik. 36) phy_type attribute created by SFC driver should not be writable. From Ben Hutchings. 37) Receive/Transmit queue allocations in pxa168_eth and mv643xx_eth should use kzalloc(). Otherwise if setup fails half-way, we'll dereference garbage when trying to teardown the rings. From Lubomir Rintel. 38) Fix double-allocation of dst (resulting in unfreeable net device) in ipv6's init_loopback(). From Gao Feng. 39) Fix fragmentation handling SKB leak in netfilter conntrack, we were freeing the wrong skb pointer. From Phil Oester. 40) Don't report "-1" (SPEED_UNKNOWN) in bond_miimon_commit(), from Nikolay Aleksandrov. 41) davinci_cpdma doesn't check for DMA mapping errors, letting the device scribble to random addresses. From Sebastian Siewior. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (69 commits) dlci: validate the net device in dlci_del() dlci: acquire rtnl_lock before calling __dev_get_by_name() af_key: fix info leaks in notify messages ipv6: ip6_sk_dst_check() must not assume ipv6 dst net: fix kernel deadlock with interface rename and netdev name retrieval. net/tg3: Avoid delay during MMIO access ipv6: check return value of ipv6_get_lladdr macvtap: fix recovery from gup errors tun: fix recovery from gup errors gre: fix a possible skb leak ipv6: Process unicast packet with Router Alert by checking flag in skb. ath9k_htc: Handle IDLE state transition properly ath9k: fix an RCU issue in calling ieee80211_get_tx_rates netfilter: ipt_ULOG: fix incorrect setting of ulog timer netfilter: ctnetlink: send event when conntrack label was modified netfilter: nf_nat_sip: fix mangling qlcnic: Do not sleep while holding spinlock drivers: net: cpsw: fix compilation error with cpsw driver tcp: doc : fix the syncookies default value sh_eth: fix misreporting of transmit abort ...
2013-06-26Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linuxLinus Torvalds
Pull i915 drm fixes from Dave Airlie: "These should be the last two fixes for i915, one is for a fence leak killing X on some older GPUs, and one is a late regression partial revert for an swiotlb/xen/i915 interaction, Konrad has promised to figure out the proper answer, and this patch is the best thing to do at this stage to avoid regressing" * 'drm-fixes' of git://people.freedesktop.org/~airlied/linux: drm/i915: make compact dma scatter lists creation work with SWIOTLB backend. drm/i915: Restore fences after resume and GPU resets
2013-06-26dlci: validate the net device in dlci_del()Zefan Li
We triggered an oops while running trinity with 3.4 kernel: BUG: unable to handle kernel paging request at 0000000100000d07 IP: [<ffffffffa0109738>] dlci_ioctl+0xd8/0x2d4 [dlci] PGD 640c0d067 PUD 0 Oops: 0000 [#1] PREEMPT SMP CPU 3 ... Pid: 7302, comm: trinity-child3 Not tainted 3.4.24.09+ 40 Huawei Technologies Co., Ltd. Tecal RH2285 /BC11BTSA RIP: 0010:[<ffffffffa0109738>] [<ffffffffa0109738>] dlci_ioctl+0xd8/0x2d4 [dlci] ... Call Trace: [<ffffffff8137c5c3>] sock_ioctl+0x153/0x280 [<ffffffff81195494>] do_vfs_ioctl+0xa4/0x5e0 [<ffffffff8118354a>] ? fget_light+0x3ea/0x490 [<ffffffff81195a1f>] sys_ioctl+0x4f/0x80 [<ffffffff81478b69>] system_call_fastpath+0x16/0x1b ... It's because the net device is not a dlci device. Reported-by: Li Jinyue <lijinyue@huawei.com> Signed-off-by: Li Zefan <lizefan@huawei.com> Cc: stable@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>