asmadeus/linux.git - The linux kernel

Age	Commit message (Collapse)	Author
2014-07-24	CAPABILITIES: remove undefined caps from all processes	Eric Paris
	This is effectively a revert of 7b9a7ec565505699f503b4fcf61500dceb36e744 plus fixing it a different way... We found, when trying to run an application from an application which had dropped privs that the kernel does security checks on undefined capability bits. This was ESPECIALLY difficult to debug as those undefined bits are hidden from /proc/$PID/status. Consider a root application which drops all capabilities from ALL 4 capability sets. We assume, since the application is going to set eff/perm/inh from an array that it will clear not only the defined caps less than CAP_LAST_CAP, but also the higher 28ish bits which are undefined future capabilities. The BSET gets cleared differently. Instead it is cleared one bit at a time. The problem here is that in security/commoncap.c::cap_task_prctl() we actually check the validity of a capability being read. So any task which attempts to 'read all things set in bset' followed by 'unset all things set in bset' will not even attempt to unset the undefined bits higher than CAP_LAST_CAP. So the 'parent' will look something like: CapInh: 0000000000000000 CapPrm: 0000000000000000 CapEff: 0000000000000000 CapBnd: ffffffc000000000 All of this 'should' be fine. Given that these are undefined bits that aren't supposed to have anything to do with permissions. But they do... So lets now consider a task which cleared the eff/perm/inh completely and cleared all of the valid caps in the bset (but not the invalid caps it couldn't read out of the kernel). We know that this is exactly what the libcap-ng library does and what the go capabilities library does. They both leave you in that above situation if you try to clear all of you capapabilities from all 4 sets. If that root task calls execve() the child task will pick up all caps not blocked by the bset. The bset however does not block bits higher than CAP_LAST_CAP. So now the child task has bits in eff which are not in the parent. These are 'meaningless' undefined bits, but still bits which the parent doesn't have. The problem is now in cred_cap_issubset() (or any operation which does a subset test) as the child, while a subset for valid cap bits, is not a subset for invalid cap bits! So now we set durring commit creds that the child is not dumpable. Given it is 'more priv' than its parent. It also means the parent cannot ptrace the child and other stupidity. The solution here: 1) stop hiding capability bits in status This makes debugging easier! 2) stop giving any task undefined capability bits. it's simple, it you don't put those invalid bits in CAP_FULL_SET you won't get them in init and you won't get them in any other task either. This fixes the cap_issubset() tests and resulting fallout (which made the init task in a docker container untraceable among other things) 3) mask out undefined bits when sys_capset() is called as it might use ~0, ~0 to denote 'all capabilities' for backward/forward compatibility. This lets 'capsh --caps="all=eip" -- -c /bin/bash' run. 4) mask out undefined bit when we read a file capability off of disk as again likely all bits are set in the xattr for forward/backward compatibility. This lets 'setcap all+pe /bin/bash; /bin/bash' run Signed-off-by: Eric Paris <eparis@redhat.com> Reviewed-by: Kees Cook <keescook@chromium.org> Cc: Andrew Vagin <avagin@openvz.org> Cc: Andrew G. Morgan <morgan@kernel.org> Cc: Serge E. Hallyn <serge.hallyn@canonical.com> Cc: Kees Cook <keescook@chromium.org> Cc: Steve Grubb <sgrubb@redhat.com> Cc: Dan Walsh <dwalsh@redhat.com> Cc: stable@vger.kernel.org Signed-off-by: James Morris <james.l.morris@oracle.com>
2014-07-24	Merge tag 'keys-next-20140722' of ↵	James Morris
	git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs into next
2014-07-24	commoncap: don't alloc the credential unless needed in cap_task_prctl	Tetsuo Handa
	In function cap_task_prctl(), we would allocate a credential unconditionally and then check if we support the requested function. If not we would release this credential with abort_creds() by using RCU method. But on some archs such as powerpc, the sys_prctl is heavily used to get/set the floating point exception mode. So the unnecessary allocating/releasing of credential not only introduce runtime overhead but also do cause OOM due to the RCU implementation. This patch removes abort_creds() from cap_task_prctl() by calling prepare_creds() only when we need to modify it. Reported-by: Kevin Hao <haokexin@gmail.com> Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Reviewed-by: Paul Moore <paul@paul-moore.com> Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: James Morris <james.l.morris@oracle.com>
2014-07-22	Merge branch 'keys-fixes' into keys-next	David Howells
	Signed-off-by: David Howells <dhowells@redhat.com>
2014-07-22	Merge remote-tracking branch 'integrity/next-with-keys' into keys-next	David Howells
	Signed-off-by: David Howells <dhowells@redhat.com>
2014-07-22	KEYS: request_key_auth: Provide key preparsing	David Howells
	Provide key preparsing for the request_key_auth key type so that we can make preparsing mandatory. This does nothing as this type can only be set up internally to the kernel. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Steve Dickson <steved@redhat.com> Acked-by: Jeff Layton <jlayton@primarydata.com>
2014-07-22	KEYS: keyring: Provide key preparsing	David Howells
	Provide key preparsing in the keyring so that we can make preparsing mandatory. For keyrings, however, only an empty payload is permitted. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Steve Dickson <steved@redhat.com> Acked-by: Jeff Layton <jlayton@primarydata.com>
2014-07-22	KEYS: big_key: Use key preparsing	David Howells
	Make use of key preparsing in the big key type so that quota size determination can take place prior to keyring locking when a key is being added. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Steve Dickson <steved@redhat.com>
2014-07-22	KEYS: user: Use key preparsing	David Howells
	Make use of key preparsing in user-defined and logon keys so that quota size determination can take place prior to keyring locking when a key is being added. Also the idmapper key types need to change to match as they use the user-defined key type routines. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Steve Dickson <steved@redhat.com> Acked-by: Jeff Layton <jlayton@primarydata.com>
2014-07-22	KEYS: Call ->free_preparse() even after ->preparse() returns an error	David Howells
	Call the ->free_preparse() key type op even after ->preparse() returns an error as it does cleaning up type stuff. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Steve Dickson <steved@redhat.com> Acked-by: Jeff Layton <jlayton@primarydata.com> Reviewed-by: Sage Weil <sage@redhat.com>
2014-07-22	KEYS: Allow expiry time to be set when preparsing a key	David Howells
	Allow a key type's preparsing routine to set the expiry time for a key. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Steve Dickson <steved@redhat.com> Acked-by: Jeff Layton <jlayton@primarydata.com> Reviewed-by: Sage Weil <sage@redhat.com>
2014-07-22	KEYS: struct key_preparsed_payload should have two payload pointers	David Howells
	struct key_preparsed_payload should have two payload pointers to correspond with those in struct key. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Steve Dickson <steved@redhat.com> Acked-by: Jeff Layton <jlayton@primarydata.com> Reviewed-by: Sage Weil <sage@redhat.com>
2014-07-19	Merge tag 'seccomp-3.17' of ↵	James Morris
	git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux into next
2014-07-19	Merge branch 'next' of git://git.infradead.org/users/pcmoore/selinux into next	James Morris

2014-07-18	sched: move no_new_privs into new atomic flags	Kees Cook
	Since seccomp transitions between threads requires updates to the no_new_privs flag to be atomic, the flag must be part of an atomic flag set. This moves the nnp flag into a separate task field, and introduces accessors. Signed-off-by: Kees Cook <keescook@chromium.org> Reviewed-by: Oleg Nesterov <oleg@redhat.com> Reviewed-by: Andy Lutomirski <luto@amacapital.net>
2014-07-18	KEYS: Provide a generic instantiation function	David Howells
	Provide a generic instantiation function for key types that use the preparse hook. This makes it easier to prereserve key quota before keyrings get locked to retain the new key. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Steve Dickson <steved@redhat.com> Acked-by: Jeff Layton <jlayton@primarydata.com> Reviewed-by: Sage Weil <sage@redhat.com>
2014-07-17	KEYS: Allow special keys (eg. DNS results) to be invalidated by CAP_SYS_ADMIN	David Howells
	Special kernel keys, such as those used to hold DNS results for AFS, CIFS and NFS and those used to hold idmapper results for NFS, used to be 'invalidateable' with key_revoke(). However, since the default permissions for keys were reduced: Commit: 96b5c8fea6c0861621051290d705ec2e971963f1 KEYS: Reduce initial permissions on keys it has become impossible to do this. Add a key flag (KEY_FLAG_ROOT_CAN_INVAL) that will permit a key to be invalidated by root. This should not be used for system keyrings as the garbage collector will try and remove any invalidate key. For system keyrings, KEY_FLAG_ROOT_CAN_CLEAR can be used instead. After this, from userspace, keyctl_invalidate() and "keyctl invalidate" can be used by any possessor of CAP_SYS_ADMIN (typically root) to invalidate DNS and idmapper keys. Invalidated keys are immediately garbage collected and will be immediately rerequested if needed again. Signed-off-by: David Howells <dhowells@redhat.com> Tested-by: Steve Dickson <steved@redhat.com>
2014-07-17	ima: define '.ima' as a builtin 'trusted' keyring	Mimi Zohar
	Require all keys added to the IMA keyring be signed by an existing trusted key on the system trusted keyring. Changelog v6: - remove ifdef CONFIG_IMA_TRUSTED_KEYRING in C code - Dmitry - update Kconfig dependency and help - select KEYS_DEBUG_PROC_KEYS - Dmitry Changelog v5: - Move integrity_init_keyring() to init_ima() - Dmitry - reset keyring[id] on failure - Dmitry Changelog v1: - don't link IMA trusted keyring to user keyring Changelog: - define stub integrity_init_keyring() function (reported-by Fengguang Wu) - differentiate between regular and trusted keyring names. - replace printk with pr_info (D. Kasatkin) - only make the IMA keyring a trusted keyring (reported-by D. Kastatkin) - define stub integrity_init_keyring() definition based on CONFIG_INTEGRITY_SIGNATURE, not CONFIG_INTEGRITY_ASYMMETRIC_KEYS. (reported-by Jim Davis) Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com> Signed-off-by: Dmitry Kasatkin <d.kasatkin@samsung.com> Acked-by: David Howells <dhowells@redhat.com>
2014-07-17	KEYS: special dot prefixed keyring name bug fix	Mimi Zohar
	Dot prefixed keyring names are supposed to be reserved for the kernel, but add_key() calls key_get_type_from_user(), which incorrectly verifies the 'type' field, not the 'description' field. This patch verifies the 'description' field isn't dot prefixed, when creating a new keyring, and removes the dot prefix test in key_get_type_from_user(). Changelog v6: - whitespace and other cleanup Changelog v5: - Only prevent userspace from creating a dot prefixed keyring, not regular keys - Dmitry Reported-by: Dmitry Kasatkin <d.kasatkin@samsung.com> Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com> Acked-by: David Howells <dhowells@redhat.com>
2014-07-17	ima: provide double buffering for hash calculation	Dmitry Kasatkin
	The asynchronous hash API allows initiating a hash calculation and then performing other tasks, while waiting for the hash calculation to complete. This patch introduces usage of double buffering for simultaneous hashing and reading of the next chunk of data from storage. Changes in v3: - better comments Signed-off-by: Dmitry Kasatkin <d.kasatkin@samsung.com> Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2014-07-17	ima: introduce multi-page collect buffers	Dmitry Kasatkin
	Use of multiple-page collect buffers reduces: 1) the number of block IO requests 2) the number of asynchronous hash update requests Second is important for HW accelerated hashing, because significant amount of time is spent for preparation of hash update operation, which includes configuring acceleration HW, DMA engine, etc... Thus, HW accelerators are more efficient when working on large chunks of data. This patch introduces usage of multi-page collect buffers. Buffer size can be specified using 'ahash_bufsize' module parameter. Default buffer size is 4096 bytes. Changes in v3: - kernel parameter replaced with module parameter Signed-off-by: Dmitry Kasatkin <d.kasatkin@samsung.com> Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2014-07-17	ima: use ahash API for file hash calculation	Dmitry Kasatkin
	Async hash API allows the use of HW acceleration for hash calculation. It may give significant performance gain and/or reduce power consumption, which might be very beneficial for battery powered devices. This patch introduces hash calculation using ahash API. ahash performance depends on the data size and the particular HW. Depending on the specific system, shash performance may be better. This patch defines 'ahash_minsize' module parameter, which is used to define the minimal file size to use with ahash. If this minimum file size is not set or the file is smaller than defined by the parameter, shash will be used. Changes in v3: - kernel parameter replaced with module parameter - pr_crit replaced with pr_crit_ratelimited - more comment changes - Mimi Changes in v2: - ima_ahash_size became as ima_ahash - ahash pre-allocation moved out from __init code to be able to use ahash crypto modules. Ahash allocated once on the first use. - hash calculation falls back to shash if ahash allocation/calculation fails - complex initialization separated from variable declaration - improved comments Signed-off-by: Dmitry Kasatkin <d.kasatkin@samsung.com> Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2014-07-17	audit: fix dangling keywords in integrity ima message output	Richard Guy Briggs
	Replace spaces in op keyword labels in log output since userspace audit tools can't parse orphaned keywords. Reported-by: Steve Grubb <sgrubb@redhat.com> Signed-off-by: Richard Guy Briggs <rgb@redhat.com> Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2014-07-17	ima: delay template descriptor lookup until use	Dmitry Kasatkin
	process_measurement() always calls ima_template_desc_current(), including when an IMA policy has not been defined. This patch delays template descriptor lookup until action is determined. Signed-off-by: Dmitry Kasatkin <d.kasatkin@samsung.com> Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2014-07-17	ima: remove unnecessary i_mutex locking from ima_rdwr_violation_check()	Dmitry Kasatkin
	Before 2.6.39 inode->i_readcount was maintained by IMA. It was not atomic and protected using spinlock. For 2.6.39, i_readcount was converted to atomic and maintaining was moved VFS layer. Spinlock for some unclear reason was replaced by i_mutex. After analyzing the code, we came to conclusion that i_mutex locking is unnecessary, especially when an IMA policy has not been defined. This patch removes i_mutex locking from ima_rdwr_violation_check(). Signed-off-by: Dmitry Kasatkin <d.kasatkin@samsung.com> Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2014-07-17	Merge branch 'stable-3.16' of git://git.infradead.org/users/pcmoore/selinux ↵	James Morris
	into next
2014-07-10	selinux: fix the default socket labeling in sock_graft()	Paul Moore
	The sock_graft() hook has special handling for AF_INET, AF_INET, and AF_UNIX sockets as those address families have special hooks which label the sock before it is attached its associated socket. Unfortunately, the sock_graft() hook was missing a default approach to labeling sockets which meant that any other address family which made use of connections or the accept() syscall would find the returned socket to be in an "unlabeled" state. This was recently demonstrated by the kcrypto/AF_ALG subsystem and the newly released cryptsetup package (cryptsetup v1.6.5 and later). This patch preserves the special handling in selinux_sock_graft(), but adds a default behavior - setting the sock's label equal to the associated socket - which resolves the problem with AF_ALG and presumably any other address family which makes use of accept(). Cc: stable@vger.kernel.org Signed-off-by: Paul Moore <pmoore@redhat.com> Tested-by: Milan Broz <gmazyland@gmail.com>
2014-06-26	selinux: reduce the number of calls to synchronize_net() when flushing caches	Paul Moore
	When flushing the AVC, such as during a policy load, the various network caches are also flushed, with each making a call to synchronize_net() which has shown to be expensive in some cases. This patch consolidates the network cache flushes into a single AVC callback which only calls synchronize_net() once for each AVC cache flush. Reported-by: Jaejyn Shin <flagon22bass@gmail.com> Signed-off-by: Paul Moore <pmoore@redhat.com>
2014-06-23	selinux: no recursive read_lock of policy_rwlock in security_genfs_sid()	Waiman Long
	With the introduction of fair queued rwlock, recursive read_lock() may hang the offending process if there is a write_lock() somewhere in between. With recursive read_lock checking enabled, the following error was reported: ============================================= [ INFO: possible recursive locking detected ] 3.16.0-rc1 #2 Tainted: G E --------------------------------------------- load_policy/708 is trying to acquire lock: (policy_rwlock){.+.+..}, at: [<ffffffff8125b32a>] security_genfs_sid+0x3a/0x170 but task is already holding lock: (policy_rwlock){.+.+..}, at: [<ffffffff8125b48c>] security_fs_use+0x2c/0x110 other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(policy_rwlock); lock(policy_rwlock); This patch fixes the occurrence of recursive read_lock() of policy_rwlock by adding a helper function __security_genfs_sid() which requires caller to take the lock before calling it. The security_fs_use() was then modified to call the new helper function. Signed-off-by: Waiman Long <Waiman.Long@hp.com> Acked-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: Paul Moore <pmoore@redhat.com>
2014-06-19	selinux: fix a possible memory leak in cond_read_node()	Namhyung Kim
	The cond_read_node() should free the given node on error path as it's not linked to p->cond_list yet. This is done via cond_node_destroy() but it's not called when next_entry() fails before the expr loop. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Paul Moore <pmoore@redhat.com>
2014-06-19	selinux: simple cleanup for cond_read_node()	Namhyung Kim
	The node->cur_state and len can be read in a single call of next_entry(). And setting len before reading is a dead write so can be eliminated. Signed-off-by: Namhyung Kim <namhyung@kernel.org> (Minor tweak to the length parameter in the call to next_entry()) Signed-off-by: Paul Moore <pmoore@redhat.com>
2014-06-18	security: Used macros from compiler.h instead of __attribute__((...))	Gideon Israel Dsouza
	To increase compiler portability there is <linux/compiler.h> which provides convenience macros for various gcc constructs. Eg: __packed for __attribute__((packed)). This patch is part of a large task I've taken to clean the gcc specific attributes and use the the macros instead. Signed-off-by: Gideon Israel Dsouza <gidisrael@gmail.com> Signed-off-by: Paul Moore <pmoore@redhat.com>
2014-06-18	selinux: introduce str_read() helper	Namhyung Kim
	There're some code duplication for reading a string value during policydb_read(). Add str_read() helper to fix it. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Paul Moore <pmoore@redhat.com>
2014-06-17	SELinux: use ARRAY_SIZE	Himangi Saraogi
	ARRAY_SIZE is more concise to use when the size of an array is divided by the size of its type or the size of its first element. The Coccinelle semantic patch that makes this change is as follows: // <smpl> @@ type T; T[] E; @@ - (sizeof(E)/sizeof(E[...])) + ARRAY_SIZE(E) // </smpl> Signed-off-by: Himangi Saraogi <himangi774@gmail.com> Signed-off-by: Paul Moore <pmoore@redhat.com>
2014-06-17	Merge tag 'v3.15' into next	Paul Moore
	Linux 3.15
2014-06-13	Merge branch 'serge-next-2' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/sergeh/linux-security Pull more security layer updates from Serge Hallyn: "A few more commits had previously failed to make it through security-next into linux-next but this week made it into linux-next. At least commit "ima: introduce ima_kernel_read()" was deemed critical by Mimi to make this merge window. This is a temporary tree just for this request. Mimi has pointed me to some previous threads about keeping maintainer trees at the previous release, which I'll certainly do for anything long-term, after talking with James" * 'serge-next-2' of git://git.kernel.org/pub/scm/linux/kernel/git/sergeh/linux-security: ima: introduce ima_kernel_read() evm: prohibit userspace writing 'security.evm' HMAC value ima: check inode integrity cache in violation check ima: prevent unnecessary policy checking evm: provide option to protect additional SMACK xattrs evm: replace HMAC version with attribute mask ima: prevent new digsig xattr from being replaced
2014-06-12	ima: introduce ima_kernel_read()	Dmitry Kasatkin
	Commit 8aac62706 "move exit_task_namespaces() outside of exit_notify" introduced the kernel opps since the kernel v3.10, which happens when Apparmor and IMA-appraisal are enabled at the same time. ---------------------------------------------------------------------- [ 106.750167] BUG: unable to handle kernel NULL pointer dereference at 0000000000000018 [ 106.750221] IP: [<ffffffff811ec7da>] our_mnt+0x1a/0x30 [ 106.750241] PGD 0 [ 106.750254] Oops: 0000 [#1] SMP [ 106.750272] Modules linked in: cuse parport_pc ppdev bnep rfcomm bluetooth rpcsec_gss_krb5 nfsd auth_rpcgss nfs_acl nfs lockd sunrpc fscache dm_crypt intel_rapl x86_pkg_temp_thermal intel_powerclamp kvm_intel snd_hda_codec_hdmi kvm crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 glue_helper lrw gf128mul ablk_helper cryptd snd_hda_codec_realtek dcdbas snd_hda_intel snd_hda_codec snd_hwdep snd_pcm snd_page_alloc snd_seq_midi snd_seq_midi_event snd_rawmidi psmouse snd_seq microcode serio_raw snd_timer snd_seq_device snd soundcore video lpc_ich coretemp mac_hid lp parport mei_me mei nbd hid_generic e1000e usbhid ahci ptp hid libahci pps_core [ 106.750658] CPU: 6 PID: 1394 Comm: mysqld Not tainted 3.13.0-rc7-kds+ #15 [ 106.750673] Hardware name: Dell Inc. OptiPlex 9010/0M9KCM, BIOS A08 09/19/2012 [ 106.750689] task: ffff8800de804920 ti: ffff880400fca000 task.ti: ffff880400fca000 [ 106.750704] RIP: 0010:[<ffffffff811ec7da>] [<ffffffff811ec7da>] our_mnt+0x1a/0x30 [ 106.750725] RSP: 0018:ffff880400fcba60 EFLAGS: 00010286 [ 106.750738] RAX: 0000000000000000 RBX: 0000000000000100 RCX: ffff8800d51523e7 [ 106.750764] RDX: ffffffffffffffea RSI: ffff880400fcba34 RDI: ffff880402d20020 [ 106.750791] RBP: ffff880400fcbae0 R08: 0000000000000000 R09: 0000000000000001 [ 106.750817] R10: 0000000000000000 R11: 0000000000000001 R12: ffff8800d5152300 [ 106.750844] R13: ffff8803eb8df510 R14: ffff880400fcbb28 R15: ffff8800d51523e7 [ 106.750871] FS: 0000000000000000(0000) GS:ffff88040d200000(0000) knlGS:0000000000000000 [ 106.750910] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 106.750935] CR2: 0000000000000018 CR3: 0000000001c0e000 CR4: 00000000001407e0 [ 106.750962] Stack: [ 106.750981] ffffffff813434eb ffff880400fcbb20 ffff880400fcbb18 0000000000000000 [ 106.751037] ffff8800de804920 ffffffff8101b9b9 0001800000000000 0000000000000100 [ 106.751093] 0000010000000000 0000000000000002 000000000000000e ffff8803eb8df500 [ 106.751149] Call Trace: [ 106.751172] [<ffffffff813434eb>] ? aa_path_name+0x2ab/0x430 [ 106.751199] [<ffffffff8101b9b9>] ? sched_clock+0x9/0x10 [ 106.751225] [<ffffffff8134a68d>] aa_path_perm+0x7d/0x170 [ 106.751250] [<ffffffff8101b945>] ? native_sched_clock+0x15/0x80 [ 106.751276] [<ffffffff8134aa73>] aa_file_perm+0x33/0x40 [ 106.751301] [<ffffffff81348c5e>] common_file_perm+0x8e/0xb0 [ 106.751327] [<ffffffff81348d78>] apparmor_file_permission+0x18/0x20 [ 106.751355] [<ffffffff8130c853>] security_file_permission+0x23/0xa0 [ 106.751382] [<ffffffff811c77a2>] rw_verify_area+0x52/0xe0 [ 106.751407] [<ffffffff811c789d>] vfs_read+0x6d/0x170 [ 106.751432] [<ffffffff811cda31>] kernel_read+0x41/0x60 [ 106.751457] [<ffffffff8134fd45>] ima_calc_file_hash+0x225/0x280 [ 106.751483] [<ffffffff8134fb52>] ? ima_calc_file_hash+0x32/0x280 [ 106.751509] [<ffffffff8135022d>] ima_collect_measurement+0x9d/0x160 [ 106.751536] [<ffffffff810b552d>] ? trace_hardirqs_on+0xd/0x10 [ 106.751562] [<ffffffff8134f07c>] ? ima_file_free+0x6c/0xd0 [ 106.751587] [<ffffffff81352824>] ima_update_xattr+0x34/0x60 [ 106.751612] [<ffffffff8134f0d0>] ima_file_free+0xc0/0xd0 [ 106.751637] [<ffffffff811c9635>] __fput+0xd5/0x300 [ 106.751662] [<ffffffff811c98ae>] ____fput+0xe/0x10 [ 106.751687] [<ffffffff81086774>] task_work_run+0xc4/0xe0 [ 106.751712] [<ffffffff81066fad>] do_exit+0x2bd/0xa90 [ 106.751738] [<ffffffff8173c958>] ? retint_swapgs+0x13/0x1b [ 106.751763] [<ffffffff8106780c>] do_group_exit+0x4c/0xc0 [ 106.751788] [<ffffffff81067894>] SyS_exit_group+0x14/0x20 [ 106.751814] [<ffffffff8174522d>] system_call_fastpath+0x1a/0x1f [ 106.751839] Code: c3 0f 1f 44 00 00 55 48 89 e5 e8 22 fe ff ff 5d c3 0f 1f 44 00 00 55 65 48 8b 04 25 c0 c9 00 00 48 8b 80 28 06 00 00 48 89 e5 5d <48> 8b 40 18 48 39 87 c0 00 00 00 0f 94 c0 c3 0f 1f 80 00 00 00 [ 106.752185] RIP [<ffffffff811ec7da>] our_mnt+0x1a/0x30 [ 106.752214] RSP <ffff880400fcba60> [ 106.752236] CR2: 0000000000000018 [ 106.752258] ---[ end trace 3c520748b4732721 ]--- ---------------------------------------------------------------------- The reason for the oops is that IMA-appraisal uses "kernel_read()" when file is closed. kernel_read() honors LSM security hook which calls Apparmor handler, which uses current->nsproxy->mnt_ns. The 'guilty' commit changed the order of cleanup code so that nsproxy->mnt_ns was not already available for Apparmor. Discussion about the issue with Al Viro and Eric W. Biederman suggested that kernel_read() is too high-level for IMA. Another issue, except security checking, that was identified is mandatory locking. kernel_read honors it as well and it might prevent IMA from calculating necessary hash. It was suggested to use simplified version of the function without security and locking checks. This patch introduces special version ima_kernel_read(), which skips security and mandatory locking checking. It prevents the kernel oops to happen. Signed-off-by: Dmitry Kasatkin <d.kasatkin@samsung.com> Suggested-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com> Cc: <stable@vger.kernel.org>
2014-06-12	evm: prohibit userspace writing 'security.evm' HMAC value	Mimi Zohar
	Calculating the 'security.evm' HMAC value requires access to the EVM encrypted key. Only the kernel should have access to it. This patch prevents userspace tools(eg. setfattr, cp --preserve=xattr) from setting/modifying the 'security.evm' HMAC value directly. Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com> Cc: <stable@vger.kernel.org>
2014-06-12	ima: check inode integrity cache in violation check	Dmitry Kasatkin
	When IMA did not support ima-appraisal, existance of the S_IMA flag clearly indicated that the file was measured. With IMA appraisal S_IMA flag indicates that file was measured and/or appraised. Because of this, when measurement is not enabled by the policy, violations are still reported. To differentiate between measurement and appraisal policies this patch checks the inode integrity cache flags. The IMA_MEASURED flag indicates whether the file was actually measured, while the IMA_MEASURE flag indicates whether the file should be measured. Unfortunately, the IMA_MEASURED flag is reset to indicate the file needs to be re-measured. Thus, this patch checks the IMA_MEASURE flag. This patch limits the false positive violation reports, but does not fix it entirely. The IMA_MEASURE/IMA_MEASURED flags are indications that, at some point in time, the file opened for read was in policy, but might not be in policy now (eg. different uid). Other changes would be needed to further limit false positive violation reports. Changelog: - expanded patch description based on conversation with Roberto (Mimi) Signed-off-by: Dmitry Kasatkin <d.kasatkin@samsung.com> Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2014-06-12	ima: prevent unnecessary policy checking	Dmitry Kasatkin
	ima_rdwr_violation_check is called for every file openning. The function checks the policy even when violation condition is not met. It causes unnecessary policy checking. This patch does policy checking only if violation condition is met. Changelog: - check writecount is greater than zero (Mimi) Signed-off-by: Dmitry Kasatkin <d.kasatkin@samsung.com> Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2014-06-12	evm: provide option to protect additional SMACK xattrs	Dmitry Kasatkin
	Newer versions of SMACK introduced following security xattrs: SMACK64EXEC, SMACK64TRANSMUTE and SMACK64MMAP. To protect these xattrs, this patch includes them in the HMAC calculation. However, for backwards compatibility with existing labeled filesystems, including these xattrs needs to be configurable. Changelog: - Add SMACK dependency on new option (Mimi) Signed-off-by: Dmitry Kasatkin <d.kasatkin@samsung.com> Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2014-06-12	evm: replace HMAC version with attribute mask	Dmitry Kasatkin
	Using HMAC version limits the posibility to arbitrarily add new attributes such as SMACK64EXEC to the hmac calculation. This patch replaces hmac version with attribute mask. Desired attributes can be enabled with configuration parameter. It allows to build kernels which works with previously labeled filesystems. Currently supported attribute is 'fsuuid' which is equivalent of the former version 2. Signed-off-by: Dmitry Kasatkin <d.kasatkin@samsung.com> Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2014-06-12	ima: prevent new digsig xattr from being replaced	Mimi Zohar
	Even though a new xattr will only be appraised on the next access, set the DIGSIG flag to prevent a signature from being replaced with a hash on file close. Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2014-06-12	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next	Linus Torvalds
	Pull networking updates from David Miller: 1) Seccomp BPF filters can now be JIT'd, from Alexei Starovoitov. 2) Multiqueue support in xen-netback and xen-netfront, from Andrew J Benniston. 3) Allow tweaking of aggregation settings in cdc_ncm driver, from Bjørn Mork. 4) BPF now has a "random" opcode, from Chema Gonzalez. 5) Add more BPF documentation and improve test framework, from Daniel Borkmann. 6) Support TCP fastopen over ipv6, from Daniel Lee. 7) Add software TSO helper functions and use them to support software TSO in mvneta and mv643xx_eth drivers. From Ezequiel Garcia. 8) Support software TSO in fec driver too, from Nimrod Andy. 9) Add Broadcom SYSTEMPORT driver, from Florian Fainelli. 10) Handle broadcasts more gracefully over macvlan when there are large numbers of interfaces configured, from Herbert Xu. 11) Allow more control over fwmark used for non-socket based responses, from Lorenzo Colitti. 12) Do TCP congestion window limiting based upon measurements, from Neal Cardwell. 13) Support busy polling in SCTP, from Neal Horman. 14) Allow RSS key to be configured via ethtool, from Venkata Duvvuru. 15) Bridge promisc mode handling improvements from Vlad Yasevich. 16) Don't use inetpeer entries to implement ID generation any more, it performs poorly, from Eric Dumazet. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1522 commits) rtnetlink: fix userspace API breakage for iproute2 < v3.9.0 tcp: fixing TLP's FIN recovery net: fec: Add software TSO support net: fec: Add Scatter/gather support net: fec: Increase buffer descriptor entry number net: fec: Factorize feature setting net: fec: Enable IP header hardware checksum net: fec: Factorize the .xmit transmit function bridge: fix compile error when compiling without IPv6 support bridge: fix smatch warning / potential null pointer dereference via-rhine: fix full-duplex with autoneg disable bnx2x: Enlarge the dorq threshold for VFs bnx2x: Check for UNDI in uncommon branch bnx2x: Fix 1G-baseT link bnx2x: Fix link for KR with swapped polarity lane sctp: Fix sk_ack_backlog wrap-around problem net/core: Add VF link state control policy net/fsl: xgmac_mdio is dependent on OF_MDIO net/fsl: Make xgmac_mdio read error message useful net_sched: drr: warn when qdisc is not work conserving ...
2014-06-10	Merge branch 'serge-next-1' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/sergeh/linux-security Pull security layer updates from Serge Hallyn: "This is a merge of James Morris' security-next tree from 3.14 to yesterday's master, plus four patches from Paul Moore which are in linux-next, plus one patch from Mimi" * 'serge-next-1' of git://git.kernel.org/pub/scm/linux/kernel/git/sergeh/linux-security: ima: audit log files opened with O_DIRECT flag selinux: conditionally reschedule in hashtab_insert while loading selinux policy selinux: conditionally reschedule in mls_convert_context while loading selinux policy selinux: reject setexeccon() on MNT_NOSUID applications with -EACCES selinux: Report permissive mode in avc: denied messages. Warning in scanf string typing Smack: Label cgroup files for systemd Smack: Verify read access on file open - v3 security: Convert use of typedef ctl_table to struct ctl_table Smack: bidirectional UDS connect check Smack: Correctly remove SMACK64TRANSMUTE attribute SMACK: Fix handling value==NULL in post setxattr bugfix patch for SMACK Smack: adds smackfs/ptrace interface Smack: unify all ptrace accesses in the smack Smack: fix the subject/object order in smack_ptrace_traceme() Minor improvement of 'smack_sb_kern_mount' smack: fix key permission verification KEYS: Move the flags representing required permission to linux/key.h
2014-06-09	Merge branch 'for-3.16' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup Pull cgroup updates from Tejun Heo: "A lot of activities on cgroup side. Heavy restructuring including locking simplification took place to improve the code base and enable implementation of the unified hierarchy, which currently exists behind a __DEVEL__ mount option. The core support is mostly complete but individual controllers need further work. To explain the design and rationales of the the unified hierarchy Documentation/cgroups/unified-hierarchy.txt is added. Another notable change is css (cgroup_subsys_state - what each controller uses to identify and interact with a cgroup) iteration update. This is part of continuing updates on css object lifetime and visibility. cgroup started with reference count draining on removal way back and is now reaching a point where csses behave and are iterated like normal refcnted objects albeit with some complexities to allow distinguishing the state where they're being deleted. The css iteration update isn't taken advantage of yet but is planned to be used to simplify memcg significantly" * 'for-3.16' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: (77 commits) cgroup: disallow disabled controllers on the default hierarchy cgroup: don't destroy the default root cgroup: disallow debug controller on the default hierarchy cgroup: clean up MAINTAINERS entries cgroup: implement css_tryget() device_cgroup: use css_has_online_children() instead of has_children() cgroup: convert cgroup_has_live_children() into css_has_online_children() cgroup: use CSS_ONLINE instead of CGRP_DEAD cgroup: iterate cgroup_subsys_states directly cgroup: introduce CSS_RELEASED and reduce css iteration fallback window cgroup: move cgroup->serial_nr into cgroup_subsys_state cgroup: link all cgroup_subsys_states in their sibling lists cgroup: move cgroup->sibling and ->children into cgroup_subsys_state cgroup: remove cgroup->parent device_cgroup: remove direct access to cgroup->children memcg: update memcg_has_children() to use css_next_child() memcg: remove tasks/children test from mem_cgroup_force_empty() cgroup: remove css_parent() cgroup: skip refcnting on normal root csses and cgrp_dfl_root self css cgroup: use cgroup->self.refcnt for cgroup refcnting ...
2014-06-03	ima: audit log files opened with O_DIRECT flag	Mimi Zohar
	Files are measured or appraised based on the IMA policy. When a file, in policy, is opened with the O_DIRECT flag, a deadlock occurs. The first attempt at resolving this lockdep temporarily removed the O_DIRECT flag and restored it, after calculating the hash. The second attempt introduced the O_DIRECT_HAVELOCK flag. Based on this flag, do_blockdev_direct_IO() would skip taking the i_mutex a second time. The third attempt, by Dmitry Kasatkin, resolves the i_mutex locking issue, by re-introducing the IMA mutex, but uncovered another problem. Reading a file with O_DIRECT flag set, writes directly to userspace pages. A second patch allocates a user-space like memory. This works for all IMA hooks, except ima_file_free(), which is called on __fput() to recalculate the file hash. Until this last issue is addressed, do not 'collect' the measurement for measuring, appraising, or auditing files opened with the O_DIRECT flag set. Based on policy, permit or deny file access. This patch defines a new IMA policy rule option named 'permit_directio'. Policy rules could be defined, based on LSM or other criteria, to permit specific applications to open files with the O_DIRECT flag set. Changelog v1: - permit or deny file access based IMA policy rules Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com> Acked-by: Dmitry Kasatkin <d.kasatkin@samsung.com> Cc: <stable@vger.kernel.org>
2014-06-03	selinux: conditionally reschedule in hashtab_insert while loading selinux policy	Dave Jones
	After silencing the sleeping warning in mls_convert_context() I started seeing similar traces from hashtab_insert. Do a cond_resched there too. Signed-off-by: Dave Jones <davej@redhat.com> Acked-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: Paul Moore <pmoore@redhat.com>
2014-06-03	selinux: conditionally reschedule in mls_convert_context while loading ↵	Dave Jones
	selinux policy On a slow machine (with debugging enabled), upgrading selinux policy may take a considerable amount of time. Long enough that the softlockup detector gets triggered. The backtrace looks like this.. > BUG: soft lockup - CPU#2 stuck for 23s! [load_policy:19045] > Call Trace: > [<ffffffff81221ddf>] symcmp+0xf/0x20 > [<ffffffff81221c27>] hashtab_search+0x47/0x80 > [<ffffffff8122e96c>] mls_convert_context+0xdc/0x1c0 > [<ffffffff812294e8>] convert_context+0x378/0x460 > [<ffffffff81229170>] ? security_context_to_sid_core+0x240/0x240 > [<ffffffff812221b5>] sidtab_map+0x45/0x80 > [<ffffffff8122bb9f>] security_load_policy+0x3ff/0x580 > [<ffffffff810788a8>] ? sched_clock_cpu+0xa8/0x100 > [<ffffffff810786dd>] ? sched_clock_local+0x1d/0x80 > [<ffffffff810788a8>] ? sched_clock_cpu+0xa8/0x100 > [<ffffffff8103096a>] ? __change_page_attr_set_clr+0x82a/0xa50 > [<ffffffff810786dd>] ? sched_clock_local+0x1d/0x80 > [<ffffffff810788a8>] ? sched_clock_cpu+0xa8/0x100 > [<ffffffff8103096a>] ? __change_page_attr_set_clr+0x82a/0xa50 > [<ffffffff810788a8>] ? sched_clock_cpu+0xa8/0x100 > [<ffffffff81534ddc>] ? retint_restore_args+0xe/0xe > [<ffffffff8109c82d>] ? trace_hardirqs_on_caller+0xfd/0x1c0 > [<ffffffff81279a2e>] ? trace_hardirqs_on_thunk+0x3a/0x3f > [<ffffffff810d28a8>] ? rcu_irq_exit+0x68/0xb0 > [<ffffffff81534ddc>] ? retint_restore_args+0xe/0xe > [<ffffffff8121e947>] sel_write_load+0xa7/0x770 > [<ffffffff81139633>] ? vfs_write+0x1c3/0x200 > [<ffffffff81210e8e>] ? security_file_permission+0x1e/0xa0 > [<ffffffff8113952b>] vfs_write+0xbb/0x200 > [<ffffffff811581c7>] ? fget_light+0x397/0x4b0 > [<ffffffff81139c27>] SyS_write+0x47/0xa0 > [<ffffffff8153bde4>] tracesys+0xdd/0xe2 Stephen Smalley suggested: > Maybe put a cond_resched() within the ebitmap_for_each_positive_bit() > loop in mls_convert_context()? That seems to do the trick. Tested by downgrading and re-upgrading selinux-policy-targeted. Signed-off-by: Dave Jones <davej@redhat.com> Acked-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: Paul Moore <pmoore@redhat.com>
2014-06-03	selinux: reject setexeccon() on MNT_NOSUID applications with -EACCES	Paul Moore
	We presently prevent processes from using setexecon() to set the security label of exec()'d processes when NO_NEW_PRIVS is enabled by returning an error; however, we silently ignore setexeccon() when exec()'ing from a nosuid mounted filesystem. This patch makes things a bit more consistent by returning an error in the setexeccon()/nosuid case. Signed-off-by: Paul Moore <pmoore@redhat.com> Acked-by: Andy Lutomirski <luto@amacapital.net> Acked-by: Stephen Smalley <sds@tycho.nsa.gov>