summaryrefslogtreecommitdiffstats
path: root/fs/ceph
AgeCommit message (Collapse)Author
2010-05-17ceph: name bdi ceph-%d instead of major:minorSage Weil
The bdi_setup_and_register() helper doesn't help us since we bdi_init() in create_client() and bdi_register() only when sget() succeeds. Signed-off-by: Sage Weil <sage@newdream.net>
2010-05-17ceph: skip mds sync on forced unmountSage Weil
Signed-off-by: Sage Weil <sage@newdream.net>
2010-05-17ceph: adjust masked struct_v variable namesSage Weil
Reported-by: Bill Pemberton <wfp5p@virginia.edu> Signed-off-by: Sage Weil <sage@newdream.net>
2010-05-17ceph: clean up mount options, ->show_options()Sage Weil
Ensure all options are included in /proc/mounts. Some cleanup. Signed-off-by: Sage Weil <sage@newdream.net>
2010-05-17ceph: set dn offset when splicedSage Weil
We want to assign an offset when the dentry goes from null to linked, which is always done by splice_dentry(). Notably, we should NOT assign an offset when a dentry is first created and is still null. BUG if we try to splice a non-null dentry (we shouldn't). Signed-off-by: Sage Weil <sage@newdream.net>
2010-05-17ceph: don't clobber i_max_offset on already complete dirSage Weil
This can screw up offsets assigned to new dentries and break dcache readdir results. Signed-off-by: Sage Weil <sage@newdream.net>
2010-05-17ceph: skip set_dentry_offset work if directory not I_COMPLETESage Weil
Signed-off-by: Sage Weil <sage@newdream.net>
2010-05-17ceph: set next_offset on readdir finishSage Weil
Set next_offset to 2 (always 2!), not 0, on readdir finish. Signed-off-by: Sage Weil <sage@newdream.net>
2010-05-17ceph: listxattr should compare version by >=Henry C Chang
If the version hasn't changed, don't rebuild the index. Signed-off-by: Henry C Chang <henry_c_chang@tcloudcomputing.com> Signed-off-by: Sage Weil <sage@newdream.net>
2010-05-17ceph: fix xattr dangling pointer / double freeSage Weil
If we use the xattr_blob, clear the pointer so we don't release the memory at the bottom of the fuction. Reported-by: Henry C Chang <henry_c_chang@tcloudcomputing.com> Signed-off-by: Sage Weil <sage@newdream.net>
2010-05-17ceph: close messenger raceSage Weil
Simplify messenger locking, and close race between ceph_con_close() setting the CLOSED bit and con_work() checking the bit, then taking the mutex. Signed-off-by: Sage Weil <sage@newdream.net>
2010-05-17ceph: name msgpools; useful error messagesSage Weil
Signed-off-by: Sage Weil <sage@newdream.net>
2010-05-17ceph: fix memory leak due to possible dentry init raceSage Weil
Free dentry_info in error path. Signed-off-by: Sage Weil <sage@newdream.net>
2010-05-17ceph: include auth method in error messagesSage Weil
Signed-off-by: Sage Weil <sage@newdream.net>
2010-05-17ceph: osdtimeout=0 for now timeoutSage Weil
Allow the osd reset timeout to be disabled. Signed-off-by: Sage Weil <sage@newdream.net>
2010-05-17ceph: d_obtain_alias() returns ERR_PTR()Dan Carpenter
d_obtain_alias() doesn't return NULL, it returns an ERR_PTR(). Signed-off-by: Dan Carpenter <error27@gmail.com> Signed-off-by: Sage Weil <sage@newdream.net>
2010-05-17ceph: wake up mount thread when getting osdmapYehuda Sadeh
Now that the mount thread waits for the osdmap, it needs to be awaken. Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
2010-05-17ceph: remove unused #includesHuang Weiyi
Remove unused #include's in fs/ceph/super.c Signed-off-by: Huang Weiyi <weiyi.huang@gmail.com> Signed-off-by: Sage Weil <sage@newdream.net>
2010-05-17ceph: wait for both monmap and osdmap when opening sessionSage Weil
Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
2010-05-17ceph: clean up connection resetSage Weil
Reset out_keepalive_pending and peer_global_seq, and drop unused var. Signed-off-by: Sage Weil <sage@newdream.net>
2010-05-17ceph: simplify ceph_msg_newSage Weil
We only need to pass in front_len. Callers can attach any other payload pieces (middle, data) as they see fit. Signed-off-by: Sage Weil <sage@newdream.net>
2010-05-17ceph: make ceph_msg_new return NULL on failure; clean up, fix callersSage Weil
Returning ERR_PTR(-ENOMEM) is useless extra work. Return NULL on failure instead, and fix up the callers (about half of which were wrong anyway). Signed-off-by: Sage Weil <sage@newdream.net>
2010-05-17ceph: rewrite msgpool using mempool_tSage Weil
Since we don't need to maintain large pools of messages, we can just use the standard mempool_t. We maintain a msgpool 'wrapper' because we need the mempool_t* in the alloc function, and mempool gives us only pool_data. Signed-off-by: Sage Weil <sage@newdream.net>
2010-05-17ceph: use ceph_sb_to_client instead of ceph_clientCheng Renquan
ceph_sb_to_client and ceph_client are really identical, we need to dump one; while function ceph_client is confusing with "struct ceph_client", ceph_sb_to_client's definition is more clear; so we'd better switch all call to ceph_sb_to_client. -static inline struct ceph_client *ceph_client(struct super_block *sb) -{ - return sb->s_fs_info; -} Signed-off-by: Cheng Renquan <crquan@gmail.com> Signed-off-by: Sage Weil <sage@newdream.net>
2010-05-17ceph: handle kzalloc() failureCheng Renquan
Signed-off-by: Cheng Renquan <crquan@gmail.com> Signed-off-by: Sage Weil <sage@newdream.net>
2010-05-17ceph: drop unnecessary msgpool for mon_client subscribe_ackSage Weil
Preallocate a single message to reuse instead. Signed-off-by: Sage Weil <sage@newdream.net>
2010-05-17ceph: drop unnecessary msgpool for mon_client auth_replySage Weil
Preallocate a single reply message that we can reuse instead. Signed-off-by: Sage Weil <sage@newdream.net>
2010-05-17ceph: clean up statfsSage Weil
Avoid unnecessary msgpool. Preallocate reply. Fix use-after-free race. Signed-off-by: Sage Weil <sage@newdream.net>
2010-05-17ceph: fix theoretically possible double-put on connectionSage Weil
This would only trigger if we bailed out before resetting r_con_filling_msg because the server reply was corrupt (oversized). Signed-off-by: Sage Weil <sage@newdream.net>
2010-05-17ceph: cleanup: remove dead codeDan Carpenter
"xattr" is never NULL here. We took care of that in the previous if statement block. Signed-off-by: Dan Carpenter <error27@gmail.com> Signed-off-by: Sage Weil <sage@newdream.net>
2010-05-17ceph: reduce build_path debug outputSage Weil
Signed-off-by: Sage Weil <sage@newdream.net>
2010-05-17ceph: use __page_cache_alloc and add_to_page_cache_lruYehuda Sadeh
Following Nick Piggin patches in btrfs, pagecache pages should be allocated with __page_cache_alloc, so they obey pagecache memory policies. Also, using add_to_page_cache_lru instead of using a private pagevec where applicable. Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net> Signed-off-by: Sage Weil <sage@newdream.net>
2010-05-17ceph: update for removal of kref_setStephen Rothwell
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Sage Weil <sage@newdream.net>
2010-05-17ceph: simplify page setup for incoming dataSage Weil
Drop largely useless helper __prepare_pages(), and simplify sanity checks. Signed-off-by: Sage Weil <sage@newdream.net>
2010-05-17ceph: invalidate affected dentry leases on aborted requestsSage Weil
If we abort a request, we return to caller, but the request may still complete. And if we hold the dir FILE_EXCL bit, we may not release a lease when sending a request. A simple un-tar, control-c, un-tar again will reproduce the bug (manifested as a 'Cannot open: File exists'). Ensure we invalidate affected dentry leases (as well dir I_COMPLETE) so we don't have valid (but incorrect) leases. Do the same, consistently, at other sites where I_COMPLETE is similarly cleared. Signed-off-by: Sage Weil <sage@newdream.net>
2010-05-17ceph: fix race between aborted requests and fill_traceSage Weil
When we abort requests we need to prevent fill_trace et al from doing anything that relies on locks held by the VFS caller. This fixes a race between the reply handler and the abort code, ensuring that continue holding the dir mutex until the reply handler completes. Signed-off-by: Sage Weil <sage@newdream.net>
2010-05-17ceph: clean up mds reply, error handlingSage Weil
We would occasionally BUG out in the reply handler because r_reply was nonzero, due to a race with ceph_mdsc_do_request temporarily setting r_reply to an ERR_PTR value. This is unnecessary, messy, and also wrong in the EIO case. Clean up by consistently using r_err for errors and r_reply for messages. Also fix the abort logic to trigger consistently for all errors that return to the caller early (e.g., EIO from timeout case). If an abort races with a reply, use the result from the reply. Also fix locking for r_err, r_reply update in the reply handler. Signed-off-by: Sage Weil <sage@newdream.net>
2010-05-11ceph: preserve seq # on requeued messages after transient transport errorsSage Weil
If the tcp connection drops and we reconnect to reestablish a stateful session (with the mds), we need to resend previously sent (and possibly received) messages with the _same_ seq # so that they can be dropped on the other end if needed. Only assign a new seq once after the message is queued. Signed-off-by: Sage Weil <sage@newdream.net>
2010-05-11ceph: fix cap removal racesSage Weil
The iterate_session_caps helper traverses the session caps list and tries to grab an inode reference. However, the __ceph_remove_cap was clearing the inode backpointer _before_ removing itself from the session list, causing a null pointer dereference. Clear cap->ci under protection of s_cap_lock to avoid the race, and to tightly couple the list and backpointer state. Use a local flag to indicate whether we are releasing the cap, as cap->session may be modified by a racing thread in iterate_session_caps. Signed-off-by: Sage Weil <sage@newdream.net>
2010-05-11ceph: zero unused message header, footer fieldsSage Weil
We shouldn't leak any prior memory contents to other parties. And random data, particularly in the 'version' field, can cause problems down the line. Signed-off-by: Sage Weil <sage@newdream.net>
2010-05-11ceph: fix locking for waking session requests after reconnectSage Weil
The session->s_waiting list is protected by mdsc->mutex, not s_mutex. This was causing (rare) s_waiting list corruption. Fix errors paths too, while we're here. A more thorough cleanup of this function is coming soon. Signed-off-by: Sage Weil <sage@newdream.net>
2010-05-11ceph: resubmit requests on pg mapping change (not just primary change)Sage Weil
OSD requests need to be resubmitted on any pg mapping change, not just when the pg primary changes. Resending only when the primary changes results in occasional 'hung' requests during osd cluster recovery or rebalancing. Signed-off-by: Sage Weil <sage@newdream.net>
2010-05-11ceph: fix open file counting on snapped inodes when mds returns no capsSage Weil
It's possible the MDS will not issue caps on a snapped inode, in which case an open request may not __ceph_get_fmode(), botching the open file counting. (This is actually a server bug, but the client shouldn't BUG out in this case.) Signed-off-by: Sage Weil <sage@newdream.net>
2010-05-11ceph: unregister osd request on failureSage Weil
The osd request wasn't being unregistered when the osd returned a failure code, even though the result was returned to the caller. This would cause it to eventually time out, and then crash the kernel when it tried to resend the request using a stale page vector. Signed-off-by: Sage Weil <sage@newdream.net>
2010-05-05ceph: don't use writeback_control in writepages completionSage Weil
The ->writepages writeback_control is not still valid in the writepages completion. We were touching it solely to adjust pages_skipped when there was a writeback error (EIO, ENOSPC, EPERM due to bad osd credentials), causing an oops in the writeback code shortly thereafter. Updating pages_skipped on error isn't correct anyway, so let's just rip out this (clearly broken) code to pass the wbc to the completion. Signed-off-by: Sage Weil <sage@newdream.net>
2010-05-04ceph: unregister bdi before kill_anon_super releases device nameSage Weil
Unregister and destroy the bdi in put_super, after mount is r/o, but before put_anon_super releases the device name. For symmetry, bdi_destroy in destroy_client (we bdi_init in create_client). Only set s_bdi if bdi_register succeeds, since we use it to decide whether to bdi_unregister. Signed-off-by: Sage Weil <sage@newdream.net>
2010-05-03ceph: remove bad auth_x kmem_cacheSage Weil
It's useless, since our allocations are already a power of 2. And it was allocated per-instance (not globally), which caused a name collision when we tried to mount a second file system with auth_x enabled. Signed-off-by: Sage Weil <sage@newdream.net>
2010-05-03ceph: fix lockless caps checkSage Weil
The __ variant requires caller to hold i_lock. Signed-off-by: Sage Weil <sage@newdream.net>
2010-05-03ceph: clear dir complete, invalidate dentry on replayed renameSage Weil
If a rename operation is resent to the MDS following an MDS restart, the client does not get a full reply (containing the resulting metadata) back. In that case, a ceph_rename() needs to compensate by doing anything useful that fill_inode() would have, like d_move(). It also needs to invalidate the dentry (to workaround the vfs_rename_dir() bug) and clear the dir complete flag, just like fill_trace(). Signed-off-by: Sage Weil <sage@newdream.net>
2010-05-03ceph: fix direct io truncate offsetSage Weil
truncate_inode_pages_range wants the end offset to align with the last byte in a page. Signed-off-by: Sage Weil <sage@newdream.net>