Age | Commit message (Collapse) | Author |
|
Add a sanity check for the fs_info as we will dereference it, similar to
what the 'store features' handler does.
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
We don't want to trigger the change on a read-only filesystem, similar
to what the label handler does.
Signed-off-by: David Sterba <dsterba@suse.cz>
|
|
The key variable occupies 17 bytes, the key_start is used once, we can
simply reuse existing 'key' for that purpose. As the key is not a simple
type, compiler doest not do it on itself.
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
The size of root item is more than 400 bytes, which is quite a lot of
stack space. As we do IO from inside the subvolume ioctls, we should
keep the stack usage low in case the filesystem is on top of other
layers (NFS, device mapper, iscsi, etc).
Reviewed-by: Tsutomu Itoh <t-itoh@jp.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
We're going to use the argument multiple times later.
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
uuid_mutex
When the replace target fails, the target device will be taken
out of fs device list, scratch + update_dev_time and freed. However
we could do the scratch + update_dev_time and free part after the
device has been taken out of device list, so that we don't have to
hold the device_list_mutex and uuid_mutex locks.
Reported issue:
[ 5375.718845] ======================================================
[ 5375.718846] [ INFO: possible circular locking dependency detected ]
[ 5375.718849] 4.4.5-scst31x-debug-11+ #40 Not tainted
[ 5375.718849] -------------------------------------------------------
[ 5375.718851] btrfs-health/4662 is trying to acquire lock:
[ 5375.718861] (sb_writers){.+.+.+}, at: [<ffffffff812214f7>] __sb_start_write+0xb7/0xf0
[ 5375.718862]
[ 5375.718862] but task is already holding lock:
[ 5375.718907] (&fs_devs->device_list_mutex){+.+.+.}, at: [<ffffffffa028263c>] btrfs_destroy_dev_replace_tgtdev+0x3c/0x150 [btrfs]
[ 5375.718907]
[ 5375.718907] which lock already depends on the new lock.
[ 5375.718907]
[ 5375.718908]
[ 5375.718908] the existing dependency chain (in reverse order) is:
[ 5375.718911]
[ 5375.718911] -> #3 (&fs_devs->device_list_mutex){+.+.+.}:
[ 5375.718917] [<ffffffff810da4be>] lock_acquire+0xce/0x1e0
[ 5375.718921] [<ffffffff81633949>] mutex_lock_nested+0x69/0x3c0
[ 5375.718940] [<ffffffffa0219bf6>] btrfs_show_devname+0x36/0x210 [btrfs]
[ 5375.718945] [<ffffffff81267079>] show_vfsmnt+0x49/0x150
[ 5375.718948] [<ffffffff81240b07>] m_show+0x17/0x20
[ 5375.718951] [<ffffffff81246868>] seq_read+0x2d8/0x3b0
[ 5375.718955] [<ffffffff8121df28>] __vfs_read+0x28/0xd0
[ 5375.718959] [<ffffffff8121e806>] vfs_read+0x86/0x130
[ 5375.718962] [<ffffffff8121f4c9>] SyS_read+0x49/0xa0
[ 5375.718966] [<ffffffff81637976>] entry_SYSCALL_64_fastpath+0x16/0x7a
[ 5375.718968]
[ 5375.718968] -> #2 (namespace_sem){+++++.}:
[ 5375.718971] [<ffffffff810da4be>] lock_acquire+0xce/0x1e0
[ 5375.718974] [<ffffffff81635199>] down_write+0x49/0x80
[ 5375.718977] [<ffffffff81243593>] lock_mount+0x43/0x1c0
[ 5375.718979] [<ffffffff81243c13>] do_add_mount+0x23/0xd0
[ 5375.718982] [<ffffffff81244afb>] do_mount+0x27b/0xe30
[ 5375.718985] [<ffffffff812459dc>] SyS_mount+0x8c/0xd0
[ 5375.718988] [<ffffffff81637976>] entry_SYSCALL_64_fastpath+0x16/0x7a
[ 5375.718991]
[ 5375.718991] -> #1 (&sb->s_type->i_mutex_key#5){+.+.+.}:
[ 5375.718994] [<ffffffff810da4be>] lock_acquire+0xce/0x1e0
[ 5375.718996] [<ffffffff81633949>] mutex_lock_nested+0x69/0x3c0
[ 5375.719001] [<ffffffff8122d608>] path_openat+0x468/0x1360
[ 5375.719004] [<ffffffff8122f86e>] do_filp_open+0x7e/0xe0
[ 5375.719007] [<ffffffff8121da7b>] do_sys_open+0x12b/0x210
[ 5375.719010] [<ffffffff8121db7e>] SyS_open+0x1e/0x20
[ 5375.719013] [<ffffffff81637976>] entry_SYSCALL_64_fastpath+0x16/0x7a
[ 5375.719015]
[ 5375.719015] -> #0 (sb_writers){.+.+.+}:
[ 5375.719018] [<ffffffff810d97ca>] __lock_acquire+0x17ba/0x1ae0
[ 5375.719021] [<ffffffff810da4be>] lock_acquire+0xce/0x1e0
[ 5375.719026] [<ffffffff810d3bef>] percpu_down_read+0x4f/0xa0
[ 5375.719028] [<ffffffff812214f7>] __sb_start_write+0xb7/0xf0
[ 5375.719031] [<ffffffff81242eb4>] mnt_want_write+0x24/0x50
[ 5375.719035] [<ffffffff8122ded2>] path_openat+0xd32/0x1360
[ 5375.719037] [<ffffffff8122f86e>] do_filp_open+0x7e/0xe0
[ 5375.719040] [<ffffffff8121d8a4>] file_open_name+0xe4/0x130
[ 5375.719043] [<ffffffff8121d923>] filp_open+0x33/0x60
[ 5375.719073] [<ffffffffa02776a6>] update_dev_time+0x16/0x40 [btrfs]
[ 5375.719099] [<ffffffffa02825be>] btrfs_scratch_superblocks+0x4e/0x90 [btrfs]
[ 5375.719123] [<ffffffffa0282665>] btrfs_destroy_dev_replace_tgtdev+0x65/0x150 [btrfs]
[ 5375.719150] [<ffffffffa02c6c80>] btrfs_dev_replace_finishing+0x6b0/0x990 [btrfs]
[ 5375.719175] [<ffffffffa02c729e>] btrfs_dev_replace_start+0x33e/0x540 [btrfs]
[ 5375.719199] [<ffffffffa02c7f58>] btrfs_auto_replace_start+0xf8/0x140 [btrfs]
[ 5375.719222] [<ffffffffa02464e6>] health_kthread+0x246/0x490 [btrfs]
[ 5375.719225] [<ffffffff810a70df>] kthread+0xef/0x110
[ 5375.719229] [<ffffffff81637d2f>] ret_from_fork+0x3f/0x70
[ 5375.719230]
[ 5375.719230] other info that might help us debug this:
[ 5375.719230]
[ 5375.719233] Chain exists of:
[ 5375.719233] sb_writers --> namespace_sem --> &fs_devs->device_list_mutex
[ 5375.719233]
[ 5375.719234] Possible unsafe locking scenario:
[ 5375.719234]
[ 5375.719234] CPU0 CPU1
[ 5375.719235] ---- ----
[ 5375.719236] lock(&fs_devs->device_list_mutex);
[ 5375.719238] lock(namespace_sem);
[ 5375.719239] lock(&fs_devs->device_list_mutex);
[ 5375.719241] lock(sb_writers);
[ 5375.719241]
[ 5375.719241] *** DEADLOCK ***
[ 5375.719241]
[ 5375.719243] 4 locks held by btrfs-health/4662:
[ 5375.719266] #0: (&fs_info->health_mutex){+.+.+.}, at: [<ffffffffa0246303>] health_kthread+0x63/0x490 [btrfs]
[ 5375.719293] #1: (&fs_info->dev_replace.lock_finishing_cancel_unmount){+.+.+.}, at: [<ffffffffa02c6611>] btrfs_dev_replace_finishing+0x41/0x990 [btrfs]
[ 5375.719319] #2: (uuid_mutex){+.+.+.}, at: [<ffffffffa0282620>] btrfs_destroy_dev_replace_tgtdev+0x20/0x150 [btrfs]
[ 5375.719343] #3: (&fs_devs->device_list_mutex){+.+.+.}, at: [<ffffffffa028263c>] btrfs_destroy_dev_replace_tgtdev+0x3c/0x150 [btrfs]
[ 5375.719343]
[ 5375.719343] stack backtrace:
[ 5375.719347] CPU: 2 PID: 4662 Comm: btrfs-health Not tainted 4.4.5-scst31x-debug-11+ #40
[ 5375.719348] Hardware name: Supermicro SYS-6018R-WTRT/X10DRW-iT, BIOS 1.0c 01/07/2015
[ 5375.719352] 0000000000000000 ffff880856f73880 ffffffff813529e3 ffffffff826182a0
[ 5375.719354] ffffffff8260c090 ffff880856f738c0 ffffffff810d667c ffff880856f73930
[ 5375.719357] ffff880861f32b40 ffff880861f32b68 0000000000000003 0000000000000004
[ 5375.719357] Call Trace:
[ 5375.719363] [<ffffffff813529e3>] dump_stack+0x85/0xc2
[ 5375.719366] [<ffffffff810d667c>] print_circular_bug+0x1ec/0x260
[ 5375.719369] [<ffffffff810d97ca>] __lock_acquire+0x17ba/0x1ae0
[ 5375.719373] [<ffffffff810f606d>] ? debug_lockdep_rcu_enabled+0x1d/0x20
[ 5375.719376] [<ffffffff810da4be>] lock_acquire+0xce/0x1e0
[ 5375.719378] [<ffffffff812214f7>] ? __sb_start_write+0xb7/0xf0
[ 5375.719383] [<ffffffff810d3bef>] percpu_down_read+0x4f/0xa0
[ 5375.719385] [<ffffffff812214f7>] ? __sb_start_write+0xb7/0xf0
[ 5375.719387] [<ffffffff812214f7>] __sb_start_write+0xb7/0xf0
[ 5375.719389] [<ffffffff81242eb4>] mnt_want_write+0x24/0x50
[ 5375.719393] [<ffffffff8122ded2>] path_openat+0xd32/0x1360
[ 5375.719415] [<ffffffffa02462a0>] ? btrfs_congested_fn+0x180/0x180 [btrfs]
[ 5375.719418] [<ffffffff810f606d>] ? debug_lockdep_rcu_enabled+0x1d/0x20
[ 5375.719420] [<ffffffff8122f86e>] do_filp_open+0x7e/0xe0
[ 5375.719423] [<ffffffff810f615d>] ? rcu_read_lock_sched_held+0x6d/0x80
[ 5375.719426] [<ffffffff81201a9b>] ? kmem_cache_alloc+0x26b/0x5d0
[ 5375.719430] [<ffffffff8122e7d4>] ? getname_kernel+0x34/0x120
[ 5375.719433] [<ffffffff8121d8a4>] file_open_name+0xe4/0x130
[ 5375.719436] [<ffffffff8121d923>] filp_open+0x33/0x60
[ 5375.719462] [<ffffffffa02776a6>] update_dev_time+0x16/0x40 [btrfs]
[ 5375.719485] [<ffffffffa02825be>] btrfs_scratch_superblocks+0x4e/0x90 [btrfs]
[ 5375.719506] [<ffffffffa0282665>] btrfs_destroy_dev_replace_tgtdev+0x65/0x150 [btrfs]
[ 5375.719530] [<ffffffffa02c6c80>] btrfs_dev_replace_finishing+0x6b0/0x990 [btrfs]
[ 5375.719554] [<ffffffffa02c6b23>] ? btrfs_dev_replace_finishing+0x553/0x990 [btrfs]
[ 5375.719576] [<ffffffffa02c729e>] btrfs_dev_replace_start+0x33e/0x540 [btrfs]
[ 5375.719598] [<ffffffffa02c7f58>] btrfs_auto_replace_start+0xf8/0x140 [btrfs]
[ 5375.719621] [<ffffffffa02464e6>] health_kthread+0x246/0x490 [btrfs]
[ 5375.719641] [<ffffffffa02463d8>] ? health_kthread+0x138/0x490 [btrfs]
[ 5375.719661] [<ffffffffa02462a0>] ? btrfs_congested_fn+0x180/0x180 [btrfs]
[ 5375.719663] [<ffffffff810a70df>] kthread+0xef/0x110
[ 5375.719666] [<ffffffff810a6ff0>] ? kthread_create_on_node+0x200/0x200
[ 5375.719669] [<ffffffff81637d2f>] ret_from_fork+0x3f/0x70
[ 5375.719672] [<ffffffff810a6ff0>] ? kthread_create_on_node+0x200/0x200
[ 5375.719697] ------------[ cut here ]------------
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Reported-by: Yauhen Kharuzhy <yauhen.kharuzhy@zavadatar.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
The "sizeof(*arg->clone_sources) * arg->clone_sources_count" expression
can overflow. It causes several static checker warnings. It's all
under CAP_SYS_ADMIN so it's not that serious but lets silence the
warnings.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Since mixed block groups accounting isn't byte-accurate and f_bree is an
unsigned integer, it could overflow. Avoid this.
Signed-off-by: Luis de Bethencourt <luisbg@osg.samsung.com>
Suggested-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Metadata for mixed block is already accounted in total data and should not
be counted as part of the free metadata space.
Signed-off-by: Luis de Bethencourt <luisbg@osg.samsung.com>
Link: https://bugzilla.kernel.org/show_bug.cgi?id=114281
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Currently, we don't allow the user to try and rebalance to a dup profile
on a multi-device filesystem. In most cases, this is a perfectly sensible
restriction as raid1 uses the same amount of space and provides better
protection.
However, when reshaping a multi-device filesystem down to a single device
filesystem, this requires the user to convert metadata and system chunks
to single profile before deleting devices, and then convert again to dup,
which leaves a period of time where metadata integrity is reduced.
This patch removes the single-device-only restriction from converting to
dup profile to remove this potential data integrity reduction.
Signed-off-by: Austin S. Hemmelgarn <ahferroin7@gmail.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Move the op exclusivity check before the other code (same as in
ADD_DEV).
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
It seems to be long time unused, since 2008 and
6885f308b5570 ("Btrfs: Misc 2.6.25 updates").
Propagating the removal touches some code but has no functional effect.
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Creates helper fucntion as needed by the device delete
and replace operations. Also now it checks if the next
device being assigned is an active device.
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Yauhen reported in the ML that s_bdev is null at mount, and
s_bdev gets updated to some device when missing device is
replaced, as because bdev is null for missing device, things
gets matched up. Fix this by checking if s_bdev is set. I
didn't want to completely remove updating s_bdev because
the future multi device support at vfs layer may need it.
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Reported-by: Yauhen Kharuzhy <yauhen.kharuzhy@zavadatar.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
ta-da!
The main issue is the lack of down_write_killable(), so the places
like readdir.c switched to plain inode_lock(); once killable
variants of rwsem primitives appear, that'll be dealt with.
lockdep side also might need more work
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
The rest of work.xattr stuff isn't needed for this branch
|
|
Also drop the newline from the message.
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
The kiocb already has the new position, so use that. The only interesting
case is AIO, where we currently don't bother updating ki_pos. We're about
to free the kiocb after we're done, so we might as well update it to make
everyone's life simpler.
While we're at it also return the bytes written argument passed in if
we were successful so that the boilerplate error switch code in the
callers can go away.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
This will allow us to do per-I/O sync file writes, as required by a lot
of fileservers or storage targets.
XXX: Will need a few additional audits for O_DSYNC
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Including blkdev_direct_IO and dax_do_io. It has to be ki_pos to actually
work, so eliminate the superflous argument.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Single caller passes GFP_NOFS. We can get rid of the
gfpflags_allow_blocking checks as NOFS can block but does not recurse to
filesystem through reclaim.
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Similar to __clear_extent_bit, do not fail if the state preallocation
fails as we might not need it. One less BUG_ON.
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Single caller passes GFP_NOFS.
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Single caller passes GFP_NOFS.
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Single caller passes GFP_NOFS.
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Callers pass GFP_NOFS and tests pass GFP_KERNEL, but using NOFS there
does not hurt. No need to pass the flags around.
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Callers pass GFP_NOFS. No need to pass the flags around.
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Callers pass GFP_NOFS. No need to pass the flags around.
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Callers pass GFP_NOFS and GFP_KERNEL. No need to pass the flags around.
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
All callers pass GFP_NOFS.
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
The BTRFS_IOC_SEARCH_TREE ioctl returns file system items directly
to userspace. In order to decode them, full type information is required.
Create a new header, btrfs_tree to contain these since most users won't
need them.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Reviewed-by: Liu Bo <bo.li.liu@oracle.com>
Reviewed-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
struct btrfs_ioctl_defrag_range_args is used by the BTRFS_IOC_DEFRAG_RANGE
ioctl.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Reviewed-by: Liu Bo <bo.li.liu@oracle.com>
Reviewed-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
The BTRFS_BALANCE_* flags are used by struct btrfs_ioctl_balance_args.flags
and btrfs_ioctl_balance_args.{data,meta,sys}.flags in the BTRFS_IOC_BALANCE
ioctl.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Reviewed-by: Liu Bo <bo.li.liu@oracle.com>
Reviewed-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
The compat/compat_ro/incompat feature flags are used by the feature set/get
ioctls.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Reviewed-by: Liu Bo <bo.li.liu@oracle.com>
Reviewed-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
The BTRFS_QGROUP_LIMIT_* flags are required to tell the kernel which
fields are valid when using the BTRFS_IOC_QGROUP_LIMIT ioctl.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Reviewed-by: Liu Bo <bo.li.liu@oracle.com>
Reviewed-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
BTRFS_LABEL_SIZE is required to define the BTRFS_IOC_GET_FSLABEL and
BTRFS_IOC_SET_FSLABEL ioctls.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Reviewed-by: Liu Bo <bo.li.liu@oracle.com>
Reviewed-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
A refactor patch, and avoids user input verification in the
btrfs_dev_replace_start(), and so this function can be reused.
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Local variable fs_info, contains root->fs_info, use it.
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Rename BTRFS_DEVICE_BY_ID so it's more descriptive that we specify the
device by id, it'll be part of the public API. The mask of supported
flags is also renamed, only for internal use.
The error code for unknown flags is EOPNOTSUPP, fixed.
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
For clarity how we are going to find the device, let's call it a device
specifier, devspec for short. Also rename the arguments that are a
leftover from previous function purpose.
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
We should avoid duplicating the device constraints, let's use the
btrfs_raid_array in btrfs_check_raid_min_devices.
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|