bcachefs.git - Unnamed repository; edit this file 'description' to name the repository.

Age	Commit message (Collapse)	Author
2019-04-03	bcachefs: delete some debug code	Kent Overstreet

2019-04-03	bcachefs: add missing include	Kent Overstreet

2019-04-03	bcachefs: BCH_NAME_MAX	Kent Overstreet
	also fix some dirent bugs
2019-04-03	bcachefs: optimize __bch2_btree_iter_relock()	Kent Overstreet
	bch2_btree_node_relock() and __bch2_btree_iter_relock() are now only used for relocking, not upgrading or downgrading locks, so we can split out bch2_btree_node_upgrade() and slim down the fast path.
2019-04-03	bcachefs: btree_node_lock_increment()	Kent Overstreet

2019-04-03	bcachefs: btree_iter_get_locks()	Kent Overstreet

2019-04-03	bcachefs: bch2_btree_iter_upgrade()/downgrade()	Kent Overstreet
	Replaces bch2_btree_iter_set_locks_want() - also add more assertions Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2019-04-03	bcachefs: Fix a minor memory leak	Kent Overstreet

2019-04-03	bcachefs: Fix a bug in the str_hash code	Kent Overstreet
	fixes b0f3e786995cb3b12975503f963e469db5a4f09b
2019-04-03	bcachefs: bch_sb_field_clean	Kent Overstreet
	Implement a superblock field so we don't have to read the journal after a clean shutdown (and more importantly, we can verify what we find in the journal after a clean shutdown) Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2019-04-03	bcachefs: Make some improvements to the journal shutdown code	Kent Overstreet

2019-04-03	bcachefs: split out recovery.c	Kent Overstreet

2019-04-03	bcachefs: btree gc refactoring	Kent Overstreet

2019-04-03	bcachefs: fix a btree iter traverse error path	Kent Overstreet
	Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2019-04-03	bcachefs: btree perf/unit tests	Kent Overstreet
	The sysfs interface is crap and will be changed Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2019-04-03	bcachefs: fix missing bch_crc_bytes entries	Kent Overstreet

2019-04-03	bcachefs: add a discard mount option	Kent Overstreet
	Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2019-04-03	bcachefs: fix a minor fsync bug	Kent Overstreet

2019-04-03	bcachefs: drop locks when needed in bch2_btree_node_get_sibling()	Kent Overstreet

2019-04-03	bcachefs: btree iter refactoring	Kent Overstreet

2019-04-03	bcachefs: fix a fun truncate bug	Kent Overstreet
	truncate was leaving extents past the end of i_size. Turns out, it was doing so because it thought it wasn't shrinking the file when it was, and it thought it wasn't shrinking because i_size had gotten screwed up - the in memory i_size was smaller than the on disk i_size, which is never supposed to happen. Also turns out, the thing that was screwing up i_size was truncate - specifically, the error path when the filemap_write_and_wait_range() call fails. Besides fixing truncate itself, this patch also fixes and makes rigorous a lot of the locking pertaining to i_size and ei_inode (the cached on disk inode in bch_inode_info). Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2019-04-03	bcachefs: add a journal_seq_verify debug option	Kent Overstreet

2019-04-03	bcachefs: implement BTREE_INSERT_NOUNLOCK	Kent Overstreet
	BTREE_INSERT_NOUNLOCK means after a sucessful btree update, do not drop any locks (e.g. while merging nodes). This is going to be used to fix some locking primarily related to bi_size in bch_inode_info.
2019-04-03	bcachefs: use BTREE_ITER_END more consistently	Kent Overstreet

2019-04-03	bcachefs: better bch2_strtoh()	Kent Overstreet

2019-04-03	bcachefs: Fix a spurious error in fsck	Kent Overstreet
	If fsck finds an unreachable directory, it could just be because we crashed between deleting the dirent and deleting the inode, since that isn't done atomically yet - it's only a real error if the directory isn't empty
2019-04-03	bcachefs: don't use BTREE_INSERT_NOWAIT when we're not supposed to	Kent Overstreet
	was causing spurious journal replay failures
2019-04-03	bcachefs: fix missing btree_iter_set_dirty() call	Kent Overstreet

2019-04-03	bcachefs: btree allocation deadlock fix	Kent Overstreet

2019-04-03	bcachefs: fix another minor locking bug	Kent Overstreet

2019-04-03	bcachefs: fix error path in fallocate	Kent Overstreet

2019-04-03	bcachefs: tighten up reserve sizes	Kent Overstreet

2019-04-03	bcachefs: fix device sysfs links	Kent Overstreet

2019-04-03	bcachefs: fcollapse works on block granularity, not page	Kent Overstreet

2019-04-03	bcachefs: fix an error path in fcollapse	Kent Overstreet

2019-04-03	bcachefs: fix SGID + acls	Kent Overstreet

2019-04-03	bcachefs: fix dio write when faulting in from file we're writing to	Kent Overstreet

2019-04-03	bcachefs: drop some dead code	Kent Overstreet

2019-04-03	bcachefs: kill bch2_read_string_list()	Kent Overstreet

2019-04-03	bcachefs: Initial commit	Kent Overstreet
	Fork of drivers/md/bcache Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2019-04-03	cifs: convert to add_to_page_cache()	Kent Overstreet

2019-04-03	fs: factor out d_mark_tmpfile()	Kent Overstreet
	New helper for bcachefs - bcachefs doesn't want the inode_dec_link_count() call that d_tmpfile does, it handles i_nlink on its own atomically with other btree updates Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2019-04-03	fs: insert_inode_locked2()	Kent Overstreet
	New helper for bcachefs, so that when we race inserting an inode we can atomically grab a ref to the inode already in the inode cache. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2019-04-03	mm: pagecache add lock	Kent Overstreet
	Add a per address space lock around adding pages to the pagecache - making it possible for fallocate INSERT_RANGE/COLLAPSE_RANGE to work correctly, and also hopefully making truncate and dio a bit saner. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2019-03-01	Merge branch 'akpm' (patches from Andrew)	Linus Torvalds
	Merge misc fixes from Andrew Morton: "2 fixes" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: hugetlbfs: fix races and page leaks during migration kasan: turn off asan-stack for clang-8 and earlier
2019-03-01	hugetlbfs: fix races and page leaks during migration	Mike Kravetz
	hugetlb pages should only be migrated if they are 'active'. The routines set/clear_page_huge_active() modify the active state of hugetlb pages. When a new hugetlb page is allocated at fault time, set_page_huge_active is called before the page is locked. Therefore, another thread could race and migrate the page while it is being added to page table by the fault code. This race is somewhat hard to trigger, but can be seen by strategically adding udelay to simulate worst case scheduling behavior. Depending on 'how' the code races, various BUG()s could be triggered. To address this issue, simply delay the set_page_huge_active call until after the page is successfully added to the page table. Hugetlb pages can also be leaked at migration time if the pages are associated with a file in an explicitly mounted hugetlbfs filesystem. For example, consider a two node system with 4GB worth of huge pages available. A program mmaps a 2G file in a hugetlbfs filesystem. It then migrates the pages associated with the file from one node to another. When the program exits, huge page counts are as follows: node0 1024 free_hugepages 1024 nr_hugepages node1 0 free_hugepages 1024 nr_hugepages Filesystem Size Used Avail Use% Mounted on nodev 4.0G 2.0G 2.0G 50% /var/opt/hugepool That is as expected. 2G of huge pages are taken from the free_hugepages counts, and 2G is the size of the file in the explicitly mounted filesystem. If the file is then removed, the counts become: node0 1024 free_hugepages 1024 nr_hugepages node1 1024 free_hugepages 1024 nr_hugepages Filesystem Size Used Avail Use% Mounted on nodev 4.0G 2.0G 2.0G 50% /var/opt/hugepool Note that the filesystem still shows 2G of pages used, while there actually are no huge pages in use. The only way to 'fix' the filesystem accounting is to unmount the filesystem If a hugetlb page is associated with an explicitly mounted filesystem, this information in contained in the page_private field. At migration time, this information is not preserved. To fix, simply transfer page_private from old to new page at migration time if necessary. There is a related race with removing a huge page from a file and migration. When a huge page is removed from the pagecache, the page_mapping() field is cleared, yet page_private remains set until the page is actually freed by free_huge_page(). A page could be migrated while in this state. However, since page_mapping() is not set the hugetlbfs specific routine to transfer page_private is not called and we leak the page count in the filesystem. To fix that, check for this condition before migrating a huge page. If the condition is detected, return EBUSY for the page. Link: http://lkml.kernel.org/r/74510272-7319-7372-9ea6-ec914734c179@oracle.com Link: http://lkml.kernel.org/r/20190212221400.3512-1-mike.kravetz@oracle.com Fixes: bcc54222309c ("mm: hugetlb: introduce page_huge_active") Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com> Reviewed-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: <stable@vger.kernel.org> [mike.kravetz@oracle.com: v2] Link: http://lkml.kernel.org/r/7534d322-d782-8ac6-1c8d-a8dc380eb3ab@oracle.com [mike.kravetz@oracle.com: update comment and changelog] Link: http://lkml.kernel.org/r/420bcfd6-158b-38e4-98da-26d0cd85bd01@oracle.com Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-02-28	Merge tag 'for-linus-5.0-ofs1' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/hubcap/linux Pull orangefs fixlet from Mike Marshall: "Remove two un-needed BUG_ONs" * tag 'for-linus-5.0-ofs1' of git://git.kernel.org/pub/scm/linux/kernel/git/hubcap/linux: orangefs: remove two un-needed BUG_ONs...
2019-02-25	afs: Fix manually set volume location server list	David Howells
	When a cell with a volume location server list is added manually by echoing the details into /proc/net/afs/cells, a record is added but the flag saying it has been looked up isn't set. This causes the VL server rotation code to wait forever, with the top of /proc/pid/stack looking like: afs_select_vlserver+0x3a6/0x6f3 afs_vl_lookup_vldb+0x4b/0x92 afs_create_volume+0x25/0x1b9 ... with the thread stuck in afs_start_vl_iteration() waiting for AFS_CELL_FL_NO_LOOKUP_YET to be cleared. Fix this by clearing AFS_CELL_FL_NO_LOOKUP_YET when setting up a record if that record's details were supplied manually. Fixes: 0a5143f2f89c ("afs: Implement VL server rotation") Reported-by: Dave Botsch <dwb7@cornell.edu> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-02-25	Revert "x86/fault: BUG() when uaccess helpers fault on kernel addresses"	Linus Torvalds
	This reverts commit 9da3f2b74054406f87dff7101a569217ffceb29b. It was well-intentioned, but wrong. Overriding the exception tables for instructions for random reasons is just wrong, and that is what the new code did. It caused problems for tracing, and it caused problems for strncpy_from_user(), because the new checks made perfectly valid use cases break, rather than catch things that did bad things. Unchecked user space accesses are a problem, but that's not a reason to add invalid checks that then people have to work around with silly flags (in this case, that 'kernel_uaccess_faults_ok' flag, which is just an odd way to say "this commit was wrong" and was sprinked into random places to hide the wrongness). The real fix to unchecked user space accesses is to get rid of the special "let's not check __get_user() and __put_user() at all" logic. Make __{get\|put}_user() be just aliases to the regular {get\|put}_user() functions, and make it impossible to access user space without having the proper checks in places. The raison d'être of the special double-underscore versions used to be that the range check was expensive, and if you did multiple user accesses, you'd do the range check up front (like the signal frame handling code, for example). But SMAP (on x86) and PAN (on ARM) have made that optimization pointless, because the _real_ expense is the "set CPU flag to allow user space access". Do let's not break the valid cases to catch invalid cases that shouldn't even exist. Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Kees Cook <keescook@chromium.org> Cc: Tobin C. Harding <tobin@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Jann Horn <jannh@google.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-02-21	Merge tag 'ceph-for-5.0-rc8' of git://github.com/ceph/ceph-client	Linus Torvalds
	Pull ceph fixes from Ilya Dryomov: "Two bug fixes for old issues, both marked for stable" * tag 'ceph-for-5.0-rc8' of git://github.com/ceph/ceph-client: ceph: avoid repeatedly adding inode to mdsc->snap_flush_list libceph: handle an empty authorize reply