summaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)Author
2013-06-17block: Bio cancellationaio-idaKent Overstreet
If a bio is associated with a kiocb, allow it to be cancelled. This is accomplished by adding a pointer to a kiocb in struct bio, and when we go to dequeue a request we check if its bio has been cancelled - if so, we end the request with -ECANCELED. We don't currently try to cancel bios if IO has already been started - that'd require a per bio callback function, and a way to find all the outstanding bios for a given kiocb. Such a mechanism may or may not be added in the future but this patch tries to start simple. Currently this can only be triggered with aio and io_cancel(), but the mechanism can be used for sync io too. It can also be used for bios created by stacking drivers, and bio clones in general - when cloning a bio, if the bi_iocb pointer is copied as well the clone will then be cancellable. bio_clone() could be modified to do this, but hasn't in this patch because all the bio_clone() users would need to be auditied to make sure that it's safe. We can't blindly make e.g. raid5 writes cancellable without the knowledge of the md code. Initial patch by Anatol Pomazau (anatol@google.com). Signed-off-by: Kent Overstreet <koverstreet@google.com> Cc: Zach Brown <zab@redhat.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Jens Axboe <axboe@kernel.dk> Cc: Jeff Moyer <jmoyer@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Benjamin LaHaise <bcrl@kvack.org>
2013-06-17aio: Allow cancellation without a cancel callback, new kiocb lookupKent Overstreet
This patch does a couple things: * Allows cancellation of any kiocb, even if the driver doesn't implement a ki_cancel callback function. This will be used for block layer cancellation - there, implementing a callback is problematic, but we can implement useful cancellation by just checking if the kicob has been marked as cancelled when it goes to dequeue the request. * Implements a new lookup mechanism for cancellation. Previously, to cancel a kiocb we had to look it up in a linked list, and kiocbs were added to the linked list lazily. But if any kiocb is cancellable, the lazy list adding no longer works, so we need a new mechanism. This is done by allocating kiocbs out of a (lazily allocated) array of pages, which means we can refer to the kiocbs (and iterate over them) with small integers - we use the percpu tag allocation code for allocating individual kiocbs. Signed-off-by: Kent Overstreet <koverstreet@google.com> Cc: Zach Brown <zab@redhat.com> Cc: Felipe Balbi <balbi@ti.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Jens Axboe <axboe@kernel.dk> Cc: Asai Thambi S P <asamymuthupa@micron.com> Cc: Selvan Mani <smani@micron.com> Cc: Sam Bradshaw <sbradshaw@micron.com> Cc: Jeff Moyer <jmoyer@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Benjamin LaHaise <bcrl@kvack.org>
2013-06-17block, aio: batch completion for bios/kiocbsKent Overstreet
When completing a kiocb, there's some fixed overhead from touching the kioctx's ring buffer the kiocb belongs to. Some newer high end block devices can complete multiple IOs per interrupt, much like many network interfaces have been for some time. This plumbs through infrastructure so we can take advantage of multiple completions at the interrupt level, and complete multiple kiocbs at the same time. Drivers have to be converted to take advantage of this, but it's a simple change and the next patches will convert a few drivers. To use it, an interrupt handler (or any code that completes bios or requests) declares and initializes a struct batch_complete: struct batch_complete batch; batch_complete_init(&batch); Then, instead of calling bio_endio(), it calls bio_endio_batch(bio, err, &batch). This just adds the bio to a list in the batch_complete. At the end, it calls batch_complete(&batch); This completes all the bios all at once, building up a list of kiocbs; then the list of kiocbs are completed all at once. [akpm@linux-foundation.org: fix warning] [akpm@linux-foundation.org: fs/aio.c needs bio.h, move bio_endio_batch() declaration somewhere rational] [akpm@linux-foundation.org: fix warnings] [minchan@kernel.org: fix build error due to bio_endio_batch] [akpm@linux-foundation.org: fix tracepoint in batch_complete()] Signed-off-by: Kent Overstreet <koverstreet@google.com> Cc: Zach Brown <zab@redhat.com> Cc: Felipe Balbi <balbi@ti.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Jens Axboe <axboe@kernel.dk> Cc: Asai Thambi S P <asamymuthupa@micron.com> Cc: Selvan Mani <smani@micron.com> Cc: Sam Bradshaw <sbradshaw@micron.com> Cc: Jeff Moyer <jmoyer@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Benjamin LaHaise <bcrl@kvack.org> Cc: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Minchan Kim <minchan@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2013-06-17block: prep work for batch completionKent Overstreet
Add a struct batch_complete * argument to bi_end_io; infrastructure to make use of it comes in the next patch. Signed-off-by: Kent Overstreet <koverstreet@google.com> Cc: Zach Brown <zab@redhat.com> Cc: Felipe Balbi <balbi@ti.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Jens Axboe <axboe@kernel.dk> Cc: Asai Thambi S P <asamymuthupa@micron.com> Cc: Selvan Mani <smani@micron.com> Cc: Sam Bradshaw <sbradshaw@micron.com> Cc: Jeff Moyer <jmoyer@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Benjamin LaHaise <bcrl@kvack.org> Reviewed-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2013-06-17aio: convert the ioctx list to radix treeKent Overstreet
On Wed, Jun 12, 2013 at 11:14:40AM -0700, Kent Overstreet wrote: > On Mon, Apr 15, 2013 at 02:40:55PM +0300, Octavian Purdila wrote: > > When using a large number of threads performing AIO operations the > > IOCTX list may get a significant number of entries which will cause > > significant overhead. For example, when running this fio script: > > > > rw=randrw; size=256k ;directory=/mnt/fio; ioengine=libaio; iodepth=1 > > blocksize=1024; numjobs=512; thread; loops=100 > > > > on an EXT2 filesystem mounted on top of a ramdisk we can observe up to > > 30% CPU time spent by lookup_ioctx: > > > > 32.51% [guest.kernel] [g] lookup_ioctx > > 9.19% [guest.kernel] [g] __lock_acquire.isra.28 > > 4.40% [guest.kernel] [g] lock_release > > 4.19% [guest.kernel] [g] sched_clock_local > > 3.86% [guest.kernel] [g] local_clock > > 3.68% [guest.kernel] [g] native_sched_clock > > 3.08% [guest.kernel] [g] sched_clock_cpu > > 2.64% [guest.kernel] [g] lock_release_holdtime.part.11 > > 2.60% [guest.kernel] [g] memcpy > > 2.33% [guest.kernel] [g] lock_acquired > > 2.25% [guest.kernel] [g] lock_acquire > > 1.84% [guest.kernel] [g] do_io_submit > > > > This patchs converts the ioctx list to a radix tree. For a performance > > comparison the above FIO script was run on a 2 sockets 8 core > > machine. This are the results (average and %rsd of 10 runs) for the > > original list based implementation and for the radix tree based > > implementation: > > > > cores 1 2 4 8 16 32 > > list 109376 ms 69119 ms 35682 ms 22671 ms 19724 ms 16408 ms > > %rsd 0.69% 1.15% 1.17% 1.21% 1.71% 1.43% > > radix 73651 ms 41748 ms 23028 ms 16766 ms 15232 ms 13787 ms > > %rsd 1.19% 0.98% 0.69% 1.13% 0.72% 0.75% > > % of radix > > relative 66.12% 65.59% 66.63% 72.31% 77.26% 83.66% > > to list > > > > To consider the impact of the patch on the typical case of having > > only one ctx per process the following FIO script was run: > > > > rw=randrw; size=100m ;directory=/mnt/fio; ioengine=libaio; iodepth=1 > > blocksize=1024; numjobs=1; thread; loops=100 > > > > on the same system and the results are the following: > > > > list 58892 ms > > %rsd 0.91% > > radix 59404 ms > > %rsd 0.81% > > % of radix > > relative 100.87% > > to list > > So, I was just doing some benchmarking/profiling to get ready to send > out the aio patches I've got for 3.11 - and it looks like your patch is > causing a ~1.5% throughput regression in my testing :/ ... <snip> I've got an alternate approach for fixing this wart in lookup_ioctx()... Instead of using an rbtree, just use the reserved id in the ring buffer header to index an array pointing the ioctx. It's not finished yet, and it needs to be tidied up, but is most of the way there. -ben -- "Thought is the essence of where you are now." -- And, a rework of Ben's code, but this was entirely his idea -Kent fs/aio.c | 80 ++++++++++++++++++++++++++++++++++++++++++----- include/linux/mm_types.h | 5 ++ kernel/fork.c | 4 ++ 3 files changed, 81 insertions(+), 8 deletions(-)
2013-06-17aio: Kill ki_dtorKent Overstreet
sock_aio_dtor() is dead code - and stuff that does need to do cleanup can simply do it before calling aio_complete(). Signed-off-by: Kent Overstreet <koverstreet@google.com> Cc: Zach Brown <zab@redhat.com> Cc: Felipe Balbi <balbi@ti.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Jens Axboe <axboe@kernel.dk> Cc: Asai Thambi S P <asamymuthupa@micron.com> Cc: Selvan Mani <smani@micron.com> Cc: Sam Bradshaw <sbradshaw@micron.com> Cc: Jeff Moyer <jmoyer@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Benjamin LaHaise <bcrl@kvack.org> Cc: Theodore Ts'o <tytso@mit.edu>
2013-06-17aio: Kill ki_usersKent Overstreet
The kiocb refcount is only needed for cancellation - to ensure a kiocb isn't freed while a ki_cancel callback is running. But if we restrict ki_cancel callbacks to not block (which they currently don't), we can simply drop the refcount. Signed-off-by: Kent Overstreet <koverstreet@google.com> Cc: Zach Brown <zab@redhat.com> Cc: Felipe Balbi <balbi@ti.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Jens Axboe <axboe@kernel.dk> Cc: Asai Thambi S P <asamymuthupa@micron.com> Cc: Selvan Mani <smani@micron.com> Cc: Sam Bradshaw <sbradshaw@micron.com> Cc: Jeff Moyer <jmoyer@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Benjamin LaHaise <bcrl@kvack.org> Cc: Theodore Ts'o <tytso@mit.edu>
2013-06-17aio: Kill unneeded kiocb membersKent Overstreet
The old aio retry infrastucture needed to save the various arguments to to aio operations. But with the retry infrastructure gone, we can trim struct kiocb quite a bit. Signed-off-by: Kent Overstreet <koverstreet@google.com> Cc: Zach Brown <zab@redhat.com> Cc: Felipe Balbi <balbi@ti.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Jens Axboe <axboe@kernel.dk> Cc: Asai Thambi S P <asamymuthupa@micron.com> Cc: Selvan Mani <smani@micron.com> Cc: Sam Bradshaw <sbradshaw@micron.com> Cc: Jeff Moyer <jmoyer@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Benjamin LaHaise <bcrl@kvack.org> Cc: Theodore Ts'o <tytso@mit.edu>
2013-06-17aio: Kill aio_rw_vect_retry()Kent Overstreet
This code doesn't serve any purpose anymore, since the aio retry infrastructure has been removed. This change should be safe because aio_read/write are also used for synchronous IO, and called from do_sync_read()/do_sync_write() - and there's no looping done in the sync case (the read and write syscalls). Signed-off-by: Kent Overstreet <koverstreet@google.com> Cc: Zach Brown <zab@redhat.com> Cc: Felipe Balbi <balbi@ti.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Jens Axboe <axboe@kernel.dk> Cc: Asai Thambi S P <asamymuthupa@micron.com> Cc: Selvan Mani <smani@micron.com> Cc: Sam Bradshaw <sbradshaw@micron.com> Cc: Jeff Moyer <jmoyer@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Benjamin LaHaise <bcrl@kvack.org>
2013-06-17aio: io_cancel() no longer returns the io_eventKent Overstreet
Originally, io_event() was documented to return the io_event if cancellation succeeded - the io_event wouldn't be delivered via the ring buffer like it normally would. But this isn't what the implementation was actually doing; the only driver implementing cancellation, the usb gadget code, never returned an io_event in its cancel function. And aio_complete() was recently changed to no longer suppress event delivery if the kiocb had been cancelled. This gets rid of the unused io_event argument to kiocb_cancel() and kiocb->ki_cancel(), and changes io_cancel() to return -EINPROGRESS if kiocb->ki_cancel() returned success. Also tweak the refcounting in kiocb_cancel() to make more sense. Signed-off-by: Kent Overstreet <koverstreet@google.com> Cc: Zach Brown <zab@redhat.com> Cc: Felipe Balbi <balbi@ti.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Jens Axboe <axboe@kernel.dk> Cc: Asai Thambi S P <asamymuthupa@micron.com> Cc: Selvan Mani <smani@micron.com> Cc: Sam Bradshaw <sbradshaw@micron.com> Cc: Jeff Moyer <jmoyer@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Benjamin LaHaise <bcrl@kvack.org>
2013-06-17idr: Percpu idaKent Overstreet
Percpu frontend for allocating ids. With percpu allocation (that works), it's impossible to guarantee it will always be possible to allocate all nr_tags - typically, some will be stuck on a remote percpu freelist where the current job can't get to them. We do guarantee that it will always be possible to allocate at least (nr_tags / 2) tags - this is done by keeping track of which and how many cpus have tags on their percpu freelists. On allocation failure if enough cpus have tags that there could potentially be (nr_tags / 2) tags stuck on remote percpu freelists, we then pick a remote cpu at random to steal from. Note that the synchronization is _definitely_ tricky - we're using xchg()/cmpxchg() on the percpu lists, to synchronize between steal_tags(). The alternative would've been adding a spinlock to protect the percpu freelists, but that would've required some different tricky code to avoid deadlock because of the lock ordering. Note that there's no cpu hotplug notifier - we don't care, because steal_tags() will eventually get the down cpu's tags. We _could_ satisfy more allocations if we had a notifier - but we'll still meet our guarantees and it's absolutely not a correctness issue, so I don't think it's worth the extra code. Signed-off-by: Kent Overstreet <koverstreet@google.com> Cc: Tejun Heo <tj@kernel.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Christoph Lameter <cl@linux-foundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Jens Axboe <axboe@kernel.dk> Cc: "Nicholas A. Bellinger" <nab@linux-iscsi.org>
2013-06-17idr: Rewrite idaKent Overstreet
This is a new, from scratch implementation of ida that should be simpler, faster and more space efficient. Two primary reasons for the rewrite: * A future patch will reimplement idr on top of this ida implementation + radix trees. Once that's done, the end result will be ~1k fewer lines of code, much simpler and easier to understand and it should be quite a bit faster. * The performance improvements and addition of ganged allocation should make ida more suitable for use by a percpu id/tag allocator, which would then act as a frontend to this allocator. The old ida implementation was done with the idr data structures - this was IMO backwards. I'll soon be reimplementing idr on top of this new ida implementation and radix trees - using a separate dedicated data structure for the free ID bitmap should actually make idr faster, and the end result is _significantly_ less code. This implementation conceptually isn't that different from the old one - it's a tree of bitmaps, where one bit in a given node indicates whether or not there are free bits in a child node. The main difference (and advantage) over the old version is that the tree isn't implemented with pointers - it's implemented in an array, like how heaps are implemented, which both better space efficiency and it'll be faster since there's no pointer chasing. This does mean that the entire bitmap is stored in one contiguous memory allocation - and as currently implemented we won't be able to allocate _quite_ as many ids as with the previous implementation. I don't expect this to be an issue in practice since anywhere this is used, an id corresponds to a struct allocation somewher else - we can't allocate an unbounded number of ids, we'll run out of memory somewhere else eventually, and I expect that to be the limiting factor in practice. If a user/use case does come up where this matters I can add some sharding (or perhaps add a separate big_ida implementation) - but the extra complexity would adversely affect performance for the users that don't need > millions of ids, so I intend to leave the implementation as is until if and when this becomes an issue. Signed-off-by: Kent Overstreet <koverstreet@google.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Tejun Heo <tj@kernel.org> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Fengguang Wu <fengguang.wu@intel.com>
2013-06-17idr: Rename ida_simple_get() -> ida_alloc_range()Kent Overstreet
The old ida interfaces that didn't do locking have been removed, the "simple" distinction doesn't make sense anymore. Also, add an ida_alloc() wrapper that doesn't take the start and end parameters. Signed-off-by: Kent Overstreet <koverstreet@google.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Jens Axboe <axboe@kernel.dk> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Matthew Wilcox <willy@linux.intel.com> Cc: Joshua Morris <josh.h.morris@us.ibm.com> Cc: Philip Kelleher <pjk1939@linux.vnet.ibm.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: David Airlie <airlied@linux.ie> Cc: Jean Delvare <khali@linux-fr.org> Cc: Guenter Roeck <linux@roeck-us.net> Cc: Jonathan Cameron <jic23@cam.ac.uk> Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com> Cc: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Cc: Jens Taprogge <jens.taprogge@taprogge.org> Cc: "Michał Mirosław" <mirq-linux@rere.qmqm.pl> Cc: Richard Cochran <richardcochran@gmail.com> Cc: Ohad Ben-Cohen <ohad@wizery.com> Cc: Alessandro Zummo <a.zummo@towertech.it> Cc: Boaz Harrosh <bharrosh@panasas.com> Cc: Benny Halevy <bhalevy@tonian.com> Cc: "James E.J. Bottomley" <JBottomley@parallels.com> Cc: Mike Christie <michaelc@cs.wisc.edu> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Evgeniy Polyakov <zbr@ioremap.net> Cc: Wim Van Sebroeck <wim@iguana.be> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: Tejun Heo <tj@kernel.org> Cc: Li Zefan <lizefan@huawei.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@suse.cz> Cc: Balbir Singh <bsingharora@gmail.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Marcel Holtmann <marcel@holtmann.org> Cc: Gustavo Padovan <gustavo@padovan.org> Cc: Johan Hedberg <johan.hedberg@gmail.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Lauro Ramos Venancio <lauro.venancio@openbossa.org> Cc: Aloisio Almeida Jr <aloisio.almeida@openbossa.org> Cc: Samuel Ortiz <sameo@linux.intel.com> Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com> Cc: Asai Thambi S P <asamymuthupa@micron.com> Cc: Selvan Mani <smani@micron.com> Cc: Sam Bradshaw <sbradshaw@micron.com> Cc: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Cc: David Howells <dhowells@redhat.com> Cc: Dave Jones <davej@redhat.com> Cc: Dave Airlie <airlied@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Joerg Roedel <joro@8bytes.org> Cc: Alex Williamson <alex.williamson@redhat.com> Cc: Varun Sethi <Varun.Sethi@freescale.com> Cc: Alexey Kardashevskiy <aik@ozlabs.ru> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Jiri Slaby <jslaby@suse.cz> Cc: Alan Cox <alan@linux.intel.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Serge Hallyn <serge.hallyn@canonical.com> Cc: Mauro Carvalho Chehab <mchehab@redhat.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Fengguang Wu <fengguang.wu@intel.com> Cc: Randy Dunlap <rdunlap@infradead.org>
2013-06-17idr: Convert code to ida_simple_get()Kent Overstreet
A lot of drivers were open coding ida_simple_get() - by converting them we can get rid of (crappy) ida_pre_get(), ida_get_new(), ida_remove() interfaces. The ida_simple_*() interfaces do their own locking, which means we can remove a fair amount of code. Additionally, some code was open coding ida_get_cyclic() (c.f. idr_alloc_cyclic()) - after digging into the git logs this seems to have entirely been a performance optimization. This patch removes the cyclic allocation - cyclic allocation as performance optimization is rather sketchy, and in a couple patches we're reimplementing ida from scratch and the new implementation should be a good deal faster. Signed-off-by: Kent Overstreet <koverstreet@google.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Tejun Heo <tj@kernel.org> Cc: Fengguang Wu <fengguang.wu@intel.com> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Matthew Wilcox <willy@linux.intel.com> Cc: Joshua Morris <josh.h.morris@us.ibm.com> Cc: Philip Kelleher <pjk1939@linux.vnet.ibm.com> Cc: David Airlie <airlied@linux.ie> Cc: "Michał Mirosław" <mirq-linux@rere.qmqm.pl> Cc: Boaz Harrosh <bharrosh@panasas.com> Cc: Benny Halevy <bhalevy@tonian.com> Cc: "James E.J. Bottomley" <JBottomley@parallels.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: Li Zefan <lizefan@huawei.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Asai Thambi S P <asamymuthupa@micron.com> Cc: Selvan Mani <smani@micron.com> Cc: Sam Bradshaw <sbradshaw@micron.com> Cc: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Cc: David Howells <dhowells@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Dave Jones <davej@redhat.com> Cc: Dave Airlie <airlied@redhat.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Alex Williamson <alex.williamson@redhat.com> Cc: Varun Sethi <Varun.Sethi@freescale.com> Cc: Alexey Kardashevskiy <aik@ozlabs.ru> Cc: Jiri Slaby <jslaby@suse.cz> Cc: Alan Cox <alan@linux.intel.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Serge Hallyn <serge.hallyn@canonical.com> Cc: Mauro Carvalho Chehab <mchehab@redhat.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au>
2013-06-03percpu: implement generic percpu refcountingKent Overstreet
This implements a refcount with similar semantics to atomic_get()/atomic_dec_and_test() - but percpu. It also implements two stage shutdown, as we need it to tear down the percpu counts. Before dropping the initial refcount, you must call percpu_ref_kill(); this puts the refcount in "shutting down mode" and switches back to a single atomic refcount with the appropriate barriers (synchronize_rcu()). It's also legal to call percpu_ref_kill() multiple times - it only returns true once, so callers don't have to reimplement shutdown synchronization. [akpm@linux-foundation.org: fix build] [akpm@linux-foundation.org: coding-style tweak] Signed-off-by: Kent Overstreet <koverstreet@google.com> Cc: Zach Brown <zab@redhat.com> Cc: Felipe Balbi <balbi@ti.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Jens Axboe <axboe@kernel.dk> Cc: Asai Thambi S P <asamymuthupa@micron.com> Cc: Selvan Mani <smani@micron.com> Cc: Sam Bradshaw <sbradshaw@micron.com> Cc: Jeff Moyer <jmoyer@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Benjamin LaHaise <bcrl@kvack.org> Cc: Tejun Heo <tj@kernel.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Christoph Lameter <cl@linux-foundation.org> Cc: Ingo Molnar <mingo@redhat.com> Reviewed-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Tejun Heo <tj@kernel.org>
2013-06-03Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 fixes from Martin Schwidefsky: "Recent bug fixes, one of them touches a common code file. It adds two #ifndef/#endif pairs to asm-generic/io.h to be able to override xlate_dev_kmem_ptr and xlate_dev_mem_ptr." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: s390/pgtable: Fix gmap notifier address s390/dasd: fix handling of gone paths s390/pgtable: Fix check for pgste/storage key handling arch: s390: appldata: using strncpy() and strnlen() instead of sprintf() s390/smp: lost IPIs on cpu hotplug kernel: Fix s390 absolute memory access for /dev/mem s390/dma: do not call debug_dma after free
2013-06-03Merge branch 'for-3.10-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup Pull cgroup fixes from Tejun Heo: - Fix for yet another xattr bug which may lead to NULL deref. - A subtle bug in for_each_descendant_pre(). This bug requires quite specific conditions to trigger and isn't too likely to actually happen in the wild, but maybe that just makes it that much more nastier. - A warning message added for silly cgroup re-mount (not -o remount, but unmount followed by mount) behavior. * 'for-3.10-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: cgroup: warn about mismatching options of a new mount of an existing hierarchy cgroup: fix a subtle bug in descendant pre-order walk cgroup: initialize xattr before calling d_instantiate()
2013-06-01Merge git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pendingLinus Torvalds
Pull scsi target fixes from Nicholas Bellinger: "The highlights include: - Re-instate sess->wait_list in target_wait_for_sess_cmds() for active I/O shutdown handling in fabrics using se_cmd->cmd_kref - Make ib_srpt call target_sess_cmd_list_set_waiting() during session shutdown - Fix FILEIO off-by-one READ_CAPACITY bug for !S_ISBLK export - Fix iscsi-target login error heap buffer overflow (Kees) - Fix iscsi-target active I/O shutdown handling regression in v3.10-rc1 A big thanks to Kees Cook for fixing a long standing login error buffer overflow bug. All patches are CC'ed to stable with the exception of the v3.10-rc1 specific regression + other minor target cleanup." * git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: iscsi-target: Fix iscsit_free_cmd() se_cmd->cmd_kref shutdown handling target: Propigate up ->cmd_kref put return via transport_generic_free_cmd iscsi-target: fix heap buffer overflow on error target/file: Fix off-by-one READ_CAPACITY bug for !S_ISBLK export ib_srpt: Call target_sess_cmd_list_set_waiting during shutdown_session target: Re-instate sess_wait_list for target_wait_for_sess_cmds target: Remove unused wait_for_tasks bit in target_wait_for_sess_cmds
2013-06-01Merge tag 'fbdev-for-3.10-rc4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/plagnioj/linux-fbdev Pull fbdev fixes from Jean-Christophe PLAGNIOL-VILLARD: "This contains some small fixes - Atmel LCDC: fix blank the backlight on remove - ps3fb: fix compile warning - OMAPDSS: Fix crash with DT boot" * tag 'fbdev-for-3.10-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/plagnioj/linux-fbdev: atmel_lcdfb: blank the backlight on remove trivial: atmel_lcdfb: add missing error message OMAPDSS: Fix crash with DT boot fbdev/ps3fb: fix compile warning
2013-06-01Merge tag 'please-pull-aertracefix' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras Pull aer error logging fix from Tony Luck: "Can't call pci_get_domain_bus_and_slot() from interupt context" * tag 'please-pull-aertracefix' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras: aerdrv: Move cper_print_aer() call out of interrupt context
2013-05-31target: Propigate up ->cmd_kref put return via transport_generic_free_cmdNicholas Bellinger
Go ahead and propigate up the ->cmd_kref put return value from target_put_sess_cmd() -> transport_release_cmd() -> transport_put_cmd() -> transport_generic_free_cmd(). This is useful for certain fabrics when determining the active I/O shutdown case with SCF_ACK_KREF where a final target_put_sess_cmd() is still required by the caller. Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2013-05-31Merge tag 'stable/for-linus-3.10-rc3-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen Pull Xen fixes from Konrad Rzeszutek Wilk: - Use proper error paths - Clean up APIC IPI usage (incorrect arguments) - Delay XenBus frontend resume is backend (xenstored) is not running - Fix build error with various combinations of CONFIG_ * tag 'stable/for-linus-3.10-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen: xenbus_client.c: correct exit path for xenbus_map_ring_valloc_hvm xen-pciback: more uses of cached MSI-X capability offset xen: Clean up apic ipi interface xenbus: save xenstore local status for later use xenbus: delay xenbus frontend resume if xenstored is not running xmem/tmem: fix 'undefined variable' build error.
2013-05-30aerdrv: Move cper_print_aer() call out of interrupt contextLance Ortiz
The following warning was seen on 3.9 when a corrected PCIe error was being handled by the AER subsystem. WARNING: at .../drivers/pci/search.c:214 pci_get_dev_by_id+0x8a/0x90() This occurred because a call to pci_get_domain_bus_and_slot() was added to cper_print_pcie() to setup for the call to cper_print_aer(). The warning showed up because cper_print_pcie() is called in an interrupt context and pci_get* functions are not supposed to be called in that context. The solution is to move the cper_print_aer() call out of the interrupt context and into aer_recover_work_func() to avoid any warnings when calling pci_get* functions. Signed-off-by: Lance Ortiz <lance.ortiz@hp.com> Acked-by: Borislav Petkov <bp@suse.de> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
2013-05-29target: Re-instate sess_wait_list for target_wait_for_sess_cmdsNicholas Bellinger
Switch back to pre commit 1c7b13fe652 list splicing logic for active I/O shutdown with tcm_qla2xxx + ib_srpt fabrics. The original commit was done under the incorrect assumption that it's safe to walk se_sess->sess_cmd_list unprotected in target_wait_for_sess_cmds() after sess->sess_tearing_down = 1 has been set by target_sess_cmd_list_set_waiting() during session shutdown. So instead of adding sess->sess_cmd_lock protection around sess->sess_cmd_list during target_wait_for_sess_cmds(), switch back to sess->sess_wait_list to allow wait_for_completion() + TFO->release_cmd() to occur without having to walk ->sess_cmd_list after the list_splice. Also add a check to exit if target_sess_cmd_list_set_waiting() has already been called, and add a WARN_ON to check for any fabric bug where new se_cmds are added to sess->sess_cmd_list after sess->sess_tearing_down = 1 has already been set. Cc: Joern Engel <joern@logfs.org> Cc: Roland Dreier <roland@kernel.org> Cc: stable@vger.kernel.org Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2013-05-29xenbus: delay xenbus frontend resume if xenstored is not runningAurelien Chartier
If the xenbus frontend is located in a domain running xenstored, the device resume is hanging because it is happening before the process resume. This patch adds extra logic to the resume code to check if we are the domain running xenstored and delay the resume if needed. Signed-off-by: Aurelien Chartier <aurelien.chartier@citrix.com> [Changes in v2: - Instead of bypassing the resume, process it in a workqueue] [Changes in v3: - Add a struct work in xenbus_device to avoid dynamic allocation - Several small code fixes] [Changes in v4: - Use a dedicated workqueue] [Changes in v5: - Move create_workqueue error handling to xenbus_frontend_dev_resume] Acked-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2013-05-29Merge branch 'fbdev-3.10-fixes' of git://gitorious.org/linux-omap-dss2/linux ↵Jean-Christophe PLAGNIOL-VILLARD
into linux-fbdev/for-3.10-fixes Pull Tomi fixes for ps3fb and omap2 Signed-off-by: Jean-Christophe PLAGNIOL-VILLARD <plagnioj@jcrosoft.com>
2013-05-25Merge tag 'pm+acpi-3.10-rc3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management and ACPI fixes from Rafael Wysocki: - Additional CPU ID for the intel_pstate driver from Dirk Brandewie. - More cpufreq fixes related to ARM big.LITTLE support and locking from Viresh Kumar. - VIA C7 cpufreq build fix from Rafał Bilski. - ACPI power management fix making it possible to use device power states regardless of the CONFIG_PM setting from Rafael J Wysocki. - New ACPI video blacklist item from Bastian Triller. * tag 'pm+acpi-3.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: ACPI / video: Add "Asus UL30A" to ACPI video detect blacklist cpufreq: arm_big_little_dt: Instantiate as platform_driver cpufreq: arm_big_little_dt: Register driver only if DT has valid data cpufreq / e_powersaver: Fix linker error when ACPI processor is a module cpufreq / intel_pstate: Add additional supported CPU ID cpufreq: Drop rwsem lock around CPUFREQ_GOV_POLICY_EXIT ACPI / PM: Allow device power states to be used for CONFIG_PM unset
2013-05-25Merge branch 'fixes' of git://git.infradead.org/users/vkoul/slave-dmaLinus Torvalds
Pull slave-dma fixes from Vinod Koul: "We have two patches from Andy & Rafael fixing the Lynxpoint dma" * 'fixes' of git://git.infradead.org/users/vkoul/slave-dma: ACPI / LPSS: register clock device for Lynxpoint DMA properly dma: acpi-dma: parse CSRT to extract additional resources
2013-05-24Merge branch 'akpm' (incoming from Andrew Morton)Linus Torvalds
Merge fixes from Andrew Morton: "A bunch of fixes and one simple fbdev driver which missed the merge window because people will still talking about it (to no great effect)." * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (30 commits) aio: fix kioctx not being freed after cancellation at exit time mm/pagewalk.c: walk_page_range should avoid VM_PFNMAP areas drivers/rtc/rtc-max8998.c: check for pdata presence before dereferencing ocfs2: goto out_unlock if ocfs2_get_clusters_nocache() failed in ocfs2_fiemap() random: fix accounting race condition with lockless irq entropy_count update drivers/char/random.c: fix priming of last_data mm/memory_hotplug.c: fix printk format warnings nilfs2: fix issue of nilfs_set_page_dirty() for page at EOF boundary drivers/block/brd.c: fix brd_lookup_page() race fbdev: FB_GOLDFISH should depend on HAS_DMA drivers/rtc/rtc-pl031.c: pass correct pointer to free_irq() auditfilter.c: fix kernel-doc warnings aio: fix io_getevents documentation revert "selftest: add simple test for soft-dirty bit" drivers/leds/leds-ot200.c: fix error caused by shifted mask mm/THP: use pmd_populate() to update the pmd with pgtable_t pointer linux/kernel.h: fix kernel-doc warning mm compaction: fix of improper cache flush in migration code rapidio/tsi721: fix bug in MSI interrupt handling hfs: avoid crash in hfs_bnode_create ...
2013-05-24Merge tag 'fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc Pull ARM SoC fixes from Olof Johansson: "We didn't have any fixes sent up for -rc2, so this is a slightly larger batch. A bit all over the place platform-wise; OMAP, at91, marvell, renesas, sunxi, ux500, etc. I tried to summarize highlights but there isn't a whole lot to point out. Lots of little things fixed all over. A couple of defconfig updates due to new/changing options." * tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (44 commits) ARM: at91/sama5: fix incorrect PMC pcr div definition ARM: at91/dt: fix macb pinctrl_macb_rmii_mii_alt definition ARM: at91: at91sam9n12: move external irq declatation to DT ARM: shmobile: marzen: Use error values in usb_power_* ARM: tegra: defconfig fixes ARM: nomadik: fix IRQ assignment for SMC ethernet ARM: vt8500: Add missing NULL terminator in dt_compat clk: tegra: add ac97 controller clock clk: tegra: remove USB from clk init table ARM: dts: mvebu: Fix wrong the address reg value for the L2-cache node ARM: plat-orion: Fix num_resources and id for ge10 and ge11 ARM: OMAP2+: hwmod: Remove sysc slave idle and auto idle apis SERIAL: OMAP: Remove the slave idle handling from the driver ARM: OMAP2+: serial: Remove the un-used slave idle hooks ARM: OMAP2+: hwmod-data: UART IP needs software control to manage sidle modes ARM: OMAP2+: hwmod: Add a new flag to handle SIDLE in SWSUP only in active ARM: OMAP2+: hwmod: Fix sidle programming in _enable_sysc()/_idle_sysc() arm: mvebu: fix the 'ranges' property to handle PCIe ARM: mvebu: select ARCH_REQUIRE_GPIOLIB for mvebu platform ARM: AM33XX: Add missing .clkdm_name to clkdiv32k_ick clock ...
2013-05-24linux/kernel.h: fix kernel-doc warningRandy Dunlap
Fix kernel-doc warning in <linux/kernel.h>: Warning(include/linux/kernel.h:590): No description found for parameter 'ip' scripts/kernel-doc cannot handle macros, functions, or function prototypes between the function or macro that is being documented and its definition, so move these prototypes above the function that is being documented. Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-05-24wait: fix false timeouts when using wait_event_timeout()Imre Deak
Many callers of the wait_event_timeout() and wait_event_interruptible_timeout() expect that the return value will be positive if the specified condition becomes true before the timeout elapses. However, at the moment this isn't guaranteed. If the wake-up handler is delayed enough, the time remaining until timeout will be calculated as 0 - and passed back as a return value - even if the condition became true before the timeout has passed. Fix this by returning at least 1 if the condition becomes true. This semantic is in line with what wait_for_condition_timeout() does; see commit bb10ed09 ("sched: fix wait_for_completion_timeout() spurious failure under heavy load"). Daniel said "We have 3 instances of this bug in drm/i915. One case even where we switch between the interruptible and not interruptible wait_event_timeout variants, foolishly presuming they have the same semantics. I very much like this." One such bug is reported at https://bugs.freedesktop.org/show_bug.cgi?id=64133 Signed-off-by: Imre Deak <imre.deak@intel.com> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Acked-by: David Howells <dhowells@redhat.com> Acked-by: Jens Axboe <axboe@kernel.dk> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Cc: Dave Jones <davej@redhat.com> Cc: Lukas Czerner <lczerner@redhat.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-05-24rapidio: add enumeration/discovery start from user spaceAlexandre Bounine
Add RapidIO enumeration/discovery start from user space. User space start allows to defer RapidIO fabric scan until the moment when all participating endpoints are initialized avoiding mandatory synchronized start of all endpoints (which may be challenging in systems with large number of RapidIO endpoints). Signed-off-by: Alexandre Bounine <alexandre.bounine@idt.com> Cc: Matt Porter <mporter@kernel.crashing.org> Cc: Li Yang <leoli@freescale.com> Cc: Kumar Gala <galak@kernel.crashing.org> Cc: Andre van Herk <andre.van.herk@Prodrive.nl> Cc: Micha Nelissen <micha.nelissen@Prodrive.nl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-05-24rapidio: make enumeration/discovery configurableAlexandre Bounine
Systems that use RapidIO fabric may need to implement their own enumeration and discovery methods which are better suitable for needs of a target application. The following set of patches is intended to simplify process of introduction of new RapidIO fabric enumeration/discovery methods. The first patch offers ability to add new RapidIO enumeration/discovery methods using kernel configuration options. This new configuration option mechanism allows to select statically linked or modular enumeration/discovery method(s) from the list of existing methods or use external module(s). This patch also updates the currently existing enumeration/discovery code to be used as a statically linked or modular method. The corresponding configuration option is named "Basic enumeration/discovery" method. This is the only one configuration option available today but new methods are expected to be introduced after adoption of provided patches. The second patch address a long time complaint of RapidIO subsystem users regarding fabric enumeration/discovery start sequence. Existing implementation offers only a boot-time enumeration/discovery start which requires synchronized boot of all endpoints in RapidIO network. While it works for small closed configurations with limited number of endpoints, using this approach in systems with large number of endpoints is quite challenging. To eliminate requirement for synchronized start the second patch introduces RapidIO enumeration/discovery start from user space. For compatibility with the existing RapidIO subsystem implementation, automatic boot time enumeration/discovery start can be configured in by specifying "rio-scan.scan=1" command line parameter if statically linked basic enumeration method is selected. This patch: Rework to implement RapidIO enumeration/discovery method selection combined with ability to use enumeration/discovery as a kernel module. This patch adds ability to introduce new RapidIO enumeration/discovery methods using kernel configuration options. Configuration option mechanism allows to select statically linked or modular enumeration/discovery method from the list of existing methods or use external modules. If a modular enumeration/discovery is selected each RapidIO mport device can have its own method attached to it. The existing enumeration/discovery code was updated to be used as statically linked or modular method. This configuration option is named "Basic enumeration/discovery" method. Several common routines have been moved from rio-scan.c to make them available to other enumeration methods and reduce number of exported symbols. Signed-off-by: Alexandre Bounine <alexandre.bounine@idt.com> Cc: Matt Porter <mporter@kernel.crashing.org> Cc: Li Yang <leoli@freescale.com> Cc: Kumar Gala <galak@kernel.crashing.org> Cc: Andre van Herk <andre.van.herk@Prodrive.nl> Cc: Micha Nelissen <micha.nelissen@Prodrive.nl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-05-24Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds
Pull networking fixes from David Miller: "It's been a while since my last pull request so quite a few fixes have piled up." Indeed. 1) Fix nf_{log,queue} compilation with PROC_FS disabled, from Pablo Neira Ayuso. 2) Fix data corruption on some tg3 chips with TSO enabled, from Michael Chan. 3) Fix double insertion of VLAN tags in be2net driver, from Sarveshwar Bandi. 4) Don't have TCP's MD5 support pass > PAGE_SIZE page offsets in scatter-gather entries into the crypto layer, the crypto layer can't handle that. From Eric Dumazet. 5) Fix lockdep splat in 802.1Q MRP code, also from Eric Dumazet. 6) Fix OOPS in netfilter log module when called from conntrack, from Hans Schillstrom. 7) FEC driver needs to use netif_tx_{lock,unlock}_bh() rather than the non-BH disabling variants. From Fabio Estevam. 8) TCP GSO can generate out-of-order packets, fix from Eric Dumazet. 9) vxlan driver doesn't update 'used' field of fdb entries when it should, from Sridhar Samudrala. 10) ipv6 should use kzalloc() to allocate inet6 socket cork options, otherwise we can OOPS in ip6_cork_release(). From Eric Dumazet. 11) Fix races in bonding set mode, from Nikolay Aleksandrov. 12) Fix checksum generation regression added by "r8169: fix 8168evl frame padding.", from Francois Romieu. 13) ip_gre can look at stale SKB data pointer, fix from Eric Dumazet. 14) Fix checksum handling when GSO is enabled in bnx2x driver with certain chips, from Yuval Mintz. 15) Fix double free in batman-adv, from Martin Hundebøll. 16) Fix device startup synchronization with firmware in tg3 driver, from Nithin Sujit. 17) perf networking dropmonitor doesn't work at all due to mixed up trace parameter ordering, from Ben Hutchings. 18) Fix proportional rate reduction handling in tcp_ack(), from Nandita Dukkipati. 19) IPSEC layer doesn't return an error when a valid state is detected, causing an OOPS. Fix from Timo Teräs. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (85 commits) be2net: bug fix on returning an invalid nic descriptor tcp: xps: fix reordering issues net: Revert unused variable changes. xfrm: properly handle invalid states as an error virtio_net: enable napi for all possible queues during open tcp: bug fix in proportional rate reduction. net: ethernet: sun: drop unused variable net: ethernet: korina: drop unused variable net: ethernet: apple: drop unused variable qmi_wwan: Added support for Cinterion's PLxx WWAN Interface perf: net_dropmonitor: Remove progress indicator perf: net_dropmonitor: Use bisection in symbol lookup perf: net_dropmonitor: Do not assume ordering of dictionaries perf: net_dropmonitor: Fix symbol-relative addresses perf: net_dropmonitor: Fix trace parameter order net: fec: use a more proper compatible string for MVF type device qlcnic: Fix updating netdev->features qlcnic: remove netdev->trans_start updates within the driver qlcnic: Return proper error codes from probe failure paths tg3: Update version to 3.132 ...
2013-05-24cgroup: fix a subtle bug in descendant pre-order walkTejun Heo
When cgroup_next_descendant_pre() initiates a walk, it checks whether the subtree root doesn't have any children and if not returns NULL. Later code assumes that the subtree isn't empty. This is broken because the subtree may become empty inbetween, which can lead to the traversal escaping the subtree by walking to the sibling of the subtree root. There's no reason to have the early exit path. Remove it along with the later assumption that the subtree isn't empty. This simplifies the code a bit and fixes the subtle bug. While at it, fix the comment of cgroup_for_each_descendant_pre() which was incorrectly referring to ->css_offline() instead of ->css_online(). Signed-off-by: Tejun Heo <tj@kernel.org> Reviewed-by: Michal Hocko <mhocko@suse.cz> Cc: stable@vger.kernel.org
2013-05-23Merge tag 'pci-v3.10-fixes-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull PCI updates from Bjorn Helgaas: "Here are some more fixes for v3.10. The Moorestown update broke Intel Medfield devices, so I reverted it. The acpiphp change fixes a regression: we broke hotplug notifications to host bridges when we split acpiphp into the host-bridge related part and the endpoint-related part. Moorestown Revert "x86/pci/mrst: Use configuration mechanism 1 for 00:00.0, 00:02.0, 00:03.0" Hotplug PCI: acpiphp: Re-enumerate devices when host bridge receives Bus Check" * tag 'pci-v3.10-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: Revert "x86/pci/mrst: Use configuration mechanism 1 for 00:00.0, 00:02.0, 00:03.0" PCI: acpiphp: Re-enumerate devices when host bridge receives Bus Check
2013-05-23Merge tag 'tty-3.10-rc2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty Pull tty/serial fixes from Greg Kroah-Hartman: "Here are some tty / serial driver fixes for 3.10-rc2. Nothing huge, although the rocket driver fix looks large, it's just moving the code around to fix the reported build issues in it. Other than that, this has the fix for the of-reported lockdep warning from the vt layer, as well as some other needed bugfixes. All of these have been in linux-next for a while" * tag 'tty-3.10-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: tty: mxser: Fix build warning introduced by dfc7b837c7f9 (Re: linux-next: build warning after merge of the tty.current tree) tty: mxser: fix usage of opmode_ioaddr serial: 8250_dw: add ACPI ID for Intel BayTrail TTY: Fix tty miss restart after we turn off flow-control tty/vt: Fix vc_deallocate() lock order TTY: ehv_bytechan: add missing platform_driver_unregister() when module exit TTY: rocket, fix more no-PCI warnings serial: mcf: missing uart_unregister_driver() on error in mcf_init() tty: serial: mpc5xxx: fix error handing in mpc52xx_uart_init() serial: samsung: add missing platform_driver_unregister() when module exit serial: pl011: protect attribute read from NULL platform data struct tty: nwpserial: Pass correct pointer to free_irq() serial: 8250_dw: Add valid clk pointer check
2013-05-23Merge tag 'usb-3.10-rc2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb Pull USB fixes from Greg Kroah-Hartman: "Here are a number of tiny USB bugfixes / new device ids for 3.10-rc2 The majority of these are USB gadget fixes, but they are all small. Other than that, some USB host controller fixes, and USB serial driver fixes for problems reported with them. Also hopefully a fixed up USB_OTG Kconfig dependancy, that one seems to be almost impossible to get right for all of the different platforms these days." * tag 'usb-3.10-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (56 commits) USB: cxacru: potential underflow in cxacru_cm_get_array() USB: ftdi_sio: Add support for Newport CONEX motor drivers USB: option: add device IDs for Dell 5804 (Novatel E371) WWAN card usb: ohci: fix goto wrong tag in err case usb: isp1760-if: fix memleak when platform_get_resource fail usb: ehci-s5p: fix memleak when fallback to pdata USB: serial: clean up chars_in_buffer USB: ti_usb_3410_5052: fix chars_in_buffer overhead USB: io_ti: fix chars_in_buffer overhead USB: ftdi_sio: fix chars_in_buffer overhead USB: ftdi_sio: clean up get_modem_status USB: serial: add generic wait_until_sent implementation USB: serial: add wait_until_sent operation USB: set device dma_mask without reference to global data USB: Blacklisted Cinterion's PLxx WWAN Interface usb: option: Add Telewell TW-LTE 4G USB: EHCI: remove bogus #error USB: reset resume quirk needed by a hub USB: usb-stor: realtek_cr: Fix compile error usb, chipidea: fix link error when USB_EHCI_HCD is a module ...
2013-05-23OMAPDSS: Fix crash with DT bootTomi Valkeinen
When booting with DT, there's a crash when omapfb is probed. This is caused by the fact that omapdss+DT is not yet supported, and thus omapdss is not probed at all. On the other hand, omapfb is always probed. When omapfb tries to use omapdss, there's a NULL pointer dereference crash. The same error should most likely happen with omapdrm and omap_vout also. To fix this, add an "initialized" state to omapdss. When omapdss has been probed, it's marked as initialized. omapfb, omapdrm and omap_vout check this state when they are probed to see that omapdss is actually there. Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ti.com> Tested-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
2013-05-22Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linusLinus Torvalds
Pull MIPS update from Ralf Baechle: - Fix a build error if <linux/printk.h> is included without <linux/linkage.h> having been included before. - Cleanup and fix the damage done by the generic idle loop patch. - A kprobes fix that brings the MIPS code in line with what other architectures are for quite a while already. - Wire up the native getdents64(2) syscall for 64 bit - for some reason it was only for the compat ABIs. This has been reported to cause an application issue. This turned out bigger than I meant but the wait instruction support code was driving me nuts. * 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus: MIPS: N64: Wire getdents64(2) kprobes/mips: Fix to check double free of insn slot MIPS: Idle: Break r4k_wait into two functions and fix it. MIPS: Idle: Do address fiddlery in helper functions. MIPS: Idle: Consolidate all declarations in <asm/idle.h>. MIPS: Idle: Don't call local_irq_disable() in cpu_wait() implementations. MIPS: Idle: Re-enable irqs at the end of r3081, au1k and loongson2 cpu_wait. MIPS: Idle: Make call of function pointer readable. MIPS: Idle: Consistently reformat inline assembler. MIPS: Idle: cleaup SMTC idle hook as per Linux coding style. MIPS: Consolidate idle loop / WAIT instruction support in a single file. MIPS: clock.h: Remove declaration of cpu_wait. Add include dependencies to <linux/printk.h>. MIPS: Rewrite pfn_valid to work in modules, too.
2013-05-22Merge tag 'omap-fixes-a-for-3.10-rc' of ↵Olof Johansson
git://git.kernel.org/pub/scm/linux/kernel/git/pjw/omap-pending into fixes From Paul Walmsley: Fix the OMAP serial driver to work correctly on OMAP4 when booting with DT. * tag 'omap-fixes-a-for-3.10-rc' of git://git.kernel.org/pub/scm/linux/kernel/git/pjw/omap-pending: ARM: OMAP2+: hwmod: Remove sysc slave idle and auto idle apis SERIAL: OMAP: Remove the slave idle handling from the driver ARM: OMAP2+: serial: Remove the un-used slave idle hooks ARM: OMAP2+: hwmod-data: UART IP needs software control to manage sidle modes ARM: OMAP2+: hwmod: Add a new flag to handle SIDLE in SWSUP only in active ARM: OMAP2+: hwmod: Fix sidle programming in _enable_sysc()/_idle_sysc() Signed-off-by: Olof Johansson <olof@lixom.net>
2013-05-22Merge tag 'mfd-fixes-3.10-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/sameo/mfd-fixes Pull mfd fixes from Samuel Ortiz: "This is the first batch of MFD fixes for 3.10. It's bigger than I would like, most of it is due to the big ab/db8500 merge that went through during the 3.10 merge window. So we have: - Some build fixes for the tps65912 and ab8500 drivers. - A couple of build fixes for the the si476x driver with pre 4.3 gcc compilers. - A few runtime breakage fixes (probe failures or oopses) for the ab8500 and db8500 drivers. - Some sparse or regular gcc warning fixes for the si476x, ab8500 and cros_ec drivers." * tag 'mfd-fixes-3.10-1' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/mfd-fixes: mfd: ab8500-sysctrl: Let sysctrl driver work without pdata mfd: db8500-prcmu: Update stored DSI PLL divider value mfd: ab8500-sysctrl: Always enable pm_power_off handler mfd: ab8500-core: Pass GPADC compatible string to MFD core mfd: db8500-prcmu: Supply the pdata_size attribute for db8500-thermal mfd: ab8500-core: Use the correct driver name when enabling gpio/pinctrl mfd: ab8500: Pass AB8500 IRQ to debugfs code by resource mfd: ab8500-gpadc: Suppress 'ignoring regulator_enable() return value' warning mfd: ab8500-sysctrl: Set sysctrl_dev during probe mfd: ab8500-sysctrl: Fix sparse warning mfd: abx500-core: Fix sparse warning mfd: ab8500: Debugfs code depends on gpadc mfd: si476x: Use get_unaligned_be16() for unaligned be16 loads mfd: cros_ec_spi: Use %z to format pointer differences mfd: si476x: Do not use binary constants mfd: tps65912: Select MFD_CORE
2013-05-22Merge tag 'fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux Pull virtio fixes from Rusty Russell: "A build fix and a uapi exposure fix. The build fix is later than I liked, but my first version broke linux-next due to overzealous header clean." * tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux: virtio_console: fix uapi header Hoist memcpy_fromiovec/memcpy_toiovec into lib/
2013-05-22kernel: Fix s390 absolute memory access for /dev/memMichael Holzheu
On s390 the prefix page and absolute zero pages are not correctly returned when reading /dev/mem. The reason is that the s390 asm/io.h file includes the asm-generic/io.h file which then defines xlate_dev_mem_ptr() and therefore overwrites the s390 specific version that does the correct swap operation for prefix and absolute zero pages. The problem is a regression that was introduced with git commit cd248341 (s390/pci: base support). To fix the problem add "#ifndef xlate_dev_mem_ptr" in asm-generic/io.h and "#define xlate_dev_mem_ptr" in asm/io.h. This ensures that the s390 version is used. For completeness also add the "#ifndef" construct for xlate_dev_kmem_ptr(). Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2013-05-22Add include dependencies to <linux/printk.h>.Ralf Baechle
If <linux/linkage.h> has not been included before <linux/printk.h>, a build error like the below one will result: CC arch/mips/kernel/idle.o In file included from arch/mips/kernel/idle.c:17:0: include/linux/printk.h:109:1: error: data definition has no type or storage class [-Werror] include/linux/printk.h:109:1: error: type defaults to ‘int’ in declaration of ‘asmlinkage’ [-Werror=implicit-int] include/linux/printk.h:110:1: error: ‘format’ attribute only applies to function types [-Werror=attributes] include/linux/printk.h:110:1: error: expected ‘,’ or ‘;’ before ‘int’ include/linux/printk.h:114:1: error: data definition has no type or storage class [-Werror] include/linux/printk.h:114:1: error: type defaults to ‘int’ in declaration of ‘asmlinkage’ [-Werror=implicit-int] include/linux/printk.h:115:1: error: ‘format’ attribute only applies to function types [-Werror=attributes] include/linux/printk.h:115:1: error: expected ‘,’ or ‘;’ before ‘int’ include/linux/printk.h:117:1: error: data definition has no type or storage class [-Werror] include/linux/printk.h:117:1: error: type defaults to ‘int’ in declaration of ‘asmlinkage’ [-Werror=implicit-int] include/linux/printk.h:118:1: error: ‘format’ attribute only applies to function types [-Werror=attributes] include/linux/printk.h:118:1: error: ‘__cold__’ attribute ignored [-Werror=attributes] include/linux/printk.h:118:1: error: expected ‘,’ or ‘;’ before ‘asmlinkage’ include/linux/printk.h:122:1: error: data definition has no type or storage class [-Werror] include/linux/printk.h:122:1: error: type defaults to ‘int’ in declaration of ‘asmlinkage’ [-Werror=implicit-int] include/linux/printk.h:123:1: error: ‘format’ attribute only applies to function types [-Werror=attributes] include/linux/printk.h:123:1: error: ‘__cold__’ attribute ignored [-Werror=attributes] include/linux/printk.h:123:1: error: expected ‘,’ or ‘;’ before ‘int’ In file included from include/linux/kernel.h:14:0, from include/linux/sched.h:15, from arch/mips/kernel/idle.c:18: include/linux/dynamic_debug.h: In function ‘ddebug_dyndbg_module_param_cb’: include/linux/dynamic_debug.h:124:3: error: implicit declaration of function ‘printk’ [-Werror=implicit-function-declaration] Fixed by including <linux/linkage.h>. Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2013-05-22ACPI / PM: Allow device power states to be used for CONFIG_PM unsetRafael J. Wysocki
Currently, drivers/acpi/device_pm.c depends on CONFIG_PM and all of the functions defined in there are replaced with static inline stubs if that option is unset. However, CONFIG_PM means, roughly, "runtime PM or suspend/hibernation support" and some of those functions are useful regardless of that. For example, they are used by the ACPI fan driver for controlling fans and acpi_device_set_power() is called during device removal. Moreover, device initialization may depend on setting device power states properly. For these reasons, make the routines manipulating ACPI device power states defined in drivers/acpi/device_pm.c available for CONFIG_PM unset too. Reported-by: Zhang Rui <rui.zhang@intel.com> Reported-and-tested-by: Michel Lespinasse <walken@google.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: 3.9+ <stable@vger.kernel.org>
2013-05-21Merge branch 'drm-radeon-sun-hainan' of ↵Linus Torvalds
git://people.freedesktop.org/~airlied/linux Pull radeon sun/hainan support from Dave Airlie: "Since I know its outside the merge window, but since this is new hw I thought I'd try and provoke the new hw exception, it just fills in the blanks in the driver for the new AMD sun and hainan chipsets." * 'drm-radeon-sun-hainan' of git://people.freedesktop.org/~airlied/linux: drm/radeon: add Hainan pci ids drm/radeon: add golden register settings for Hainan (v2) drm/radeon: sun/hainan chips do not have UVD (v2) drm/radeon: track which asics have UVD drm/radeon: radeon-asic updates for Hainan drm/radeon: fill in ucode loading support for Hainan drm/radeon: don't touch DCE or VGA regs on Hainan (v3) drm/radeon: fill in GPU init for Hainan (v2) drm/radeon: add chip family for Hainan
2013-05-20target: Remove unused wait_for_tasks bit in target_wait_for_sess_cmdsJoern Engel
Drop unused transport_wait_for_tasks() check in target_wait_for_sess_cmds shutdown code, and convert tcm_qla2xxx + ib_srpt fabric drivers. Cc: Joern Engel <joern@logfs.org> Cc: Roland Dreier <roland@kernel.org> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2013-05-20Merge branch 'master' of ↵John W. Linville
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless into for-davem