summaryrefslogtreecommitdiff
path: root/kernel/rcu/tree.c
AgeCommit message (Collapse)Author
2015-03-12rcu: Move rcu_report_unblock_qs_rnp() to common codePaul E. McKenney
The rcu_report_unblock_qs_rnp() function is invoked when the last task blocking the current grace period exits its outermost RCU read-side critical section. Previously, this was called only from rcu_read_unlock_special(), and was therefore defined only when CONFIG_RCU_PREEMPT=y. However, this function will be invoked even when CONFIG_RCU_PREEMPT=n once CPU-hotplug operations are processed only at the beginnings of RCU grace periods. The reason for this change is that the last task on a given leaf rcu_node structure's ->blkd_tasks list might well exit its RCU read-side critical section between the time that recent CPU-hotplug operations were applied and when the new grace period was initialized. This situation could result in RCU waiting forever on that leaf rcu_node structure, because if all that structure's CPUs were already offline, there would be no quiescent-state events to drive that structure's part of the grace period. This commit therefore moves rcu_report_unblock_qs_rnp() to common code that is built unconditionally so that the quiescent-state-forcing code can clean up after this situation, avoiding the grace-period stall. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-03-11rcu: Remove event tracing from rcu_cpu_notify(), used by offline CPUsPaul E. McKenney
Offline CPUs cannot safely invoke trace events, but such CPUs do execute within rcu_cpu_notify(). Therefore, this commit removes the trace events from rcu_cpu_notify(). These trace events are for utilization, against which rcu_cpu_notify() execution time should be negligible. Reported-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-03-11rcu: Provide diagnostic option to slow down grace-period initializationPaul E. McKenney
Grace-period initialization normally proceeds quite quickly, so that it is very difficult to reproduce races against grace-period initialization. This commit therefore allows grace-period initialization to be artificially slowed down, increasing race-reproduction probability. A pair of new Kconfig parameters are provided, CONFIG_RCU_TORTURE_TEST_SLOW_INIT to enable the slowdowns, and CONFIG_RCU_TORTURE_TEST_SLOW_INIT_DELAY to specify the number of jiffies of slowdown to apply. A boot-time parameter named rcutree.gp_init_delay allows boot-time delay to be specified. By default, no delay will be applied even if CONFIG_RCU_TORTURE_TEST_SLOW_INIT is set. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-03-11rcu: Detect stalls caused by failure to propagate up rcu_node treePaul E. McKenney
If all CPUs have passed through quiescent states, then stalls might be due to starvation of the grace-period kthread or to failure to propagate the quiescent states up the rcu_node combining tree. The current stall warning messages do not differentiate, so this commit adds a printout of the root rcu_node structure's ->qsmask field. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-03-11rcu: Put all orphan-callback-related code under same commentPaul E. McKenney
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-03-11rcu: Consolidate offline-CPU callback initializationPaul E. McKenney
Currently, both rcu_cleanup_dead_cpu() and rcu_send_cbs_to_orphanage() initialize the outgoing CPU's callback list. However, only rcu_cleanup_dead_cpu() invokes rcu_send_cbs_to_orphanage(), and it does so unconditionally, which means that only one of these initializations is required. This commit therefore consolidates the callback-list initialization with the rest of the callback handling in rcu_send_cbs_to_orphanage(). Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-03-03rcu: Remove redundant check of cpu_online()Yao Dongdong
Because invoke_cpu_core() checks whether the current CPU is online, there is no need for __call_rcu_core() to redundantly check it. There should not be any performance degradation because the called function is visible to the compiler. This commit therefore removes the redundant check. Signed-off-by: Yao Dongdong <yaodongdong@huawei.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-03-03rcu: Get rcu_sched_force_quiescent_state() where it belongsPaul E. McKenney
The very similar functions rcu_force_quiescent_state(), rcu_bh_force_quiescent_state(), and rcu_sched_force_quiescent_state() are supposed to be together, but have drifted apart. This commit restores rcu_sched_force_quiescent_state() to its rightful place. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-03-03rcu: Use IS_ENABLED() to CONFIG_RCU_FANOUT_EXACT #ifdefPaul E. McKenney
This commit uses IS_ENABLED() to remove the #ifdef from the rcu_init_levelspread() functions. No effect on executable code. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-03-03rcu: Move early boot callback tests earlierPaul E. McKenney
Because callbacks can now be posted quite early in boot, move the early boot callback tests to precede RCU initialization. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-03-03rcu: Move early-boot callbacks to no-CBs lists for no-CBs CPUsPaul E. McKenney
When a CPU is first determined to be a no-CBs CPUs, this commit causes any early boot callbacks to be moved to the no-CBs callback list, allowing them to be invoked. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-02-26rcu: Tighten up affinity and check for sysidlePaul E. McKenney
If the RCU grace-period kthread invoking rcu_sysidle_check_cpu() happens to be running on the tick_do_timer_cpu initially, then rcu_bind_gp_kthread() won't bind it. This kthread might then migrate before invoking rcu_gp_fqs(), which will trigger the WARN_ON_ONCE() in rcu_sysidle_check_cpu(). This commit therefore makes rcu_bind_gp_kthread() do the binding even if the kthread is currently on the same CPU. Because this incurs added overhead, this commit also causes each RCU grace-period kthread to invoke rcu_bind_gp_kthread() once at boot rather than at the beginning of each grace period. And as long as rcu_bind_gp_kthread() is being modified, this commit eliminates its #ifdef. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-02-26rcu: Fixes to NO_HZ_FULL sysidle accountingPaul E. McKenney
On second and subsequent passes through quiescent-state forcing, the isidle variable was initialized to false, which would prevent full sysidle state from being reached if a grace period needed more than one round of quiescent-state forcing (which most should not). However, the check for offline CPUs in the quiescent-state forcing main loop had the wrong sense, which could prevent CPUs from ever entering full sysidle state. This commit fixes both of these bugs. Given that sysidle is not yet wired up, this has no effect in old kernels, but might have proven frustrating had anyone attempted to wire it up. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-02-26rcu: Update from rcu_expedited variable to rcu_gp_is_expedited()Paul E. McKenney
This commit updates open-coded tests of the rcu_expedited variable to instead use rcu_gp_is_expedited(). Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-02-26rcu: Fix a couple of typos in rcu_all_qs() comment headerPaul E. McKenney
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-02-26rcu: Avoid clobbering early boot callbacksPaul E. McKenney
When a CPU comes online, it initializes its callback list. This is a bad thing if this is the first time that the CPU has come online and if that CPU has early boot callbacks. This commit therefore avoid initializing the callback list if there are callbacks present, in which case the initial call_rcu() did the initialization for us. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-02-26rcu: Prevent early-boot RCU callbacks from splattingPaul E. McKenney
Currently, a call_rcu() that precedes rcu_init() will splat due to the callback lists not having yet been initialized. This commit causes the first such callback to initialize the boot CPU's RCU callback list. Note that this commit does not change rcu_init()-time initialization, which means that the callback will be discarded at rcu_init() time. Fixing this is the job of later commits. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-02-26rcu: Wire ->rda pointers at compile timePaul E. McKenney
This commit wires up the rcu_state structures' ->rda pointers to the per-CPU rcu_data structures at compile time, thus ensuring that this linkage is present at early boot, in turn allowing posting of callbacks before rcu_init() is executed. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-02-26rcu: Abstract default callback-list initialization from init_callback_list()Paul E. McKenney
In preparation for early-boot posting of callbacks, this commit abstracts initialization of the default (non-no-CB) callbacks list from the init_callback_list() function into a new init_default_callback_list() function. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-02-25rcu_tree: Avoid touching rnp->completed when a new GP is startedLai Jiangshan
In rcu_gp_init(), rnp->completed equals to rsp->completed in THEORY, we don't need to touch it normally. If something goes wrong, it will complain and fixup rnp->completed and avoid oops. This commit thus avoids the normal needless store to rnp->completed. Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-01-15Merge branches 'doc.2015.01.07a', 'fixes.2015.01.15a', ↵Paul E. McKenney
'preempt.2015.01.06a', 'srcu.2015.01.06a', 'stall.2015.01.16a' and 'torture.2015.01.11a' into HEAD doc.2015.01.07a: Documentation updates. fixes.2015.01.15a: Miscellaneous fixes. preempt.2015.01.06a: Changes to handling of lists of preempted tasks. srcu.2015.01.06a: SRCU updates. stall.2015.01.16a: RCU CPU stall-warning updates and fixes. torture.2015.01.11a: RCU torture-test updates and fixes.
2015-01-15rcu: Add GP-kthread-starvation checks to CPU stall warningsPaul E. McKenney
This commit adds a message that is printed if the relevant grace-period kthread has not been able to run for the two seconds preceding the stall warning. (The two seconds is double the maximum interval between successive bouts of quiescent-state forcing.) Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-01-15rcu: Make cond_resched_rcu_qs() apply to normal RCU flavorsPaul E. McKenney
Although cond_resched_rcu_qs() only applies to TASKS_RCU, it is used in places where it would be useful for it to apply to the normal RCU flavors, rcu_preempt, rcu_sched, and rcu_bh. This is especially the case for workloads that aggressively overload the system, particularly those that generate large numbers of RCU updates on systems running NO_HZ_FULL CPUs. This commit therefore communicates quiescent states from cond_resched_rcu_qs() to the normal RCU flavors. Note that it is unfortunately necessary to leave the old ->passed_quiesce mechanism in place to allow quiescent states that apply to only one flavor to be recorded. (Yes, we could decrement ->rcu_qs_ctr_snap in that case, but that is not so good for debugging of RCU internals.) In addition, if one of the RCU flavor's grace period has stalled, this will invoke rcu_momentary_dyntick_idle(), resulting in a heavy-weight quiescent state visible from other CPUs. Reported-by: Sasha Levin <sasha.levin@oracle.com> Reported-by: Dave Jones <davej@redhat.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> [ paulmck: Merge commit from Sasha Levin fixing a bug where __this_cpu() was used in preemptible code. ]
2015-01-15rcu: Optionally run grace-period kthreads at real-time priorityPaul E. McKenney
Recent testing has shown that under heavy load, running RCU's grace-period kthreads at real-time priority can improve performance (according to 0day test robot) and reduce the incidence of RCU CPU stall warnings. However, most systems do just fine with the default non-realtime priorities for these kthreads, and it does not make sense to expose the entire user base to any risk stemming from this change, given that this change is of use only to a few users running extremely heavy workloads. Therefore, this commit allows users to specify realtime priorities for the grace-period kthreads, but leaves them running SCHED_OTHER by default. The realtime priority may be specified at build time via the RCU_KTHREAD_PRIO Kconfig parameter, or at boot time via the rcutree.kthread_prio parameter. Either way, 0 says to continue the default SCHED_OTHER behavior and values from 1-99 specify that priority of SCHED_FIFO behavior. Note that a value of 0 is not permitted when the RCU_BOOST Kconfig parameter is specified. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-01-10rcutorture: Check from beginning to end of grace periodPaul E. McKenney
Currently, rcutorture's Reader Batch checks measure from the end of the previous grace period to the end of the current one. This commit tightens up these checks by measuring from the start and end of the same grace period. This involves adding rcu_batches_started() and friends corresponding to the existing rcu_batches_completed() and friends. We leave SRCU alone for the moment, as it does not yet have a way of tracking both ends of its grace periods. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-01-10rcu: Make _batches_completed() functions return unsigned longPaul E. McKenney
Long ago, the various ->completed fields were of type long, but now are unsigned long due to signed-integer-overflow concerns. However, the various _batches_completed() functions remained of type long, even though their only purpose in life is to return the corresponding ->completed field. This patch cleans this up by changing these functions' return types to unsigned long. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-01-06rcu: Handle gpnum/completed wrap while dyntick idlePaul E. McKenney
Subtle race conditions can result if a CPU stays in dyntick-idle mode long enough for the ->gpnum and ->completed fields to wrap. For example, consider the following sequence of events: o CPU 1 encounters a quiescent state while waiting for grace period 5 to complete, but then enters dyntick-idle mode. o While CPU 1 is in dyntick-idle mode, the grace-period counters wrap around so that the grace period number is now 4. o Just as CPU 1 exits dyntick-idle mode, grace period 4 completes and grace period 5 begins. o The quiescent state that CPU 1 passed through during the old grace period 5 looks like it applies to the new grace period 5. Therefore, the new grace period 5 completes without CPU 1 having passed through a quiescent state. This could clearly be a fatal surprise to any long-running RCU read-side critical section that happened to be running on CPU 1 at the time. At one time, this was not a problem, given that it takes significant time for the grace-period counters to overflow even on 32-bit systems. However, with the advent of NO_HZ_FULL and SMP embedded systems, arbitrarily long idle periods are now becoming quite feasible. It is therefore time to close this race. This commit therefore avoids this race condition by having the quiescent-state forcing code detect when a CPU is falling too far behind, and setting a new rcu_data field ->gpwrap when this happens. Whenever this new ->gpwrap field is set, the CPU's ->gpnum and ->completed fields are known to be untrustworthy, and can be ignored, along with any associated quiescent states. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-01-06rcu: Improve diagnostics for spurious RCU CPU stall warningsPaul E. McKenney
The current RCU CPU stall warning code will print "Stall ended before state dump start" any time that the stall-warning code is triggered on a CPU that has already reported a quiescent state for the current grace period and if all quiescent states have been reported for the current grace period. However, a true stall can result in these symptoms, for example, by preventing RCU's grace-period kthreads from ever running This commit therefore checks for this condition, reporting the end of the stall only if one of the grace-period counters has actually advanced. Otherwise, it reports the last time that the grace-period kthread made meaningful progress. (In normal situations, the grace-period kthread should make meaningful progress at least every jiffies_till_next_fqs jiffies.) Reported-by: Miroslav Benes <mbenes@suse.cz> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Tested-by: Miroslav Benes <mbenes@suse.cz>
2015-01-06rcu: Make RCU_CPU_STALL_INFO include number of fqs attemptsPaul E. McKenney
One way that an RCU CPU stall warning can happen is if the grace-period kthread is not allowed to execute. One proxy for this kthread's forward progress is the number of force-quiescent-state (fqs) scans. This commit therefore adds the number of fqs scans to the RCU CPU stall warning printouts when CONFIG_RCU_CPU_STALL_INFO=y. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-01-06rcu: Remove redundant callback-list initializationPaul E. McKenney
The RCU callback lists are initialized in both rcu_boot_init_percpu_data() and rcu_init_percpu_data(). The former is intended for initializing immutable data, so this commit removes the initialization from rcu_boot_init_percpu_data() and leaves it in rcu_init_percpu_data(). This change prepares for permitting callbacks to be queued very early in boot. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-01-06rcu: Don't scan root rcu_node structure for stalled tasksPaul E. McKenney
Now that blocked tasks are no longer migrated to the root rcu_node structure, there is no need to scan the root rcu_node structure for blocked tasks stalling the current grace period. This commit therefore removes this scan. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-01-06rcu: Note quiescent state when CPU goes offlinePaul E. McKenney
The rcu_cleanup_dead_cpu() function (called after a CPU has gone completely offline) has not reported a quiescent state because there was probably at least one synchronize_rcu() between the time the CPU went offline and the CPU_DEAD notifier, and this would have detected the CPU's offline state via quiescent-state forcing. However, the plan is for CPUs to take themselves offline, at which point it makes sense for them to report their own quiescent state. This commit makes this change in preparation for the new CPU-hotplug setup. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-01-06rcu: Don't initiate RCU priority boosting on root rcu_nodePaul E. McKenney
Because there is no longer any preempted tasks on the root rcu_node, and because there is no longer ever an rcub kthread for the root rcu_node, this commit drops the code in force_qs_rnp() that attempts to awaken the non-existent root rcub kthread. This is strictly a performance enhancement, removing a root rcu_node ->lock acquisition and release along with some tests in rcu_initiate_boost(), ending with the test that notes that there is no rcub kthread. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-01-06rcu: Shorten irq-disable region in rcu_cleanup_dead_cpu()Paul E. McKenney
Now that we are not migrating callbacks, there is no need to hold the ->orphan_lock across the the ->qsmaskinit bit-clearing process. This commit therefore releases ->orphan_lock immediately after adopting the orphaned RCU callbacks. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-01-06rcu: Don't migrate blocked tasks even if all corresponding CPUs offlinePaul E. McKenney
When the last CPU associated with a given leaf rcu_node structure goes offline, something must be done about the tasks queued on that rcu_node structure. Each of these tasks has been preempted on one of the leaf rcu_node structure's CPUs while in an RCU read-side critical section that it have not yet exited. Handling these tasks is the job of rcu_preempt_offline_tasks(), which migrates them from the leaf rcu_node structure to the root rcu_node structure. Unfortunately, this migration has to be done one task at a time because each tasks allegiance must be shifted from the original leaf rcu_node to the root, so that future attempts to deal with these tasks will acquire the root rcu_node structure's ->lock rather than that of the leaf. Worse yet, this migration must be done with interrupts disabled, which is not so good for realtime response, especially given that there is no bound on the number of tasks on a given rcu_node structure's list. (OK, OK, there is a bound, it is just that it is unreasonably large, especially on 64-bit systems.) This was not considered a problem back when rcu_preempt_offline_tasks() was first written because realtime systems were assumed not to do CPU-hotplug operations while real-time applications were running. This assumption has proved of dubious validity given that people are starting to run multiple realtime applications on a single SMP system and that it is common practice to offline then online a CPU before starting its real-time application in order to clear extraneous processing off of that CPU. So we now need CPU hotplug operations to avoid undue latencies. This commit therefore avoids migrating these tasks, instead letting them be dequeued one by one from the original leaf rcu_node structure by rcu_read_unlock_special(). This means that the clearing of bits from the upper-level rcu_node structures must be deferred until the last such task has been dequeued, because otherwise subsequent grace periods won't wait on them. This commit has the beneficial side effect of simplifying the CPU-hotplug code for TREE_PREEMPT_RCU, especially in CONFIG_RCU_BOOST builds. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-01-06rcu: Make rcu_read_unlock_special() propagate ->qsmaskinit bit clearingPaul E. McKenney
This commit causes rcu_read_unlock_special() to propagate ->qsmaskinit bit clearing up the rcu_node tree once a given rcu_node structure's blkd_tasks list becomes empty. This is the final commit in preparation for the rework of RCU priority boosting: It enables preempted tasks to remain queued on their rcu_node structure even after all of that rcu_node structure's CPUs have gone offline. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-01-06rcu: Abstract rcu_cleanup_dead_rnp() from rcu_cleanup_dead_cpu()Paul E. McKenney
This commit abstracts rcu_cleanup_dead_rnp() from rcu_cleanup_dead_cpu() in preparation for the rework of RCU priority boosting. This new function will be invoked from rcu_read_unlock_special() in the reworked scheme, which is why rcu_cleanup_dead_rnp() assumes that the leaf rcu_node structure's ->qsmaskinit field has already been updated. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-01-06rcu: Fix rcu_barrier() race that could result in too-short waitPaul E. McKenney
The rcu_barrier() no-callbacks check for no-CBs CPUs has race conditions. It checks a given CPU's lists of callbacks, and if all three no-CBs lists are empty, ignores that CPU. However, these three lists could potentially be empty even when callbacks are present if the check executed just as the callbacks were being moved from one list to another. It turns out that recent versions of rcutorture can spot this race. This commit plugs this hole by consolidating the per-list counts of no-CBs callbacks into a single count, which is incremented before the corresponding callback is posted and after it is invoked. Then rcu_barrier() checks this single count to reliably determine whether the corresponding CPU has no-CBs callbacks. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-01-06tiny_rcu: Directly force QS when call_rcu_[bh|sched]() on idle_taskLai Jiangshan
For RCU in UP, context-switch = QS = GP, thus we can force a context-switch when any call_rcu_[bh|sched]() is happened on idle_task. After doing so, rcu_idle/irq_enter/exit() are useless, so we can simply make these functions empty. More important, this change does not change the functionality logically. Note: raise_softirq(RCU_SOFTIRQ)/rcu_sched_qs() in rcu_idle_enter() and outmost rcu_irq_exit() will have to wake up the ksoftirqd (due to in_interrupt() == 0). Before this patch After this patch: call_rcu_sched() in idle; call_rcu_sched() in idle set resched do other stuffs; do other stuffs outmost rcu_irq_exit() outmost rcu_irq_exit() (empty function) (or rcu_idle_enter()) (or rcu_idle_enter(), also empty function) start to resched. (see above) rcu_sched_qs() rcu_sched_qs() QS,and GP and advance cb QS,and GP and advance cb wake up the ksoftirqd wake up the ksoftirqd set resched resched to ksoftirqd (or other) resched to ksoftirqd (or other) These two code patches are almost the same. Size changed after patched: size kernel/rcu/tiny-old.o kernel/rcu/tiny-patched.o text data bss dec hex filename 3449 206 8 3663 e4f kernel/rcu/tiny-old.o 2406 144 8 2558 9fe kernel/rcu/tiny-patched.o Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2014-12-30rcu: Fix invoke_rcu_callbacks() commentPaul E. McKenney
Despite what the comment says, it is only softirqs that are disabled, not interrupts. This commit therefore fixes the comment. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2014-12-30rcu: Make rcu_nmi_enter() handle nestingPaul E. McKenney
The x86 architecture has multiple types of NMI-like interrupts: real NMIs, machine checks, and, for some values of NMI-like, debugging and breakpoint interrupts. These interrupts can nest inside each other. Andy Lutomirski is adding RCU support to these interrupts, so rcu_nmi_enter() and rcu_nmi_exit() must now correctly handle nesting. This commit therefore introduces nesting, using a clever NMI-coordination algorithm suggested by Andy. The trick is to atomically increment ->dynticks (if needed) before manipulating ->dynticks_nmi_nesting on entry (and, accordingly, after on exit). In addition, ->dynticks_nmi_nesting is incremented by one if ->dynticks was incremented and by two otherwise. This means that when rcu_nmi_exit() sees ->dynticks_nmi_nesting equal to one, it knows that ->dynticks must be atomically incremented. This NMI-coordination algorithms has been validated by the following Promela model: ------------------------------------------------------------------------ /* * Promela model for Andy Lutomirski's suggested change to rcu_nmi_enter() * that allows nesting. * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation; either version 2 of the License, or * (at your option) any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program; if not, you can access it online at * http://www.gnu.org/licenses/gpl-2.0.html. * * Copyright IBM Corporation, 2014 * * Author: Paul E. McKenney <paulmck@linux.vnet.ibm.com> */ byte dynticks_nmi_nesting = 0; byte dynticks = 0; /* * Promela verision of rcu_nmi_enter(). */ inline rcu_nmi_enter() { byte incby; byte tmp; incby = BUSY_INCBY; assert(dynticks_nmi_nesting >= 0); if :: (dynticks & 1) == 0 -> atomic { dynticks = dynticks + 1; } assert((dynticks & 1) == 1); incby = 1; :: else -> skip; fi; tmp = dynticks_nmi_nesting; tmp = tmp + incby; dynticks_nmi_nesting = tmp; assert(dynticks_nmi_nesting >= 1); } /* * Promela verision of rcu_nmi_exit(). */ inline rcu_nmi_exit() { byte tmp; assert(dynticks_nmi_nesting > 0); assert((dynticks & 1) != 0); if :: dynticks_nmi_nesting != 1 -> tmp = dynticks_nmi_nesting; tmp = tmp - BUSY_INCBY; dynticks_nmi_nesting = tmp; :: else -> dynticks_nmi_nesting = 0; atomic { dynticks = dynticks + 1; } assert((dynticks & 1) == 0); fi; } /* * Base-level NMI runs non-atomically. Crudely emulates process-level * dynticks-idle entry/exit. */ proctype base_NMI() { byte busy; busy = 0; do :: /* Emulate base-level dynticks and not. */ if :: 1 -> atomic { dynticks = dynticks + 1; } busy = 1; :: 1 -> skip; fi; /* Verify that we only sometimes have base-level dynticks. */ if :: busy == 0 -> skip; :: busy == 1 -> skip; fi; /* Model RCU's NMI entry and exit actions. */ rcu_nmi_enter(); assert((dynticks & 1) == 1); rcu_nmi_exit(); /* Emulated re-entering base-level dynticks and not. */ if :: !busy -> skip; :: busy -> atomic { dynticks = dynticks + 1; } busy = 0; fi; /* We had better now be in dyntick-idle mode. */ assert((dynticks & 1) == 0); od; } /* * Nested NMI runs atomically to emulate interrupting base_level(). */ proctype nested_NMI() { do :: /* * Use an atomic section to model a nested NMI. This is * guaranteed to interleave into base_NMI() between a pair * of base_NMI() statements, just as a nested NMI would. */ atomic { /* Verify that we only sometimes are in dynticks. */ if :: (dynticks & 1) == 0 -> skip; :: (dynticks & 1) == 1 -> skip; fi; /* Model RCU's NMI entry and exit actions. */ rcu_nmi_enter(); assert((dynticks & 1) == 1); rcu_nmi_exit(); } od; } init { run base_NMI(); run nested_NMI(); } ------------------------------------------------------------------------ The following script can be used to run this model if placed in rcu_nmi.spin: ------------------------------------------------------------------------ if ! spin -a rcu_nmi.spin then echo Spin errors!!! exit 1 fi if ! cc -DSAFETY -o pan pan.c then echo Compilation errors!!! exit 1 fi ./pan -m100000 Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Lai Jiangshan <laijs@cn.fujitsu.com>
2014-11-13Merge branches 'torture.2014.11.03a', 'cpu.2014.11.03a', 'doc.2014.11.13a', ↵Paul E. McKenney
'fixes.2014.11.13a', 'signal.2014.10.29a' and 'rt.2014.10.29a' into HEAD cpu.2014.11.03a: Changes for per-CPU variables. doc.2014.11.13a: Documentation updates. fixes.2014.11.13a: Miscellaneous fixes. signal.2014.10.29a: Signal changes. rt.2014.10.29a: Real-time changes. torture.2014.11.03a: torture-test changes.
2014-11-03rcutorture: Add early boot self testsPranith Kumar
Add early boot self tests for RCU under CONFIG_PROVE_RCU. Currently the only test is adding a dummy callback which increments a counter which we then later verify after calling rcu_barrier*(). Signed-off-by: Pranith Kumar <bobby.prani@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2014-11-03rcu: Remove "cpu" argument to rcu_cleanup_after_idle()Paul E. McKenney
The "cpu" argument to rcu_cleanup_after_idle() is always the current CPU, so drop it. This moves the smp_processor_id() from the caller to rcu_cleanup_after_idle(), saving argument-passing overhead. Again, the anticipated cross-CPU uses of these functions has been replaced by NO_HZ_FULL. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Pranith Kumar <bobby.prani@gmail.com>
2014-11-03rcu: Remove "cpu" argument to rcu_prepare_for_idle()Paul E. McKenney
The "cpu" argument to rcu_prepare_for_idle() is always the current CPU, so drop it. This in turn allows two of the uses of "cpu" in this function to be replaced with a this_cpu_ptr() and the third by smp_processor_id(), replacing that of the call to rcu_prepare_for_idle(). Again, the anticipated cross-CPU uses of these functions has been replaced by NO_HZ_FULL. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Pranith Kumar <bobby.prani@gmail.com>
2014-11-03rcu: Remove "cpu" argument to rcu_needs_cpu()Paul E. McKenney
The "cpu" argument to rcu_needs_cpu() is always the current CPU, so drop it. This in turn allows the "cpu" argument to rcu_cpu_has_callbacks() to be removed, which allows the uses of "cpu" in both functions to be replaced with a this_cpu_ptr(). Again, the anticipated cross-CPU uses of these functions has been replaced by NO_HZ_FULL. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Pranith Kumar <bobby.prani@gmail.com>
2014-11-03rcu: Remove "cpu" argument to rcu_note_context_switch()Paul E. McKenney
The "cpu" argument to rcu_note_context_switch() is always the current CPU, so drop it. This in turn allows the "cpu" argument to rcu_preempt_note_context_switch() to be removed, which allows the sole use of "cpu" in both functions to be replaced with a this_cpu_ptr(). Again, the anticipated cross-CPU uses of these functions has been replaced by NO_HZ_FULL. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Pranith Kumar <bobby.prani@gmail.com>
2014-11-03rcu: Remove "cpu" argument to rcu_preempt_check_callbacks()Paul E. McKenney
Because rcu_preempt_check_callbacks()'s argument is guaranteed to always be the current CPU, drop the argument and replace per_cpu() with __this_cpu_read(). Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Pranith Kumar <bobby.prani@gmail.com>
2014-11-03rcu: Remove "cpu" argument to rcu_pending()Paul E. McKenney
Because rcu_pending()'s argument is guaranteed to always be the current CPU, drop the argument and replace per_cpu_ptr() with this_cpu_ptr(). Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Pranith Kumar <bobby.prani@gmail.com>
2014-11-03rcu: Remove "cpu" argument to rcu_check_callbacks()Paul E. McKenney
The "cpu" argument was kept around on the off-chance that RCU might offload scheduler-clock interrupts. However, this offload approach has been replaced by NO_HZ_FULL, which offloads -all- RCU processing from qualifying CPUs. It is therefore time to remove the "cpu" argument to rcu_check_callbacks(), which this commit does. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Pranith Kumar <bobby.prani@gmail.com>