WebSVN – HelenOS-doc – Blame – /design/trunk/src/ch_scheduling.xml

Rev	Author	Line No.	Line
9	bondari	1	<?xml version="1.0" encoding="UTF-8"?>
57	jermar	2	<chapter id="scheduling">
		3	<?dbhtml filename="scheduling.html"?>
9	bondari	4
57	jermar	5	<title>Scheduling</title>
9	bondari	6
57	jermar	7	<para>One of the key aims of the operating system is to create and support
		8	the impression that several activities are executing contemporarily. This is
		9	true for both uniprocessor as well as multiprocessor systems. In the case of
		10	multiprocessor systems, the activities are trully happening in parallel. The
		11	scheduler helps to materialize this impression by planning threads on as
		12	many processors as possible and, where this means reaches its limits, by
		13	quickly switching among threads executing on a single processor.</para>
		14
		15	<section>
		16	<title>Contexts</title>
		17
		18	<para>The term context refers to the set of processor resources that
		19	define the current state of the computation or the environment and the
		20	kernel understands it in several more or less narrow sences:</para>
		21
		22	<itemizedlist>
		23	<listitem>
		24	<para>synchronous register context,</para>
		25	</listitem>
		26
		27	<listitem>
		28	<para>asynchronous register context,</para>
		29	</listitem>
		30
		31	<listitem>
		32	<para>FPU context and</para>
		33	</listitem>
		34
		35	<listitem>
		36	<para>memory management context.</para>
		37	</listitem>
		38	</itemizedlist>
		39
76	palkovsky	40	<para>The most narrow sense refers to the the synchronous register
57	jermar	41	context. It includes all the preserved registers as defined by the
		42	architecture. To highlight some, the program counter and stack pointer
76	palkovsky	43	take part in the synchronous register context. These registers must be
		44	preserved across a procedure call and during synchronous context
		45	switches.</para>
57	jermar	46
		47	<para>The next type of the context understood by the kernel is the
		48	asynchronous register context. On an interrupt, the interrupted execution
		49	flow's state must be guaranteed to be eventually completely restored.
		50	Therefore the interrupt context includes, among other things, the scratch
		51	registers as defined by the architecture. As a special optimization and if
		52	certain conditions are met, it need not include the architecture's
		53	preserved registers. The condition mentioned in the previous sentence is
		54	that the low-level assembly language interrupt routines don't modify the
		55	preserved registers. The handlers usually call a higher-level C routine.
		56	The preserved registers are then saved on the stack by the compiler
		57	generated code of the higher-level function. In HelenOS, several
		58	architectures can be compiled with this optimization.</para>
		59
		60	<para>Although the kernel does not do any floating point
		61	arithmetics<footnote>
		62	<para>Some architectures (e.g. ia64) inevitably use a fixed set of
64	jermar	63	floating point registers to carry out their normal operations.</para>
57	jermar	64	</footnote>, it must protect FPU context of userspace threads against
		65	destruction by other threads. Moreover, only a fraction of userspace
		66	programs use the floating point unit. HelenOS contains a generic framework
64	jermar	67	for switching FPU context only when the switch is forced (i.e. a thread
		68	uses a floating point instruction and its FPU context is not loaded in the
		69	processor).</para>
57	jermar	70
		71	<para>The last member of the context family is the memory management
		72	context. It includes memory management registers that identify address
		73	spaces on hardware level (i.e. ASIDs and page tables pointers).</para>
		74
9	bondari	75	<section>
131	jermar	76	<title>Synchronous Context Switches</title>
9	bondari	77
57	jermar	78	<para>The scheduler, but also other pieces of the kernel, make heavy use
		79	of synchronous context switches, because it is a natural vehicle not
		80	only for changes in control flow, but also for switching between two
		81	kernel stacks. Two functions figure in a synchronous context switch
133	jermar	82	implementation: <code>context_save()</code> and
		83	<code>context_restore()</code>. Note that these two functions break the
57	jermar	84	natural perception of the linear C code execution flow starting at
		85	function's entry point and ending on one of the function's exit
		86	points.</para>
		87
133	jermar	88	<para>When the <code>context_save()</code> function is called, the
57	jermar	89	synchronous context is saved in a memory structure passed to it. After
133	jermar	90	executing <code>context_save()</code>, the caller is returned 1 as a
57	jermar	91	return value. The execution of instructions continues as normally until
133	jermar	92	<code>context_restore()</code> is called. For the caller, it seems like
57	jermar	93	the call never returns<footnote>
		94	<para>Which might be a source of problems with variable liveliness
133	jermar	95	after <code>context_restore()</code>.</para>
57	jermar	96	</footnote>. Nevertheless, a synchronous register context, which is
133	jermar	97	saved in a memory structure passed to <code>context_restore()</code>, is
57	jermar	98	restored, thus transfering the control flow to the place of occurrence
133	jermar	99	of the corresponding call to <code>context_save()</code>. From the
57	jermar	100	perspective of the caller of the corresponding
133	jermar	101	<code>context_save()</code>, it looks like a return from
		102	<code>context_save()</code>. However, this time a return value of 0 is
57	jermar	103	returned.</para>
9	bondari	104	</section>
57	jermar	105	</section>
9	bondari	106
57	jermar	107	<section>
58	jermar	108	<title>Threads</title>
9	bondari	109
58	jermar	110	<para>A thread is the basic executable entity with some code and stack.
		111	While the code, implemented by a C language function, can be shared by
		112	several threads, the stack is always private to each instance of the
		113	thread. Each thread belongs to exactly one task through which it shares
		114	address space with its sibling threads. Threads that execute purely in the
		115	kernel don't have any userspace memory allocated. However, when a thread
		116	has ambitions to run in userspace, it must be allocated a userspace stack.
		117	The distinction between the purely kernel threads and threads running also
		118	in userspace is made by refering to the former group as to kernel threads
		119	and to the latter group as to userspace threads. Both kernel and userspace
		120	threads are visible to the scheduler and can become a subject of kernel
		121	preemption and thread migration during times when preemption is
		122	possible.</para>
		123
64	jermar	124	<formalpara>
131	jermar	125	<title>Thread States</title>
62	jermar	126
75	jermar	127	<para>In each moment, a thread exists in one of six possible thread
64	jermar	128	states. When the thread is created and first readied into the
		129	scheduler's run queues or when a thread is migrated to a new processor,
75	jermar	130	it is put into the <constant>Entering</constant> state. After some time,
		131	the scheduler picks up the thread and starts executing it. A thread
		132	being currently executed on a processor is in the
		133	<constant>Running</constant> state. From there, the thread has three
		134	possibilities. It either runs until it is preemtped, in which case the
		135	state changes to <constant>Ready</constant>, goes to the
		136	<constant>Sleeping</constant> state by going to sleep or enters the
		137	<constant>Exiting</constant> state when it reaches termination. When the
		138	thread exits, its kernel structure usually stays in memory, until the
133	jermar	139	thread is detached by another thread using <code>thread_detach()</code>
75	jermar	140	function. Terminated but undetached threads are in the
		141	<constant>Undead</constant> state. When the thread is detached or
		142	detaches itself during its life, it is destroyed in the
		143	<constant>Exiting</constant> state and the <constant>Undead</constant>
101	bondari	144	state is not reached.<figure float="1">
64	jermar	145	<title>Transitions among thread states.</title>
61	jermar	146
64	jermar	147	<mediaobject id="thread_states" xreflabel="">
87	bondari	148	<imageobject role="pdf">
131	jermar	149	<imagedata fileref="images/thread_states.pdf" format="PDF" />
77	bondari	150	</imageobject>
		151
64	jermar	152	<imageobject role="html">
		153	<imagedata fileref="images/thread_states.png" format="PNG" />
		154	</imageobject>
62	jermar	155
64	jermar	156	<imageobject role="fop">
131	jermar	157	<imagedata fileref="images/thread_states.svg" format="SVG" />
64	jermar	158	</imageobject>
		159	</mediaobject>
		160	</figure></para>
		161	</formalpara>
58	jermar	162
64	jermar	163	<formalpara>
131	jermar	164	<title>Pseudo Threads</title>
58	jermar	165
64	jermar	166	<para>HelenOS userspace layer knows even smaller units of execution.
		167	Each userspace thread can make use of an arbitrary number of pseudo
		168	threads. These pseudo threads have their own synchronous register
		169	context, userspace code and stack. They live their own life within the
		170	userspace thread and the scheduler does not have any idea about them
		171	because they are completely implemented by the userspace library. This
		172	implies several things:<itemizedlist>
		173	<listitem>
		174	<para>pseudothreads schedule themselves cooperatively within the
		175	time slice given to their userspace thread,</para>
		176	</listitem>
58	jermar	177
64	jermar	178	<listitem>
		179	<para>pseudothreads share FPU context of their containing thread
		180	and</para>
		181	</listitem>
		182
		183	<listitem>
		184	<para>all pseudothreads of one userspace thread block when one of
		185	them goes to sleep.</para>
		186	</listitem>
		187	</itemizedlist></para>
		188	</formalpara>
59	jermar	189	</section>
58	jermar	190
59	jermar	191	<section>
		192	<title>Scheduler</title>
		193
		194	<section>
131	jermar	195	<title>Run Queues</title>
59	jermar	196
		197	<para>There is an array of several run queues on each processor. The
		198	current version of HelenOS uses 16 run queues implemented by 16 doubly
		199	linked lists. Each of the run queues is associated with thread priority.
		200	The lower the run queue index in the array is, the higher is the
		201	priority of threads linked in that run queue and the shorter is the time
		202	in which those threads will execute. When kernel code wants to access
		203	the run queue, it must first acquire its lock.</para>
		204	</section>
		205
		206	<section>
131	jermar	207	<title>Scheduler Operation</title>
59	jermar	208
		209	<para>The scheduler is invoked either explicitly when a thread calls the
133	jermar	210	<code>scheduler()</code> function (e.g. goes to sleep or merely wants to
59	jermar	211	relinquish the processor for a while) or implicitly on a periodic basis
		212	when the generic clock interrupt preempts the current thread. After its
		213	invocation, the scheduler saves the synchronous register context of the
		214	current thread and switches to its private stack. Afterwards, a new
		215	thread is selected according to the scheduling policy. If there is no
		216	suitable thread, the processor is idle and no thread executes on it.
		217	Note that the act of switching to the private scheduler stack is
		218	essential. If the processor kept running using the stack of the
		219	preempted thread it could damage it because the old thread can be
		220	migrated to another processor and scheduled there. In the worst case
62	jermar	221	scenario, two execution flows would be using the same stack.</para>
59	jermar	222
		223	<para>The scheduling policy is implemented in function
		224	<code>find_best_thread</code>. This function walks the processor run
		225	queues from lower towards higher indices and looks for a thread. If the
		226	visited run queue is empty, it simply searches the next run queue. If it
		227	is known in advance that there are no ready threads waiting for
133	jermar	228	execution, <code>find_best_thread()</code> interruptibly halts the
59	jermar	229	processor or busy waits until some threads arrive. This process repeats
133	jermar	230	until <code>find_best_thread()</code> succeeds.</para>
59	jermar	231
		232	<para>After the best thread is chosen, the scheduler switches to the
		233	thread's task and memory management context. Finally, the saved
		234	synchronous register context is restored and the thread runs. Each
		235	scheduled thread is given a time slice depending on its priority (i.e.
		236	run queue). The higher priority, the shorter timeslice. To summarize,
		237	this policy schedules threads with high priorities more frequently but
		238	gives them smaller time slices. On the other hand, lower priority
		239	threads are scheduled less frequently, but run for longer periods of
		240	time.</para>
		241
		242	<para>When a thread uses its entire time slice, it is preempted and put
		243	back into the run queue that immediately follows the previous run queue
		244	from which the thread ran. Threads that are woken up from a sleep are
		245	put into the biggest priority run queue. Low priority threads are
		246	therefore those that don't go to sleep so often and just occupy the
		247	processor.</para>
		248
		249	<para>In order to avoid complete starvation of the low priority threads,
		250	from time to time, the scheduler will provide them with a bonus of one
		251	point priority increase. In other words, the scheduler will now and then
		252	move the entire run queues one level up.</para>
		253	</section>
		254
		255	<section>
131	jermar	256	<title>Processor Load Balancing</title>
59	jermar	257
		258	<para>Normally, for the sake of cache locality, threads are scheduled on
		259	one of the processors and don't leave it. Nevertheless, a situation in
		260	which one processor is heavily overloaded while others sit idle can
76	palkovsky	261	occur. HelenOS deploys special kernel threads to help mitigate this
59	jermar	262	problem. Each processor is associated with one load balancing thread
76	palkovsky	263	called <code>kcpulb</code> that wakes up regularly to see whether its
		264	processor is underbalanced or not. If it is, the thread attempts to
59	jermar	265	migrate threads from other overloaded processors to its own processor's
		266	run queues. When the job is done or there is no need for load balancing,
		267	the thread goes to sleep.</para>
		268
		269	<para>The balancing threads operate very gently and try to migrate low
		270	priority threads first; one <code>kcpulb</code> never takes from one
		271	processor twice in a row. The load balancing threads as well as threads
		272	that were just stolen cannot be migrated. The <code>kcpulb</code>
		273	threads are wired to their processors and cannot be migrated whatsoever.
		274	The ordinary threads are protected only until they are
		275	rescheduled.</para>
		276	</section>
57	jermar	277	</section>
		278	</chapter>

Subversion Repositories HelenOS-doc

(root)/design/trunk/src/ch_scheduling.xml – Rev 133