Merge tag 'trace-v6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace (965a9d75) · Commits · EulixOS / Software / Kernel

Documentation/trace/index.rst

+1 −0

Original line number	Diff line number	Diff line
		@@ -32,3 +32,4 @@ Linux Tracing Technologies
		sys-t
		coresight/index
		user_events
		rv/index

Documentation/trace/kprobetrace.rst

+4 −4

Original line number	Diff line number	Diff line
		@@ -28,10 +28,10 @@ Synopsis of kprobe_events
		-------------------------
		::

		p[:[GRP/]EVENT] [MOD:]SYM[+offs]\|MEMADDR [FETCHARGS] : Set a probe
		r[MAXACTIVE][:[GRP/]EVENT] [MOD:]SYM[+0] [FETCHARGS] : Set a return probe
		p:[GRP/]EVENT] [MOD:]SYM[+0]%return [FETCHARGS] : Set a return probe
		-:[GRP/]EVENT : Clear a probe
		p[:[GRP/][EVENT]] [MOD:]SYM[+offs]\|MEMADDR [FETCHARGS] : Set a probe
		r[MAXACTIVE][:[GRP/][EVENT]] [MOD:]SYM[+0] [FETCHARGS] : Set a return probe
		p[:[GRP/][EVENT]] [MOD:]SYM[+0]%return [FETCHARGS] : Set a return probe
		-:[GRP/][EVENT] : Clear a probe

		GRP : Group name. If omitted, use "kprobes" for it.
		EVENT : Event name. If omitted, the event name is generated

Documentation/trace/rv/da_monitor_instrumentation.rst

0 → 100644

+171 −0

Original line number	Diff line number	Diff line
		Deterministic Automata Instrumentation
		======================================

		The RV monitor file created by dot2k, with the name "$MODEL_NAME.c"
		includes a section dedicated to instrumentation.

		In the example of the wip.dot monitor created on [1], it will look like::

		/*
		* This is the instrumentation part of the monitor.
		*
		* This is the section where manual work is required. Here the kernel events
		* are translated into model's event.
		*
		*/
		static void handle_preempt_disable(void data, / XXX: fill header */)
		{
		da_handle_event_wip(preempt_disable_wip);
		}

		static void handle_preempt_enable(void data, / XXX: fill header */)
		{
		da_handle_event_wip(preempt_enable_wip);
		}

		static void handle_sched_waking(void data, / XXX: fill header */)
		{
		da_handle_event_wip(sched_waking_wip);
		}

		static int enable_wip(void)
		{
		int retval;

		retval = da_monitor_init_wip();
		if (retval)
		return retval;

		rv_attach_trace_probe("wip", /* XXX: tracepoint */, handle_preempt_disable);
		rv_attach_trace_probe("wip", /* XXX: tracepoint */, handle_preempt_enable);
		rv_attach_trace_probe("wip", /* XXX: tracepoint */, handle_sched_waking);

		return 0;
		}

		The comment at the top of the section explains the general idea: the
		instrumentation section translates kernel events into the *model's
		event*.

		Tracing callback functions
		--------------------------

		The first three functions are the starting point of the callback *handler
		functions* for each of the three events from the wip model. The developer
		does not necessarily need to use them: they are just starting points.

		Using the example of::

		void handle_preempt_disable(void data, / XXX: fill header */)
		{
		da_handle_event_wip(preempt_disable_wip);
		}

		The preempt_disable event from the model connects directly to the
		preemptirq:preempt_disable. The preemptirq:preempt_disable event
		has the following signature, from include/trace/events/preemptirq.h::

		TP_PROTO(unsigned long ip, unsigned long parent_ip)

		Hence, the handle_preempt_disable() function will look like::

		void handle_preempt_disable(void *data, unsigned long ip, unsigned long parent_ip)

		In this case, the kernel event translates one to one with the automata
		event, and indeed, no other change is required for this function.

		The next handler function, handle_preempt_enable() has the same argument
		list from the handle_preempt_disable(). The difference is that the
		preempt_enable event will be used to synchronize the system to the model.

		Initially, the model is placed in the initial state. However, the system
		might or might not be in the initial state. The monitor cannot start
		processing events until it knows that the system has reached the initial state.
		Otherwise, the monitor and the system could be out-of-sync.

		Looking at the automata definition, it is possible to see that the system
		and the model are expected to return to the initial state after the
		preempt_enable execution. Hence, it can be used to synchronize the
		system and the model at the initialization of the monitoring section.

		The start is informed via a special handle function, the
		"da_handle_start_event_$(MONITOR_NAME)(event)", in this case::

		da_handle_start_event_wip(preempt_enable_wip);

		So, the callback function will look like::

		void handle_preempt_enable(void *data, unsigned long ip, unsigned long parent_ip)
		{
		da_handle_start_event_wip(preempt_enable_wip);
		}

		Finally, the "handle_sched_waking()" will look like::

		void handle_sched_waking(void data, struct task_struct task)
		{
		da_handle_event_wip(sched_waking_wip);
		}

		And the explanation is left for the reader as an exercise.

		enable and disable functions
		----------------------------

		dot2k automatically creates two special functions::

		enable_$(MONITOR_NAME)()
		disable_$(MONITOR_NAME)()

		These functions are called when the monitor is enabled and disabled,
		respectively.

		They should be used to attach and detach the instrumentation to the running
		system. The developer must add to the relative function all that is needed to
		attach and detach its monitor to the system.

		For the wip case, these functions were named::

		enable_wip()
		disable_wip()

		But no change was required because: by default, these functions attach and
		detach the tracepoints_to_attach, which was enough for this case.

		Instrumentation helpers
		-----------------------

		To complete the instrumentation, the handler functions need to be attached to a
		kernel event, at the monitoring enable phase.

		The RV interface also facilitates this step. For example, the macro "rv_attach_trace_probe()"
		is used to connect the wip model events to the relative kernel event. dot2k automatically
		adds "rv_attach_trace_probe()" function call for each model event in the enable phase, as
		a suggestion.

		For example, from the wip sample model::

		static int enable_wip(void)
		{
		int retval;

		retval = da_monitor_init_wip();
		if (retval)
		return retval;

		rv_attach_trace_probe("wip", /* XXX: tracepoint */, handle_preempt_enable);
		rv_attach_trace_probe("wip", /* XXX: tracepoint */, handle_sched_waking);
		rv_attach_trace_probe("wip", /* XXX: tracepoint */, handle_preempt_disable);

		return 0;
		}

		The probes then need to be detached at the disable phase.

		[1] The wip model is presented in::

		Documentation/trace/rv/deterministic_automata.rst

		The wip monitor is presented in::

		Documentation/trace/rv/da_monitor_synthesis.rst

Documentation/trace/rv/da_monitor_synthesis.rst

0 → 100644

+147 −0

Original line number	Diff line number	Diff line
		Deterministic Automata Monitor Synthesis
		========================================

		The starting point for the application of runtime verification (RV) technics
		is the specification or modeling of the desired (or undesired) behavior
		of the system under scrutiny.

		The formal representation needs to be then synthesized into a monitor
		that can then be used in the analysis of the trace of the system. The
		monitor connects to the system via an instrumentation that converts
		the events from the system to the events of the specification.


		In Linux terms, the runtime verification monitors are encapsulated inside
		the RV monitor abstraction. The RV monitor includes a set of instances
		of the monitor (per-cpu monitor, per-task monitor, and so on), the helper
		functions that glue the monitor to the system reference model, and the
		trace output as a reaction to event parsing and exceptions, as depicted
		below::

		Linux +----- RV Monitor ----------------------------------+ Formal
		Realm \| \| Realm
		+-------------------+ +----------------+ +-----------------+
		\| Linux kernel \| \| Monitor \| \| Reference \|
		\| Tracing \| -> \| Instance(s) \| <- \| Model \|
		\| (instrumentation) \| \| (verification) \| \| (specification) \|
		+-------------------+ +----------------+ +-----------------+
		\| \| \|
		\| V \|
		\| +----------+ \|
		\| \| Reaction \| \|
		\| +--+--+--+-+ \|
		\| \| \| \| \|
		\| \| \| +-> trace output ? \|
		+------------------------\|--\|----------------------+
		\| +----> panic ?
		+-------> <user-specified>

		DA monitor synthesis
		--------------------

		The synthesis of automata-based models into the Linux RV monitor abstraction
		is automated by the dot2k tool and the rv/da_monitor.h header file that
		contains a set of macros that automatically generate the monitor's code.

		dot2k
		-----

		The dot2k utility leverages dot2c by converting an automaton model in
		the DOT format into the C representation [1] and creating the skeleton of
		a kernel monitor in C.

		For example, it is possible to transform the wip.dot model present in
		[1] into a per-cpu monitor with the following command::

		$ dot2k -d wip.dot -t per_cpu

		This will create a directory named wip/ with the following files:

		- wip.h: the wip model in C
		- wip.c: the RV monitor

		The wip.c file contains the monitor declaration and the starting point for
		the system instrumentation.

		Monitor macros
		--------------

		The rv/da_monitor.h enables automatic code generation for the *Monitor
		Instance(s)* using C macros.

		The benefits of the usage of macro for monitor synthesis are 3-fold as it:

		- Reduces the code duplication;
		- Facilitates the bug fix/improvement;
		- Avoids the case of developers changing the core of the monitor code
		to manipulate the model in a (let's say) non-standard way.

		This initial implementation presents three different types of monitor instances:

		- ``#define DECLARE_DA_MON_GLOBAL(name, type)``
		- ``#define DECLARE_DA_MON_PER_CPU(name, type)``
		- ``#define DECLARE_DA_MON_PER_TASK(name, type)``

		The first declares the functions for a global deterministic automata monitor,
		the second for monitors with per-cpu instances, and the third with per-task
		instances.

		In all cases, the 'name' argument is a string that identifies the monitor, and
		the 'type' argument is the data type used by dot2k on the representation of
		the model in C.

		For example, the wip model with two states and three events can be
		stored in an 'unsigned char' type. Considering that the preemption control
		is a per-cpu behavior, the monitor declaration in the 'wip.c' file is::

		DECLARE_DA_MON_PER_CPU(wip, unsigned char);

		The monitor is executed by sending events to be processed via the functions
		presented below::

		da_handle_event_$(MONITOR_NAME)($(event from event enum));
		da_handle_start_event_$(MONITOR_NAME)($(event from event enum));
		da_handle_start_run_event_$(MONITOR_NAME)($(event from event enum));

		The function ``da_handle_event_$(MONITOR_NAME)()`` is the regular case where
		the event will be processed if the monitor is processing events.

		When a monitor is enabled, it is placed in the initial state of the automata.
		However, the monitor does not know if the system is in the initial state.

		The ``da_handle_start_event_$(MONITOR_NAME)()`` function is used to notify the
		monitor that the system is returning to the initial state, so the monitor can
		start monitoring the next event.

		The ``da_handle_start_run_event_$(MONITOR_NAME)()`` function is used to notify
		the monitor that the system is known to be in the initial state, so the
		monitor can start monitoring and monitor the current event.

		Using the wip model as example, the events "preempt_disable" and
		"sched_waking" should be sent to monitor, respectively, via [2]::

		da_handle_event_wip(preempt_disable_wip);
		da_handle_event_wip(sched_waking_wip);

		While the event "preempt_enabled" will use::

		da_handle_start_event_wip(preempt_enable_wip);

		To notify the monitor that the system will be returning to the initial state,
		so the system and the monitor should be in sync.

		Final remarks
		-------------

		With the monitor synthesis in place using the rv/da_monitor.h and
		dot2k, the developer's work should be limited to the instrumentation
		of the system, increasing the confidence in the overall approach.

		[1] For details about deterministic automata format and the translation
		from one representation to another, see::

		Documentation/trace/rv/deterministic_automata.rst

		[2] dot2k appends the monitor's name suffix to the events enums to
		avoid conflicting variables when exporting the global vmlinux.h
		use by BPF programs.

Documentation/trace/rv/deterministic_automata.rst

0 → 100644

+184 −0

Original line number	Diff line number	Diff line
		Deterministic Automata
		======================

		Formally, a deterministic automaton, denoted by G, is defined as a quintuple:

		G = { X, E, f, x\ :subscript:`0`, X\ :subscript:`m` }

		where:

		- X is the set of states;
		- E is the finite set of events;
		- x\ :subscript:`0` is the initial state;
		- X\ :subscript:`m` (subset of X) is the set of marked (or final) states.
		- f : X x E -> X $ is the transition function. It defines the state
		transition in the occurrence of an event from E in the state X. In the
		special case of deterministic automata, the occurrence of the event in E
		in a state in X has a deterministic next state from X.

		For example, a given automaton named 'wip' (wakeup in preemptive) can
		be defined as:

		- X = { ``preemptive``, ``non_preemptive``}
		- E = { ``preempt_enable``, ``preempt_disable``, ``sched_waking``}
		- x\ :subscript:`0` = ``preemptive``
		- X\ :subscript:`m` = {``preemptive``}
		- f =
		- f\ (``preemptive``, ``preempt_disable``) = ``non_preemptive``
		- f\ (``non_preemptive``, ``sched_waking``) = ``non_preemptive``
		- f\ (``non_preemptive``, ``preempt_enable``) = ``preemptive``

		One of the benefits of this formal definition is that it can be presented
		in multiple formats. For example, using a graphical representation, using
		vertices (nodes) and edges, which is very intuitive for operating system
		practitioners, without any loss.

		The previous 'wip' automaton can also be represented as::

		preempt_enable
		+---------------------------------+
		v \|
		#============# preempt_disable +------------------+
		--> H preemptive H -----------------> \| non_preemptive \|
		#============# +------------------+
		^ \|
		\| sched_waking \|
		+--------------+

		Deterministic Automaton in C
		----------------------------

		In the paper "Efficient formal verification for the Linux kernel",
		the authors present a simple way to represent an automaton in C that can
		be used as regular code in the Linux kernel.

		For example, the 'wip' automata can be presented as (augmented with comments)::

		/* enum representation of X (set of states) to be used as index */
		enum states {
		preemptive = 0,
		non_preemptive,
		state_max
		};

		#define INVALID_STATE state_max

		/* enum representation of E (set of events) to be used as index */
		enum events {
		preempt_disable = 0,
		preempt_enable,
		sched_waking,
		event_max
		};

		struct automaton {
		char *state_names[state_max]; // X: the set of states
		char *event_names[event_max]; // E: the finite set of events
		unsigned char function[state_max][event_max]; // f: transition function
		unsigned char initial_state; // x_0: the initial state
		bool final_states[state_max]; // X_m: the set of marked states
		};

		struct automaton aut = {
		.state_names = {
		"preemptive",
		"non_preemptive"
		},
		.event_names = {
		"preempt_disable",
		"preempt_enable",
		"sched_waking"
		},
		.function = {
		{ non_preemptive, INVALID_STATE, INVALID_STATE },
		{ INVALID_STATE, preemptive, non_preemptive },
		},
		.initial_state = preemptive,
		.final_states = { 1, 0 },
		};

		The transition function is represented as a matrix of states (lines) and
		events (columns), and so the function f : X x E -> X can be solved
		in O(1). For example::

		next_state = automaton_wip.function[curr_state][event];

		Graphviz .dot format
		--------------------

		The Graphviz open-source tool can produce the graphical representation
		of an automaton using the (textual) DOT language as the source code.
		The DOT format is widely used and can be converted to many other formats.

		For example, this is the 'wip' model in DOT::

		digraph state_automaton {
		{node [shape = circle] "non_preemptive"};
		{node [shape = plaintext, style=invis, label=""] "__init_preemptive"};
		{node [shape = doublecircle] "preemptive"};
		{node [shape = circle] "preemptive"};
		"__init_preemptive" -> "preemptive";
		"non_preemptive" [label = "non_preemptive"];
		"non_preemptive" -> "non_preemptive" [ label = "sched_waking" ];
		"non_preemptive" -> "preemptive" [ label = "preempt_enable" ];
		"preemptive" [label = "preemptive"];
		"preemptive" -> "non_preemptive" [ label = "preempt_disable" ];
		{ rank = min ;
		"__init_preemptive";
		"preemptive";
		}
		}

		This DOT format can be transformed into a bitmap or vectorial image
		using the dot utility, or into an ASCII art using graph-easy. For
		instance::

		$ dot -Tsvg -o wip.svg wip.dot
		$ graph-easy wip.dot > wip.txt

		dot2c
		-----

		dot2c is a utility that can parse a .dot file containing an automaton as
		in the example above and automatically convert it to the C representation
		presented in [3].

		For example, having the previous 'wip' model into a file named 'wip.dot',
		the following command will transform the .dot file into the C
		representation (previously shown) in the 'wip.h' file::

		$ dot2c wip.dot > wip.h

		The 'wip.h' content is the code sample in section 'Deterministic Automaton
		in C'.

		Remarks
		-------

		The automata formalism allows modeling discrete event systems (DES) in
		multiple formats, suitable for different applications/users.

		For example, the formal description using set theory is better suitable
		for automata operations, while the graphical format for human interpretation;
		and computer languages for machine execution.

		References
		----------

		Many textbooks cover automata formalism. For a brief introduction see::

		O'Regan, Gerard. Concise guide to software engineering. Springer,
		Cham, 2017.

		For a detailed description, including operations, and application on Discrete
		Event Systems (DES), see::

		Cassandras, Christos G., and Stephane Lafortune, eds. Introduction to discrete
		event systems. Boston, MA: Springer US, 2008.

		For the C representation in kernel, see::

		De Oliveira, Daniel Bristot; Cucinotta, Tommaso; De Oliveira, Romulo
		Silva. Efficient formal verification for the Linux kernel. In:
		International Conference on Software Engineering and Formal Methods.
		Springer, Cham, 2019. p. 315-332.