Tutorial part 7: execution paths¶
A diagnostic
can optionally have a diagnostic_execution_path
describing a path of execution through code.
For example, let’s pretend we’re writing a static analyis tool for finding bugs in CPython extension code.
Let’s say we’re analyzing this code:
PyObject *
make_a_list_of_random_ints_badly(PyObject *self,
PyObject *args)
{
PyObject *list, *item;
long count, i;
if (!PyArg_ParseTuple(args, "i", &count)) {
return NULL;
}
list = PyList_New(0);
for (i = 0; i < count; i++) {
item = PyLong_FromLong(random());
PyList_Append(list, item);
}
return list;
}
This code attempts to take an Python integer parameter and then build a list of that length, containing random integers. However, there are numerous bugs in this code: a type mismatch, mistakes in reference-counting, and an almost total lack of error-handling.
For example, PyList_Append
requires a non-NULL first parameter (list
),
but PyList_New
can fail, returning NULL, and this isn’t checked for,
which would lead to a segfault if PyList_New
fails.
We can add a diagnostic_execution_path
to the diagnostic
via diagnostic_add_execution_path()
, and then add events to it
using diagnostic_execution_path_add_event()
.
For example, with:
diagnostic_event_id alloc_event_id
= diagnostic_execution_path_add_event (path,
loc_call_to_PyList_New,
logical_loc, 0,
"when %qs fails, returning NULL",
"PyList_New");
we create an event that will be worded as:
(1) when `PyList_New' fails, returning NULL
Note that diagnostic_execution_path_add_event()
returns a
diagnostic_event_id
. We can use this to refer to this event
in another event using the %@
format code in its message, which
takes the address of a diagnostic_event_id
:
diagnostic_execution_path_add_event (path,
loc_call_to_PyList_Append,
logical_loc, 0,
"when calling %qs, passing NULL from %@ as argument %i",
"PyList_Append", &alloc_event_id, 1);
where the latter event will be worded as:
(2) when calling `PyList_Append', passing NULL from (1) as argument 1
where the %@
reference to the other event has been printed as (1)
.
In SARIF output the text “(1)” will have a embedded link referring within the sarif
log to the threadFlowLocation
object for the other event, via JSON
pointer (see §3.10.3 “URIs that use the sarif scheme”).
Let’s add an event between these describing control flow, creating three events in all:
diagnostic_execution_path *path = diagnostic_add_execution_path (d);
diagnostic_event_id alloc_event_id
= diagnostic_execution_path_add_event (path,
loc_call_to_PyList_New,
logical_loc, 0,
"when %qs fails, returning NULL",
"PyList_New");
diagnostic_execution_path_add_event (path,
loc_for_cond,
logical_loc, 0,
"when %qs", "i < count");
diagnostic_execution_path_add_event (path,
loc_call_to_PyList_Append,
logical_loc, 0,
"when calling %qs, passing NULL from %@ as argument %i",
"PyList_Append", &alloc_event_id, 1);
Assuming we also gave it diagnostic_logical_location
with:
const char *funcname = "make_a_list_of_random_ints_badly";
const diagnostic_logical_location *logical_loc
= diagnostic_manager_new_logical_location (diag_mgr,
DIAGNOSTIC_LOGICAL_LOCATION_KIND_FUNCTION,
NULL, /* parent */
funcname,
funcname,
funcname);
and finish the diagnostic
with diagnostic_finish()
like this:
diagnostic_finish (d,
"passing NULL as argument %i to %qs"
" which requires a non-NULL parameter",
1, "PyList_Append");
then we should get output to text sinks similar to the following:
In function 'make_a_list_of_random_ints_badly':
test-warning-with-path.c:30:5: warning: passing NULL as argument 1 to 'PyList_Append' which requires a non-NULL parameter"
30 | PyList_Append(list, item);
| ^~~~~~~~~~~~~~~~~~~~~~~~~
make_a_list_of_random_ints_badly': events 1-3
26 | list = PyList_New(0);
| ^~~~~~~~~~~~~
| |
| (1) when 'PyList_New' fails, returning NULL
27 |
28 | for (i = 0; i < count; i++) {
| ~~~~~~~~~
| |
| (2) when 'i < count'
29 | item = PyLong_FromLong(random());
30 | PyList_Append(list, item);
| ~~~~~~~~~~~~~~~~~~~~~~~~~
| |
| (3) when calling 'PyList_Append', passing NULL from (1) as argument 1
and for SARIF sinks the path will be added as a codeFlow
object
(see SARIF 2.1.0 §3.36 codeFlow object).
Here’s the above example in full:
/* begin create phys locs */
const diagnostic_physical_location *loc_call_to_PyList_New
= make_range (diag_mgr, main_file, line_num_call_to_PyList_New, 10, 22);
const diagnostic_physical_location *loc_for_cond
= make_range (diag_mgr, main_file, line_num_for_loop, 15, 23);
const diagnostic_physical_location *loc_call_to_PyList_Append
= make_range (diag_mgr, main_file, line_num_call_to_PyList_Append, 5, 29);
/* end create phys locs */
/* begin create logical locs */
const char *funcname = "make_a_list_of_random_ints_badly";
const diagnostic_logical_location *logical_loc
= diagnostic_manager_new_logical_location (diag_mgr,
DIAGNOSTIC_LOGICAL_LOCATION_KIND_FUNCTION,
NULL, /* parent */
funcname,
funcname,
funcname);
/* end create logical locs */
diagnostic *d = diagnostic_begin (diag_mgr,
DIAGNOSTIC_LEVEL_WARNING);
diagnostic_set_location (d, loc_call_to_PyList_Append);
diagnostic_set_logical_location (d, logical_loc);
/* begin path creation */
diagnostic_execution_path *path = diagnostic_add_execution_path (d);
diagnostic_event_id alloc_event_id
= diagnostic_execution_path_add_event (path,
loc_call_to_PyList_New,
logical_loc, 0,
"when %qs fails, returning NULL",
"PyList_New");
diagnostic_execution_path_add_event (path,
loc_for_cond,
logical_loc, 0,
"when %qs", "i < count");
diagnostic_execution_path_add_event (path,
loc_call_to_PyList_Append,
logical_loc, 0,
"when calling %qs, passing NULL from %@ as argument %i",
"PyList_Append", &alloc_event_id, 1);
/* end path creation */
diagnostic_finish (d,
"passing NULL as argument %i to %qs"
" which requires a non-NULL parameter",
1, "PyList_Append");
Moving on¶
That’s the end of the tutorial. For more information on libgdiagnostics, see the topic guide.