I wanted to start a discussion on CTest script mode, specifically return codes and how errors
propagate silently up from the executed script.
This discussion stems from the transition of numerous projects including CMake itself over to CI systems which mark the success/failure of a job based on the error code returned from the executable.
My initial reasoning is that people expect CTest script mode will return an error state only if the script provided to it contains invalid CMake code.
Lets consider a trimmed down version of CMake’s gitlab-ci test stage:
.cmake_test_unix: &cmake_test_unix
stage: test
script:
- "ctest --output-on-failure -V -S .gitlab/ci/ctest_test.cmake"
And the associated ctest_test.cmake
cmake_minimum_required(VERSION 3.8)
...
ctest_start(APPEND)
...
include("${CMAKE_CURRENT_LIST_DIR}/ctest_exclusions.cmake")
ctest_test(
RETURN_VALUE test_result
EXCLUDE "${test_exclusions}")
ctest_submit(PARTS Test)
if (test_result)
message(FATAL_ERROR "Failed to test")
endif ()
Everything looks reasonable on an initial read of ctest_test.cmake
. By looking at the FATAL_ERROR
block we reason that we explicitly return an error code with message
and otherwise will return success.
In actuality if any line of the CMake code itself causes a fatal error the script will return an error. This makes sense, if the required include
file is not found, the script had an unrecoverable error and should fail.
The problem is with error conditions that don’t terminate the script execution. Commands such as ctest_configure
or ctest_test
are good examples. For example somebody could write the following configuration script:
ctest_start()
...
include("${CMAKE_CURRENT_LIST_DIR}/ctest_exclusions.cmake")
ctest_configure(OPTIONS "-DTRY_WITH_OPTION=ON"
RETURN_VALUE config_result)
if(NOT config_result)
ctest_configure(OPTIONS "-DTRY_WITH_OPTION=OFF")
endif()
This script when executed will return a failure code any time the first ctest_configure
fails even if the second ctest_configure was valid. For the ctest_test
command this kind of surprise occurs if any test fails for an unexpected reason such as file missing ( NOT_RUN failures). **UPDATE: ctest_test doesn’t return an error code when a test fails normally
**
The correct script is actually:
ctest_start()
...
include("${CMAKE_CURRENT_LIST_DIR}/ctest_exclusions.cmake")
ctest_configure(OPTIONS "-DTRY_WITH_OPTION=ON"
RETURN_VALUE test_result
CAPTURE_CMAKE_ERROR has_error)
if(has_error)
ctest_configure(OPTIONS "-DTRY_WITH_OPTION=OFF")
endif()
Note:
I didn’t drop the usage of RETURN_VALUE
as the results of CAPTURE_CMAKE_ERROR
are -1
for failure and not the actual error code that was returned
Coming back to the original ctest_test.cmake
script we now have to figure out what we consider to be a failed job. If we consider any failed test to merit a failed job the original code is correct.
So the problems/issues I would like to discuss are:
-
The lack of documentation on how CTest -S handles error codes. No where in the CTest primary manual do we document how error codes propagate.
-
The lack of documentation in each of the
ctest_
commands documenting when roughly it will cause an error code to occur but still allow execution. Nowhere except in the small print ofCAPTURE_CMAKE_ERROR
do we even hint that these commands propagate errors -
The inability to clear error state of previous commands, which means that if your script includes other CMake/CTest code you have zero control over error propagation
-
The poorly named
CAPTURE_CMAKE_ERROR
option. It isn’t capturing CMake internal errors, nor is it actually storing the error code that was generated. A better name would beSTOP_ERROR_CODE_PROPAGATION
-
The foot gun that is
RETURN_VALUE
withoutCAPTURE_CMAKE_ERROR
. As the above example show, ‘correct’ code will almost always need just capture or both values to properly handle return codes.
Doc Updates
Are we okay with updating the doc section of each ctest_
command to explicitly document how these commands effect the calling script?
For example the ctest_test
command would be updated to be:
`
Run tests in the project build tree and store the tests results in Test.xml for submission with the ctest_submit() command.
If all tests are executed, no matter if they succeed the calling script will continue to execute. If you
want to check if any tests fail you will need to look at the contents of RETURN_VALUE
. If you need
the calling ctest -S
executable to have a non zero return code you will need to cause an error with something like message(FATAL_ERROR "...")
If any test fails for an unexpected reason ( NOT_RUN ) the calling script will continue, but the executable ( ctest -S
) will have a non zero return code. If this automatic error propagation is not needed you will need to use the CAPTURE_CMAKE_ERROR
option.
`
Return value option
Should we deprecate RETURN_VALUE
and CAPTURE_CMAKE_ERROR
and instead offer a single option that does both?