CTest Script Mode ( -S ) and return codes

I wanted to start a discussion on CTest script mode, specifically return codes and how errors propagate silently up from the executed script.

This discussion stems from the transition of numerous projects including CMake itself over to CI systems which mark the success/failure of a job based on the error code returned from the executable.

My initial reasoning is that people expect CTest script mode will return an error state only if the script provided to it contains invalid CMake code.

Lets consider a trimmed down version of CMake’s gitlab-ci test stage:

.cmake_test_unix: &cmake_test_unix
    stage: test
    script:
        - "ctest --output-on-failure -V -S .gitlab/ci/ctest_test.cmake"

And the associated ctest_test.cmake

cmake_minimum_required(VERSION 3.8)

...
ctest_start(APPEND)
...
include("${CMAKE_CURRENT_LIST_DIR}/ctest_exclusions.cmake")
ctest_test(
  RETURN_VALUE test_result
  EXCLUDE "${test_exclusions}")
ctest_submit(PARTS Test)

if (test_result)
  message(FATAL_ERROR "Failed to test")
endif ()

Everything looks reasonable on an initial read of ctest_test.cmake. By looking at the FATAL_ERROR block we reason that we explicitly return an error code with message and otherwise will return success.

In actuality if any line of the CMake code itself causes a fatal error the script will return an error. This makes sense, if the required include file is not found, the script had an unrecoverable error and should fail.

The problem is with error conditions that don’t terminate the script execution. Commands such as ctest_configure or ctest_test are good examples. For example somebody could write the following configuration script:

ctest_start()
...
include("${CMAKE_CURRENT_LIST_DIR}/ctest_exclusions.cmake")
ctest_configure(OPTIONS "-DTRY_WITH_OPTION=ON"
                RETURN_VALUE config_result)
if(NOT config_result)
  ctest_configure(OPTIONS "-DTRY_WITH_OPTION=OFF")
endif()

This script when executed will return a failure code any time the first ctest_configure fails even if the second ctest_configure was valid. For the ctest_test command this kind of surprise occurs if any test fails for an unexpected reason such as file missing ( NOT_RUN failures). **UPDATE: ctest_test doesn’t return an error code when a test fails normally **

The correct script is actually:

ctest_start()
...
include("${CMAKE_CURRENT_LIST_DIR}/ctest_exclusions.cmake")
ctest_configure(OPTIONS "-DTRY_WITH_OPTION=ON"
               RETURN_VALUE test_result
               CAPTURE_CMAKE_ERROR has_error)
if(has_error)
  ctest_configure(OPTIONS "-DTRY_WITH_OPTION=OFF")
endif()

Note:
I didn’t drop the usage of RETURN_VALUE as the results of CAPTURE_CMAKE_ERROR are -1 for failure and not the actual error code that was returned

Coming back to the original ctest_test.cmake script we now have to figure out what we consider to be a failed job. If we consider any failed test to merit a failed job the original code is correct.

So the problems/issues I would like to discuss are:

  • The lack of documentation on how CTest -S handles error codes. No where in the CTest primary manual do we document how error codes propagate.

  • The lack of documentation in each of the ctest_ commands documenting when roughly it will cause an error code to occur but still allow execution. Nowhere except in the small print of CAPTURE_CMAKE_ERROR do we even hint that these commands propagate errors

  • The inability to clear error state of previous commands, which means that if your script includes other CMake/CTest code you have zero control over error propagation

  • The poorly named CAPTURE_CMAKE_ERROR option. It isn’t capturing CMake internal errors, nor is it actually storing the error code that was generated. A better name would be STOP_ERROR_CODE_PROPAGATION

  • The foot gun that is RETURN_VALUE without CAPTURE_CMAKE_ERROR. As the above example show, ‘correct’ code will almost always need just capture or both values to properly handle return codes.

Doc Updates

Are we okay with updating the doc section of each ctest_ command to explicitly document how these commands effect the calling script?

For example the ctest_test command would be updated to be:

`
Run tests in the project build tree and store the tests results in Test.xml for submission with the ctest_submit() command.

If all tests are executed, no matter if they succeed the calling script will continue to execute. If you
want to check if any tests fail you will need to look at the contents of RETURN_VALUE. If you need
the calling ctest -S executable to have a non zero return code you will need to cause an error with something like message(FATAL_ERROR "...")

If any test fails for an unexpected reason ( NOT_RUN ) the calling script will continue, but the executable ( ctest -S ) will have a non zero return code. If this automatic error propagation is not needed you will need to use the CAPTURE_CMAKE_ERROR option.
`

Return value option

Should we deprecate RETURN_VALUE and CAPTURE_CMAKE_ERROR and instead offer a single option that does both?

1 Like

ctest_test() does not cause ctest -S to return a non-zero code if a test fails. Are you saying we should change it to do so (with a policy)?

You are correct.

The wording should be If any test fails for an unexpected reason such as missing input file. I willl edit the original post

@kyle.edwards I have updated the original post.

The corrections for ctest_test that @kyle.edwards pointed out brings up the following inconsistent behavior:

cmake_minimum_required(VERSION 3.8)

...
ctest_start(APPEND)
...
include("${CMAKE_CURRENT_LIST_DIR}/ctest_exclusions.cmake")
ctest_test(
  RETURN_VALUE test_result
  EXCLUDE "${test_exclusions}")
ctest_submit(PARTS Test)

If we presume no parse errors, I think we should expect that the this script will always return 0. As it stands the exceptions I am aware of is:

  • tests marked as NOT_RUN cause a non zero return value

It might be wise to check a few different CMake/CTest versions for these observations. I think there were changes recently around the ctest return code (related to when there are no tests at all). Probably no difference in behaviour, but worth confirming in case some of this is a regression.