CTest: MaxRecursionDepth failures

I saw https://gitlab.kitware.com/cmake/cmake/-/merge_requests/8302 and decided to try the MaxRecursionDepth CTest CMake self-test on an x86_64 8GB RAM Windows 11 laptop using:

  • downloaded binary of CMake 3.20.6 : depth 232
  • MinGW / MSYS2 GCC 12.2.0 -built CMake master branch using Ninja and Debug : depth 232
  • MSVC 19.35 master and Debug : depth 722

WIth =these ctest.exe, I observe the MaxRecursionDepth self-test fails with “Stack overflow” on tests: find_package-{default,default-script,invalid-var,invalid-var-script}. I haven’t paid attention to this test before so maybe it’s just a known issue.

What is the output of

$ grep CMake_DEFAULT_RECURSION_LIMIT Source/cmConfigure.h

in the build tree?

#define CMake_DEFAULT_RECURSION_LIMIT 400
cmake -Bbuildv
-- Building for: Ninja
-- The C compiler identification is MSVC 19.35.32215.0
-- The CXX compiler identification is MSVC 19.35.32215.0
...
bin\ctest.exe -R RunCMake.MaxRecursionDepth -V

“Stack overflow” at depth 722 on find_package-{default,default-script,invalid-var,var-script}


cmake -Bbuildvr -DCMAKE_BUILD_TYPE=Release
-- Building for: Ninja
-- The C compiler identification is MSVC 19.35.32215.0
-- The CXX compiler identification is MSVC 19.35.32215.0
...
bin\ctest.exe -R RunCMake.MaxRecursionDepth -V

all pass

The recursion depth limit is determined by a table here, when CMake is built, based on the platform and compiler in use. I suspect that the size of the C++ stack frames in our implementation has gone up since that table was written.

Please try git bisect between when that table was added, and now, to determine if/when there was a jump in our stack usage.

OK these were first set about 4 years ago

I also in my arbitrary checks only see the stack overflow in Debug build/test of CMake itself.

With MSVC 19.35 and CMake 3.18.4 Debug build, I see the same MaxRecursionDepth sub-tests each fail at depth 845. I wonder if this is a peculiarity of my computer or also interaction with compiler version?

It seems the CI isn’t catching this across many compiler versions also.

– Does the CI do Debug MSVC builds?

That’s because the table I linked lowers the limit to 100 for CI and similar builds. The purpose is to make sure our own modules don’t recurse too much. Unfortunately it also means that the MaxRecursionDepth test is not actually covering the externally-visible limits.

OK. I see from these simple tests that using the same compiler, there is a finite but not dramatic increase in stack usage for this Debug test from 3.18.4 to master.

I wonder if it is useful to add CI test cases that would for example catch the change in stack usage that’s occurred over time, if that is significant, or if a new test should be formulated to monitor stack usage.

Perhaps we can change the recursion limit lookup to check an environment variable before using the compiled-in default limit. Then we can compile in the default limit even in CI builds, but set the environment variable during testing. Then in the MaxRecursionDepth test we can clear that environment variable to test the real limit.

See CMake MR 8307 for work on this.

Another result: CentOS 8 and GCC 8.5.0: find_package-default-script segfault with depth1000