Parsing data efficiently (JSON or maybe something else?)

epistax · July 21, 2025, 12:36pm

We have a hybrid setup where we are introducing CMake piece by piece on top of an existing complete build environment. The first goal for us is introducing a tight TDD cycle. We’re making some headway using JSON to share some common data between the build systems, such as warning flags.

We have a (~2000 item) list of interfaces that are mocked as a part unit testing build. We’re using CMake’s json support to read the JSON list to a CMake list, iterate through it and create the custom commands and targets to generate the mocks. Using CMake’s profiler on one system showed the loop taking about 70% of the overall configuration time with 30% just parsing the JSON list.

The overall time is not a lot so we don’t need to change it, but I wanted to see if we were doing things in a reasonable fashion for CMake. We have our dependencies set up so if there is a change to a mocked interface, it gets re-mocked. If there is a change to the JSON list, configure re-runs.

Thanks!

vito.gamberini · July 21, 2025, 6:59pm

Parsing JSON inside CMakeLang is about the worst case scenario for any pipeline. JSON is ubiquitous, so it’s a necessary feature, but any opportunity to avoid it should be leveraged.

If it became a problem, the answer is to not use CMake to do so. Write your own script in a “real” programming language to parse the JSON into add_custom_command()/add_custom_target calls, record them in a file, and include() that file from your CMake project.

epistax · July 21, 2025, 7:07pm

Thanks! This makes sense.