Why not to add `dict()` to CMake?

alex · June 8, 2021, 3:01am

I believe that implementing richer data structures as opaque types that need to be interacted with a particular function is perfectly workable. In particular, I don’t see why the full set of JSON types isn’t workable: dicts and lists being the top two. It seems to me that JSON is already able to represent every value in the CMake language (currently only strings), so we already have an unambiguous serialization for everything…

If the author directly asks for a special data structure, they should be responsible for moving them between various representations. This is more what I mean, step by step:

dict(NEW my_dict)                   # my_dict is a hash table in memory
dict(INSERT my_dict "key" "value")  # insert (key, value) into the dict
message(STATUS "${my_dict}")        # only now do we cast to a JSON string
my_func(${my_dict})                 # pass the dict as a single argument
my_func("${my_dict}")               # pass JSON string version

The same exact thing could be used to support real, nested lists with arbitrary content.

list(NEW my_list "a" "b;c" "d")
list(LENGTH my_list len)         # len is 3
message(STATUS "${my_list}")     # -- ["a", "b;c", "d"]
my_func(${my_list})              # receives three arguments
my_func("${my_list}")            # receives one argument, JSON repr
list(TO_CMAKE_LIST my_list var)  # dev warning: loss of fidelity

You could also get these compound structures out of JSON

json(PARSE [=[ ["a", "b;c", "d" ] ]=] VAR var TYPE ty)
# var is a structured list now, "${ty}" is "list" (to match the function name
# and make cmake_langauge(CALL ${ty}) practical)

This really doesn’t seem impractical. It’s more about “should we” than about “can we”, and I think the argument for lists is strong, while for dicts it is somewhat weaker.

Sure, but I don’t see how that’s relevant. I could observe that CMake has a script mode that is (ab)used pretty widely. A domain-specific programming language is still a programming language.

That’s not a very good language design ethos. Why don’t we all program in Brainf**k? None of these fancy named variables are unavoidably necessary. We can all just agree that the CMake reserved variables have documented, reserved positions on the tape.

I’m being a little facetious, but you’re assuming some universal set of values re: pain-versus-productivity for a build DSL, which there is not.

The design goals do not make CMake any less Turing complete or any less a programming language. Disliking the fact that CMake is a programming language does not make it any less true. You can argue for the merits (or not) of adding any particular feature to CMake, but you can’t build your argument on a falsehood.

It’s also worth noting that you can view CMake a related pair of programming languages:

The scripting language that runs during the configure step
The inputs to the generator step; those inputs are themselves a sort of sub-Turing-complete DSL for describing builds; the semantics are declarative. Generator expressions are able to carry out substantial computations on strings, too.

So for any feature, you’d have to consider which language (or both) to add it to.