Feature request: Universal enforcement of CMake script formatting

Hi,

I think official enforcement of “correct” CMake code formatting would make reading and writing CMake code significantly faster and easier. I’d like to see a tool from Kitware that ships with CMake and can be run on a CMake file to make it comply with Kitware’s official style guidelines for CMake code format. I’d even go so far as to have CMake complain about formatting at configuration-time.

Motivation:

  • I wish I had auto-formatting tools when writing CMake code,
  • I really wish other developers in my company had auto-formatting tools when writing CMake code,
  • I think arguing about CMake formatting guidelines is a form of bikeshedding - I’d rather not think about it and blindly follow the official guidelines,
  • If you google “CMake coding style”, the first three results are from KDE, ROS, and Jetbrains. I bet they wish they could stop publishing guidelines and just say “use the official tool”,
  • In this talk, Chandler Carruth (Google DevOps, C++ standards committee, all-round smart guy) says “[clang-format, a C++ auto-formatting tool, is] the single largest improvement in your productivity as a developer”. I think the same might apply to CMake.

Caveats:

  • If Kitware choose different standardisations to existing codebases, they might be upset. If there’s an auto-formatting tool then maybe they won’t be very upset,
  • I’m aware of tools like this, which are a great start, but I’d like to see greater adoption and more universality,
  • I’m not a CMake developer, nor have I ever written any lexing/parsing code, so I have no idea how much effort this entails.

Thanks,
Tom

3 Likes

CMake’s…loose syntax makes this very hard. Knowing whether a function argument is a keyword or not, how many arguments it takes after that, etc. is really hard. Not to mention what kind of effects quoting or variable expansions have at various places, etc. “Standard” commands could probably have some standard syntax formatting since the argument sets are “known”, but user-defined commands can have all kinds of logic applied to them.

FWIW, my rules are basically along the lines of:

  • 2 space indent
  • spaces after “control flow” commands (if, else, function, while, return, continue, end*, etc.), otherwise no space
  • quote all variable expansions except:
    • when if cares
    • when it is a list of arguments
  • for argument blocks, vertical alignment is good (whether hanging indents or not)

Not to discourage anyone else from trying, but figuring out how to apply these rules to arbitrary code is not something I think is a good use of my time at least.

1 Like

One thing on my wishlist for CMake is a documentation generation tool for CMake scripts and modules – perhaps, down the line, these two things could be integrated? It stands to reason that if the developer is telling the docs tool, “for this function, these are the options, these are the single value args, these are the multi value args”, then a lexer/linter/formatter tool could use that information to provide hints or reformatting at each call site of said function.

It may not even require much extra effort from the developer, since ~90% of the time I see cmake_parse_arguments used it’s in pretty much exactly this form:

set (options "")
set (oneValueArgs "")
set (multiValueArgs "")

cmake_parse_arguments (PREFIX "${options}" "${oneValueArgs}" "${multiValueArgs}" ${ARGN})

I could even see supporting in-code documenting of functions arguments with something like this:

set (options 
        OPTION_1 # doc string for option 1
        OPTION_2 # doc string for option 2
    )

set (oneValueArgs
        ARG_1 # doc string for arg 1
        ARG_2 # doc string for arg 2
    )

Just for the record, formatters usually don’t need to follow include paths to format the code. Since CMake files are rarely standalone, you also need to see what is included anywhere before the file in question is executed to peruse for such API docstrings for formatting help. Since include() usually relies on runtime manipulation of CMAKE_MODULE_PATH, a full CMake interpreter needs to be run to be a “formatter” (though a lint tool which does deeper static analysis would be able to do this).

Another case that came to mind are APIs that wrap another. They “skim off” a set of arguments and then pass the rest off to some other API. Getting such information is difficult. There’s also the “multiple arguments that take multiple values” where each instance of the keyword argument needs handled individually that cmake_parse_arguments just doesn’t handle today (IME, this is far rarer than the “wrapper” API).

1 Like