CMake Pre-Build command

developer · April 23, 2020, 7:27pm

I am trying to have a formatting tool (AStyle) run before building a given target.

There is the add_custom_command that can be associated with a target and specified as PRE_BUILD. Unfortunately, this doesn’t work with Makefiles. Is there a way to do this such that I can run a command only when the target has to be rebuilt?

I’ve tried using add_custom_command without a target and add_custom_target, but it seems both of these don’t quite do the same thing. Ideally, I wouldn’t have to have my command run every time make is run. Instead I would prefer the command to run only when the specific target has to be rebuilt.

ben.boeckel · April 27, 2020, 1:54pm

AFAIK, there’s no good way of doing this today. That said, targets which change your source tree is usually not a good idea. I suggest keeping it as an explicit target (rather than implicit) and use CI or git hooks to enforce style rather than the build system.

starseeker · April 27, 2020, 3:06pm

We’re not set up to do it before building, but we’ve been able to have a style-related build fail before linking, with something similar to the following. (We may have introduced a custom mod to do asyle without actually modifying the file - see https://sourceforge.net/p/brlcad/code/HEAD/tree/brlcad/trunk/misc/tools/astyle/ for the version we’re using)

function(VALIDATE_STYLE targetname srcslist)

if (STYLE_VALIDATE AND ASTYLE_EXECUTABLE AND NOT "${srcslist}" STREQUAL "")

    set(fullpath_srcslist)
    foreach(srcfile ${srcslist})
      if (EXISTS "${CMAKE_CURRENT_SOURCE_DIR}/${srcfile}")
        set(fullpath_srcslist ${fullpath_srcslist} "${CMAKE_CURRENT_SOURCE_DIR}/${srcfile}")
      endif (EXISTS "${CMAKE_CURRENT_SOURCE_DIR}/${srcfile}")
    endforeach(srcfile ${srcslist})

    # If we have a list that isn't empty, use it
    if (fullpath_srcslist)
      add_custom_command(
    TARGET ${targetname} PRE_LINK
        COMMAND "${ASTYLE_EXECUTABLE}" --report --options=${CMAKE_SOURCE_DIR}/misc/astyle.opt ${fullpath_srcslist}
        COMMENT "Checking formatting of ${targetname} srcs"
    )
      if(TARGET astyle)
        add_dependencies(${targetname} astyle)
      endif(TARGET astyle)
    endif(fullpath_srcslist)

  endif(STYLE_VALIDATE AND ASTYLE_EXECUTABLE AND NOT "${srcslist}" STREQUAL "")

endfunction(VALIDATE_STYLE)

developer · April 27, 2020, 4:01pm

@starseeker, the only annoying thing about doing it pre_link, is that source files are formatted with astyle, they would have been updated after build. Which means the next time you build, those files will be seen as modified, requiring another rebuild.

starseeker · April 27, 2020, 8:05pm

That’s why we have a mod to astyle to report if it would have formatted the file but to not otherwise change anything - in that mode it doesn’t change any of the source files, so it doesn’t trigger a rebuild.

developer · April 27, 2020, 8:39pm

Perhaps, I’m not understanding. Don’t you want astyle to format your source files though? So do you just manually run astyle to format the code then after a link failure?

starseeker · April 27, 2020, 11:45pm

Right. It’s intended (once we eventually turn it on) as a check to encourage coders to set up their environments to edit in the standard style - encourages consistency in the checked-in codebase (or will once we finally enable it by default…) That way if there’s a style problem it is caught early - a pre-commit build check will automatically fail on style issues - but it gets fixed something other than the build system (usually manual human astyle or edit) since for our project the build step should never alter the source tree.

starseeker · April 27, 2020, 11:56pm

If I were wanting to have it actually do the formatting and then build it, one possibility might be:

Set up one target to copy the source files into the build directory and astyle format them.
Build the target from the copied sources rather than the original.
Have a post-build target that copied the astyle altered files back over the original source files and then ran a second command to adjust the timestamps on the copied files to match the originals (or pre-date the generated binary.) The last step might have to be platform specific - I don’t know of a way to do that with CMake offhand…

ben.boeckel · April 28, 2020, 4:41am

You don’t want to do this either. IDEs and editors will have developers editing these copied files directly when they “jump to error” or the like.

It’s best to just have a separate astyle target which is run manually rather than trying to integrate it into the normal build workflow. It’s just not something that works well with the C/C++ style compilation model without the build tool knowing what is going on. Here, make and ninja have no concept for specifying self-modifying rules and the editor-uses-the-build-tree behavior is dangerous as all your sources that happen there are now generated sources and their edits are not tracked by the build system properly since they are outputs. Copying them back means you now have a dependency cycle in ninja too (mtime fixup or not).

jwm · June 1, 2020, 7:51pm

Can you explain why “format before build” is a bad idea? I’m not a CMake guru (yet), but in the past I’ve learned that the best way to enforce project workflow is to integrate it into the workflow. When is a better time to format the code than before building (because it could modify the source)?

You don’t want to wait for commit for a couple of reasons, not the least being that the newly formatted code would never have been built. Another is that git doesn’t encourage that type of modification before commit either, and there’s no way to enforce it server-side, so it relies upon developers configuring their repos properly (not reliable).

To me, formatting before build, like unit tests after build, should never be optional. It would be very nice if CMake supported these kinds of workflow actions.

ben.boeckel · June 1, 2020, 8:48pm

It’s not a CMake-specific thing. I wouldn’t be happy with this in any build step. Editor? OK. Separate build target? OK. But not during the normal build. It upends your build graph because your source files are now outputs of some computation. Sure, that’s doable, but now they need to go to some other path (anything else is a circular dependency). Then your compiler is warning about copied source files and your editor which helpfully uses the compiler to drop you at warnings and errors is having you edit the reformatted file off in the hinterlands of your build tree instead of the one in your source tree.

On your code review/CI workflow part of the contribution pipeline. I’ve contributed to a number of projects without actually building code (typos, trivial fixes, etc.) and having the CI pipeline track my mistakes is much faster for me than faffing about getting dev environments up and running for trivial patches.

Eh, commits are fungible. (I don’t buy Mercurial or Fossil’s excuses; rebase is too useful.)

Huh? It may not encourage, but it certainly doesn’t discourage.

Configure it as part of your review/CI workflow. If you’re committing directly to master…don’t do that. Put server-side safeguards on what goes into master. That I can trust. I certainly wouldn’t trust developers setting up their local repos reliably. How many CMake contributors do you think actually run Utilities/SetupForDevelopment.sh? Probably the regular contributors and no one else.

CMake (and other Kitware projects) reject code that is not formatted properly. But, we also have tools that will fix the formatting for contributors and push the fixed branch back to their fork (upon request). If the robot allows it, it’s all good and one less thing I have to worry about as a reviewer. As a reviewer, I can focus on the content of the contribution rather than things like “is that a valid UTF-8 codepoint?”, “is this file too damn big?”, or “is the whitespace sane”.

CMake is a build system, not a workflow engine. And the problem with any of this is that since development is local, I can turn off any barriers you put up. make foo/fast skips any build steps you put up (assuming I even make a build tree), git commit -n ignores your git hooks, I have a different/wrong version of the formatter, the formatter mocked by a symlink to /usr/bin/true, etc. It just needs to be done server-side in some capacity because that’s the only place you can actually trust in your workflow.

jwm · June 1, 2020, 8:55pm

I see your points (for the most part) but they do reflect a particular way of doing business that isn’t necessarily appropriate for small, integrated teams.

One point that I’d like clarification on is

CMake is a build system

From what I understand, CMake is a build-file generator. The actual build system is external. And this causes problems when you want build-time behavior (time-stamps, commit hashes, version strings, etc), since CMake “stops” once the build files are generated. I’ve seen numerous Stack Overflow posts about this, and no idomatic solutions. But I’m still looking, and I could be wrong.

ben.boeckel · June 1, 2020, 9:44pm

We do it even for smaller teams (< 5 consistently active developers). Granted, once you have it setup for larger projects, adding small ones is trivial (I think it comes to maybe 12 lines in the configuration with the YAML anchors we have).

Ah, this is a terminology thing. I call make and ninja “build tools”. That is, they take a description and actually build it (“build executor” may also suffice). But CMake targets and such have semantics that CMake guarantees for all of its generators. This means that not every build tool CMake can generate has every feature available in CMake’s model (e.g., wildcard rules in make, custom commands with deps = in ninja, etc.). Some of those features do have special cases in CMake itself (BYPRODUCTS and job pools in ninja, all the VS_PROP stuff, XCODE_ATTRIBUTE, Eclipse has some “nature” stuff exposed, and more), but it’s still far short of the general cases supported by any single build tool.

jwm · June 1, 2020, 9:59pm

I guess that’s my point. You have an existing workflow that you clone for each project. Other teams stand up their workflows from scratch. They may do something analogous to you, or they may be more naive. I’m hoping there isn’t anything philosophical in how CMake works that depends on a particular (undocumented) workflow.

ben.boeckel · June 1, 2020, 11:05pm

Philosophical? No. But what you want is best done as a separate, explicit make fix-formatting target. CMake just doesn’t support custom commands editing sources like clang-format -i does and nothing in the ecosystem (AFAIK) well-supports building a “shadow” source tree (that said, I’m also not sure how I’d design one either). Maybe a dedicated build system could do it, but it isn’t something I can forsee being translatable into a build.ninja file.

jwm · June 2, 2020, 12:11am

You’ve mentioned a “shadow source tree” a couple of times. Perhaps you are making assumptions. I run my formatting in-place, so the build just sees a normal, modified target. Yes, editors need to know about reloading externally modified files, but most editors handle that pretty well.

FWIW, my homegrown “ecosystem” has a script that is run as a custom CMake target that runs the formatter only on the modified files, to avoid introducing unrelated formatting changes. Or, using the git-clang-format model, another option is to only format the modified lines. But, obviously my script isn’t cross-platform, and so this is perhaps where you object to the lack of support in CMake?

Perhaps it is shortcomings in our toolchain, or my ignorance of them or any of a number of other limitations that don’t bear detailing, but I can’t agree with your assertion that what I want is “best” done as a separate target. On that, we will have to agree to disagree.

hex · June 2, 2020, 12:19am

The mention of shadow source tree results of the given build order. You do not seem to have that defined, yet. You want your sources being formatted before being build. Even though they are the same source files your build system must understand the concept of before and after, eventually. Therefore, shadow source tree.

ben.boeckel · June 2, 2020, 11:44am

Then it cannot be in the dependency tree for any targets that come from those source. It needs to be a separate target dependency tree. If not, you now either have:

a target which depends on its own output: logical error and not satisfiable (the build tool would need some guarantees of idempotency here and any format would need to be run continuously until it does not make a modification)
unspecified inputs/outputs: the build system is not tracking things properly and you have ghosty behaviors depending on the build tool.

I’m talking in terms of build tools that exist today. I know of no build executor that is happy with such a strategy (supported by CMake or not).