Request: "verbatim" or "recurse" to copy_directory

TLDR: Please add an option to copy_directory to copy the directory AND its contents, i.e. copy_directory -r d1 d2 dest will result in dest/d1/... and dest/d2/....

copy_directory surprises me often by not quite behaving like any other directory copy operations I work with. The name suggests it will copy the directory structure: directory and contents.

Current behavior is what I would call “copy_directory_contents”.

On Mac and mobile, folders are often treated as objects (.app, .bundle, etc). Copying the contents makes no sense, and is further confused when you try to copy multiple dirs:

x: x/1, x/2
y: y/2, y/3
${CMAKE_COMMAND} -E copy_directory ${CMAKE_CURRENT_LIST_DIR}/x ${CMAKE_CURRENT_LIST_DIR}/y z)
expected:
  build/z/x/1, z/x/2, z/y/2, z/y/3
actual:
  build/z/1, z/2, z/3

My current use case is the following:

add_library (soandso INTERFACE)
set_target_properties (soandso PROPERTIES RUNTIME_STUFF something.bundle something.framework)

...

function (post_build target)
  get_target_property (_runtimes ${target} RUNTIME_STUFF)
  if (_runtimes)
    add_custom_command (
      TARGET ${target} POST_BUILD
      COMMAND ${CMAKE_COMMAND} -E copy_directory ${_runtimes} $<TARGET_FILE_DIR:${target}>
      COMMAND_EXPAND_LISTS
    )
  endif ()
endfunction ()

This just creates a directory-scramble with the contents of all those folders merged into one directory, instead I have to unpack the lists myself and mess around separating out the folder name:

...
        foreach (_runtime ${_runtimes})
            get_filename_component (_folder ${_runtime} NAME)
            add_custom_command (
                TARGET ${target} POST_BUILD
                COMMAND ${CMAKE_COMMAND} -E copy_directory
                    ${_runtime}
                    $<TARGET_FILE_DIR:${target}>/${_folder}
            )
        endforeach ()
...
1 Like

I think a better long-term solution is to use the rsync patterns (in a new -E command) where a trailing / means “copy contents” and without is “copy the entity itself”. This allows per-source selection of what is intended as well.

1 Like

I wrestled with that idea, my original expectation tbh, but isn’t that perhaps even more ambiguous? I’ve watched a lot of sound devops folks waste large numbers of cycles debugging that a string ended up with or without a trailing slash before finally reaching an rsync invocation; trying to slash-prune so paths don’t have trailing slashes, conflicting with assumptions of trailing slashes to support easy concatenation (looking at you, visual studio path macros $(YouBetterHaveASlashOnIt)Subdir); the consequences of people who trimmed too hard and made “c:/” into “c:” which means something different, “/” ends up as “” which could mean arguments get shuffled.

I suspect it might cause more headaches than it solves :face_holding_back_tears:

I don’t like it either, but what else is there to do that doesn’t require either strange flags to do it or per-command selection of how source paths are treated. It’s a long-standing pattern and not too hard to grok with some docs.

I’ll also note that other tools also have different behavior on trailing slashes:

mkdir t
ln -s t f
rm f/ # fails with "f" is a directory

I like the rsync pattern. The downside is that it is very cumbersome to use, as you depict it. The upside is that it became a kind of standard.

The best practice in general in CMake as I see it:

  • never have trailing slashes in variable values
  • clean external inputs from trailing slashes
  • use CMake functions like get_filename_component( DIRECTORY | ABSOLUTE_PATH ) to get directory paths for that (no own code)
  • add explicitly a trailing slash when it matters to the semantic. E.g.:
if ( EXISTS "${dir}" )

if ( EXISTS "${dir}/SubDir" )

If you follow these rules, then the rsync pattern is fine.

@ocroquette it’s been around since the CP/M days, one of the DOS copy commands used it, and it slipped into rsync. It is [was] the first thing I’d try when I didn’t get the copy I expected, but I’m from that line of engineering. It doesn’t seem to be something newer or less-unix oriented devs would be familiar with.

It comes back to a string-manipulation trick too. Off the top of your head, do you remember what the difference is between http://uri and http://uri/ ?

What happens if in some combination of conditions instead of a string it’s a list (${_runtimes}/x64/)?

And while I personally love command line-golf, when I initially tried adding a “/” on the argument, to see if that fixed it, it felt super awkward to be doing string-shenanigans in amongst a well enunciated list of settings and properties that pays off its verboseness because of its clarity:

add_custom_command (
     TARGET     ${target}
     POST_BUILD
     COMMENT "Install fmod studio files for ${target}"
     COMMAND ${CMAKE_COMMAND} -E copy_directory ${_runtimes}  $<TARGET_FILE_DIR:${target}>
     MAIN_DEPENDENCY ${target}Refresh
     WORKING_DIRECTORY ${_studio_dir}
     COMMAND_EXPAND_LISTS
)

I guess a part of the itch I’m trying to describe is that it introduces an ambiguity, because what you consider to be a static argument becomes a behavioral one, too.

I agree that a “value argument” that modifies the behavior is not a good thing.

Maybe this request would also be the opportunity to introduce the possibility to create mirrors instead of copying over, e.g. the equivalent of the --delete option of rsync. I needed that once in a while, and the only solution was to delete the destination recursively first, which was not nice.

Something like:

cmake -E

mirror_directory [--delete] <src> <dst>

    Mirrors the content of the <src> directory into the <destination> directory,
    similarly to the rsync tool. If <dst> directory does not exist, it will be
    created. Items in <src> and <dst> will be compared in terms of type
    (file, directory, symlink), modification date (for files), size (for files),
    reference (for links). If any differ, the item in <dst> will be replaced.
    mirror_directory will mirror symbolic links without modifying them
    and delete any item in <dst> that does not exist in <src>, but only
    if the option --delete is given, for safety reasons.

Then the semantic would be unambiguous, independently of trailing slashes. Just thinking out loud…

-E mirror_directory sounds far too complicated for us to get right in any meaningful way when it is already well-solved by other cross-platform tools. I really think that projects should prefer rsync or rclone where possible for such things. Clearing the install prefix is the job of the end-user because trees can overlap in ways that projects cannot expect (headers in a subdirectory of a related project, single-prefix install trees, etc.).

@ben.boeckel Presumably these commands exist for reasons like:

image

Rsync is likely a goto for experienced users here, because it has this quirk. So I think that distinction is valuable to capture in cmake, but doing it by string manipulation (adding a slash)?

Old-hands who are familiar with rsync will probably home in on it easily, but it’s got to be bewildering to newer engineers learning cmake. Is it the source or destination that needs the slash? What’s the difference between src,dst, src/,dst, src,dst/ and src/,dst/? What is the expectation of ${copy} "${this_is_actually_a_list}/" ${dst}/? That’s a lot to expect users to read, see and debug easily, especially when they are in a cmake context where things are spelled out and verbose.

And thinking about learning it earlier, it dawned on me: As a new user, I’d intuit that “src dst” means “copy src and its contents” while “src/ dst” means “copy everything below src over into dst” - it would actually be more logical.

For tiny projects, it seems innocent:

${dircopy}_if_different ${src}/Resources ${dst}/Resources
${dircopy}_if_different ${src}/Resources/ ${dst}/Resources

but this is a workhorse piece that operates down in the deep bowels of things like distribution signing and packaging and bundling; it’s often the stuff that never runs on developer machines but handles the latter stages of release shipping on build machines, where you really don’t want to be down in muck trying to find / find a missing slash…

macro (...)
  get_target_property (srcfolders ${${tgt_var_name}} ${${tgt_var_property}})
  ... ${dircopy} ... ${srcfolders} ${dstfolder}
endmacro ()

... ${dircopy}_if_different $<TARGET_PROPERTY:${APPNAME},RESOURCE_FOLDER> $<TARGET_PROPERTY:${libname},RESOURCE_DESTINATION>

# Sometimes, this copies the entire directory tree into ${destination}...
... ${dircopy}_if_different $<TARGET_PROPERTY:${APPNAME},RESOURCE_FOLDER>/ $<TARGET_PROPERTY:${libname},RESOURCE_DESTINATION>

I’m not sure what the exact correct CMakeism would be, but perhaps copy directory could require one of two boolean parameters and have passing neither be deprecated pending removal:

# create copies of each folder listed under dst
copy_directory --whole ${folder[s]} ${dst}   # --verbatim?
# copy the contents of each of the folders listed all into dst
copy_directory --contents ${folder[s]} ${dst}

For reference, the install(DIRECTORY) command already documents trailing-slash distinctions.

copy the directory AND its contents

install(DIRECTORY)'s implementation generates calls to file(INSTALL) in installation scripts, and the implementation of that command already supports recursive copying and such, optimized with file metadata comparisons. Such functionality could be exposed by a cmake -E command-line tool.