Bug in `if(<varname> STREQUAL <string>)`

Why is in this code the fatal error not triggered. This is clearly a bug?

function(overwrite_if_empty varName byString)
  if("${${varName}}" STREQUAL "")
    message("SET")
    set(${varName} ${byString} PARENT_SCOPE)
  endif()
endfunction()

set(A "")
overwrite_if_empty(A "")

message("'${A}'")

if(A STREQUAL "")
  message(FATAL_ERROR "WUPS")
endif()

message("'${A}'")

If you remove overwrite_if_empty(A "") the error gets triggeret.

Can somebody explain whats going on here? Thanks a lot.

Update: Does the set(${varName} ${byString} PARENT_SCOPE) unset the variable because ${byString} is empty.

It does unset it. However, its string value should still be empty. I think it may fail because the left side is not defined anymore and is therefore just the string A (which is not STREQUAL "").

Indeed set(A "${B}") and set(A ${B}) are not equivalent when B is empty. I don’t know whether it’s due to how CMake does argument passing or if it’s a quirk of the set function, but I documented that behavior in a repository of mine on GitHub to remember about it:

Below that test, there are a few more that show “interesting” behavior when using “lists”.

Thanks, yes this is just a really nuissence, modern cmake should really leave these old habits. They make CMake extremely bug prone!
A function which in some cases does unsetting is just plain wrong and is the opposite of what the function says to do.

@ben.boeckel Could we not have a new policy which can be set, such that this set function behaves sane? Is there any resistance?

The same would be for if. Cant there be an new policy that an if argument distinguishes between 2 distinct things and not having any fallback: a string in quotes, or a variable name == the value of the variable, and not a fallback to the variable name string if the variable with that name is not defined?

A variable name which is not defined is an error.

Indeed. Alas, it is old behavior and a policy would be…quite noisy for everyone. I think a policy would be good to have, but this is a lot of work for users of CMake that is being requested.

Again, this would be very noisy and would require guarding a lot of existing code with DEFINED "x" AND (…)

Thats why this policy would be opt-in first for maybe some year :). At least it would make it possible to traverse later to more sane language constructs also to make people less frustrated and ro show that there is progress to make it cleaner :slight_smile: I would love to see this happening. Also does Cmake intend to have a python interface anytime?

I think mostly such a policy is not so noisy as when it happens for new code its in 90% an error except for globals defined on the command line which where not defaulted with option or otherwise set, so I would argue its only good to either be explicit with if(DEFINED … or then default the variable…

No. The stability guarantees of Python are wholly unsuited for CMake’s stability guarantees.

That’s not how policies work. They warn that new behavior is coming until the code is fixed or the policy is set to NEW for “accept the new behavior”. Setting it to OLD is never a suitable “solution” to policy warnings.

@ben.boeckel Thanks for the reply! Interessting: Why you mean is that the case for python? I think its good that CMake tries to invest time in such stability things!

Sorry, I misunderstood: Yeah then it would be quite noisy, but what can we do, if we want to have long term fixes in thes weird behavior, there is no workaround, or is there? People will just use OLD and continue and maybe need to fix these sooner or later this policy is removed…

I know it comes with a pain, but as I have written tons of CMake, I am just extremely relieved if we would not have to think always about these weird edge cases.

Sure, I agree. If the policy is written, we can test it out to see how much it affects. I just suspect that it is a deeper rabbit hole to make projects adhere to it that I don’t have time for myself and I can only give that same warning to others. Note that it is not just a set(x) → set(x "") blind replacement because “fixing” means also looking to make sure that the call was not intended to be unset(x) which depends on any DEFINED checks elsewhere and behind other calls (e.g., option, find_*, etc.). There are also interactions across scopes that would need to be investigated in such places.

If CMake had chosen Python back in 1996, we’d be stuck with Python 2.x today to get the stability guarantees we ended up with (and, IMO, have been critical to CMake’s success because “no one” likes tinkering with their build systems) because old syntax is no longer valid syntax in modern versions. I also don’t trust Python upstream to not keep cranking on new features that would keep being requested even though CMake pinned itself to Python 3.9 (or whatever) for such stability. I trust them more to not make a Python3 mess, but I still find breaking things every release (3.9’s forced os.add_dll_directory, 3.10’s initializer rewrites, etc.).

AFAIK, only Perl and Tcl have had such stability since then until today and I don’t know that many would consider them strictly better than CMake’s current syntax (though a “real” list representation would certainly be appreciated).

2 Likes

Ah, nice to know.
Jeah, totally agree its maybe a rabbit whole. I agree it would be really good if CMake would habe two distinct types, lists and strings. Getting rid of the semicolon :sunglasses:

You should always quote expanded strings.

Due to how Cmake treats variable expansion in set() expansion of the ${byString} produced an empty string, which basically unsets(A). See docs and example below:

Zero arguments will cause normal variables to be unset set() documentation

See also: if() Variable Expansion

function(overwrite_if_empty varName byString)
  if("${${varName}}" STREQUAL "")
    message("\${${varName}} => >${${varName}}<")
    set(${varName} ${byString} PARENT_SCOPE) 
  endif()
endfunction()

set(A "")
# The following two are equal because of unquoted expansion inside function
#overwrite_if_empty(A "")
set(A)

message("'${A}'")

if(A STREQUAL "")
  message(FATAL_ERROR "EMPTY STRING")
endif()

if(NOT DEFINED A)
  message(FATAL_ERROR "FAILS IF NOT DEFINED")
endif()

message("'${A}'")