Is unescaping of \; in foreach() intentional or not?

I came across the following code extract today and was somewhat surprised by the way \; is handled:

cmake_minimum_required(VERSION 3.16)

# Example of data to be processed
set(output [[
firstLine;aaa;bbb
secondLine;ccc;ddd]])

# The "output" variable may contain embedded semicolons, so escape them
string(REPLACE ";" "\;" output "${output}")

# Convert the string to a list, one line per list element
string(REPLACE "\n" ";" output "${output}")

message("output: ${output}")

# Iterate over each line of the output
foreach(line IN LISTS output)
    message("line: ${line}")
endforeach()

Sample output from the above:

output: firstLine\;aaa\;bbb;secondLine\;ccc\;ddd
line: firstLine;aaa;bbb
line: secondLine;ccc;ddd

Is this the expected behavior? The wording of the documentation in cmake-language(7) under the Escape Sequences section is pretty ambiguous and I find it rather unhelpful in understanding how \; should be handled in almost any situation. At the very least, it needs examples to demonstrate the various different cases. That aside, the above example could be demonstrating a bug in the foreach() command or it could be the intended behavior, but I can’t tell.

@brad.king Do you know whether we expect the \; to be transformed to plain ; when the foreach() command is used like in the above example?

1 Like

The behavior is as expected. The lists documentation says:

The sequence \; does not divide a value but is replaced by ; in the resulting element.

The original \; in your string() call encodes \; because the escape sequence documentation says:

A \; outside of any Variable References encodes itself

Historically this evolved in the very early days when someone said “I need to escape a ; for this specific use case” and implemented the \; behavior without considering the semantics in general. Due to compatibility, it cannot be changed now. A policy for this would have a huge runtime cost.

My question was more about how foreach() seems to be unescaping it though. That was the part that seemed more unexpected. That particular use case doesn’t seem to fall under any of the documented places where I would expect unescaping to occur.

foreach is expanding the list into elements exactly as any other list expansion does, and then iterating over those elements. What is wrong?

Hmmm… maybe I’m focusing on the wrong line. Would I be right in thinking that foreach() would assign the value to line still with that value containing the escaped \;, but it is then the evaluation of ${line} in the message() call where that escaping is removed to give plain ;?

My example is a simplified version of the code that gave rise to this query. The original code also evaluates a variable in the same way, but in a command like set(blah ${line} PARENT_SCOPE). I maybe missed that this is the evaluation that is doing the unescaping, not foreach().

foreach is setting line to firstLine;aaa;bbb and then secondLine;ccc;ddd. Inside the loop, the code message("line: ${line}") is evaluating ${line} inside a quoted argument, so there is no list expansion.