Are square brackets universally supported by file(GLOB)?

Hi,

I have a situation where an application adds timestamp information to output file names and I’m trying to strip that out so I can automate regression testing.

File names are of the format EXzzhhmm.nnn where zz is a two letter code (‘CR’, ‘MB’, ‘PR’), hhmm are zero padded 24-hour hour and minute, and nnn is a zero padded serial number 000 to 999.

I can match these files with file(GLOB outfiles LIST_DIRECTORIES FALSE EX[CMP][RB][0-2][0-9][0-9][0-9].[0-9][0-9][0-9]). This works fine on Linux.

Unfortunately it craps out on Windows. Because of course it does.

I can’t find a clear definition of what CMake considers to be a legal glob expression in the official documentation. Professional CMake shows bar[0-9].txt as a glob pattern example, and like I said, the above glob pattern works on Linux with no issue.

My question is: should I expect CMake glob patterns to work universally or (as I am suspecting) are legal glob patterns host-dependent? It’s not clear from any of the documentation if I should expect host-dependent glob patterns aside from the issue of case sensitivity which isn’t a problem here.

I can work around this with a simple EX*.* glob and filtering with foreach() and string(REGEX) but I’d really like to avoid writing a specialty parser if host-independent globbing is available.

Thanks,

– Bob

CMake’s code behind file(GLOB) is implemented in CMake itself, so it’s not like there’s some system-implemented glob(3) function that isn’t implemented on Windows changing behavior here.

Cc: @brad.king I don’t see a test suite for KWSys’ Glob.cxx code. Are we just leaning on file(GLOB) testing? I don’t see much on complex glob expressions in there (at least that is explicitly tested).

I played with this a bit on windows and for me, I was able to find my file using EX[cmp][rb][0-2][0-9][0-9][0-9].[0-9][0-9][0-9] but not using the uppercase characters in the brackets.

Looking at the glob code, I wonder if the code at https://gitlab.kitware.com/cmake/cmake/-/blob/master/Source/kwsys/Glob.cxx#L143 potentially needs to do something like what happens around https://gitlab.kitware.com/cmake/cmake/-/blob/master/Source/kwsys/Glob.cxx#L160 for letters inside the brackets.

Thanks! It never occurred to me to test with the letters in the patterns set to lower case. It wouldn’t matter on Windows (case insensitivity) but I’d expect it to break on Linux. Weird. Also good to know that it’s not a pass-through to the underlying OS.

I was looking at the implementation of the globbing to check just now too (in kwsys/Glob.cxx, as mentioned in previous posts) and it converts all file names to lowercase on case-insensitive file systems according to the code comment. In practice though, it is on Windows and Apple platforms that this is done, it isn’t dependent on any file system. I didn’t see this mentioned in the docs for file(GLOB). Based on the observations that have been made in earlier posts here in this thread, it would seem that only the files found are transformed but not the regex itself (maybe it isn’t safe to do that). At the very least, this probably needs a doc update, but maybe there’s a potential improvement floating around here too.

Unfortunately, anything is going to require a policy as changing GLOB semantics could change source file listings that happen to work today :confused: .

That said, maybe this is a chance to start getting CMake off of the “OS determines case sensitivity” crutch we’ve gotten it onto? The three major platforms now offer per-directory case sensitivity flags (not to mention the vfat, exfat, etc. filesystems all of them support) for their primary filesystems (ext4 in Linux). I have no idea how that interacts with Apple Unicode normalization though…