Documentation organization/granularity

Lately I’ve been thinking about the HTML documentation and how it’s organized.

Comparing a page like cmake-generator-expressions(7) to the property or variable documentation pages, or comparing those to the output of man cmake-variables or man cmake-properties from a (Linux) command line, two things occur to me:

  1. The property and variable documentation is broken up at the wrong level
  2. It’s disappointing that only documentation broken up at that level is fully organized and indexed

What I mean by the first is that it’s far more comfortable to scroll through a page like the generator expression documentation or the cmake-variables(7) man page, than it is to click around to individual, isolated pages each documenting a single property or variable. Back when CMake was first created the page-per-variable/property organization no doubt made sense, but these days a page that only documents a single variable or property feels like it’s presenting too little information, and with too little context. (In part because 90% of those pages are barely a paragraph of text, and don’t really need to be any longer.)

(I’m tempted to include policies as well, but honestly there’s far less value to seeing a given policy’s documentation in the context of other policies, since they’re identified only by number and each policy is a standalone item. There may be a couple of exceptions, but policies are generally not interrelated. The primary value in consolidating properties or variables is when an item is only one of a set of closely-related items, in which case it can be useful to view them all together.)

What I mean by the second point is that, because the generator expressions are all documented on a single page instead of broken up, there’s no cmake --help-generator-expression-list, and no cmake --help-generator-expression name to view the documentation for a specific expression. It’s the (IMHO uncomfortably-)fine granularity of the property and variable documentation that allows individual docs to be queried. (Interestingly, though, there are already differences there: the existing documentation has the property docs stored in subdirectories by type — prop_tgt, prop_cache, etc. — whereas the variables are all lumped into a single directory variable.)

The question

Well, right up front: Does anyone disagree that the current page-per-item organization feels anemic? Do you prefer pages like CMAKE_BUILD_TOOL to a page like cmake-generator-expressions(7)? This would hardly be the first time in my life that something has felt obvious to me, only to discover that I’m actually in a small minority of opinion. If the current organization is preferable to most users, I’ll shut up and deal with it.

The proposal

Assuming I’m not alone in this, I’d like to propose a change. It will probably be tricky, with Sphinx, but I hope not impossible. The exact design can be fleshed out collaboratively in Gitlab, if this is undertaken, but my personal goals/requirements would broadly be:

  1. Consolidate documentation on individual properties/variables into longer lists of multiple items. Probably not all on one page, but perhaps broken up along the lines of the current cmake-variables(7) and cmake-properties(7) sections: “Properties of Global Scope”, “Properties on Targets”, “Variables that Provide Information”, “Variables that Control the Build”, etc. each become one page.

  2. Ensure that a URL can still be given for each individual property/variable, though it may target an anchor on a larger page. So, instead of:

    https://cmake.org/cmake/help/latest/variable/CMAKE_COMMAND.html

    it might be:

    https://cmake.org/cmake/help/latest/var_info.html#CMAKE_COMMAND

    but there would still be a way to link to “the CMAKE_COMMAND documentation” specifically, in the online help.

  3. The version-switcher in the online documentation would still function properly, and be able to navigate between separated and consolidated documentation for the same property/variable. I think this should be achievable with some modifications to version_switch.js. (Though “some modifications” may be underselling it, as it would probably need at least a list of recognized property/variable categories. In the absence of the next item, it might need a list of every property and variable name so it could map them directly to/from the correct section pages.)

  4. (Possibly) Old bookmarks/links remain valid, perhaps via a method similar to what MediaWiki calls a “soft redirect”: a page that, when loaded, points to a different page where the information can be found. (Said page can optionally auto-refresh to the target URL, if JavaScript is enabled.)

    That would allow a request for .../variable/CMAKE_INSTALL_PREFIX.html to be referred over to .../var_behavior.html#CMAKE_INSTALL_PREFIX, instead of producing a 404 page. There are doubtless plenty of bookmarks and links to https://cmake.org/cmake/help/latest/ paths out in the wild, so those URLs should ideally remain valid.

  5. cmake --help functionality isn’t broken. Perhaps documentation files per property/variable are still written out to disk for use by the cmDocumentation class. Or, perhaps the documentation lists are converted to a parseable format like JSON that it can load to retrieve individual item docs.

  6. (Ideally) cmake --help-foo-list and cmake --help-foo name functionality is extended to the generator expressions, and to other smaller-than-a-page granularity items if any good candidates are identified.

  7. The man pages shouldn’t change. This should be a no-brainer, since the manpage generator already combines all of the property/variable docs into a single file; it’d just be a matter of assembling it out of the longer consolidated pages, rather than the individual pages.

There’s probably more, but I figure that’s a start, and enough for people to react to and/or poke holes in. (Equally welcome!)

as far as cmake --help-* is concerned, those are legacy and incomplete. It probably doesn’t make sense to add them to the discussion.

Yes, i do. From that page i get the exact information i need without being distracted by other things, plus the added bonus of easy navigation via hyperlink. Using permalinks in the form of https://cmake.org/cmake/help/latest/var_info.html#CMAKE_COMMAND would still be acceptable, though.

I like proposals 1 and 2. For 3, i wonder if this complexity is needed. More recent CMake documentation all include a New in version ... tag so navigation between versions becomes less important.

As maintainer, I do not like the idea of making larger less-granular pages. Over time more detail can be added to the individual entity pages without making long pages even longer. In fact I’ve thought about splitting up cmake-generator-expressions the way the other pages are. Only recently did generator expressions become first-class Sphinx domain objects that are indexed and cross-referenced.

2 Likes

I also prefer the current granular approach, for similar reasons to Brad and hex. For new users in particular, being able to follow a link to a page dedicated to the very thing I’m interested in without the noise of other variables, etc., it is much less confronting. I think the need to find a specific variable or property is more common than someone browsing around just “seeing what’s there” and discovering other properties and variables.

For comparison, we do have pages that collect together variables for their own context. The CPack generators are documented like that (see the RPM generator for an example). These kinda work because all of the variables are very closely related. Even then, I sometimes find myself lost in those lists and would probably find it easier if they had an index of variables instead and each variable had its own page, even though they’d all be very small.

One other thought I had is that when you have a large page with many variables (or whatever other entity type we want to consider), it tends to drive you to keeping the docs for each item very short. When each thing has its own page, you don’t feel that constraint and are more willing to add more detail. Over time, I think this leads to better quality of content. I acknowledge that many of the granular pages currently are very brief. Some of them could certainly use some love, but I think the current granular approach is likely to be better for users overall.

1 Like