I’ve got a 3 MB JSON file consisting of a list of a few thousand items.
I need to parse them one by one and get the “name” member from each item.
So I’m doing
foreach(idx RANGE ${length})
string(JSON name GET “${json}” “name”)
list(APPEND names ${name})
endforeach()
This is quite slow, around 10 items per second, so for a few thousand items this takes long.
Is there a more efficient way to do this ?
Or does it need an additional mode for string(JSON) which would get each item in a list, or something like this ?
I vaguely recall some discussion about caching of such calls internally so that we could avoid having to reparse the entire JSON content on each call. I don’t think that idea was rejected, just not something to hold back the initial released implementation. I haven’t seen or heard any activity around returning to that idea, but it might help for cases like this.
How about a call to parse it, and then calls to access the parsed data ?
Something like
string(JSON PARSE [ERROR_VARIABLE err] “json-string”)
which would parse the json string and store the parsed data e.g. in a json::value with the same scoping as variables,
and then calls to access the parsed json could refer to this, e.g.
string(JSON out_var GET PARSED_JSON …)
This would be a bit similar to accessing the matched groups from regexes matches.
The “normal” JSON access functions could also put their result into this json::value, so that the following calls could access it.
Or the PARSE function could create a named JSON object, which could be referred to in following calls.
That’d require the same kind of design thinking that’d be involved in the other solution. Note that the “official” value is always the string representation, so manipulating the JSON directly may not be equivalent to the previous round-trip string(JSON) behavior (e.g., key ordering in objects) and therefore require a policy.