0

What's the correct way to traverse and modify a JSON string in C?

Specifically, I have a string, body_buf. When printed out

print("length: %d\n%.*s\n", body_len, body_len, body_buf);

It looks like this:

length: 113
{"field1":"something","whatever":10,"description":"body","id":"random","__oh__":{"session":"12345678jhgfdrtyui"}}

Another more complicated body_buf may look like this:

{"status":1,"query":{},"proc":{"memory":{"total":17177939968,"cmax":18363625472,"amax":20000000000},"cpu":{"cores":[0.788,0.132,0.319,2.951,10.111,3.309,1.43,0.8,2.705,4.203,2.32,2,0.019,0.172,0.247,3.888,0.282,0.423,5.254,0.258,0.009,0.369,3.277,0.048,0.283,7.574,3.086,1.592,0.191,0.166,4.348,0.391,0.085,0.25,7.12,4.927,3.671,1.147,3.216,4.628,0.131,0.995,0.744,4.252,4.022,3.505,3.758,3.491],"total":108.886,"limit":800},"disk":{"used":20170,"limit":50000,"io_limit":500}}}

I want to simplify body_buf (which also doubles as removing sensitive information) according to the following rules, only modifying the values, not any of the keys:

  1. Strings become the length of strings.
  2. Array of strings becomes [array_len, max_len, min_len].
  3. Array of numbers becomes [array_len, max, min].

I'm not familiar with working with JSON strings in C. What's the best way to do this?

I can treat body_buf as a string and traverse through it, modifying whatever comes after a ":", because those are bound to be the values I might modify, depending on the type. For arrays, I need to keep track of anything that are sandwiched between "[" and "]". This could work but doesn't seem very straightforward.

Alternatively, perhaps convert the body_buf to a JSON type and then traverse through the nested structure. But then I also have to modify it. I have yet to find a C example (which would be helpful) using json-c or otherwise that traverses and modifies (or create a new one via some kind of deep copy?) a JSON object.

Details (rules above, 1-3) aside, this should be a relatively common operation -- to traverse and modify. So for those more attuned to the intricacies and good/standard practices of json-c or JSON manipulation in general in C, I'm looking for some pointers.

Again, I have json-c:

#include "cJSON.h"
#include "cJSON_Utils.h"
#include <libjson/json.h>
#include <libjson/json_tokener.h>

Relevant information I've looked at so far include the following:

https://gist.github.com/alan-mushi/19546a0e2c6bd4e059fd

How to get json values after json_tokener_parse()?

Parsing deeply nested JSON key using json-c

ajfbiw.s
  • 401
  • 1
  • 8
  • 22
  • sorry, I don't quite understand what you mean, I guess you want to simplify "cores":{...}?Do you want to change it to something like "cores":["array_len":N, "max_len":3, "min_len":1] ? – yanzhang.guo Jan 27 '21 at 00:51
  • Simplify everything, all the values. So string {"Name":"Tom", "Age":18, "Address": "California", "arr": [1,2,3,4,5]} becomes {"Name": 3,"Age":18,"Address":10,"arr":[5, 5, 1]}, according to the rules. @yanzhang.guo thanks for the query. – ajfbiw.s Jan 27 '21 at 00:58
  • @yanzhang.guo Does that make sense? – ajfbiw.s Jan 27 '21 at 01:10
  • Yes, can you parse old string and rearrange it into {"Name": 3,"Age":18,"Address":10,"arr":[5, 5, 1]}?Is this what you want? – yanzhang.guo Jan 27 '21 at 01:22
  • Yes that's what I want: parse the old string and 'rearrange' it into the new one, and for complicated nested examples I gave as well. – ajfbiw.s Jan 27 '21 at 01:26
  • So, your problem is that you don't know how to parse old strings? Or something? – yanzhang.guo Jan 27 '21 at 01:30
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/227863/discussion-between-yanzhang-guo-and-ajfbiw-s). – yanzhang.guo Jan 27 '21 at 01:35
  • You have tagged [tag:json-c], and your intention seems to be to use that library for the task. Using a format-specific API is the best approach for this sort of thing, so you're on the right track. But if your question boils down to "How does one use JSON-C?" then it is too broad for SO. Consult the library's reference material, and try to work out the needed code. If that still leaves you with some specific, narrow questions then we will still be here. – John Bollinger Jan 27 '21 at 20:27
  • @JohnBollinger Thanks. I get where your point about narrowness is coming from. I edited the question. Does it help a bit in clarifying my intentions? – ajfbiw.s Jan 27 '21 at 21:36
  • 3
    *Clarifying* your intentions is not the problem. *Focusing* the question so that it affords good answers that are neither "here's your code" nor "here's a complete tutorial on JSON-C" is what is needed. JSON-C has [fairly complete documentation](https://json-c.github.io/json-c/), with links to tutorials. It's API is a typical one: it parses JSON data to an object representation with tree-like structure, which representation affords traversing the tree, modifying it, and converting the result back to a string. Do some research. – John Bollinger Jan 27 '21 at 23:06
  • Isn't the same more or less true of all questions where one discusses ways they have considered approaching a specific problem and seeks advice from those more experienced in the technology (and its conventions, important because there are multiple ways to do things) on how best to proceed? To make the question interpretable as both narrow and broad, depending on the answerer's preference, the last section ("details aside...") distills it into a rather basic (applicable to a broader audience) and easily answerable question for those familiar with the topic. @JohnBollinger – ajfbiw.s Jan 28 '21 at 23:49
  • I think a good answer requires neither the complete tutorial of json-c (note that even whether to use json-c in this case was an unknown) nor 'here is your code' (I make multiple attempts to make the question not about the Rules 1-3, which I presented only to give context). I do admit technically all questions could be answered by 'check the man pages/docs' Perhaps what makes my question 'inappropriate' is the relative obscurity of json-c or the lessened interest in C (vs Python, etc.) in general. If I asked the same about Python, there'd be a few competing answers already @JohnBollinger – ajfbiw.s Jan 28 '21 at 23:54
  • @ajfbiw.s, if indeed your question would afford an answer of length and scope suitable for SO, then you can take my comments and the lack of any answer so far as signs that *we don't recognize that*. The C tag does not have the volume of questions that the Python tag does, but it is still among SO's most active. Good C questions do not linger unanswered as this question has, and poor ones are quickly closed and / or downvoted. – John Bollinger Jan 29 '21 at 00:04
  • Possibly the involvement of JSON-C (if any -- it's not clear whether that's a requirement) is a contributor, but I, for one, am familiar with that library, and I answer a lot of C questions here, but I do not see how to answer this one without writing a book. – John Bollinger Jan 29 '21 at 00:06

1 Answers1

1

I don't know how "simplify" the json will be useful. Using json in c can be scary the first time.

I like cJSON library, it is light, portable and stable. It has a good test coverage, and the license is MIT.

I think this code using the library cJSON will do what you asked:

#include <cjson/cJSON.h>
#include <stdbool.h>
#include <string.h>
#include <stdio.h>
#include <limits.h>
#include <float.h>

const char json1[] = "{\"field1\":\"something\",\"whatever\":10,\"description\":\"body\",\"id\":\"random\",\"__oh__\":{\"session\":\"12345678jhgfdrtyui\"}}";
const char json2[] = "{\"status\":1,\"query\":{},\"proc\":{\"memory\":{\"total\":17177939968,\"cmax\":18363625472,\"amax\":20000000000},\"cpu\":{\"cores\":[0.788,0.132,0.319,2.951,10.111,3.309,1.43,0.8,2.705,4.203,2.32,2,0.019,0.172,0.247,3.888,0.282,0.423,5.254,0.258,0.009,0.369,3.277,0.048,0.283,7.574,3.086,1.592,0.191,0.166,4.348,0.391,0.085,0.25,7.12,4.927,3.671,1.147,3.216,4.628,0.131,0.995,0.744,4.252,4.022,3.505,3.758,3.491],\"total\":108.886,\"limit\":800},\"disk\":{\"used\":20170,\"limit\":50000,\"io_limit\":500}}}";
const char json3[] = "{\"Name\":\"Tom\",\"Age\":18,\"Address\":\"California\",\"arr\":[1,2,3,4,5]}";

static void simplifyArray(cJSON *input, cJSON *output)
{  
    cJSON *item;
    size_t noElems = 0;
    
    if (cJSON_IsString(cJSON_GetArrayItem(input, 0))) {
        size_t max, min;
        max = 0;
        min = UINT_MAX;
        cJSON_ArrayForEach(item, input) {
            noElems++;
            size_t len = strlen(cJSON_GetStringValue(item));
            if (len > max) max = len;
            if (len < min) min = len;
        }
        cJSON *newArray = cJSON_AddArrayToObject(output, input->string);
        cJSON_AddItemToArray(newArray, cJSON_CreateNumber(noElems));
        cJSON_AddItemToArray(newArray, cJSON_CreateNumber(max));
        cJSON_AddItemToArray(newArray, cJSON_CreateNumber(min));

    } else if (cJSON_IsNumber(cJSON_GetArrayItem(input, 0))) {
        double max, min;
        max = -DBL_MAX;
        min = DBL_MAX;
        cJSON_ArrayForEach(item, input) {
            noElems++;
            double value = item->valuedouble;
            if (value > max) max = value;
            if (value < min) min = value;
        }
        cJSON *newArray = cJSON_AddArrayToObject(output, input->string);
        cJSON_AddItemToArray(newArray, cJSON_CreateNumber(noElems));
        cJSON_AddItemToArray(newArray, cJSON_CreateNumber(max));
        cJSON_AddItemToArray(newArray, cJSON_CreateNumber(min));
    }
}

static void simplify(cJSON *input, cJSON *output)
{
    cJSON *elem;
    for (elem = input; elem != NULL; elem = elem->next) {
        if (cJSON_IsString(elem)) {
            cJSON_AddNumberToObject(output, elem->string, strlen(cJSON_GetStringValue(elem)));
        } else if (cJSON_IsArray(elem)) {
            simplifyArray(elem, output);
        } else if (cJSON_IsObject(elem)) {
            cJSON *newOutput = cJSON_AddObjectToObject(output, elem->string);
            simplify(elem->child, newOutput);
        } else {
            cJSON *dup = cJSON_Duplicate(elem, true);
            cJSON_AddItemToObject(output, elem->string, dup);
        }
    }
}

static void simplifyAndPrint(const char *json)
{
    cJSON *input = cJSON_Parse(json);
    cJSON *output = cJSON_CreateObject();
    simplify(input->child, output);
    printf("%s\n", cJSON_PrintUnformatted(output));
    cJSON_Delete(input);
    cJSON_Delete(output);
}

int main()
{
    simplifyAndPrint(json1);
    simplifyAndPrint(json2);
    simplifyAndPrint(json3);
    return 0;
}

The output:

{"field1":9,"whatever":10,"description":4,"id":6,"__oh__":{"session":18}}
{"status":1,"query":{},"proc":{"memory":{"total":17177939968,"cmax":18363625472,"amax":20000000000},"cpu":{"cores":[48,10.111,0.009],"total":108.886,"limit":800},"disk":{"used":20170,"limit":50000,"io_limit":500}}}
{"Name":3,"Age":18,"Address":10,"arr":[5,5,1]}

In the example above I preferred don't alter the input JSON, if you don't care about this you can use the funcion cJSON_ReplaceItemInObject to substitute the node.

P.S.: I am assuming arrays contain only strings and numbers, and don't mix it, because there is no rule to handle other array configurations.

P.S.2: This code is using the version of the library present in Ubuntu 20.04, if you download the library from GitHub the version will contain more features.

  • This is perfect -- a clean and understandable example. With the lack of good existing examples on the subject anywhere (none that really helped me), this will be beneficial to others looking for an intro (crash course, really, because this captures pretty much the gist of everything one needs, the rhythm of it) to working with JSON in C (quite intimidating when not sure just what to use when and what's the proper/conventional way of doing things, especially if they are more used to a higher-level language). Yes, in my specific example, arrays are homogeneous and of either strings or numbers. – ajfbiw.s Jan 29 '21 at 17:14
  • The background here is I'm collecting data for API discovery, and it is only the schema and some basic information that I need. This also roots out (at least reduces) potentially sensitive information in the data. – ajfbiw.s Jan 29 '21 at 17:18
  • https://stackoverflow.com/questions/65097945/query-parameter-preserve-as-json Hi Matheus, does this sound like something you'd know the answer of as well? Cool if not. – ajfbiw.s Mar 04 '21 at 01:15