17

How do you trim white spaces in both the keys and values in a JavaScript Object recursively?

I came across one issue in which I was trying to "clean" a user supplied JSON string and send it into my other code for further processing.

Let's say we've got a user supplied JSON string whose property key and value are of type "string". However, what's problematic in this case is that the keys and values are not as clean as desired. Say a { " key_with_leading_n_trailing_spaces ": " my_value_with_leading_spaces" }.

In this case, it can easily cause issue with your brilliantly written JavaScript program trying to make use of such data(or should we call it dirty data?) because when your code is trying to get the value out of this JSON object, not only the key is not matching but also the value can not be matched. I have looked around google and found a few tips but there is not one cure that cure it all.

Given this JSON with lots of white spaces in keys and values.

var badJson = {
  "  some-key   ": "    let it go    ",
  "  mypuppy     ": "    donrio   ",
  "   age  ": "   12.3",
  "  children      ": [
    { 
      "   color": " yellow",
      "name    ": "    alice"
    },    { 
      "   color": " silver        ",
      "name    ": "    bruce"
    },    { 
      "   color": " brown       ",
      "     name    ": "    francis"
    },    { 
      "   color": " red",
      "      name    ": "    york"
    },

  ],
  "     house": [
    {
      "   name": "    mylovelyhouse     ",
      " address      " : { "number" : 2343, "road    "  : "   boardway", "city      " : "   Lexiton   "}
    }
  ]

};

So this is what I came up with ( with help of using lodash.js):

//I made this function to "recursively" hunt down keys that may 
//contain leading and trailing white spaces
function trimKeys(targetObj) {

  _.forEach(targetObj, function(value, key) {

      if(_.isString(key)){
        var newKey = key.trim();
        if (newKey !== key) {
            targetObj[newKey] = value;
            delete targetObj[key];
        }

        if(_.isArray(targetObj[newKey]) || _.isObject(targetObj[newKey])){
            trimKeys(targetObj[newKey]);
        }
      }else{

        if(_.isArray(targetObj[key]) || _.isObject(targetObj[key])){
            trimKeys(targetObj[key]);
        }
      }
   });

}

//I stringify this is just to show it in a bad state
var badJson = JSON.stringify(badJson);

console.log(badJson);

//now it is partially fixed with value of string type trimed
badJson = JSON.parse(badJson,function(key,value){
    if(typeof value === 'string'){
        return value.trim();
    }
    return value;
});

trimKeys(badJson);

console.log(JSON.stringify(badJson));

Note here : I did this in a 1, 2 steps because I could not find a better one shot to deal it all solution. If there is issue in my code or anything better, please do share with us.

Thanks!

vichsu
  • 1,880
  • 5
  • 17
  • 20
  • 1
    technically it is not JSON. – epascarello Nov 03 '15 at 22:55
  • Removed json tag as you are talking about a javascript object literal, not JSON. – Mike Brant Nov 03 '15 at 22:59
  • Thanks, epascarello, i may not use the term accurately but this is a trivial JavaScript object. If you don't mind, please let me know where it is not qualified as a JSON object. – vichsu Nov 03 '15 at 23:00
  • Now I see the difference, I should say it's a javscript object literal. Thanks, Mike! I tried RobG's suggestion but I got obj.reduce is not a function. Are you referring to object.reduce in the node js npm package ? – vichsu Nov 03 '15 at 23:11
  • @vichsu—ooops, *reduce* is a method of arrays, I meant to iterate over `Object.keys(obj).reduce(...)`, however the function also needs to be recursive. Not enough time for an answer at the moment. – RobG Nov 03 '15 at 23:31
  • @RobG, It's ok. Yup, it needs to be recursive but I think epascarello's answer is quite a concise solution to my case. it will take sometime to make one that you proposed. Thank you! – vichsu Nov 03 '15 at 23:40
  • @vichsu—yes, epascarello's answer is good, but only if the data can be suitably represented as JSON in the first place (e.g. functions and dates will not survive well). – RobG Nov 03 '15 at 23:43
  • @RogG - Agree. if functions , dates or other date types are involved with this, then it becomes a bit questionable whether or not it still works. – vichsu Nov 04 '15 at 00:50

8 Answers8

41

You can just stringify it, string replace, and reparse it

JSON.parse(JSON.stringify(badJson).replace(/"\s+|\s+"/g,'"'))
epascarello
  • 204,599
  • 20
  • 195
  • 236
  • This looks pretty good! Thanks, epascarello! This looks to be very concise. – vichsu Nov 03 '15 at 23:37
  • 1
    If there were quotes within the object, eg: {" key ": ' and "let" it '}, Stringify will escape them, but the regex will strip whitespace around those as well, producing 'and "let"it'. As JS doesn't have negative lookbehinds, you can solve this with a function: `JSON.parse(JSON.stringify(badJson).replace(/(\\)?"\s*|\s+"/g, ($0, $1) => $1 ? $0 : '"'))` – SamGoody Mar 10 '17 at 08:37
  • 4
    You could also use a replacer function with `JSON.stringify` that checks if the value is a string and trims it if it is. See [this CodePen](https://codepen.io/ajmueller/pen/NyXNME). I'm sure this could be optimized for performance, but it's pretty easy to read, somewhat concise, and should perform just fine for reasonably sized objects. – Alex Mueller Feb 17 '18 at 00:16
  • @AlexMueller has the correct solution. Please don't manipulate stringified JSON with regex -- it's stuff like this that gives javascript programmers a bad name. Even with the improved regex w/ negative look-behinds, consider: `'" '` That string has 5 trailing spaces, and doesn't get correctly processed. – kayjtea Nov 30 '20 at 15:32
23

You can clean up the property names and attributes using Object.keys to get an array of the keys, then Array.prototype.reduce to iterate over the keys and create a new object with trimmed keys and values. The function needs to be recursive so that it also trims nested Objects and Arrays.

Note that it only deals with plain Arrays and Objects, if you want to deal with other types of object, the call to reduce needs to be more sophisticated to determine the type of object (e.g. a suitably clever version of new obj.constructor()).

function trimObj(obj) {
  if (!Array.isArray(obj) && typeof obj != 'object') return obj;
  return Object.keys(obj).reduce(function(acc, key) {
    acc[key.trim()] = typeof obj[key] == 'string'? obj[key].trim() : trimObj(obj[key]);
    return acc;
  }, Array.isArray(obj)? []:{});
}
RobG
  • 142,382
  • 31
  • 172
  • 209
  • Thanks, @RobG. This works very well! I just wonder why we have to write code to deal with such issue. Why can't native JS have methods like this by default? – vichsu Nov 04 '15 at 00:01
  • There is a [*TC39 mailing list*](https://mail.mozilla.org/pipermail/es-discuss/). Go for it. ;-) – RobG Nov 04 '15 at 00:33
  • 4
    changing first line to `if (obj === null || !Array.isArray(obj) && typeof obj != 'object') return obj;` helped me not get an error for null objects – Daniel May 31 '18 at 17:47
5

The best solution I used is this. Check the documentation on replacer function.

function trimObject(obj){
  var trimmed = JSON.stringify(obj, (key, value) => {
    if (typeof value === 'string') {
      return value.trim();
    }
    return value;
  });
  return JSON.parse(trimmed);
}

var obj = {"data": {"address": {"city": "\n \r     New York", "country": "      USA     \n\n\r"}}};
console.log(trimObject(obj));
Hari Das
  • 10,145
  • 7
  • 62
  • 59
3

epascarello's answer above plus some unit tests (just for me to be sure):

function trimAllFieldsInObjectAndChildren(o: any) {
  return JSON.parse(JSON.stringify(o).replace(/"\s+|\s+"/g, '"'));
}

import * as _ from 'lodash';
assert.true(_.isEqual(trimAllFieldsInObjectAndChildren(' bob '), 'bob'));
assert.true(_.isEqual(trimAllFieldsInObjectAndChildren('2 '), '2'));
assert.true(_.isEqual(trimAllFieldsInObjectAndChildren(['2 ', ' bob ']), ['2', 'bob']));
assert.true(_.isEqual(trimAllFieldsInObjectAndChildren({'b ': ' bob '}), {'b': 'bob'}));
assert.true(_.isEqual(trimAllFieldsInObjectAndChildren({'b ': ' bob ', 'c': 5, d: true }), {'b': 'bob', 'c': 5, d: true}));
assert.true(_.isEqual(trimAllFieldsInObjectAndChildren({'b ': ' bob ', 'c': {' d': 'alica c c '}}), {'b': 'bob', 'c': {'d': 'alica c c'}}));
assert.true(_.isEqual(trimAllFieldsInObjectAndChildren({'a ': ' bob ', 'b': {'c ': {'d': 'e '}}}), {'a': 'bob', 'b': {'c': {'d': 'e'}}}));
assert.true(_.isEqual(trimAllFieldsInObjectAndChildren({'a ': ' bob ', 'b': [{'c ': {'d': 'e '}}, {' f ': ' g ' }]}), {'a': 'bob', 'b': [{'c': {'d': 'e'}}, {'f': 'g' }]}));
Richard
  • 14,798
  • 21
  • 70
  • 103
2

I think a generic map function handles this well. It separates the deep object traversal and transformation from the particular action we wish to perform -

const identity = x =>
  x

const map = (f = identity, x = null) =>
  Array.isArray(x)
    ? x.map(v => map(f, v))
: Object(x) === x
    ? Object.fromEntries(Object.entries(x).map(([ k, v ]) => [ map(f, k), map(f, v) ]))
    : f(x)

const dirty = 
` { "  a  ": "  one "
  , " b": [ null,  { "c ": 2, " d ": { "e": "  three" }}, 4 ]
  , "  f": { "  g" : [ "  five", 6] }
  , "h " : [[ [" seven  ", 8 ], null, { " i": " nine " } ]]
  , " keep  space  ": [ " betweeen   words.  only  trim  ends   " ]
  }
`
  
const result =
  map
   ( x => String(x) === x ? x.trim() : x // x.trim() only if x is a String
   , JSON.parse(dirty)
   )
   
console.log(JSON.stringify(result))
// {"a":"one","b":[null,{"c":2,"d":{"e":"three"}},4],"f":{"g":["five",6]},"h":[[["seven",8],null,{"i":"nine"}]],"keep  space":["betweeen   words.  only  trim  ends"]}

map can be reused to easily apply a different transformation -

const result =
  map
   ( x => String(x) === x ? x.trim().toUpperCase() : x
   , JSON.parse(dirty)
   )

console.log(JSON.stringify(result))
// {"A":"ONE","B":[null,{"C":2,"D":{"E":"THREE"}},4],"F":{"G":["FIVE",6]},"H":[[["SEVEN",8],null,{"I":"NINE"}]],"KEEP  SPACE":["BETWEEEN   WORDS.  ONLY  TRIM  ENDS"]}

making map practical

Thanks to Scott's comment, we add some ergonomics to map. In this example, we write trim as a function -

const trim = (dirty = "") =>
  map
   ( k => k.trim().toUpperCase()          // transform keys
   , v => String(v) === v ? v.trim() : v  // transform values
   , JSON.parse(dirty)                    // init
   )

That means map must accept two functional arguments now -

const map = (fk = identity, fv = identity, x = null) =>
  Array.isArray(x)
    ? x.map(v => map(fk, fv, v)) // recur into arrays
: Object(x) === x
    ? Object.fromEntries(
        Object.entries(x).map(([ k, v ]) =>
          [ fk(k)           // call fk on keys
          , map(fk, fv, v)  // recur into objects
          ] 
        )
      )
: fv(x) // call fv on values

Now we can see key transformation working as separated from value transformation. String values get a simple .trim while keys get .trim() and .toUpperCase() -

console.log(JSON.stringify(trim(dirty)))
// {"A":"one","B":[null,{"C":2,"D":{"E":"three"}},4],"F":{"G":["five",6]},"H":[[["seven",8],null,{"I":"nine"}]],"KEEP  SPACES":["betweeen   words.  only  trim  ends"]}

Expand the snippet below to verify the results in your own browser -

const identity = x =>
  x

const map = (fk = identity, fv = identity, x = null) =>
  Array.isArray(x)
    ? x.map(v => map(fk, fv, v))
: Object(x) === x
    ? Object.fromEntries(
        Object.entries(x).map(([ k, v ]) =>
          [ fk(k), map(fk, fv, v) ]
        )
      )
: fv(x)

const dirty = 
` { "  a  ": "  one "
  , " b": [ null,  { "c ": 2, " d ": { "e": "  three" }}, 4 ]
  , "  f": { "  g" : [ "  five", 6] }
  , "h " : [[ [" seven  ", 8 ], null, { " i": " nine " } ]]
  , " keep  spaces  ": [ " betweeen   words.  only  trim  ends   " ]
  }
`

const trim = (dirty = "") =>
  map
   ( k => k.trim().toUpperCase()
   , v => String(v) === v ? v.trim() : v
   , JSON.parse(dirty)
   )
   
console.log(JSON.stringify(trim(dirty)))
// {"A":"one","B":[null,{"C":2,"D":{"E":"three"}},4],"F":{"G":["five",6]},"H":[[["seven",8],null,{"I":"nine"}]],"KEEP  SPACES":["betweeen   words.  only  trim  ends"]}
Mulan
  • 129,518
  • 31
  • 228
  • 259
  • Or, perhaps, separate `map` and `mapKeys` functions with similar implementations, and then simply piping one to the other. Something like `pipe (mapKeys (toUpper), map (trim))`. – Scott Sauyet May 06 '20 at 00:22
  • Being able to act on keys and values separately would be worthwhile improvement. Maybe something like `map((k, v) => [ k.trim(), String(v) === v ? v.trim() : v ], dirtyObj)`? It's a slightly more verbose call than the original but it's far more useful being able to distinguish keys from values. – Mulan May 06 '20 at 00:35
  • Hmm, it's trickier than that. Your `mapKeys->map` is two passes over the input but it's the only solution I can see working right now. – Mulan May 06 '20 at 00:43
  • 1
    I added an update with your comment in mind. Never wrote `map` like this but I think there might be something to it... – Mulan May 06 '20 at 01:23
  • 1
    Unfortunately, it's not a real bifunctor, or there might be more literature on this. The key must map back to key types (`String`/`Symbol`) so there's no true parametricity. But it still seems closely related and genuinely useful. What I don't like about it is that the positional nature of the arguments means while we could easily implement `mapKeys` and `mapValues` using this, it's not so easy to use it directly to do one of those; it's possible, but ugly. (Update: I guess it's not too bad, if you simply pass `identity` for the missing function.) – Scott Sauyet May 06 '20 at 01:38
1

Similar to epascarello's answer. This is what I did :

import java.util.regex.Matcher;
import java.util.regex.Pattern;

........

public String trimWhiteSpaceAroundBoundary(String inputJson) {
    String result;
    final String regex = "\"\\s+|\\s+\"";
    final Pattern pattern = Pattern.compile(regex);
    final Matcher matcher = pattern.matcher(inputJson.trim());
    // replacing the pattern twice to cover the edge case of extra white space around ','
    result = pattern.matcher(matcher.replaceAll("\"")).replaceAll("\"");
    return result;
}

Test cases

assertEquals("\"2\"", trimWhiteSpace("\" 2 \""));
assertEquals("2", trimWhiteSpace(" 2 "));
assertEquals("{   }", trimWhiteSpace("   {   }   "));
assertEquals("\"bob\"", trimWhiteSpace("\" bob \""));
assertEquals("[\"2\",\"bob\"]", trimWhiteSpace("[\"  2  \",  \"  bob  \"]"));
assertEquals("{\"b\":\"bob\",\"c c\": 5,\"d\": true }",
              trimWhiteSpace("{\"b \": \" bob \", \"c c\": 5, \"d\": true }"));
some random guy
  • 430
  • 5
  • 9
1

I tried the solution JSON.stringify solution above, but it will not work with a string like '"this is \'my\' test"'. You can get around it using stringify's replacer function and just trim the values going in.

JSON.parse(JSON.stringify(obj, (key, value) => (typeof value === 'string' ? value.trim() : value)))

JaredM
  • 61
  • 1
  • 5
1

@RobG Thank you for the solution. Adding one more condition will not create more nested objects

function trimObj(obj) {
      if (obj === null && !Array.isArray(obj) && typeof obj != 'object') return obj;
      return Object.keys(obj).reduce(function(acc, key) { 
        acc[key.trim()] = typeof obj[key] === 'string' ? 
          obj[key].trim() : typeof obj[key] === 'object' ?  trimObj(obj[key]) : obj[key];
        return acc;
      }, Array.isArray(obj)? []:{});
    }