0

I have a string containing a JavaScript object as follows:

const myString = `[{"url":"https:\/\/audio.ngfiles.com\/1171000\/1171300_small-talk.mp3?f1668090863","is_published":true,"portal_id":2,"file_id":0,"project_id":1973416,"item_id":1171300,"description":"Audio File","width":null,"height":null,"filesize":5776266,"params":{"filename":"https:\/\/audio.ngfiles.com\/1171000\/1171300_small-talk.mp3?f1668090863","name":"small%20talk","length":"145","loop":0,"artist":"arbelamram","icon":"https:\/\/aicon.ngfiles.com\/1171\/1171300.png?f1668090865","images":{"listen":{"playing":{"url":"https:\/\/img.ngfiles.com\/audio_peaks\/3\/1171000\/1171300.1668090863-1505287.listen.png?f1668090905","rel_path":"audio_peaks\/3\/1171000\/1171300.1668090863-1505287.listen.png"},"completed":{"url":"https:\/\/img.ngfiles.com\/audio_peaks\/3\/1171000\/1171300.1668090863-1505287.listen.completed.png?f1668090905","rel_path":"audio_peaks\/3\/1171000\/1171300.1668090863-1505287.listen.completed.png"}},"condensed":{"playing":{"url":"https:\/\/img.ngfiles.com\/audio_peaks\/3\/1171000\/1171300.1668090863-1505287.condensed.png?f1668090906","rel_path":"audio_peaks\/3\/1171000\/1171300.1668090863-1505287.condensed.png"},"completed":{"url":"https:\/\/img.ngfiles.com\/audio_peaks\/3\/1171000\/1171300.1668090863-1505287.condensed.completed.png?f1668090906","rel_path":"audio_peaks\/3\/1171000\/1171300.1668090863-1505287.condensed.completed.png"}}},"duration":145},"portal_item_requirements":[5],"html":"\n\n<div id=\"audio-listen-player\" class=\"audio-listen-player\">\n\t<div id=\"audio-listen-wrapper\" class=\"audio-listen-wrapper\">\n\n\t\t<div id=\"waveform\" class=\"audio-listen-container\"><\/div>\n\n\t\t<div class=\"outer-frame\"><\/div>\n\n\t\t<p id=\"cant-play-mp3\" style=\"display:none\">Your Browser does not support html5\/mp3 audio playback.!!!<\/p>\n\n\t\t<p id=\"loading-audio\">\n\t\t\t<em class=\"fa fa-spin fa-spinner\"><\/em> LOADING...\n\t\t<\/p>\n\t<\/div>\n\n\t<div class=\"audio-listen-controls\">\n\t\t<div class=\"play-controls\">\n\t\t\t<button class=\"audio-listen-btn\" id=\"audio-listen-play\" disabled>\n\t\t\t\t<i class=\"fa fa-play\"><\/i>\n\t\t\t<\/button>\n\n\t\t\t<button class=\"audio-listen-btn\" id=\"audio-listen-pause\" disabled>\n\t\t\t\t<i class=\"fa fa-pause\"><\/i>\n\t\t\t<\/button>\n\n\t\t<\/div>\n\t\t<div class=\"playback-info\">\n\t\t\t<span id=\"audio-listen-progress\">00.00<\/span>\n\t\t\t\/\n\t\t\t<span id=\"audio-listen-duration\">00.00<\/span>\n\t\t<\/div>\n\t\t<div class=\"sound-controls\">\n\t\t\t<button class=\"audio-listen-btn\" id=\"audio-listen-repeat\">\n\t\t\t\t<i class=\"fa fa-retweet\"><\/i>\n\t\t\t<\/button>\n\n\t\t\t\t\t\t\t<button class=\"audio-listen-btn\" id=\"audio-listen-volumeToggle\">\n\t\t\t\t\t<i class=\"fa fa-volume-off\"><\/i>\n\t\t\t\t<\/button>\n\n\t\t\t\t<div class=\"off\" id=\"audio-listen-volume\"><\/div>\n\t\t\t\n\t\t<\/div>\n\t<\/div>\n<\/div>\n\n",
callback:function(){(function($) { var player = NgAudioPlayer.fromListenPage({ 'generic_id': 1171300, 'type_id': 3, 'url': "https:\/\/audio.ngfiles.com\/1171000\/1171300_small-talk.mp3?f1668090863", 'version': 1668090863, 'duration': 145, 'loop': false, 'images': {"listen":{"playing":{"url":"https:\/\/img.ngfiles.com\/audio_peaks\/3\/1171000\/1171300.1668090863-1505287.listen.png?f1668090905","rel_path":"audio_peaks\/3\/1171000\/1171300.1668090863-1505287.listen.png"},"completed":{"url":"https:\/\/img.ngfiles.com\/audio_peaks\/3\/1171000\/1171300.1668090863-1505287.listen.completed.png?f1668090905","rel_path":"audio_peaks\/3\/1171000\/1171300.1668090863-1505287.listen.completed.png"}},"condensed":{"playing":{"url":"https:\/\/img.ngfiles.com\/audio_peaks\/3\/1171000\/1171300.1668090863-1505287.condensed.png?f1668090906","rel_path":"audio_peaks\/3\/1171000\/1171300.1668090863-1505287.condensed.png"},"completed":{"url":"https:\/\/img.ngfiles.com\/audio_peaks\/3\/1171000\/1171300.1668090863-1505287.condensed.completed.png?f1668090906","rel_path":"audio_peaks\/3\/1171000\/1171300.1668090863-1505287.condensed.completed.png"}}}, 'playlist': 'listen' }, 128);   })(jQuery); }}]`

As you can see, it's JavaScript and has functions. But I need a way of parsing the rest of the object as JSON, without using eval.

It can either return null where there were functions or completely remove the key and value.

I have tried the following RegEx but I'm not the best at it so I just keep messing up the object to make it unparsable and unrunnable.

/function\([^()]*\){[^}]*}/gi

Which then replaces with null

LuisAFK
  • 846
  • 4
  • 22
  • 3
    If you use `JSON.stingify()` the object will be converted to JSON. Since you cannot have functions in a JSON object, they will be deleted. – Gabe Nov 12 '22 at 17:15
  • @Gabe I'm sorry, I worded my question incorrectly. The object I have is as a string, that I get from a RegEx by scraping some HTML. But it's a string, not an object – LuisAFK Nov 12 '22 at 17:31
  • From where are you getting this string? It's obviously not JSON. Maybe fix it at the sending end if you can...? – Andy Nov 12 '22 at 18:28
  • @Andy I can't fix it because I'm getting it from someone's website. Also, you're right, it's not valid JSON, it's JavaScript. And that's the problem, I need to parse it as JSON because it's basically JSON but with functions – LuisAFK Nov 12 '22 at 18:31
  • I don't think you can because the string itself doesn't parse due to a mix of single/double quotes, and you can't work on a string that doesn't parse. Can you refine the scraping tool to only get what it is you need? – Andy Nov 12 '22 at 21:21
  • @Andy don't you know any Regex that could fix it? – LuisAFK Nov 13 '22 at 08:52
  • I think for this to work with regex you would need the recursive regex `(R?)` operator which is not supported in javascript. Maybe this answer can help you to write a function to cut the function block out of the string: https://stackoverflow.com/a/14952529/17438890 – Gabe Nov 13 '22 at 09:34
  • Your non-JSON already has a problem at this snippet: `"html":"\n\n
    – trincot Nov 13 '22 at 09:38
  • You still haven't told us, what info from that string, you want to extract. May be we can a totally different solution, id we know that. – Poul Bak Nov 13 '22 at 17:36
  • @PoulBak everything except for the functions. I did say it in my question, I wanted to remove the function and parse everything else – LuisAFK Nov 13 '22 at 19:14

2 Answers2

1

Note to OP and others that want to play with this: You can't just use:

`...` 

when you hardcode a string her on SO, you must use:

 String.raw`...`

Otherwise escaped control characters will become control characters - messing up JSON.parse. This of course not a problem, if you receive this a string.

Having fixed that, you can use the following regex to match the 'garbage':

/,\s*callback[\s\S]+(?=\}\])/g

Replace with an empty string and you will have valid JSON.

Explanation:

,\s*callback[\s\S]+(?=\}\]) - match a comma, optional whitespace and callback followed by one or more of any characters (greedy) until look ahead for }]

This works because [\s\S]+ is greedy. It will start by matching the rest of the text, then move backwards until the look ahead matches.

Proof (to @PeterThoeny):

const myString = String.raw`[{"url":"https:\/\/audio.ngfiles.com\/1171000\/1171300_small-talk.mp3?f1668090863","is_published":true,"portal_id":2,"file_id":0,"project_id":1973416,"item_id":1171300,"description":"Audio File","width":null,"height":null,"filesize":5776266,"params":{"filename":"https:\/\/audio.ngfiles.com\/1171000\/1171300_small-talk.mp3?f1668090863","name":"small%20talk","length":"145","loop":0,"artist":"arbelamram","icon":"https:\/\/aicon.ngfiles.com\/1171\/1171300.png?f1668090865","images":{"listen":{"playing":{"url":"https:\/\/img.ngfiles.com\/audio_peaks\/3\/1171000\/1171300.1668090863-1505287.listen.png?f1668090905","rel_path":"audio_peaks\/3\/1171000\/1171300.1668090863-1505287.listen.png"},"completed":{"url":"https:\/\/img.ngfiles.com\/audio_peaks\/3\/1171000\/1171300.1668090863-1505287.listen.completed.png?f1668090905","rel_path":"audio_peaks\/3\/1171000\/1171300.1668090863-1505287.listen.completed.png"}},"condensed":{"playing":{"url":"https:\/\/img.ngfiles.com\/audio_peaks\/3\/1171000\/1171300.1668090863-1505287.condensed.png?f1668090906","rel_path":"audio_peaks\/3\/1171000\/1171300.1668090863-1505287.condensed.png"},"completed":{"url":"https:\/\/img.ngfiles.com\/audio_peaks\/3\/1171000\/1171300.1668090863-1505287.condensed.completed.png?f1668090906","rel_path":"audio_peaks\/3\/1171000\/1171300.1668090863-1505287.condensed.completed.png"}}},"duration":145},"portal_item_requirements":[5],"html":"\n\n<div id=\"audio-listen-player\" class=\"audio-listen-player\">\n\t<div id=\"audio-listen-wrapper\" class=\"audio-listen-wrapper\">\n\n\t\t<div id=\"waveform\" class=\"audio-listen-container\"><\/div>\n\n\t\t<div class=\"outer-frame\"><\/div>\n\n\t\t<p id=\"cant-play-mp3\" style=\"display:none\">Your Browser does not support html5\/mp3 audio playback.!!!<\/p>\n\n\t\t<p id=\"loading-audio\">\n\t\t\t<em class=\"fa fa-spin fa-spinner\"><\/em> LOADING...\n\t\t<\/p>\n\t<\/div>\n\n\t<div class=\"audio-listen-controls\">\n\t\t<div class=\"play-controls\">\n\t\t\t<button class=\"audio-listen-btn\" id=\"audio-listen-play\" disabled>\n\t\t\t\t<i class=\"fa fa-play\"><\/i>\n\t\t\t<\/button>\n\n\t\t\t<button class=\"audio-listen-btn\" id=\"audio-listen-pause\" disabled>\n\t\t\t\t<i class=\"fa fa-pause\"><\/i>\n\t\t\t<\/button>\n\n\t\t<\/div>\n\t\t<div class=\"playback-info\">\n\t\t\t<span id=\"audio-listen-progress\">00.00<\/span>\n\t\t\t\/\n\t\t\t<span id=\"audio-listen-duration\">00.00<\/span>\n\t\t<\/div>\n\t\t<div class=\"sound-controls\">\n\t\t\t<button class=\"audio-listen-btn\" id=\"audio-listen-repeat\">\n\t\t\t\t<i class=\"fa fa-retweet\"><\/i>\n\t\t\t<\/button>\n\n\t\t\t\t\t\t\t<button class=\"audio-listen-btn\" id=\"audio-listen-volumeToggle\">\n\t\t\t\t\t<i class=\"fa fa-volume-off\"><\/i>\n\t\t\t\t<\/button>\n\n\t\t\t\t<div class=\"off\" id=\"audio-listen-volume\"><\/div>\n\t\t\t\n\t\t<\/div>\n\t<\/div>\n<\/div>\n\n",
callback:function(){(function($) { var player = NgAudioPlayer.fromListenPage({ 'generic_id': 1171300, 'type_id': 3, 'url': "https:\/\/audio.ngfiles.com\/1171000\/1171300_small-talk.mp3?f1668090863", 'version': 1668090863, 'duration': 145, 'loop': false, 'images': {"listen":{"playing":{"url":"https:\/\/img.ngfiles.com\/audio_peaks\/3\/1171000\/1171300.1668090863-1505287.listen.png?f1668090905","rel_path":"audio_peaks\/3\/1171000\/1171300.1668090863-1505287.listen.png"},"completed":{"url":"https:\/\/img.ngfiles.com\/audio_peaks\/3\/1171000\/1171300.1668090863-1505287.listen.completed.png?f1668090905","rel_path":"audio_peaks\/3\/1171000\/1171300.1668090863-1505287.listen.completed.png"}},"condensed":{"playing":{"url":"https:\/\/img.ngfiles.com\/audio_peaks\/3\/1171000\/1171300.1668090863-1505287.condensed.png?f1668090906","rel_path":"audio_peaks\/3\/1171000\/1171300.1668090863-1505287.condensed.png"},"completed":{"url":"https:\/\/img.ngfiles.com\/audio_peaks\/3\/1171000\/1171300.1668090863-1505287.condensed.completed.png?f1668090906","rel_path":"audio_peaks\/3\/1171000\/1171300.1668090863-1505287.condensed.completed.png"}}}, 'playlist': 'listen' }, 128);   })(jQuery); }}]`;

    let regex = /,\s*callback[\s\S]+(?=\}\])/g;

    let json = myString.replaceAll(regex, '');

    let myObject = JSON.parse(json);

    console.log(myObject);
Poul Bak
  • 10,450
  • 5
  • 32
  • 57
  • 1
    I don't think this is correct, you can't look for `}]`. The callback function looks like this: `callback: function () { ... }`, so you'd need to find the matching closing curly bracket because inside the function you have other curly brackets. – Peter Thoeny Nov 14 '22 at 20:28
  • @PeterThoeny: `[\s\S]+`is `greedy`! It wil wil start matching everything, then move back until the look ahead is true. Just try it. I agree that changing the input might invalidate the regex. – Poul Bak Nov 15 '22 at 10:15
  • This assumes that there is only one callback function, and it needs to be at the end. – Peter Thoeny Nov 16 '22 at 21:39
1

You need a proper language parser to replace a function with null.

The challenge using regular expressions is that the input has parenthesis and brackets with nesting, and you need to look for the matching pair in order to find the end of the function.

You can do that with regex in three steps:

  1. annotate { and } with nesting level, such as {~0~ ... {~1~ ... }~1~ ... }~0~
  2. replace the function with null by looking non-greedily for the closing bracket with same nesting level, such as function() {~1~ ... {~2~ ... }~2~... }~1~
  3. clean up nesting annotation of remaining brackets

Here is working code with silly input to demonstrate the three steps:

const input = `[{
  name: "jimmy",
  avatar: { size16: "jimmy16.png" },
  callback: function() {
    (function ($) {
      let o = { foo: "bar", sub: { marine: 1 } };
    })(jQuery);
  },
  zzz: "z1"
}]`;
console.log('input: ' + input);
let level = 0;
let inputWithLevel = input.replace(/[\{\}]/g, m => {
  if(m === '{') {
    return '{~' + (level++) + '~';
  } else {
    return '}~' + (--level) + '~';
  }
});
console.log('inputWithLevel: ' + inputWithLevel);
let result = inputWithLevel
  .replace(/\b(\w+: *)function\(\) *\{~(\d+)~[\s\S]*?\}~(\2)~/g, '$1null')
  .replace(/([\{\}])~\d+~/g, '$1');
console.log('result: ' + result);

Output:

input: [{
  name: "jimmy",
  avatar: { size16: "jimmy16.png" },
  callback: function() {
    (function ($) {
      let o = { foo: "bar", sub: { marine: 1 } };
    })(jQuery);
  },
  zzz: "z1"
}]
inputWithLevel: [{~0~
  name: "jimmy",
  avatar: {~1~ size16: "jimmy16.png" }~1~,
  callback: function() {~1~
    (function ($) {~2~
      let o = {~3~ foo: "bar", sub: {~4~ marine: 1 }~4~ }~3~;
    }~2~)(jQuery);
  }~1~,
  zzz: "z1"
}~0~]
result: [{
  name: "jimmy",
  avatar: { size16: "jimmy16.png" },
  callback: null,
  zzz: "z1"
}]

A word of caution: This regex approach does not cover corner cases:

  • if input has non-balanced brackets (e.g. not compilable)
  • if a text value has non-balanced brackets
Peter Thoeny
  • 7,379
  • 1
  • 10
  • 20