1

So I am going to preface this by saying I came up with this using RegExr and just typing in random thing until I got what I needed.

The goal I had in mind was to generate an array from a json string using.

my @valuesArray=$msgPayload=~/((?=["]).+?(?=["])")/g;

This works for most cases, however it just selects values from one quote to the next quote. this becomes an issue when there is a value like true, false, or null.

Example String

..."email":"","lastLogon":null,"pwdChanged":"0","status":"A","phoneNumber":"","extension":"","locale":"en_US","boChecked":false,"boUserId":"","UserId":"TWEH","sChecked":false,...

So I am working around this by using multiple replaces before actually splitting up into an array.

$msgPayload =~ s/:true/:"true"/g;
$msgPayload =~ s/:false/:"false"/g;
$msgPayload =~ s/:null/:"null"/g;
my @valuesArray=$msgPayload=~/((?=["]).+?(?=["])")/g;

Now I was wondering if there was a way to combine all of this into a single regex expresion where it can also select the true false and null values without first having to do the replace. And if it is possible I feel like it would be more efficient doing it that way, but if I wrong, I would love to know more.

Thanks!

Edit: We don't have any modules installed, and I doubt I could get any installed for this project since it is so small. I am just trying to make due with what I have.

The data is stored in a varchar column in a table. One full example would be received like so.

[ {  "signonId" : "",  "userId" : "USERNAME",  "groupID" : "Master",  "userName" : "",  "phoneNumber" : "",  "extension" : "4444",  "emailAddress" : "",  "password" : "",  "locale" : ""}, {  "signonId" : "",  "userId" : "USERNAME",  "groupID" : "Master",  "userName" : "",  "phoneNumber" : "",  "extension" : "5555",  "emailAddress" : "",  "password" : "",  "locale" : ""} ]

Which then when you replace the " with " and format it, it becomes...

[
   {
      "signonId":"",
      "userId":"USERNAME",
      "groupID":"Master",
      "userName":"",
      "phoneNumber":"",
      "extension":"4444",
      "emailAddress":"",
      "password":"",
      "locale":""
   },
   {
      "signonId":"",
      "userId":"USERNAME",
      "groupID":"Master",
      "userName":"",
      "phoneNumber":"",
      "extension":"5555",
      "emailAddress":"",
      "password":"",
      "locale":""
   }
]
ysth
  • 96,171
  • 6
  • 121
  • 214
Muamer Bektić
  • 299
  • 3
  • 16
  • 3
    In general it is better to parse JSON using a dedicated parser module, see for example [JSON::XS](https://metacpan.org/pod/JSON::XS). Then you can also easily extract your array values from the resulting hash. – Håkon Hægland Mar 12 '19 at 13:59
  • 3
    If you give us some correct JSON, we can show you how to turn it into a data structure. Regular expressions cannot exhaustively parse JSON in the way you're trying to. But it's probably as simple as `my $data = JSON -> decode ( $string );` and go from there. – Sobrique Mar 12 '19 at 14:04
  • @Sobrique We don't have any modules installed, and I doubt I could get any installed for this project since it is so small. I am just trying to make due with what I have. – Muamer Bektić Mar 12 '19 at 14:28
  • 2
    perl-JSON ships by default with Centos/Redhat. Otherwise what you're trying to do is re-implement some prior art, with less testing. – Sobrique Mar 12 '19 at 15:37
  • 4
    In 2019, there is [no excuse for](https://stackoverflow.com/questions/3735836/how-can-i-install-perl-modules-without-root-privileges) [not being able to](https://stackoverflow.com/questions/251705/how-can-i-use-a-new-perl-module-without-install-permissions) [install modules](https://stackoverflow.com/questions/2980297/how-can-i-use-cpan-as-a-non-root-user). – mob Mar 12 '19 at 17:32
  • @mob Thanks for the helpful information, this is my first time working with perl, so there is a lot I do not know. – Muamer Bektić Mar 12 '19 at 17:46

1 Answers1

5

First, your example data looks like it wants to be a hash, not an array.

Second, parsing JSON with regexes is (as you have already discovered) very fragile and error-prone. It's much better to use a proper JSON parser, such as the JSON module:

#!/usr/bin/env perl

use strict;
use warnings;

use JSON 'from_json';
use Data::Dumper;

my $json = q({
  "email": "",
  "lastLogon":null,
  "pwdChanged":"0"
} );

my $decoded = from_json( $json );

print Dumper $decoded;

Output:

$VAR1 = {
          'email' => '',
          'pwdChanged' => '0',
          'lastLogon' => undef
        };
Dave Sherohman
  • 45,363
  • 14
  • 64
  • 102
  • the issue is we do not have any modules installed, and i doubt I could get someone to install them. It's just a weird set of circumstances, I get the string from a table where a json string is stored as a varchar. I've added more proper examples to my above question. – Muamer Bektić Mar 12 '19 at 14:22
  • 3
    @MuamerBektić If you are on Perl 5.14 or newer, [JSON::PP](https://perldoc.pl/JSON::PP) is core (though slow). As an alternative, [JSON::Tiny](https://metacpan.org/pod/JSON::Tiny) can be [fatpacked](https://metacpan.org/pod/App::FatPacker) into any script. – Grinnz Mar 12 '19 at 17:11
  • @Grinnz I see [no JSON listed in perldoc](https://perldoc.perl.org/index-modules-J.html). Is that only about docs? (I seem to recall that it was added to core (v5.14?) ... and later removed ... false memory?) – zdim Mar 12 '19 at 17:24
  • 2
    @MuamerBektić I absolutely second the idea to somehow use a module. The `JSON::Tiny` linked by Grinnz can even merely be copied -- it's a single file with no dependencies (from what I can see) – zdim Mar 12 '19 at 17:24
  • @zdim I'm probably going to try to incorporate it, the only issue is that the strings are aren't always json values, sometimes they show up as xml, and I didn't want to have to use two modules, so i was just planning on using arrays created by regex. My biggest concern is going to be efficiency in the end so if it does improve speed i will do it. – Muamer Bektić Mar 12 '19 at 17:30
  • 1
    @MuamerBektić Hang on ... do you mean to say that you don't actually have proper JSON? If _any_ of it isn't correct JSON then modules will choke on it. What do you mean by "_strings aren't always json values ... but show up as xml_" ? Can you update the question to this fact, and give example(s) of string(s) you have which aren't valid JSON? – zdim Mar 12 '19 at 17:40
  • @zdim so basically, i am pulling these values from a table where they are being dumped. The values being dumped are one of three things, XML format, JSON string format, or Empty. I have an if statement set up to differentiate between these three situations, and proceed accordingly. So there's no way the JSONprocess would happen for an XML string. Sorry if I poorly worded it the first time. – Muamer Bektić Mar 12 '19 at 17:45
  • 1
    @zdim perldoc.perl.org is unmaintained and has many [longstanding issues](https://github.com/jonallen/perldoc.perl.org/issues) (like the one where it has never listed HTTP::Tiny and JSON::PP that have been core since 5.14), this is why I created perldoc.pl in the first place. – Grinnz Mar 12 '19 at 17:46