1

I am trying to capture the user name and channel id of that user from an api string using regex.

Unfortunately I can not use a JSON Parser on the JSON format so I (beginner) am stuck with Regex.

My solution finds the username matches its string, finds the channel id and also matched that string. Because it's non-greedy, it finds the shortest possible solution and creates several capture groups, if multiple persons are connected.

But a problem arises, if multiple users of the server are online but some not connected to a channel. Regex then finds the first username and uses the in-between space until it finds the channel id of the next user. Then it obviously gives me the correct channel id but the incorrect user.

I excluded the symbol "{" at some point, because it separates different users and that worked. Unfortunately on some occasions "{" can also occur inside the users parameters so some were not captured anymore.

Now I tried to ban the string ""id"" from the allowed string between the two capture groups instead.

But I can't get it to work. Do you have any suggestions?

This example captures User 1 and 3 correctly but matches username User 2 with the channel id of Bot 1. I don't know much about flavors but it said PCRE(PHP) on the test site and so far that worked for my program. I shortened the avatar links and beginning with ....

Regular Expression:

username": "((?!Bot 1).*?)".*?channel_id": "([0-9]*?)"

String snippet:

"members": [{"id": "0", "username": "User 1", "discriminator": "0000", "avatar": null, "status": "online", "deaf": false, "mute": false, "self_deaf": false, "self_mute": false, "suppress": false, "channel_id": "0123456789", "avatar_url": "https://..."}, {"id": "1", "username": "User 2", "discriminator": "0000", "avatar": null, "status": "online", "game": {"name": "pls help"}, "avatar_url": "https://..."}, {"id": "2", "username": "Bot 1", "discriminator": "0000", "avatar": null, "status": "online", "game": {"name": "music | ;;help"}, "deaf": false, "mute": false, "self_deaf": false, "self_mute": false, "suppress": false, "channel_id": "1234567890", "avatar_url": "https://..."}, {"id": "3", "username": "User 3", "discriminator": "0000", "avatar": null, "status": "online", "deaf": false, "mute": false, "self_deaf": false, "self_mute": false, "suppress": false, "channel_id": "2345678901", "avatar_url": "https://..."}], "presence_count": 4}

Bohemian
  • 412,405
  • 93
  • 575
  • 722
Blancsoir
  • 13
  • 3
  • 1
    why regex? why not treat it like an object? – depperm Apr 07 '22 at 16:25
  • 1
    if you don't want to use a parser, can you guarantee the order of the properties in the objects? Can you guarantee that all properties are always present in the objects of the array? – cyberbrain Apr 07 '22 at 16:26
  • Because the programm I want to use this in does not support objects or parsing the string into an object No, as you can see some users have the "game": {} property and some don't. Thats why I used .*? for the space between the two groups – Blancsoir Apr 07 '22 at 16:27
  • 1
    are you using perl? [solution](https://stackoverflow.com/q/50596570/3462319)? or what language are you using – depperm Apr 07 '22 at 16:27
  • I looked that up. I am using this in Rainmeter which states that it uses Perl Compatible Regular Expressions on Google. – Blancsoir Apr 07 '22 at 16:31
  • is this for a plugin? it looks like the [docs](https://docs.rainmeter.net/developers/) mention C++ and C#. – depperm Apr 07 '22 at 16:35
  • It is for a skin. I used this as basis but that one only catches all online users on the server [link](https://www.reddit.com/r/Rainmeter/comments/4no8hx/oc_discord_user_list/) – Blancsoir Apr 07 '22 at 16:46

2 Answers2

0

Like other suggested, plan A should be to parse the object. For plan B your regex might look like this:

"username": "([^"]+)"

It gets a bit trickier if you allow escapes, for example, if a username is "User says "hi" always". In which case you could use the pattern described here: Unroll Loop, when to use:

enter image description here

Here we would have the normal case being [^"\\] (not double-quote or escape char), and the special case being \\" (escape double-quote).

To add on the channel_id, and assuming each object starts with {"id": ... you could then do:

enter image description here

Hope it helps...Needless to say, it's pretty overkill! I'd simplify it a bit, or rather get rid of the regex entirely if you can. Good luck!

David542
  • 104,438
  • 178
  • 489
  • 842
  • Okay if I use your Code, how could I exclude Users like "Bot 1" from the captured usernames? Otherwise this works good! – Blancsoir Apr 07 '22 at 17:24
  • I can't say that I understand your solution :D. The other answer was a little bit faster and I used that succesfully as it was relatively easy to implement. Nonetheless, thank you very much for your solution. – Blancsoir Apr 08 '22 at 19:32
0

Don't allow {"id" between username and channel:

username": "((?!Bot 1)[^"]*)"(?:(?!\{"id").)*channel_id": "(\d+)"

See live demo.

Username and channel ID are captured in groups 1 and 2.

Some other minor adjustments included.

Bohemian
  • 412,405
  • 93
  • 575
  • 722
  • Hi this nearly works too, but it only captures the last symbol from User 1, in this case 1. How do I get the whole Username "User 1" captured? – Blancsoir Apr 07 '22 at 17:31
  • Okay I got it by shifting the * into the brackets. Now it says: `username": "((?!Bot 1)[^"]*)"(?:(?!\{"id").)*channel_id": "(\d+)"` I can't yet give votes but thank you very much! – Blancsoir Apr 07 '22 at 17:41
  • @Blancsoir oops, yeah. Answer updated. – Bohemian Apr 07 '22 at 17:47