0

I get the following string from a logging service:

[("Browser": "Chrome73 (v 73.0)"), ("UserAgent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.86 Safari/537.36"), ("Languages": ["nb-NO", "nb;q=0.9", "no;q=0.8", "nn;q=0.7", "en-US;q=0.6", "en;q=0.5"]), ("UserClaim-1-http://schemas.microsoft.com/ws/2008/06/identity/claims/role": "Admin"), ("SessionId": "hhaztuwfpyuobfslljuy4z4e"), ("Cookie-__RequestVerificationToken": "9MJm_A4agsgbe4c_JtAePFnfMLBEgnkc0XhROfDFVd6291SUGtLPAqprsGHBcJw9JDRde6UR_1jHY_Hr4oKi4OZzuUDXqAA6IfeEtr9sxVI1"), ("Cookie-.ASPXAUTH": "AA23B2B1A5C428BFB60E32EA5A78A7D5016D7586F88548C012A1C2C2EB2A34D40A959B43680BCCE9923F1890017F59A3A82E6C1121AF50CF226D638FBCBC40F2D8E2FE4C945B44CC7572717D56C71FCC0B7B285A0EB5379370ADC6BE970E6438"), ("Cookie-ASP.NET_SessionId": "hhaztuwfpyuobfslljuy4z4e"), ("Info-FamilyId": 21267), ("Info-LoggedInUserID": 1), ("Info-MainConsultantUserId": 3)]

And I would like to turn it into a dictionary. I should be a simple task I thought, but I have been trying to parse the string in various ways without success. Can anyone point me in the right direction?

I have been trying to use:

var x = JsonConvert.DeserializeObject(the_string_above);

I get the exception: Unexpected character encountered while parsing value. (.Path '', line 1, position 1)

The format is not valid JSON as pointed out in the comments so the question then is how can I parse the text...

Brhaka
  • 1,622
  • 3
  • 11
  • 31
Liknes
  • 195
  • 14
  • Don´t parse JSON yourself, use a JSON-parser, e.g. Newtonsoft. – MakePeaceGreatAgain Jul 05 '19 at 08:52
  • 3
    What do you mean by an array of dictionaries? This looks like an array of key-value tuples, so one dictionary? – V0ldek Jul 05 '19 at 08:52
  • you should deserialize this with a suitable deserializer instead of tring to do it yourself rely on tested and approved functionality. https://stackoverflow.com/questions/7895105/deserialize-json-with-c-sharp – Denis Schaf Jul 05 '19 at 08:53
  • 1
    *I should be a simple task I thought, but I have been trying to parse the string in various ways without success* Show us what you got so far. – Patrick Hofman Jul 05 '19 at 08:54
  • Looking at your example are you just trying to turn this into a single dictionary? – sr28 Jul 05 '19 at 08:55
  • Turning the string into a single sictionary would be correct. I have edited the text... – Liknes Jul 05 '19 at 08:57
  • That is not valid JSON. – Patrick Hofman Jul 05 '19 at 09:07
  • http://json.parser.online.fr/ Paste your string & try to fix I think just some format error !? – TimChang Jul 05 '19 at 09:08
  • The format is not valid JSON so I guess the question really is how to parse this string correctly... – Liknes Jul 05 '19 at 09:15
  • Your JSON should look like this: { "Browser": "Chrome73 (v 73.0)", "UserAgent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.86 Safari/537.36", ... } – Arthur S. Jul 05 '19 at 09:22
  • I know the format should be different, but I have to work with what I get from the logging service... – Liknes Jul 05 '19 at 09:24
  • 1
    @Liknes I have tried a regex solution for this. Please see working demo and let me know if it suffices: https://dotnetfiddle.net/u1YbBK – Rahul Sharma Jul 05 '19 at 09:54

4 Answers4

2

So my attempt is towards a Regex based solution, but a JSON solution would be better and more efficient. I have prepared a sample regex based solution for your string.

using System;
using System.Text.RegularExpressions;

public class Program
{
    public static void Main()
    {
         string str = "[(\"Browser\": \"Chrome73 (v 73.0)\"), (\"UserAgent\": \"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.86 Safari/537.36\"), (\"Languages\": [\"nb-NO\", \"nb;q=0.9\", \"no;q=0.8\", \"nn;q=0.7\", \"en-US;q=0.6\", \"en;q=0.5\"]), (\"UserClaim-1-http://schemas.microsoft.com/ws/2008/06/identity/claims/role\": \"Admin\"), (\"SessionId\": \"hhaztuwfpyuobfslljuy4z4e\"), (\"Cookie-__RequestVerificationToken\": \"9MJm_A4agsgbe4c_JtAePFnfMLBEgnkc0XhROfDFVd6291SUGtLPAqprsGHBcJw9JDRde6UR_1jHY_Hr4oKi4OZzuUDXqAA6IfeEtr9sxVI1\"), (\"Cookie-.ASPXAUTH\": \"AA23B2B1A5C428BFB60E32EA5A78A7D5016D7586F88548C012A1C2C2EB2A34D40A959B43680BCCE9923F1890017F59A3A82E6C1121AF50CF226D638FBCBC40F2D8E2FE4C945B44CC7572717D56C71FCC0B7B285A0EB5379370ADC6BE970E6438\"), (\"Cookie-ASP.NET_SessionId\": \"hhaztuwfpyuobfslljuy4z4e\"), (\"Info-FamilyId\": 21267), (\"Info-LoggedInUserID\": 1), (\"Info-MainConsultantUserId\": 3)]";
         showMatch(str, @"(?<=\()(.*?)(?=\)[,\]])");
    }

     private static void showMatch(string text, string expr) {
         MatchCollection mc = Regex.Matches(text, expr);

         foreach (Match m in mc) {
            Console.WriteLine(m);
         }
      }
}

This will output:

"Browser": "Chrome73 (v 73.0)"
"UserAgent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.86 Safari/537.36"
"Languages": ["nb-NO", "nb;q=0.9", "no;q=0.8", "nn;q=0.7", "en-US;q=0.6", "en;q=0.5"]
"UserClaim-1-http://schemas.microsoft.com/ws/2008/06/identity/claims/role": "Admin"
"SessionId": "hhaztuwfpyuobfslljuy4z4e"
"Cookie-__RequestVerificationToken": "9MJm_A4agsgbe4c_JtAePFnfMLBEgnkc0XhROfDFVd6291SUGtLPAqprsGHBcJw9JDRde6UR_1jHY_Hr4oKi4OZzuUDXqAA6IfeEtr9sxVI1"
"Cookie-.ASPXAUTH": "AA23B2B1A5C428BFB60E32EA5A78A7D5016D7586F88548C012A1C2C2EB2A34D40A959B43680BCCE9923F1890017F59A3A82E6C1121AF50CF226D638FBCBC40F2D8E2FE4C945B44CC7572717D56C71FCC0B7B285A0EB5379370ADC6BE970E6438"
"Cookie-ASP.NET_SessionId": "hhaztuwfpyuobfslljuy4z4e"
"Info-FamilyId": 21267
"Info-LoggedInUserID": 1
"Info-MainConsultantUserId": 3

Working Demo: https://dotnetfiddle.net/u1YbBK

Regex used: (?<=\()(.*?)(?=\)[,\]])

Explanation:

  1. Positive Lookbehind (?<=(): \( matches the character ( literally (case sensitive)
  2. 1st Capturing Group (.*?): .*? matches any character (except for line terminators) and *? Quantifier — Matches between zero and unlimited times, as few times as possible, expanding as needed
  3. Positive Lookahead (?=)[,]]): \) matches the character ) literally (case sensitive) and in [,\]] , matches the character , literally (case sensitive) and \] matches the character ] literally (case sensitive)
Rahul Sharma
  • 7,768
  • 2
  • 28
  • 54
  • 1
    Thanks! The output of this solution does it for me, and as far as I can see it works perfectly for all the logged data in the database! I know this solution is not answering my question 100%, but helped me to solved my problem! – Liknes Jul 05 '19 at 10:24
1
[{"Browser": "Chrome73 {v 73.0}"}, {"UserAgent": "Mozilla/5.0 {Windows NT 10.0; Win64; x64} AppleWebKit/537.36 {KHTML, like Gecko} Chrome/73.0.3683.86 Safari/537.36"}, {"Languages": ["nb-NO", "nb;q=0.9", "no;q=0.8", "nn;q=0.7", "en-US;q=0.6", "en;q=0.5"]}, {"UserClaim-1-http://schemas.microsoft.com/ws/2008/06/identity/claims/role": "Admin"}, {"SessionId": "hhaztuwfpyuobfslljuy4z4e"}, {"Cookie-__RequestVerificationToken": "9MJm_A4agsgbe4c_JtAePFnfMLBEgnkc0XhROfDFVd6291SUGtLPAqprsGHBcJw9JDRde6UR_1jHY_Hr4oKi4OZzuUDXqAA6IfeEtr9sxVI1"}, {"Cookie-.ASPXAUTH": "AA23B2B1A5C428BFB60E32EA5A78A7D5016D7586F88548C012A1C2C2EB2A34D40A959B43680BCCE9923F1890017F59A3A82E6C1121AF50CF226D638FBCBC40F2D8E2FE4C945B44CC7572717D56C71FCC0B7B285A0EB5379370ADC6BE970E6438"}, {"Cookie-ASP.NET_SessionId": "hhaztuwfpyuobfslljuy4z4e"}, {"Info-FamilyId": 21267}, {"Info-LoggedInUserID": 1}, {"Info-MainConsultantUserId": 3}]

It work find, Just replace '(' to '{' & ')' to '}'

https://en.wikipedia.org/wiki/JSON You can see how is json work , maybe you lose something , And use http://json.parser.online.fr/ to try your json string .

TimChang
  • 2,249
  • 13
  • 25
  • Thanks! Your simple solution actually works with all the data I have in the database, but it kind of messes up the data a bit. i.e Chrome73 (v 73.0) turns into Chrome73 {v 73.0} – Liknes Jul 05 '19 at 10:10
1

An example conversion with the Regex by Rahul Sharma:

using System;
using System.Text.RegularExpressions;
using System.Collections.Generic;

namespace Solution {

    public class Parser {

        public static Dictionary<string,string> parseLoggingInformation(string info) {

            Dictionary<string, string> ret = new Dictionary<string, string>();
            MatchCollection mc = Regex.Matches(info, @"(?<=\()(.*?)(?=\)[,\]])");
            foreach (Match m in mc) {
                string val = m.ToString();
                string[] vals;
                try {
                    vals = val.Split(new string[] { "\": \"" }, StringSplitOptions.None);
                    string tmp = vals[1];
                } catch (Exception) {
                    vals = val.Split(new string[] { "\": " }, StringSplitOptions.None);
                }
                string left = vals[0];
                string right = vals[1];
                ret.Add(left.Substring(1, left.Length - 1), right.Substring(0, right.Length - 1));
            }
            return ret;
        }

        public static void Main(String[] args) {
            GC.Collect();
            Dictionary<string, string> loggingData = parseLoggingInformation("[(\"Browser\": \"Chrome73 (v 73.0)\"), (\"UserAgent\": \"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.86 Safari/537.36\"), (\"Languages\": [\"nb-NO\", \"nb;q=0.9\", \"no;q=0.8\", \"nn;q=0.7\", \"en-US;q=0.6\", \"en;q=0.5\"]), (\"UserClaim-1-http://schemas.microsoft.com/ws/2008/06/identity/claims/role\": \"Admin\"), (\"SessionId\": \"hhaztuwfpyuobfslljuy4z4e\"), (\"Cookie-__RequestVerificationToken\": \"9MJm_A4agsgbe4c_JtAePFnfMLBEgnkc0XhROfDFVd6291SUGtLPAqprsGHBcJw9JDRde6UR_1jHY_Hr4oKi4OZzuUDXqAA6IfeEtr9sxVI1\"), (\"Cookie-.ASPXAUTH\": \"AA23B2B1A5C428BFB60E32EA5A78A7D5016D7586F88548C012A1C2C2EB2A34D40A959B43680BCCE9923F1890017F59A3A82E6C1121AF50CF226D638FBCBC40F2D8E2FE4C945B44CC7572717D56C71FCC0B7B285A0EB5379370ADC6BE970E6438\"), (\"Cookie-ASP.NET_SessionId\": \"hhaztuwfpyuobfslljuy4z4e\"), (\"Info-FamilyId\": 21267), (\"Info-LoggedInUserID\": 1), (\"Info-MainConsultantUserId\": 3)]");
        }
    }
}

It saves the gathered data in a Dictionary<string, string>

0

Okay, so this thing looks almost like a JSON dictionary, only that

  1. It's an array [] instead of a dictionary {}.
  2. Key-value tuples are bracketed inside ().

So the laziest idea is to change the outside brackets into {}, which is trivial, and then get rid of the unnecessary () brackets. You'll end up with a valid JSON that can be directly parsed with Newtonsoft.Json or other parser of your choice.

To parse the thing, we will greedily escape all () brackets in quotes and ignore all the rest.

public static string LogToJson(string inputString)
{
    var builder = new StringBuilder("{");
    var escaping = false;

    for (var index = 1 /* Skipping opening [ */ ; index < inputString.Length; ++index)
    {
        var @char = inputString[index];
        switch (@char)
        {
            case '(' when !escaping:
            case ')' when !escaping:
                break;
            case '"':
                escaping ^= true;
                builder.Append(@char);
                break;
            default:
                builder.Append(@char);
                break;
        }
    }

    builder.Length--; // Remove the final ]
    builder.Append("}");
    return builder.ToString();
}

Please note that this completely omits error handling and assumes the inputString is always correct. Thus it gurantees a correct JSON if and only if the input format was a correct json after removing all unescaped () brackets.

For your example string the output is:

{"Browser": "Chrome73 (v 73.0)", "UserAgent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.86 Safari/537.36", "Languages": ["nb-NO", "nb;q=0.9", "no;q=0.8", "nn;q=0.7", "en-US;q=0.6", "en;q=0.5"], "UserClaim-1-http://schemas.microsoft.com/ws/2008/06/identity/claims/role": "Admin", "SessionId": "hhaztuwfpyuobfslljuy4z4e", "Cookie-__RequestVerificationToken": "9MJm_A4agsgbe4c_JtAePFnfMLBEgnkc0XhROfDFVd6291SUGtLPAqprsGHBcJw9JDRde6UR_1jHY_Hr4oKi4OZzuUDXqAA6IfeEtr9sxVI1", "Cookie-.ASPXAUTH": "AA23B2B1A5C428BFB60E32EA5A78A7D5016D7586F88548C012A1C2C2EB2A34D40A959B43680BCCE9923F1890017F59A3A82E6C1121AF50CF226D638FBCBC40F2D8E2FE4C945B44CC7572717D56C71FCC0B7B285A0EB5379370ADC6BE970E6438", "Cookie-ASP.NET_SessionId": "hhaztuwfpyuobfslljuy4z4e", "Info-FamilyId": 21267, "Info-LoggedInUserID": 1, "Info-MainConsultantUserId": 3}
V0ldek
  • 9,623
  • 1
  • 26
  • 57
  • This isn't converting the string to a dictionary, which I think is what the OP is asking for. – sr28 Jul 05 '19 at 10:04
  • As I said, we're converting the string to JSON so that it can be parsed using any standard JSON parser. That part is rather trivial. – V0ldek Jul 05 '19 at 10:59