2

I am trying to parse the result from the google speech to text API. The json response is :

{"result":[]}
{"result":[
          {"alternative":[
                         {"transcript":"hello Google how are you     feeling","confidence":0.96274596},
                         {"transcript":"hello Google how are you today","confidence":0.97388196},
                         {"transcript":"hello Google how are you picking","confidence":0.97388196},
                         {"transcript":"hello Google how are you kidding","confidence":0.97388196}
                         ]
         ,"final":true}]
,"result_index":0
}

Now i am trying to parse it through JObject. The problem is occurring in parsing the Result object which is appearing twice so, how do i parse the second Result object. Here is my code which i am trying is :

              StreamReader SR_Response = new StreamReader(HWR_Response.GetResponseStream());
              Console.WriteLine(SR_Response.ReadToEnd()+SR_Response.ToString());
              String json_response = SR_Response.ReadToEnd() + SR_Response.ToString();
              JObject joo = JObject.Parse(json_response);
              JArray ja = (JArray)joo["result"];

                        foreach (JObject o in ja)
                        {
                            JArray ja2 = (JArray)o["alternative"];
                            foreach (JObject h in ja2)
                            {
                                Console.WriteLine(h["transcript"]);
                            }
                        }

Next solution i tried using deserialize object code is:

                string responseFromServer = (SR_Response.ReadToEnd());
                String[] jsons = responseFromServer.Split('\n');
                String text = "";
                foreach (String j in jsons)
                {
                    dynamic jsonObject = JsonConvert.DeserializeObject(j);
                    if (jsonObject == null || jsonObject.result.Count <= 0)
                    {
                        continue;
                    }
                    Console.WriteLine((string)jsonObject["result"]["alternative"][0]["transcript"]);
                    text = jsonObject.result[0].alternative[0].transcript;
                }
                Console.WriteLine("MESSAGE : "+text); 
DIGIT
  • 79
  • 2
  • 11
  • This resopnse is for one api call? two times response? – BWA Feb 03 '17 at 11:05
  • 1
    What's this, `String json_response = SR_Response.ReadToEnd() + SR_Response.ToString();` Also, your second code operates in a loop it seems split on new lines. Show us an exact example of the Json you are deserializing, and the actual full code along with the output you are receiving. – ColinM Feb 03 '17 at 11:07
  • You read the stream once with your call to `Console.WriteLine()`: `Console.WriteLine(SR_Response.ReadToEnd()+SR_Response.ToString());`. Then you immediately try to read it again: `String json_response = SR_Response.ReadToEnd() + SR_Response.ToString();`. This cannot work, you need to read the stream only once. – dbc Feb 03 '17 at 13:38
  • Assuming you fix your stream reading, this looks to be a duplicate of [Parsing large json file in .NET](http://stackoverflow.com/a/32237819/3744182) or [Line delimited json serializing and de-serializing](http://stackoverflow.com/q/29729063/3744182). The trick is to set [`JsonTextReader.SupportMultipleContent = true`](http://www.newtonsoft.com/json/help/html/ReadMultipleContentWithJsonReader.htm) – dbc Feb 03 '17 at 13:47
  • @ColinM SR_Response.ReadToEnd() is just to read the response from json object in string format & here is my complete code [link](http://stackoverflow.com/questions/40350447/google-speech-to-text-api-using-c-sharp) – DIGIT Feb 03 '17 at 17:37
  • I have to parse the text inside the transcript **hello Google how are you feeling** thus their is no output currently but `Console.WriteLine(SR_Response.ReadToEnd()+SR_Response.ToStri‌​ng());` print the json response it written on the top – DIGIT Feb 03 '17 at 17:52

1 Answers1

0

What you have is a series of JSON root objects concatenated together into a single stream. As explained in Read Multiple Fragments With JsonReader such a stream can be deserialized by setting JsonReader.SupportMultipleContent = true. Thus, to deserialize your stream, you should first introduce the following extension methods:

public static class JsonExtensions
{
    public static IEnumerable<T> DeserializeObjects<T>(Stream stream, JsonSerializerSettings settings = null)
    {
        var reader = new StreamReader(stream); // Caller should dispose
        return DeserializeObjects<T>(reader, settings);
    }

    public static IEnumerable<T> DeserializeObjects<T>(TextReader textReader, JsonSerializerSettings settings = null)
    {
        var ser = JsonSerializer.CreateDefault(settings);
        var reader = new JsonTextReader(textReader); // Caller should dispose

        reader.SupportMultipleContent = true;

        while (reader.Read())
        {
            if (reader.TokenType == JsonToken.None || reader.TokenType == JsonToken.Undefined || reader.TokenType == JsonToken.Comment)
                continue;
            yield return ser.Deserialize<T>(reader);
        }
    }
}

Next, using a code-generation utility such as http://json2csharp.com/, generate c# classes for a single JSON root object, like so:

public class Alternative
{
    public string transcript { get; set; }
    public double confidence { get; set; }
}

public class Result
{
    public List<Alternative> alternative { get; set; }
    public bool final { get; set; }
}

public class RootObject
{
    public List<Result> result { get; set; }
    public int result_index { get; set; }
}

And deserialize as follows:

List<RootObject> results;
using (var stream = HWR_Response.GetResonseStream())
{
    results = JsonExtensions.DeserializeObjects<RootObject>(stream).ToList();
}

Having done this you can use standard c# programming techniques such as Linq to enumerate the transcript values, such as:

var transcripts = results
    .SelectMany(r => r.result)
    .SelectMany(r => r.alternative)
    .Select(a => a.transcript)
    .ToList();

If you don't want define a fixed data model for your JSON collection, you can deserialize directly to a list of JObject like so:

List<JObject> objs;
using (var stream = HWR_Response.GetResonseStream())
{
    objs = JsonExtensions.DeserializeObjects<JObject>(stream).ToList();
}

Then you can use SelectTokens() to select the values of all the "transcript" properties nested inside each object:

var transcripts = objs
    // The following uses the JSONPath recursive descent operator ".." to pick out all properties named "transcript".
    .SelectMany(o => o.SelectTokens("..transcript")) 
    .Select(t => t.ToString())
    .ToList();

Updated sample fiddle showing both options.

dbc
  • 104,963
  • 20
  • 228
  • 340
  • Please provide any text/value fetch code from **transcript** in reference to the deserialize approach you mentioned so that i can get only the string inside transcript such as _hello Google how are you feeling_ – DIGIT Feb 03 '17 at 18:28