1

so I am querying a file in a S3 bucket; https://aws.amazon.com/blogs/developer/amazon-s3-select-support-in-the-aws-sdk-for-net/

and I am trying to work with:

using (var eventStream = await GetSelectObjectContentEventStream())
{
    var recordResults = eventStream
        .Where(ev => ev is RecordsEvent)
        .Cast<RecordsEvent>()
        .Select(records =>
        {
            using (var reader = new StreamReader(records.Payload, Encoding.UTF8))
            {
                return reader.ReadToEnd();
            }
        }).ToArray();
    var results = string.Join(Environment.NewLine, recordResults);
    Console.WriteLine(results);
}

however; the reader.ReadToEnd(); can actually contain multiple records. but then like: {"VatNumber":"1234","Category":"ALF"},{"VatNumber":"1234","Category":"CDL"},

as you see; not a valid list; also not a valid closure.

now I am doing:

var list = new List<T>();

    using (var eventStream = (await _s3Client.SelectObjectContentAsync(request, cancellationToken)).Payload)
        foreach (var ev in eventStream.Where(ev => ev is RecordsEvent))
        {
            if (ev is RecordsEvent records)
            {
                using (var reader = new StreamReader(records.Payload, Encoding.UTF8))
                {
                    var content = (await reader.ReadToEndAsync()).TrimEnd(JsonRecordDelimiter.ToCharArray());
                    _logger.LogInformation($"Content received is: {content}");

                    if (content.Contains(JsonRecordDelimiter))
                        list.AddRange(JsonConvert.DeserializeObject<List<T>>($"[{content}]"));
                    else
                        list.Add(JsonConvert.DeserializeObject<T>(content));
                }
            }
        }
    _logger.LogInformation("results from this thingy: " + JsonConvert.SerializeObject(list));

    return list;

notice ugly things like: reader.ReadToEndAsync()).TrimEnd(JsonRecordDelimiter.ToCharArray()

$"[{content}]")

a single element can also come in; also with the , ending. I do have an option to set NO delimiter; but then I cannot wrap my head around it.

what would be the better alternative?

the file that I am queering looks like:

1234;ALF
1234;CDL
12;A
34;
;
;
de;CD
Roelant M
  • 1,581
  • 1
  • 13
  • 28
  • Can you share the file, which you are querying? – Pavel Anikhouski May 13 '20 at 13:51
  • added the piece of the file; but that is not the issue; queering seems to work perfectly. – Roelant M May 13 '20 at 13:56
  • `what would be the better alternative?` Figure out what the actual, exact syntax is. There should be some documentation *somewhere* for it. Once you have that, then write a proper parser for it. – Stephen Cleary May 13 '20 at 14:03
  • To parse a sequence of comma-separated JSON values see [this answer](https://stackoverflow.com/a/50014780/3744182) to [Additional text encountered after finished reading JSON content:](https://stackoverflow.com/q/16765877/3744182). To simply ignore a trailing comma see [Discarding garbage characters after json object with Json.Net](https://stackoverflow.com/q/37172263/3744182). Do either of those do what you need? The first answer also shows how to stream directly without reading the entire response into a string. – dbc May 13 '20 at 14:24
  • @dbc that works perfectly! i dont need to `discarding garbage`, the first peace already takes care of that. (according to my unit-tests ;) ) thnx for the help! i you put it in as an answer i can check it! – Roelant M May 14 '20 at 06:20

0 Answers0