so I am querying a file in a S3 bucket; https://aws.amazon.com/blogs/developer/amazon-s3-select-support-in-the-aws-sdk-for-net/
and I am trying to work with:
using (var eventStream = await GetSelectObjectContentEventStream())
{
var recordResults = eventStream
.Where(ev => ev is RecordsEvent)
.Cast<RecordsEvent>()
.Select(records =>
{
using (var reader = new StreamReader(records.Payload, Encoding.UTF8))
{
return reader.ReadToEnd();
}
}).ToArray();
var results = string.Join(Environment.NewLine, recordResults);
Console.WriteLine(results);
}
however; the reader.ReadToEnd();
can actually contain multiple records. but then like:
{"VatNumber":"1234","Category":"ALF"},{"VatNumber":"1234","Category":"CDL"},
as you see; not a valid list; also not a valid closure.
now I am doing:
var list = new List<T>();
using (var eventStream = (await _s3Client.SelectObjectContentAsync(request, cancellationToken)).Payload)
foreach (var ev in eventStream.Where(ev => ev is RecordsEvent))
{
if (ev is RecordsEvent records)
{
using (var reader = new StreamReader(records.Payload, Encoding.UTF8))
{
var content = (await reader.ReadToEndAsync()).TrimEnd(JsonRecordDelimiter.ToCharArray());
_logger.LogInformation($"Content received is: {content}");
if (content.Contains(JsonRecordDelimiter))
list.AddRange(JsonConvert.DeserializeObject<List<T>>($"[{content}]"));
else
list.Add(JsonConvert.DeserializeObject<T>(content));
}
}
}
_logger.LogInformation("results from this thingy: " + JsonConvert.SerializeObject(list));
return list;
notice ugly things like:
reader.ReadToEndAsync()).TrimEnd(JsonRecordDelimiter.ToCharArray()
$"[{content}]")
a single element can also come in; also with the ,
ending.
I do have an option to set NO delimiter; but then I cannot wrap my head around it.
what would be the better alternative?
the file that I am queering looks like:
1234;ALF
1234;CDL
12;A
34;
;
;
de;CD