2

I am trying to connect to a http API. This API responses with a ndjson, that is a newline separated json strings. I need to consume these lines one by one, before I download them all (in fact even before the server knows what it will output on the future lines). In Python, I can achieve this by:

import requests, json

lines = requests.get("some url", stream=True).iter_lines()
for line in lines:
    #parse line as JSON and do whatever

and it works like charm.

I want the same effect done in Nim, but the program blocks. For example, I tried to load just the first line of the response:

import httpclient, json, streams

var stream = newHttpClient().get("some url").bodyStream
var firstLine = ""
discard stream.readLine(firstLine )
echo firstLine

but with no luck - that is, the program never echoes. I also tried streams.lines iterator, but that didn't help either.

Is there some idiom similar to the Python snipet that would allow me to easily work with the http reponse stream line by line?

  • this forum thread seems related to your issue: https://forum.nim-lang.org/t/6103 – pietroppeter Nov 10 '20 at 10:30
  • @pietroppeter Thank you. I feel like this is something that should be added to the httpclient module, if it's not possible to do these things already.... – Michal Maršálek Nov 10 '20 at 11:23
  • before bumping into that thread my best guess would have been to try with AsyncHttpClient, whose AsyncResponse has bodyStream which is a FutureStream[string] (you would need also to use a AsyncStream). I guess one could build an iterator lines out of that, but I am not sure if it is doable anyway. If the forum post helped finding a solution for your case consider adding your own answer to this question (encouraged by SO: https://stackoverflow.com/help/self-answer) – pietroppeter Nov 10 '20 at 11:50
  • @pietroppeter The forum post unfortunately didn't help me, using that approach, I was only able to get the line when the complete response was ready, rather then right away. I tried to look into your suggestion, but I don't see how the AsyncHttpClient can help me here. Can you elaborate? – Michal Maršálek Nov 10 '20 at 13:38
  • well the idea would have been that maybe the bodyStream field of AsyncResponse being a FutureStream could contain data before the full response is ready (you would need api from https://nim-lang.org/docs/asyncstreams.html to access content), but I am not very competent on async stuff so it is just a wild guess and might not be useful. Also it does not help that I would not know how to test stuff. – pietroppeter Nov 10 '20 at 13:50

1 Answers1

1

The solution is to use the net module as in the question linked by @pietroppeter. That initially didn't work for me, because I didn't construct the HTTP request correctly. The resulting code:

import net, json

const HOST = "host"
const TOKEN = "token"

iterator getNdjsonStream(path: string): JsonNode =
    let s = newSocket()
    wrapSocket(newContext(), s)
    s.connect(HOST, Port(443))
    var req = &"GET {path} HTTP/1.1\r\nHost:{HOST}\r\nAuthorization: {TOKEN}\r\n\r\n"
    s.send(req)
    while true:
        var line = ""
        while line == "" or line[0] != '{':            
            line = s.recvLine
        yield line.parseJson

I think this can't be achieved using the httpClient module. The async versions might look like they can do it but it seems to me that you can only work with the received data once the Future is completed, that is after all data is downloaded. The fact that such a simple think cannot be done simply and the lack of examples I could find lead to a couple of days of frustration and the need of opening a stackoverflow account after 10 years of programming.