3

For streaming OpenAI response in langchain

I use this https://python.langchain.com/en/latest/modules/models/chat/examples/streaming.html

It streams the response in my terminal. I want to send stream response back to the user using Flask API

Anyone, please advise

I even tried a custom sync handler, No luck

class MyCustomSyncHandler(BaseCallbackHandler):
    def __iter__(self):
        for elem in self._datastructure:
            if elem.visible:
                yield elem.value
    def on_llm_new_token(self, token: str, **kwargs):
        yield b'data: {{"stream": '+ token+ '}}\n\n'
   
  • Related: [Streaming ChatGPT's results with Flask and LangChain](https://stackoverflow.com/questions/75837803/streaming-chatgpts-results-with-flask-and-langchain) – ggorlen May 25 '23 at 00:03

1 Answers1

1

I am not using LangChain; Just using inbuilt requeests module. However, for a Production Project, consider using Flask-SSE

def open_ai_stream(prompt):
    url = config["open_ai_url"] #I am reading this from a config value 
#Value is this https://api.openai.com/v1/chat/completions
    session = requests.Session()
    payload = json.dumps({
        "model": "gpt-3.5-turbo",
            "messages": [
             {
            "role": "user",
            "content": prompt
            }
        ],
        "temperature": 0,
        "stream": True
        })
    headers = {
            'Content-Type': 'application/json',
            'Authorization': 'Bearer '+ config['open_ai_key']
    }
    with session.post(url, headers=headers, data=payload) as resp:
        for line in resp.iter_lines():
            if line:
                yield f'data: %s\n\n' % line.decode('utf-8')

and here's the flask route

@app.route("/llm-resp", methods=["POST"])
def stream_response():
    prompt = request.get_json(force = True)["prompt"]
    return Response(stream_with_context(open_ai_stream(query_text)),
                    mimetype="text/event-stream")

Please see Test Results (if it was a normal response, it would look quite different on postman): a snapshot of Postman

EDIT: Here's a slightly better way, but there is an added dependency of openai library

def chat_gpt_helper(prompt):
    """
    This function returns the response from OpenAI's Gpt3.5 turbo model using the completions API
    """
    try:
        resp = ''
        openai.api_key = os.getenv('OPEN_API_KEY')
        for chunk in openai.ChatCompletion.create(
            model="gpt-3.5-turbo",
            messages=[{
                "role": "user",
                "content":prompt
            }],
            stream=True,
        ):
            content = chunk["choices"][0].get("delta", {}).get("content")
            if content is not None:
                    print(content, end='')
                    resp += content
                    yield f'data: %s\n\n' % resp

    except Exception as e:
        print(e)
        return str(e)

Rest all would stay the same. Check the full code here:

EDIT 2:

In case someone is not able to see responses in chunks on Postman, make sure your Postman is up to date since at the time of writing, this seems to be a new feature. Relevant press release

Anshuman Kumar
  • 464
  • 1
  • 6
  • 20