Openai /v1/completions vs. /v1/chat/completions end points

Question

class OpenaiClassifier():
    def __init__(self, api_keys):
        openai.api_key = api_keys['Openai']

    def get_ratings(self, review):
        prompt = f"Rate the following review as an integer from 1 to 5, where 1 is the worst and 5 is the best: \"{review}\""
        
        response = openai.Completion.create(
            engine="text-davinci-003",
            prompt=prompt,
            n=1,
            max_tokens=5,
            temperature=0.5,
            top_p=1
        )

        try:
            rating = int(response.choices[0].text.strip())
            return rating
        except ValueError:
            return None

I wonder what's the main difference between /v1/completions and /v1/chat/completions endpoints, and how I can do text classification using these models: gpt-4, gpt-4-0314, gpt-4-32k, gpt-4-32k-0314, gpt-3.5-turbo, gpt-3.5-turbo-0301

score 18 · Answer 1 · answered May 07 '23 at 09:54

/completions endpoint provides the completion for a single prompt and takes a single string as an input, whereas the /chat/completions provides the responses for a given dialog and requires the input in a specific format corresponding to the message history.

If you want to use chat gpt models, you need to use /chat/completions API, but your request has to be adjusted.

prompt = f"Rate the following review as an integer from 1 to 5, where 1 is the worst and 5 is the best: \"{review}\""

response = openai.ChatCompletion.create(
  model="gpt-3.5-turbo",
  messages=[
    {"role": "user", "content": prompt}
  ]
)

score 13 · Answer 2 · answered May 07 '23 at 15:52

Oleg's answer is good and correct but the more complete answer is:

/v1/completions endpoint is for the old models such as DeVinci. It is a very powerful model that gets instruction and produces output.

The /v1/chat/completions API is for the newer - chat models (as Oleg mentioned). gpt-3.5-turbo is great because it can do everything DeVinci can but its cheaper(1/10 the cost) the down side is that - for it to perform the same as DeVinci it might require bigger input and the input might be more complex.

The chat model performs best when you give examples.

For DeVinci (Or other models based on /v1/completions API) the input would look like an instruction: "Creates two to three sentence short horror stories from the topic 'wind'."

For the chat models the input would look as a chat:

Two-Sentence Horror Story: He always stops crying when I pour the milk on his cereal. I just have to remember not to let him see his face on the carton.
    
Topic: Wind
Two-Sentence Horror Story:

The output would be completion of the chat. For example: The wind howled through the night, shaking the windows of the house with a sinister force. As I stepped outside, I could feel it calling out to me, beckoning me to follow its chilling path.

This is a real example from OpenAI documentation (I have added some context about the instruction API).

So the points to consider are:

Pricing (Chat models are cheaper - GPT4 aside as it is still in beta)
Input differences (The Chat models input is more complex)
Future support - to my understanding newer models will focus on chat
Fine-tuning - Currently only GPT3 models (instruction models) support fine-tuning

MJeremy · Answer 3 · 2023-05-18T10:36:03.947

According to my use experience, there's a few differences

Backend models are different: refer to the doc:

/v1/chat/completions: gpt-4, gpt-4-0314, gpt-4-32k, gpt-4-32k-0314, gpt-3.5-turbo, gpt-3.5-turbo-0301
/v1/completions:    text-davinci-003, text-davinci-002, text-curie-001, text-babbage-001, text-ada-001

Role selection: in the /chat/completions, you get to select different roles {user, assistant, system}, the system role is helpful when you want the model to consistently stay in a certain context, for example, the system might be: You are a customer service.
Append conversation: in the /chat/completions, you can input longer context in a conversational way, for example, you can go

{"role": "user", "content": "I am Peter"}
{"role": "user", "content": "Hello, Peter!"}
{"role": "user", "content": "What is my name?"}

You can constantly append the conversation and let the model to memorise your earlier conversations, which is very useful and important when you want to apply it to build chatBot with memories. While this can also be done in the /completions endpoint, it is not as explicit.

Overall, I think in the long run, the /chat/completions endpoint is the way to go.

score 2 · Answer 4 · answered Jun 13 '23 at 01:38

The other answers here are helpful, and the way I see it is that chat_completion is just a higher-level api (concatenates message history with the latest "user" message, formulates the whole thing as a json, then does a completion on that with a stopping criteria in case the completion goes beyond the "assistant"'s message and start talking as "user"). It is a controlled completion of sorts. See for example llama-cpp-pyhton's implementation of the two:

create_chat_completion() formulates a prompt then calls self() which calls create_completion().

Additional thoughts: It is unfortunate OpenAI seems to be recommending chat_completion over completion (and not offering all models with the latter), perhaps because most API use cases are of the "chat" kind, but I see a lot more potential in the raw completion API, since one can concoct their own creative structure as a json or something else.

Indeed the completions api is now deprecated. – Julian H Jul 10 '23 at 15:34 — Julian H, Jul 10 '23 at 15:34

score 1 · Answer 5 · answered Jul 14 '23 at 11:00

completions API is legacy.

In full,

Our latest models, gpt-4 and gpt-3.5-turbo, are accessed through the chat completions API endpoint. Currently, only the older legacy models are available via the completions API endpoint.

https://platform.openai.com/docs/guides/gpt

score 1 · Answer 6 · answered Jul 20 '23 at 22:51

They've updated their docs to answer this question.

TL;DR:

The chat completions format can be made similar to the completions format by constructing a request using a single user message

Likewise, the completions API can be used to simulate a chat between a user and an assistant by formatting the input accordingly.

The difference between these APIs derives mainly from the underlying GPT models that are available in each

Which model should you use?

We recommend experimenting in the playground to investigate which models provide the best price performance trade-off for your usage

score 1 · Answer 7 · answered Jul 29 '23 at 04:32

/v1/completions is for single prompt. you can use this endpoints if your app provides those services

translation
text or code generation
revise a message
write an email

On the other hand you use /v1/chat/completions if your app provides those services

chatgpt style chatbots
virtual assistant for customer service
interactive surveys and forms

Openai /v1/completions vs. /v1/chat/completions end points

7 Answers7

Linked