how can I count tokens before making api call?

Question

import { Configuration, OpenAIApi } from "openai"
import { readFile } from './readFile.js'

// Config OpenAI API
const configuration = new Configuration({
    organization: "xyx......",
    apiKey: "abc.......",
});

// OpenAI API instance
export const openai = new OpenAIApi(configuration);


const generateAnswer = async (conversation, userMessage) => {
    try {
        const dataset = await readFile();
        const dataFeed = { role: 'system', content: dataset };
        const prompt = conversation ? [...conversation?.messages, dataFeed, userMessage] : [dataFeed, userMessage];
        const completion = await openai.createChatCompletion({
            model: "gpt-3.5-turbo",
            messages: prompt
        })

        const aiMessage = completion.data.choices[0].message;
        console.log(completion.data.usage)
        return aiMessage
    } catch (e) {
        console.log(e)
    }
}
export { generateAnswer };

I am trying to create chat bot, in which I provide datafeed in start which is business information and conversation history to chat api I want to calculate tokens of conversation and reduce prompt if exceeds limit before making api call I have tried using gpt3 encoder to count tokens but i have array of objects not string in prompt

Does this answer your question? [OpenAI API: How do I count tokens before(!) I send an API request?](https://stackoverflow.com/questions/75804599/openai-api-how-do-i-count-tokens-before-i-send-an-api-request) — Rok Benko, May 10 '23 at 08:21
This link shows how can we count tokens in string in my case I have conversation array like [{role:"user",content:"hello"},{role:"ai",content:"reply comes here"}] — Sorab, May 10 '23 at 09:45

Schroeder · Accepted Answer · 2023-06-12T07:50:23.860

Exact Method

A precise way is to use tiktoken, which is a python library. Taken from the openAI cookbook:

    import tiktoken
    encoding = tiktoken.encoding_for_model("gpt-3.5-turbo")
    num_tokens = len(encoding.encode("Look at all them pretty tokens"))
    print(num_tokens)

More generally, you can use

encoding = tiktoken.get_encoding("cl100k_base")

where cl100k_base is used in gpt-4, gpt-3.5-turbo, text-embedding-ada-002; p50k_base is used in Codex models, text-davinci-002, text-davinci-003; and r50k_base is what's used in gpt2, and GPT-3 models like davinci. r50k_base and p50k_base and often (but not always) gives the same results.

Approximation Method

You usually just want you program to not crash due to exceeding the token limit, and just need a character count cutoff such that you won't exceed the token limit. Testing with tiktoken reveals that token count is usually linear, particularly with newer models, and that 1/e seems to be a robust conservative constant of proportionality. So, we can write a trivial equation for conservatively relating tokens to characters:

'#tokens <? #characters * (1/e) + safety_margin'

where <? means this is very likely true, and 1/e = 0.36787944117144232159552377016146. an adaquate choice for safety_margin seems to be 2. In some cases when using with r50k_base this needed to be 8 after 2000 characters. There are two cases where the safety margin comes into play: first for very low character count; there a value of 2 is enough and needed for all models. Second is if the model fails to reason about what it's looking at, resulting in a wobbly/noisy relationship between character count and # tokens with a constant of proportionality closer to 1/e, that may meander over the 1/e limit.

Main Approximation Result

Now reverse this to get a maximum number of characters to fit within a token limit:

'max_characters = (#tokens_limit - safety_margin) * e'

where e = 2.7182818284590... Now you've got an instant, language and platform independent, and dependency-free solution for not exceeding the token limit.

Show Your Work

With a paragraph of English

For model cl100k_base with English text, #tokens = #chars0.2016568976249748 + -5.277472848558375 For model p50k_base with English text, #tokens = #chars0.20820463015644564 + -4.697668008159241 For model r50k_base with English text, #tokens = #chars*0.20820463015644564 + -4.697668008159241

With a paragraph of Lorem ipsum

For model cl100k_base with Lorem ipsum, #tokens = #chars0.325712437966849 + -5.186204883743613 For model p50k_base with Lorem ipsum, #tokens = #chars0.3622005352481815 + 2.4256199405020595 For model r50k_base with Lorem ipsum, #tokens = #chars*0.3622005352481815 + 2.4256199405020595

With a paragraph of python code:

For model cl100k_base with sampletext2, #tokens = #chars0.2658446137873485 + -0.9057612056294033 For model p50k_base with sampletext2, #tokens = #chars0.3240730228908291 + -5.740016444496973 For model r50k_base with sampletext2, #tokens = #chars*0.3754121847018643 + -19.96012603693265

That's python, OP is using Javascript. (It does the job, but I imagine they would rather a Javascript/NPM solution) — Ben Heymink, Jun 07 '23 at 18:14
@BenHeymink A good procedure would be to analyze the token counting and develop a simple math function to conservatively predict the token count, then use that function to prevent exceeding the token count. I did that for GPT-2 and it worked great--instant and platform independent. The result was just a linear relationship between token count and character count as long as there were more than a few characters. I'll try to do that again and post if I find the time. — Schroeder, Jun 08 '23 at 14:18

score 2 · Answer 2 · answered Aug 20 '23 at 20:00

Question is old but it may help someone. There is a node.js library called tiktoken wish is a fork form the original tiktoken library.

All the examples on the official tiktoken repo are valid with a small changes.

install the tiktoken npm package:

npm install @dqbd/tiktoken

calculate the number of tokens in a text string

import { encoding_for_model } from "@dqbd/tiktoken";

//Returns the number of tokens in a text string
function numTokensFromString(message: string) {
  const encoder = encoding_for_model("gpt-3.5-turbo");

  const tokens = encoder.encode(message);
  encoder.free();
  return tokens.length;
}

decode the tokens back to string

import { encoding_for_model } from "@dqbd/tiktoken";

 function decodeTokens(message: Uint32Array) {
  const encoder = encoding_for_model("gpt-3.5-turbo");

  const words = encoder.decode(message);
  encoder.free();
  return new TextDecoder().decode(words);
}