count tokens for new OpenAI model versions (#6195)

Trying to call `ChatOpenAI.get_num_tokens_from_messages` returns the
following error for the newly announced models `gpt-3.5-turbo-0613` and
`gpt-4-0613`:

```
NotImplementedError: get_num_tokens_from_messages() is not presently implemented for model gpt-3.5-turbo-0613.See https://github.com/openai/openai-python/blob/main/chatml.md for information on how messages are converted to tokens.
```

This adds support for counting tokens for those models, by counting
tokens the same way they're counted for the previous versions of
`gpt-3.5-turbo` and `gpt-4`.

#### reviewers

  - @hwchase17
  - @agola11
This commit is contained in:
Kyle Roth 2023-06-15 09:16:03 -04:00 committed by GitHub
parent 7ad13cdbdb
commit c7db9febb0
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23

View File

@ -466,12 +466,12 @@ class ChatOpenAI(BaseChatModel):
if sys.version_info[1] <= 7:
return super().get_num_tokens_from_messages(messages)
model, encoding = self._get_encoding_model()
if model == "gpt-3.5-turbo-0301":
if model.startswith("gpt-3.5-turbo"):
# every message follows <im_start>{role/name}\n{content}<im_end>\n
tokens_per_message = 4
# if there's a name, the role is omitted
tokens_per_name = -1
elif model == "gpt-4-0314":
elif model.startswith("gpt-4"):
tokens_per_message = 3
tokens_per_name = 1
else: