forked from Archives/langchain
Add allowed and disallowed special arguments to BaseOpenAI (#3012)
## Background This PR fixes this error when there are special tokens when querying the chain: ``` Encountered text corresponding to disallowed special token '<|endofprompt|>'. If you want this text to be encoded as a special token, pass it to `allowed_special`, e.g. `allowed_special={'<|endofprompt|>', ...}`. If you want this text to be encoded as normal text, disable the check for this token by passing `disallowed_special=(enc.special_tokens_set - {'<|endofprompt|>'})`. To disable this check for all special tokens, pass `disallowed_special=()`. ``` Refer to the code snippet below, it breaks in the chain line. ``` chain = ConversationalRetrievalChain.from_llm( ChatOpenAI(openai_api_key=OPENAI_API_KEY), retriever=vectorstore.as_retriever(), qa_prompt=prompt, condense_question_prompt=condense_prompt, ) answer = chain({"question": f"{question}"}) ``` However `ChatOpenAI` class is not accepting `allowed_special` and `disallowed_special` at the moment so they cannot be passed to the `encode()` in `get_num_tokens` method to avoid the errors. ## Change - Add `allowed_special` and `disallowed_special` attributes to `BaseOpenAI` class. - Pass in `allowed_special` and `disallowed_special` as arguments of `encode()` in tiktoken. --------- Co-authored-by: samcarmen <“carmen.samkahman@gmail.com”>fix_agent_callbacks
parent
9d23cfc7dd
commit
d54c88aa21
Loading…
Reference in New Issue