Referring to #687, I implemented the functionality to reduce K if it
exceeds the token limit.
Edit: I should have ran make lint locally. Also, this only applies to
`StuffDocumentChain`
* add implementations of `BaseCallbackHandler` to support tracing:
`SharedTracer` which is thread-safe and `Tracer` which is not and is
meant to be used locally.
* Tracers persist runs to locally running `langchain-server`
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
If `distance_func` and `collection_name` are in `kwargs` they are sent
to the `QdrantClient` which results in an error being raised.
Co-authored-by: Francisco Ingham <>
`SentenceTransformer` returns a NumPy array, not a `List[List[float]]`
or `List[float]` as specified in the interface of `Embeddings`. That PR
makes it consistent with the interface.
I'm providing a hotfix for Qdrant integration. Calculating a single
embedding to obtain the vector size was great idea. However, that change
introduced a bug trying to put only that single embedding into the
database. It's fixed. Right now all the embeddings will be pushed to
Qdrant.
Now that OpenAI has deprecated all embeddings models except
text-embedding-ada-002, we should stop specifying a legacy embedding
model in the example. This will also avoid confusion from people (like
me) trying to specify model="text-embedding-ada-002" and having that
erroneously expanded to text-search-text-embedding-ada-002-query-001
Since the tokenizer and model are constructed manually, model_kwargs
needs to
be passed to their constructors. Additionally, the pipeline has a
specific
named parameter to pass these with, which can provide forward
compatibility if
they are used for something other than tokenizer or model construction.
On the [Getting Started
page](https://langchain.readthedocs.io/en/latest/modules/prompts/getting_started.html)
for prompt templates, I believe the very last example
```python
print(dynamic_prompt.format(adjective=long_string))
```
should actually be
```python
print(dynamic_prompt.format(input=long_string))
```
The existing example produces `KeyError: 'input'` as expected
***
On the [Create a custom prompt
template](https://langchain.readthedocs.io/en/latest/modules/prompts/examples/custom_prompt_template.html#id1)
page, I believe the line
```python
Function Name: {kwargs["function_name"]}
```
should actually be
```python
Function Name: {kwargs["function_name"].__name__}
```
The existing example produces the prompt:
```
Given the function name and source code, generate an English language explanation of the function.
Function Name: <function get_source_code at 0x7f907bc0e0e0>
Source Code:
def get_source_code(function_name):
# Get the source code of the function
return inspect.getsource(function_name)
Explanation:
```
***
On the [Example
Selectors](https://langchain.readthedocs.io/en/latest/modules/prompts/examples/example_selectors.html)
page, the first example does not define `example_prompt`, which is also
subtly different from previous example prompts used. For user
convenience, I suggest including
```python
example_prompt = PromptTemplate(
input_variables=["input", "output"],
template="Input: {input}\nOutput: {output}",
)
```
in the code to be copy-pasted
- This uses the faiss built-in `write_index` and `load_index` to save
and load faiss indexes locally
- Also fixes#674
- The save/load functions also use the faiss library, so I refactored
the dependency into a function
Adding quotation marks around {text} avoids generating empty or
completely random responses from OpenAI davinci-003. Empty or completely
unrelated intermediate responses in summarization messes up the final
result or makes it very inaccurate.
The error from OpenAI would be: "The model predicted a completion that
begins with a stop sequence, resulting in no output. Consider adjusting
your prompt or stop sequences."
This fix corrects the prompting for summarization chain. This works on
API too, the images are for demonstrative purposes.
This approach can be applied to other similar prompts too.
Examples:
1) Without quotation marks
![Screenshot from 2023-01-20
07-18-19](https://user-images.githubusercontent.com/22897470/213624365-9dfc18f9-5f3f-45d2-abe1-56de67397e22.png)
2) With quotation marks
![Screenshot from 2023-01-20
07-18-35](https://user-images.githubusercontent.com/22897470/213624478-c958e742-a4a7-46fe-a163-eca6326d9dae.png)
Allow optionally specifying a list of ids for pinecone rather than
having them randomly generated.
This also permits editing the embedding/metadata of existing pinecone
entries, by id.