- Description: `MlflowCallbackHandler` fails with `KeyError: "['name']
not in index"`. See https://github.com/hwchase17/langchain/issues/5770
for more details. Root cause is that LangChain does not pass "name" as a
part of `serialized` argument to `on_llm_start()` callback method. The
commit where this change was made is probably this:
18af149e91.
My bug fix derives "name" from "id" field.
- Issue: https://github.com/hwchase17/langchain/issues/5770
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
### Description
Adding a callback handler for Context. Context is a product analytics
platform for AI chat experiences to help you understand how users are
interacting with your product.
I've added the callback library + an example notebook showing its use.
### Dependencies
Requires the user to install the `context-python` library. The library
is lazily-loaded when the callback is instantiated.
### Announcing the feature
We spoke with Harrison a few weeks ago about also doing a blog post
announcing our integration, so will coordinate this with him. Our
Twitter handle for the company is @getcontextai, and the founders are
@_agamble and @HenrySG.
Thanks in advance!
**Title:** Add verbose parameter for llamacpp
**Description:**
This pull request adds a 'verbose' parameter to the llamacpp module. The
'verbose' parameter, when set to True, will enable the output of
detailed logs during the execution of the Llama model. This added
parameter can aid in debugging and understanding the internal processes
of the module.
The verbose parameter is a boolean that prints verbose output to stderr
when set to True. By default, the verbose parameter is set to True but
can be toggled off if less output is desired. This new parameter has
been added to the `validate_environment` method of the `LlamaCpp` class
which initializes the `llama_cpp.Llama` API:
```python
class LlamaCpp(LLM):
...
@root_validator()
def validate_environment(cls, values: Dict) -> Dict:
...
model_param_names = [
...
"verbose", # New verbose parameter added
]
...
values["client"] = Llama(model_path, **model_params)
...
```
---------
Signed-off-by: teleprint-me <77757836+teleprint-me@users.noreply.github.com>
At the moment, pinecone vectorStore does not support filters and
namespaces when using similarity_score_threshold search type.
In this PR, I've implemented that. It passes all the kwargs except
"score_threshold" as that is not a supported argument for method
"similarity_search_with_score".
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
## Changes
- [X] Fill the `llm_output` param when there is an output parsing error
in a Pydantic schema so that we can get the original text that failed to
parse when handling the exception
## Background
With this change, we could do something like this:
```
output_parser = PydanticOutputParser(pydantic_object=pydantic_obj)
chain = ConversationChain(..., output_parser=output_parser)
try:
response: PydanticSchema = chain.predict(input=input)
except OutputParserException as exc:
logger.error(
'OutputParserException while parsing chatbot response: %s', exc.llm_output,
)
```
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
hi @rlancemartin ,
We had a new deployment and the `pg_extension` creation command was
updated from `CREATE EXTENSION pg_embedding` to `CREATE EXTENSION
embedding`.
https://github.com/neondatabase/neon/pull/4646
The extension not made public yet. No users will be affected by this.
Will be public next week.
Please let me know if you have any questions.
Thank you in advance 🙏
Continuing with Tolkien inspired series of langchain tools. I bring to
you:
**The Fellowship of the Vectors**, AKA EmbeddingsClusteringFilter.
This document filter uses embeddings to group vectors together into
clusters, then allows you to pick an arbitrary number of documents
vector based on proximity to the cluster centers. That's a
representative sample of the cluster.
The original idea is from [Greg Kamradt](https://github.com/gkamradt)
from this video (Level4):
https://www.youtube.com/watch?v=qaPMdcCqtWk&t=365s
I added few tricks to make it a bit more versatile, so you can
parametrize what to do with duplicate documents in case of cluster
overlap: replace the duplicates with the next closest document or remove
it. This allow you to use it as an special kind of redundant filter too.
Additionally you can choose 2 diff orders: grouped by cluster or
respecting the original retriever scores.
In my use case I was using the docs grouped by cluster to run refine
chains per cluster to generate summarization over a large corpus of
documents.
Let me know if you want to change anything!
@rlancemartin, @eyurtsev, @hwchase17,
---------
Co-authored-by: rlm <pexpresss31@gmail.com>
Change details:
- Description: When calling db.persist(), a check prevents from it
proceeding as the constructor only sets member `_persist_directory` from
parameters. But the ChromaDB client settings also has this parameter,
and if the client_settings parameter is used without passing the
persist_directory (which is optional), the `persist` method raises
`ValueError` for not setting `_persist_directory`. This change fixes it
by setting the member `_persist_directory` variable from client_settings
if it is set, else uses the constructor parameter.
- Issue: I didn't find any github issue of this, but I discovered it
after calling the persist method
- Dependencies: None
- Tag maintainer: vectorstore related change - @rlancemartin, @eyurtsev
- Twitter handle: Don't have one :(
*Additional discussion*: We may need to discuss the way I implemented
the fallback using `or`.
---------
Co-authored-by: rlm <pexpresss31@gmail.com>
This PR improves the example notebook for the Marqo vectorstore
implementation by adding a new RetrievalQAWithSourcesChain example. The
`embedding` parameter in `from_documents` has its type updated to
`Union[Embeddings, None]` and a default parameter of None because this
is ignored in Marqo.
This PR also upgrades the Marqo version to 0.11.0 to remove the device
parameter after a breaking change to the API.
Related to #7068 @tomhamer @hwchase17
---------
Co-authored-by: Tom Hamer <tom@marqo.ai>
Description: Pack of small fixes and refactorings that don't affect
functionality, just making code prettier & fixing some misspelling
(hand-filtered improvements proposed by SeniorAi.online, prototype of
code improving tool based on gpt4), agents and callbacks folders was
covered.
Dependencies: Nothing changed
Twitter: https://twitter.com/nayjest
Co-authored-by: Bagatur <baskaryan@gmail.com>
This PR improves upon the Clarifai LangChain integration with improved docs, errors, args and the addition of embedding model support in LancChain for Clarifai's embedding models and an overview of the various ways you can integrate with Clarifai added to the docs.
---------
Co-authored-by: Matthew Zeiler <zeiler@clarifai.com>
Description: Added number_of_head_rows as a parameter to pandas agent.
number_of_head_rows allows the user to select the number of rows to pass
with the prompt when include_df_in_prompt is True. This gives the
ability to control the token length and can be helpful in dealing with
large dataframe.
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
- Description: Solving, anthropic packaging version issue by clearing
the mixup from package.version that is being confused with version from
- importlib.metadata.version.
- Issue: it fixes the issue #7283
- Maintainer: @hwchase17
The following change has been explained in the comment -
https://github.com/hwchase17/langchain/issues/7283#issuecomment-1624328978
- Description: pydantic's `ModelField.type_` only exposes the native
data type but not complex type hints like `List`. Thus, generating a
Tool with `from_function` through function signature produces incorrect
argument schemas (e.g., `str` instead of `List[str]`)
- Issue: N/A
- Dependencies: N/A
- Tag maintainer: @hinthornw
- Twitter handle: `mapped`
All the unittest (with an additional one in this PR) passed, though I
didn't try integration tests...
- Description: Sometimes there are csv attachments with the media type
"application/vnd.ms-excel". These files failed to be loaded via the xlrd
library. It throws a corrupted file error. I fixed it by separately
processing excel files using pandas. Excel files will be processed just
like before.
- Dependencies: pandas, os, io
---------
Co-authored-by: Chathura <chathurar@yaalalabs.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
In some cases, the OpenAI response is missing the `finish_reason`
attribute. It seems to happen when using Ada or Babbage and
`stream=true`, but I can't always reproduce it. This change just
gracefully handles the missing key.
Introduction of newest function calling feature doesn't work properly
with PromptLayerChatOpenAI model since on the `_generate` method,
functions argument are not even getting passed to the `ChatOpenAI` base
class which results in empty `ai_message.additional_kwargs`
Fixes #6365
updated `tutorials.mdx`:
- added a link to new `Deeplearning AI` course on LangChain
- added links to other tutorial videos
- fixed format
@baskaryan, @hwchase17
#### Description
refactor BedrockEmbeddings class to clean code as below:
1. inline content type and accept
2. rewrite input_body as a dictionary literal
3. no need to declare embeddings variable, so remove it
- Description: Adding to Chroma integration the option to run a
similarity search by a vector with relevance scores. Fixing two minor
typos.
- Issue: The "lambda_mult" typo is related to #4861
- Maintainer: @rlancemartin, @eyurtsev