Commit Graph

2990 Commits (fb6e63dc3642df58b4be741832c831cfdf90516d)
 

Author SHA1 Message Date
ljeagle fb6e63dc36
Upgrade the AwaDB from 0.3.5 to 0.3.6 (#7363) 1 year ago
William FH c5edbea34a
Load Run Evaluator (#7101)
Current problems:
1. Evaluating LLMs or Chat models isn't smooth. Even specifying
'generations' as the output inserts a redundant list into the eval
template
2. Configuring input / prediction / reference keys in the
`get_qa_evaluator` function is confusing. Unless you are using a chain
with the default keys, you have to specify all the variables and need to
reason about whether the key corresponds to the traced run's inputs,
outputs or the examples inputs or outputs.


Proposal:
- Configure the run evaluator according to a model. Use the model type
and input/output keys to assert compatibility where possible. Only need
to specify a reference_key for certain evaluators (which is less
confusing than specifying input keys)


When does this work:
- If you have your langchain model available (assumed always for
run_on_dataset flow)
- If you are evaluating an LLM, Chat model, or chain
- If the LLM or chat models are traced by langchain (wouldn't work if
you add an incompatible schema via the REST API)

When would this fail:
- Currently if you directly create an example from an LLM run, the
outputs are generations with all the extra metadata present. A simple
`example_key` and dumping all to the template could make the evaluations
unreliable
- Doesn't help if you're not using the low level API
- If you want to instantiate the evaluator without instantiating your
chain or LLM (maybe common for monitoring, for instance) -> could also
load from run or run type though

What's ugly:
- Personally think it's better to load evaluators one by one since
passing a config down is pretty confusing.
- Lots of testing needs to be added
- Inconsistent in that it makes a separate run and example input mapper
instead of the original `RunEvaluatorInputMapper`, which maps a run and
example to a single input.

Example usage running the for an LLM, Chat Model, and Agent.

```
# Test running for the string evaluators
evaluator_names = ["qa", "criteria"]

model = ChatOpenAI()
configured_evaluators = load_run_evaluators_for_model(evaluator_names, model=model, reference_key="answer")
run_on_dataset(ds_name, model, run_evaluators=configured_evaluators)
```


<details>
  <summary>Full code with dataset upload</summary>
```
## Create dataset
from langchain.evaluation.run_evaluators.loading import load_run_evaluators_for_model
from langchain.evaluation import load_dataset
import pandas as pd

lcds = load_dataset("llm-math")
df = pd.DataFrame(lcds)

from uuid import uuid4
from langsmith import Client
client = Client()
ds_name = "llm-math - " + str(uuid4())[0:8]
ds = client.upload_dataframe(df, name=ds_name, input_keys=["question"], output_keys=["answer"])



## Define the models we'll test over
from langchain.llms import OpenAI
from langchain.chat_models import ChatOpenAI
from langchain.agents import initialize_agent, AgentType

from langchain.tools import tool

llm = OpenAI(temperature=0)
chat_model = ChatOpenAI(temperature=0)

@tool
    def sum(a: float, b: float) -> float:
        """Add two numbers"""
        return a + b
    
def construct_agent():
    return initialize_agent(
        llm=chat_model,
        tools=[sum],
        agent=AgentType.OPENAI_MULTI_FUNCTIONS,
    )

agent = construct_agent()

# Test running for the string evaluators
evaluator_names = ["qa", "criteria"]

models = [llm, chat_model, agent]
run_evaluators = []
for model in models:
    run_evaluators.append(load_run_evaluators_for_model(evaluator_names, model=model, reference_key="answer"))
    

# Run on LLM, Chat Model, and Agent
from langchain.client.runner_utils import run_on_dataset

to_test = [llm, chat_model, construct_agent]

for model, configured_evaluators in zip(to_test, run_evaluators):
    run_on_dataset(ds_name, model, run_evaluators=configured_evaluators, verbose=True)
```
</details>

---------

Co-authored-by: Nuno Campos <nuno@boringbits.io>
1 year ago
Bagatur 1ac347b4e3
update databerry-chaindesk redirect (#7378) 1 year ago
Joshua Carroll 705d2f5b92
Update the API Reference link in Streamlit integration docs (#7377)
This page:


https://python.langchain.com/docs/modules/callbacks/integrations/streamlit

Has a bad API Reference link currently. This PR fixes it to the correct
link.

Also updates the embedded app link to
https://langchain-mrkl.streamlit.app/ (better name) which is hosted in
langchain-ai/streamlit-agent repo
1 year ago
Georges Petrov ec033ae277
Rename Databerry to Chaindesk (#7022)
---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
1 year ago
Philip Meier da5b0723d2
update MosaicML inputs and outputs (#7348)
As of today (July 7, 2023), the [MosaicML
API](https://docs.mosaicml.com/en/latest/inference.html#text-completion-requests)
uses `"inputs"` for the prompt

This PR adds support for this new format.
---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
1 year ago
Bearnardd 184ede4e48
Fix buggy output from GraphQAChain (#7372)
fixes https://github.com/hwchase17/langchain/issues/7289
A simple fix of the buggy output of `graph_qa`. If we have several
entities with triplets then the last entry of `triplets` for a given
entity merges with the first entry of the `triplets` of the next entity.
1 year ago
Harrison Chase 7cdf97ba9b
Harrison/add to imports (#7370)
pgvector cleanup
1 year ago
Bagatur 4d427b2397
Base language model docstrings (#7104) 1 year ago
ॐ shivam mamgain 2179d4eef8
Fix for KeyError in MlflowCallbackHandler (#7051)
- Description: `MlflowCallbackHandler` fails with `KeyError: "['name']
not in index"`. See https://github.com/hwchase17/langchain/issues/5770
for more details. Root cause is that LangChain does not pass "name" as a
part of `serialized` argument to `on_llm_start()` callback method. The
commit where this change was made is probably this:
18af149e91.
My bug fix derives "name" from "id" field.
  - Issue: https://github.com/hwchase17/langchain/issues/5770
---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
1 year ago
Alex Gamble df746ad821
Add a callback handler for Context (https://getcontext.ai) (#7151)
### Description

Adding a callback handler for Context. Context is a product analytics
platform for AI chat experiences to help you understand how users are
interacting with your product.

I've added the callback library + an example notebook showing its use.

### Dependencies

Requires the user to install the `context-python` library. The library
is lazily-loaded when the callback is instantiated.

### Announcing the feature

We spoke with Harrison a few weeks ago about also doing a blog post
announcing our integration, so will coordinate this with him. Our
Twitter handle for the company is @getcontextai, and the founders are
@_agamble and @HenrySG.

Thanks in advance!
1 year ago
Austin c9a0f24646
Add verbose parameter for llamacpp (#7253)
**Title:** Add verbose parameter for llamacpp

**Description:**
This pull request adds a 'verbose' parameter to the llamacpp module. The
'verbose' parameter, when set to True, will enable the output of
detailed logs during the execution of the Llama model. This added
parameter can aid in debugging and understanding the internal processes
of the module.

The verbose parameter is a boolean that prints verbose output to stderr
when set to True. By default, the verbose parameter is set to True but
can be toggled off if less output is desired. This new parameter has
been added to the `validate_environment` method of the `LlamaCpp` class
which initializes the `llama_cpp.Llama` API:

```python
class LlamaCpp(LLM):
    ...
    @root_validator()
    def validate_environment(cls, values: Dict) -> Dict:
        ...
        model_param_names = [
            ...
            "verbose",  # New verbose parameter added
        ]
        ...
        values["client"] = Llama(model_path, **model_params)
        ...
```
---------

Signed-off-by: teleprint-me <77757836+teleprint-me@users.noreply.github.com>
1 year ago
Kenny 34a2755a54
Allow passing api key into OpenAIWhisperParser (#7281)
This just allows the user to pass in an api_key directly into
OpenAIWhisperParser. Very simple addition.
1 year ago
mrkhalil6 4e7d0c115b
Add support for filters and namespaces in similarity search in Pinecone similarity_score_threshold (#7301)
At the moment, pinecone vectorStore does not support filters and
namespaces when using similarity_score_threshold search type.
In this PR, I've implemented that. It passes all the kwargs except
"score_threshold" as that is not a supported argument for method
"similarity_search_with_score".
---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
1 year ago
Manuel Saelices 01dca1e438
Add context to an output parsing error on Pydantic schema to improve exception handling (#7344)
## Changes

- [X] Fill the `llm_output` param when there is an output parsing error
in a Pydantic schema so that we can get the original text that failed to
parse when handling the exception

## Background

With this change, we could do something like this:

```
output_parser = PydanticOutputParser(pydantic_object=pydantic_obj)
chain = ConversationChain(..., output_parser=output_parser)
try:
    response: PydanticSchema = chain.predict(input=input)
except OutputParserException as exc:
    logger.error(
        'OutputParserException while parsing chatbot response: %s', exc.llm_output,
    )
```
---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
1 year ago
Raouf Chebri 1ac6deda89
update extension name (#7359)
hi @rlancemartin ,

We had a new deployment and the `pg_extension` creation command was
updated from `CREATE EXTENSION pg_embedding` to `CREATE EXTENSION
embedding`.

https://github.com/neondatabase/neon/pull/4646

The extension not made public yet. No users will be affected by this.
Will be public next week.

Please let me know if you have any questions.

Thank you in advance 🙏
1 year ago
William FH 4e180dc54e
Unset Cache in Tests (#7362)
This is impacting other unit tests that use callbacks since the cache is
still set (just empty)
1 year ago
German Martin 3ce4e46c8c
The Fellowship of the Vectors: New Embeddings Filter using clustering. (#7015)
Continuing with Tolkien inspired series of langchain tools. I bring to
you:
**The Fellowship of the Vectors**, AKA EmbeddingsClusteringFilter.
This document filter uses embeddings to group vectors together into
clusters, then allows you to pick an arbitrary number of documents
vector based on proximity to the cluster centers. That's a
representative sample of the cluster.

The original idea is from [Greg Kamradt](https://github.com/gkamradt)
from this video (Level4):
https://www.youtube.com/watch?v=qaPMdcCqtWk&t=365s

I added few tricks to make it a bit more versatile, so you can
parametrize what to do with duplicate documents in case of cluster
overlap: replace the duplicates with the next closest document or remove
it. This allow you to use it as an special kind of redundant filter too.
Additionally you can choose 2 diff orders: grouped by cluster or
respecting the original retriever scores.
In my use case I was using the docs grouped by cluster to run refine
chains per cluster to generate summarization over a large corpus of
documents.
Let me know if you want to change anything!

@rlancemartin, @eyurtsev, @hwchase17,

---------

Co-authored-by: rlm <pexpresss31@gmail.com>
1 year ago
Leonid Ganeline b489466488
docs: `dependents` update 4 (#7360)
Updated links and counters of the `dependents` page.
1 year ago
William FH 38ca5c84cb
Explicitly list requires_reference in function (#7357) 1 year ago
Harrison Chase 49b2b0e3c0
change embedding to None (#7355) 1 year ago
imaprogrammer a2830e3056
Update chroma.py: Persist directory from client_settings if provided there (#7087)
Change details:
- Description: When calling db.persist(), a check prevents from it
proceeding as the constructor only sets member `_persist_directory` from
parameters. But the ChromaDB client settings also has this parameter,
and if the client_settings parameter is used without passing the
persist_directory (which is optional), the `persist` method raises
`ValueError` for not setting `_persist_directory`. This change fixes it
by setting the member `_persist_directory` variable from client_settings
if it is set, else uses the constructor parameter.
- Issue: I didn't find any github issue of this, but I discovered it
after calling the persist method
  - Dependencies: None
- Tag maintainer: vectorstore related change - @rlancemartin, @eyurtsev
  - Twitter handle: Don't have one :(

*Additional discussion*: We may need to discuss the way I implemented
the fallback using `or`.

---------

Co-authored-by: rlm <pexpresss31@gmail.com>
1 year ago
Bagatur cb4e88e4fb
bump 227 (#7354) 1 year ago
Bagatur d1c7237034
openai fn update nb (#7352) 1 year ago
Bagatur 0ed2da7020
bump 226 (#7335) 1 year ago
Bagatur 1c8cff32f1
Generic OpenAI fn chain (#7270)
Add loading functions for openai function chains and add docs page
1 year ago
Bagatur fd7145970f
Output parser redirect (#7330)
Related to ##7311
1 year ago
OwenElliott 3074306ae1
Marqo Vector Store Examples & Type Hints (#7326)
This PR improves the example notebook for the Marqo vectorstore
implementation by adding a new RetrievalQAWithSourcesChain example. The
`embedding` parameter in `from_documents` has its type updated to
`Union[Embeddings, None]` and a default parameter of None because this
is ignored in Marqo.

This PR also upgrades the Marqo version to 0.11.0 to remove the device
parameter after a breaking change to the API.

Related to #7068 @tomhamer @hwchase17

---------

Co-authored-by: Tom Hamer <tom@marqo.ai>
1 year ago
Nayjest 5809c3d29d
Pack of small fixes and refactorings that don't affect functionality (#6990)
Description: Pack of small fixes and refactorings that don't affect
functionality, just making code prettier & fixing some misspelling
(hand-filtered improvements proposed by SeniorAi.online, prototype of
code improving tool based on gpt4), agents and callbacks folders was
covered.

Dependencies: Nothing changed

Twitter: https://twitter.com/nayjest

Co-authored-by: Bagatur <baskaryan@gmail.com>
1 year ago
Bagatur 87f75cb322
Add base Chain docstrings (#7114) 1 year ago
Leonid Ganeline 284d40b7af
docstrings top level update (#7173)
Updated docstrings so, that [API
Reference](https://api.python.langchain.com/en/latest/api_reference.html)
page has text in the second column (class/function/... description.
1 year ago
Stav Sapir 8d961b9e33
add preset ability to textgen llm (#7196)
add an ability for textgen llm to work with preset provided by text gen
webui API.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
1 year ago
Bagatur a9c5b4bcea
Bagatur/clarifai update (#7324)
This PR improves upon the Clarifai LangChain integration with improved docs, errors, args and the addition of embedding model support in LancChain for Clarifai's embedding models and an overview of the various ways you can integrate with Clarifai added to the docs.

---------

Co-authored-by: Matthew Zeiler <zeiler@clarifai.com>
1 year ago
Oleg Zabluda 9954eff8fd
Rename prompt_template => _DEFAULT_GRAPH_QA_TEMPLATE and PROMPT => GRAPH_QA_PROMPT to make consistent with the rest of the files (#7250)
Rename prompt_template => _DEFAULT_GRAPH_QA_TEMPLATE to make consistent
with the rest of the file.
1 year ago
Nikhil Kumar Gupta 6095a0a310
Added number_of_head_rows to pandas agent parameters (#7271)
Description: Added number_of_head_rows as a parameter to pandas agent.
number_of_head_rows allows the user to select the number of rows to pass
with the prompt when include_df_in_prompt is True. This gives the
ability to control the token length and can be helpful in dealing with
large dataframe.
---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
1 year ago
John Landahl e047541b5f
Corrected a typo in elasticsearch.ipynb (#7318)
Simple typo fix
1 year ago
Subsegment 152dc59060
docs : add cnosdb to Ecosystem Integrations (#7316)
- Implement a `from_cnosdb` method for the `SQLDatabase` class
  - Write CnosDB documentation and add it to Ecosystem Integrations
1 year ago
Bagatur 927c8eb91a
Refac package version check (#7312) 1 year ago
Sparsh Jain bac56618b4
Solving anthropic packaging version issue (#7306)
- Description: Solving, anthropic packaging version issue by clearing
the mixup from package.version that is being confused with version from
- importlib.metadata.version.

  - Issue: it fixes the issue #7283 
  - Maintainer: @hwchase17 

The following change has been explained in the comment -
https://github.com/hwchase17/langchain/issues/7283#issuecomment-1624328978
1 year ago
Jason B. Koh d642609a23
Fix: Recognize `List` at `from_function` (#7178)
- Description: pydantic's `ModelField.type_` only exposes the native
data type but not complex type hints like `List`. Thus, generating a
Tool with `from_function` through function signature produces incorrect
argument schemas (e.g., `str` instead of `List[str]`)
  - Issue: N/A
  - Dependencies: N/A
  - Tag maintainer: @hinthornw
  - Twitter handle: `mapped`

All the unittest (with an additional one in this PR) passed, though I
didn't try integration tests...
1 year ago
Chathura Rathnayake ec10787bc7
Fixed the confluence loader ".csv" files loading issue (#7195)
- Description: Sometimes there are csv attachments with the media type
"application/vnd.ms-excel". These files failed to be loaded via the xlrd
library. It throws a corrupted file error. I fixed it by separately
processing excel files using pandas. Excel files will be processed just
like before.

- Dependencies: pandas, os, io

---------

Co-authored-by: Chathura <chathurar@yaalalabs.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
1 year ago
Andre Elizondo b21c2f8704
Update docs for whylabs (langkit) callback handler (#7293)
- Description: Update docs for whylabs callback handler
  - Issue: none
  - Dependencies: none
  - Tag maintainer: @agola11 
  - Twitter handle: @useautomation @whylabs

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Jamie Broomall <jamie@whylabs.ai>
1 year ago
William FH e736d60516
Load Evaluator (#6942)
Create a `load_evaluators()` function so you don't have to import all
the individual evaluator classes
1 year ago
David Duong 12d14f8947
Fix secrets serialisation for ChatAnthropic (#7300) 1 year ago
William FH cb9ff6efb8
Add function call params to invocation params (#7240) 1 year ago
William FH 1f4a51cb9c
Add Agent Trajectory Interface (#7122) 1 year ago
Bagatur a6b39afe0e
rm side nav (#7297) 1 year ago
Bruno Bornsztein 1a4ca3eff9
handle missing finish_reason (#7296)
In some cases, the OpenAI response is missing the `finish_reason`
attribute. It seems to happen when using Ada or Babbage and
`stream=true`, but I can't always reproduce it. This change just
gracefully handles the missing key.
1 year ago
Leonid Ganeline 6ff9e9b34a
updated `huggingface_hub` examples (#7292)
Added examples for models:
- Google `Flan`
- TII `Falcon`
- Salesforce `XGen`
1 year ago
Avinash Raj 09acbb8410
Modified PromptLayerChatOpenAI class to support function call (#6366)
Introduction of newest function calling feature doesn't work properly
with PromptLayerChatOpenAI model since on the `_generate` method,
functions argument are not even getting passed to the `ChatOpenAI` base
class which results in empty `ai_message.additional_kwargs`

Fixes  #6365
1 year ago