Commit Graph

1507 Commits (1d861dc37a63a41ae2e0983f2ee418efde968ce3)

Author SHA1 Message Date
Matt Wells 1d861dc37a
MRKL output parser no longer breaks well formed queries (#5432)
# Handles the edge scenario in which the action input is a well formed
SQL query which ends with a quoted column

There may be a cleaner option here (or indeed other edge scenarios) but
this seems to robustly determine if the action input is likely to be a
well formed SQL query in which we don't want to arbitrarily trim off `"`
characters

Fixes #5423

## Who can review?

Community members can review the PR once tests pass. Tag
maintainers/contributors who might be interested:

For a quicker response, figure out the right person to tag with @

  @hwchase17 - project lead

  Agents / Tools / Toolkits
  - @vowelparrot
1 year ago
Yoann Poupart c1807d8408
`encoding_kwargs` for InstructEmbeddings (#5450)
# What does this PR do?

Bring support of `encode_kwargs` for ` HuggingFaceInstructEmbeddings`,
change the docstring example and add a test to illustrate with
`normalize_embeddings`.

Fixes #3605
(Similar to #3914)

Use case:
```python
from langchain.embeddings import HuggingFaceInstructEmbeddings

model_name = "hkunlp/instructor-large"
model_kwargs = {'device': 'cpu'}
encode_kwargs = {'normalize_embeddings': True}
hf = HuggingFaceInstructEmbeddings(
    model_name=model_name,
    model_kwargs=model_kwargs,
    encode_kwargs=encode_kwargs
)
```
1 year ago
Patrick Keane e09afb4b44
Removes duplicated call from langchain/client/langchain.py (#5449)
This removes duplicate code presumably introduced by a cut-and-paste
error, spotted while reviewing the code in
```langchain/client/langchain.py```. The original code had back to back
occurrences of the following code block:

```
        response = self._get(
            path,
            params=params,
        )
        raise_for_status_with_text(response)
```
1 year ago
Jan Brinkmann 0d3a9d481f
Fixed docstring in faiss.py for load_local (#5440)
# Fix for docstring in faiss.py vectorstore (load_local)

The doctring should reflect that load_local loads something FROM the
disk.
1 year ago
Davis Chase 2649b638dd
fix (#5457) 1 year ago
ByronHsu 9d658aaa5a
Add more code splitters (go, rst, js, java, cpp, scala, ruby, php, swift, rust) (#5171)
As the title says, I added more code splitters.
The implementation is trivial, so i don't add separate tests for each
splitter.
Let me know if any concerns.

Fixes # (issue)
https://github.com/hwchase17/langchain/issues/5170

## Who can review?

Community members can review the PR once tests pass. Tag
maintainers/contributors who might be interested:
@eyurtsev @hwchase17

---------

Signed-off-by: byhsu <byhsu@linkedin.com>
Co-authored-by: byhsu <byhsu@linkedin.com>
1 year ago
Paul-Emile Brotons a61b7f7e7c
adding MongoDBAtlasVectorSearch (#5338)
# Add MongoDBAtlasVectorSearch for the python library

Fixes #5337
---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
1 year ago
Harrison Chase c4b502a470
Harrison/condense q llm (#5438) 1 year ago
Zander Chase 26ff18575c
Set old LCTracer to default to port 8000 (#5381)
Issue from:
https://discord.com/channels/1038097195422978059/1069478035918688346/1112445980466483222
1 year ago
Harrison Chase 760632b292
Harrison/spark reader (#5405)
Co-authored-by: Rithwik Ediga Lakhamsani <rithwik.ediga@databricks.com>
Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
1 year ago
UmerHA 8259f9b7fa
DocumentLoader for GitHub (#5408)
# Creates GitHubLoader (#5257)

GitHubLoader is a DocumentLoader that loads issues and PRs from GitHub.

Fixes #5257

---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
1 year ago
German Martin 0b3e0dd1d2
New Trello document loader (#4767)
# Added New Trello loader class and documentation

Simple Loader on top of py-trello wrapper. 
With a board name you can pull cards and to do some field parameter
tweaks on load operation.
I included documentation and examples.
Included unit test cases using patch and a fixture for py-trello client
class.

---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
1 year ago
Harrison Chase 72f99ff953
Harrison/text splitter (#5417)
adds support for keeping separators around when using recursive text
splitter
1 year ago
小铭 cf5803e44c
Add ToolException that a tool can throw. (#5050)
# Add ToolException that a tool can throw
This is an optional exception that tool throws when execution error
occurs.
When this exception is thrown, the agent will not stop working,but will
handle the exception according to the handle_tool_error variable of the
tool,and the processing result will be returned to the agent as
observation,and printed in pink on the console.It can be used like this:
```python 
from langchain.schema import ToolException
from langchain import LLMMathChain, SerpAPIWrapper, OpenAI
from langchain.agents import AgentType, initialize_agent
from langchain.chat_models import ChatOpenAI
from langchain.tools import BaseTool, StructuredTool, Tool, tool
from langchain.chat_models import ChatOpenAI

llm = ChatOpenAI(temperature=0)
llm_math_chain = LLMMathChain(llm=llm, verbose=True)

class Error_tool:
    def run(self, s: str):
        raise ToolException('The current search tool is not available.')
    
def handle_tool_error(error) -> str:
    return "The following errors occurred during tool execution:"+str(error)

search_tool1 = Error_tool()
search_tool2 = SerpAPIWrapper()
tools = [
    Tool.from_function(
        func=search_tool1.run,
        name="Search_tool1",
        description="useful for when you need to answer questions about current events.You should give priority to using it.",
        handle_tool_error=handle_tool_error,
    ),
    Tool.from_function(
        func=search_tool2.run,
        name="Search_tool2",
        description="useful for when you need to answer questions about current events",
        return_direct=True,
    )
]
agent = initialize_agent(tools, llm, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, verbose=True,
                         handle_tool_errors=handle_tool_error)
agent.run("Who is Leo DiCaprio's girlfriend? What is her current age raised to the 0.43 power?")
```

![image](https://github.com/hwchase17/langchain/assets/32786500/51930410-b26e-4f85-a1e1-e6a6fb450ada)

## Who can review?
- @vowelparrot

---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
1 year ago
Harrison Chase 2da8c48be1
Harrison/datetime parser (#4693)
Co-authored-by: Jacob Valdez <jacobfv@msn.com>
Co-authored-by: Jacob Valdez <jacob.valdez@limboid.ai>
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
1 year ago
Eduard van Valkenburg ccb6238de1
Implemented appending arbitrary messages (#5293)
# Implemented appending arbitrary messages to the base chat message
history, the in-memory and cosmos ones.

<!--
Thank you for contributing to LangChain! Your PR will appear in our next
release under the title you set. Please make sure it highlights your
valuable contribution.

Replace this with a description of the change, the issue it fixes (if
applicable), and relevant context. List any dependencies required for
this change.

After you're done, someone will review your PR. They may suggest
improvements. If no one reviews your PR within a few days, feel free to
@-mention the same people again, as notifications can get lost.
-->

As discussed this is the alternative way instead of #4480, with a
add_message method added that takes a BaseMessage as input, so that the
user can control what is in the base message like kwargs.

<!-- Remove if not applicable -->

Fixes # (issue)

## Before submitting

<!-- If you're adding a new integration, include an integration test and
an example notebook showing its use! -->

## Who can review?

Community members can review the PR once tests pass. Tag
maintainers/contributors who might be interested:

@hwchase17

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
1 year ago
Harrison Chase d6fb25c439
Harrison/prediction guard update (#5404)
Co-authored-by: Daniel Whitenack <whitenack.daniel@gmail.com>
1 year ago
Harrison Chase 416c8b1da3
Harrison/deep infra (#5403)
Co-authored-by: Yessen Kanapin <yessenzhar@gmail.com>
Co-authored-by: Yessen Kanapin <yessen@deepinfra.com>
1 year ago
Justin Flick c09f8e4ddc
Add pagination for Vertex AI embeddings (#5325)
Fixes #5316

---------

Co-authored-by: Justin Flick <jflick@homesite.com>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
1 year ago
Harrison Chase 3e16468423
Harrison/llamacpp (#5402)
Co-authored-by: Gavin S <gavinswanson@gmail.com>
1 year ago
Chandan Routray 642ae83d86
Removed deprecated llm attribute for load_chain (#5343)
# Removed deprecated llm attribute for load_chain

Currently `load_chain` for some chain types expect `llm` attribute to be
present but `llm` is deprecated attribute for those chains and might not
be persisted during their `chain.save`.

Fixes #5224
[(issue)](https://github.com/hwchase17/langchain/issues/5224)

## Who can review?
@hwchase17
@dev2049

---------

Co-authored-by: imeckr <chandanroutray2012@gmail.com>
1 year ago
Martin Holecek 44b48d9518
Fix update_document function, add test and documentation. (#5359)
# Fix for `update_document` Function in Chroma

## Summary
This pull request addresses an issue with the `update_document` function
in the Chroma class, as described in
[#5031](https://github.com/hwchase17/langchain/issues/5031#issuecomment-1562577947).
The issue was identified as an `AttributeError` raised when calling
`update_document` due to a missing corresponding method in the
`Collection` object. This fix refactors the `update_document` method in
`Chroma` to correctly interact with the `Collection` object.

## Changes
1. Fixed the `update_document` method in the `Chroma` class to correctly
call methods on the `Collection` object.
2. Added the corresponding test `test_chroma_update_document` in
`tests/integration_tests/vectorstores/test_chroma.py` to reflect the
updated method call.
3. Added an example and explanation of how to use the `update_document`
function in the Jupyter notebook tutorial for Chroma.

## Test Plan
All existing tests pass after this change. In addition, the
`test_chroma_update_document` test case now correctly checks the
functionality of `update_document`, ensuring that the function works as
expected and updates the content of documents correctly.

## Reviewers
@dev2049

This fix will ensure that users are able to use the `update_document`
function as expected, without encountering the previous
`AttributeError`. This will enhance the usability and reliability of the
Chroma class for all users.

Thank you for considering this pull request. I look forward to your
feedback and suggestions.
1 year ago
Louis Amaudruz e455ba4ed5
Add async support to routing chains (#5373)
# Add async support for (LLM) routing chains

<!--
Thank you for contributing to LangChain! Your PR will appear in our
release under the title you set. Please make sure it highlights your
valuable contribution.

Replace this with a description of the change, the issue it fixes (if
applicable), and relevant context. List any dependencies required for
this change.

After you're done, someone will review your PR. They may suggest
improvements. If no one reviews your PR within a few days, feel free to
@-mention the same people again, as notifications can get lost.
-->

<!-- Remove if not applicable -->

Add asynchronous LLM calls support for the routing chains. More
specifically:
- Add async `aroute` function (i.e. async version of `route`) to the
`RouterChain` which calls the routing LLM asynchronously
- Implement the async `_acall` for the `LLMRouterChain`
- Implement the async `_acall` function for `MultiRouteChain` which
first calls asynchronously the routing chain with its new `aroute`
function, and then calls asynchronously the relevant destination chain.

<!-- If you're adding a new integration, please include:

1. a test for the integration - favor unit tests that does not rely on
network access.
2. an example notebook showing its use


See contribution guidelines for more information on how to write tests,
lint
etc:


https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
-->

## Who can review?

- @agola11

<!-- For a quicker response, figure out the right person to tag with @

  @hwchase17 - project lead
  Async
  - @agola11
        
 -->
1 year ago
Gael Grosch 8b7721ebbb
fix: Blob.from_data mimetype is lost (#5395)
# Fix lost mimetype when using Blob.from_data method

The mimetype is lost due to a typo in the class attribue name

Fixes # - (no issue opened but I can open one if needed)

## Changes

* Fixed typo in name
* Added unit-tests to validate the output Blob


## Review
@eyurtsev
1 year ago
Zander Chase 14099f1b93
Use Default Factory (#5380)
We shouldn't be calling a constructor for a default value - should use
default_factory instead. This is especially ad in this case since it
requires an optional dependency and an API key to be set.
 
Resolves #5361
1 year ago
Harrison Chase 6df90ad9fd
handle json parsing errors (#5371)
adds tests cases, consolidates a lot of PRs
1 year ago
玄猫 99a1e3f3a3
Fix: Handle empty documents in ContextualCompressionRetriever (Issue #5304) (#5306)
# Fix: Handle empty documents in ContextualCompressionRetriever (Issue
#5304)

Fixes #5304 

Prevent cohere.error.CohereAPIError caused by an empty list of documents
by adding a condition to check if the input documents list is empty in
the compress_documents method. If the list is empty, return an empty
list immediately, avoiding the error and unnecessary processing.

@dev2049

---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
1 year ago
os1ma 1366d070fc
Add path validation to DirectoryLoader (#5327)
# Add path validation to DirectoryLoader

This PR introduces a minor adjustment to the DirectoryLoader by adding
validation for the path argument. Previously, if the provided path
didn't exist or wasn't a directory, DirectoryLoader would return an
empty document list due to the behavior of the `glob` method. This could
potentially cause confusion for users, as they might expect a
file-loading error instead.

So, I've added two validations to the load method of the
DirectoryLoader:

- Raise a FileNotFoundError if the provided path does not exist
- Raise a ValueError if the provided path is not a directory

Due to the relatively small scope of these changes, a new issue was not
created.

## Before submitting

<!-- If you're adding a new integration, please include:

1. a test for the integration - favor unit tests that does not rely on
network access.
2. an example notebook showing its use


See contribution guidelines for more information on how to write tests,
lint
etc:


https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
-->

## Who can review?

Community members can review the PR once tests pass. Tag
maintainers/contributors who might be interested:

@eyurtsev
1 year ago
Harrison Chase b6927970f1
revert bad json (#5370) 1 year ago
Matt Wells 9a5c9df809
Fixes iter error in FAISS add_embeddings call (#5367)
# Remove re-use of iter within add_embeddings causing error

As reported in https://github.com/hwchase17/langchain/issues/5336 there
is an issue currently involving the atempted re-use of an iterator
within the FAISS vectorstore adapter

Fixes # https://github.com/hwchase17/langchain/issues/5336

## Who can review?

Community members can review the PR once tests pass. Tag
maintainers/contributors who might be interested:

  VectorStores / Retrievers / Memory
  - @dev2049
1 year ago
Janos Tolgyesi 5f4552391f
Add SKLearnVectorStore (#5305)
# Add SKLearnVectorStore

This PR adds SKLearnVectorStore, a simply vector store based on
NearestNeighbors implementations in the scikit-learn package. This
provides a simple drop-in vector store implementation with minimal
dependencies (scikit-learn is typically installed in a data scientist /
ml engineer environment). The vector store can be persisted and loaded
from json, bson and parquet format.

SKLearnVectorStore has soft (dynamic) dependency on the scikit-learn,
numpy and pandas packages. Persisting to bson requires the bson package,
persisting to parquet requires the pyarrow package.

## Before submitting

Integration tests are provided under
`tests/integration_tests/vectorstores/test_sklearn.py`

Sample usage notebook is provided under
`docs/modules/indexes/vectorstores/examples/sklear.ipynb`

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
1 year ago
Aymen Furter e2742953a6
feat: support for shopping search in SerpApi (#5259)
# Support for shopping search in SerpApi

## Who can review?
@vowelparrot
1 year ago
Eduard van Valkenburg 1daa7068b2
added cosmos kwargs option (#5292)
# Added the ability to pass kwargs to cosmos client constructor

The cosmos client has a ton of options that can be set, so allowing
those to be passed to the constructor from the chat memory constructor
with this PR.
1 year ago
mbchang f079cdf479
fix: remove empty lines that cause InvalidRequestError (#5320)
# remove empty lines in GenerativeAgentMemory that cause
InvalidRequestError in OpenAIEmbeddings

<!--
Thank you for contributing to LangChain! Your PR will appear in our
release under the title you set. Please make sure it highlights your
valuable contribution.

Replace this with a description of the change, the issue it fixes (if
applicable), and relevant context. List any dependencies required for
this change.

After you're done, someone will review your PR. They may suggest
improvements. If no one reviews your PR within a few days, feel free to
@-mention the same people again, as notifications can get lost.
-->

<!-- Remove if not applicable -->

Let's say the text given to `GenerativeAgent._parse_list` is
```
text = """
Insight 1: <insight 1>

Insight 2: <insight 2>
"""
```
This creates an `openai.error.InvalidRequestError: [''] is not valid
under any of the given schemas - 'input'` because
`GenerativeAgent.add_memory()` tries to add an empty string to the
vectorstore.

This PR fixes the issue by removing the empty line between `Insight 1`
and `Insight 2`

## Before submitting

<!-- If you're adding a new integration, please include:

1. a test for the integration - favor unit tests that does not rely on
network access.
2. an example notebook showing its use


See contribution guidelines for more information on how to write tests,
lint
etc:


https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
-->

## Who can review?

Community members can review the PR once tests pass. Tag
maintainers/contributors who might be interested:

<!-- For a quicker response, figure out the right person to tag with @

  @hwchase17 - project lead

  Tracing / Callbacks
  - @agola11

  Async
  - @agola11

  DataLoaders
  - @eyurtsev

  Models
  - @hwchase17
  - @agola11

  Agents / Tools / Toolkits
  - @vowelparrot

  VectorStores / Retrievers / Memory
  - @dev2049
        
 -->
@hwchase17
@vowelparrot
@dev2049
1 year ago
Deepak S V c6e5d90eff
Fixing blank thoughts in verbose for "_Exception" Action (#5331)
Fixed the issue of blank Thoughts being printed in verbose when
`handle_parsing_errors=True`, as below:

Before Fix:
```
Observation: There are 38175 accounts available in the dataframe.
Thought:
Observation: Invalid or incomplete response
Thought:
Observation: Invalid or incomplete response
Thought:
```

After Fix:
```
Observation: There are 38175 accounts available in the dataframe.
Thought:AI: {
    "action": "Final Answer",
    "action_input": "There are 38175 accounts available in the dataframe."
}
Observation: Invalid Action or Action Input format
Thought:AI: {
    "action": "Final Answer",
    "action_input": "The number of available accounts is 38175."
}
Observation: Invalid Action or Action Input format
```

@vowelparrot currently I have set the colour of thought to green (same
as the colour when `handle_parsing_errors=False`). If you want to change
the colour of this "_Exception" case to red or something else (when
`handle_parsing_errors=True`), feel free to change it in line 789.
1 year ago
Harrison Chase 179ddbe88b
add enum output parser (#5165) 1 year ago
Shukri 58e95cd11e
Better docs for weaviate hybrid search (#5290)
# Better docs for weaviate hybrid search

<!--
Thank you for contributing to LangChain! Your PR will appear in our next
release under the title you set. Please make sure it highlights your
valuable contribution.

Replace this with a description of the change, the issue it fixes (if
applicable), and relevant context. List any dependencies required for
this change.

After you're done, someone will review your PR. They may suggest
improvements. If no one reviews your PR within a few days, feel free to
@-mention the same people again, as notifications can get lost.
-->

<!-- Remove if not applicable -->

Fixes: NA

## Before submitting

<!-- If you're adding a new integration, include an integration test and
an example notebook showing its use! -->

## Who can review?

Community members can review the PR once tests pass. Tag
maintainers/contributors who might be interested:

<!-- For a quicker response, figure out the right person to tag with @

        @hwchase17 - project lead

        Tracing / Callbacks
        - @agola11

        Async
        - @agola11

        DataLoaders
        - @eyurtsev

        Models
        - @hwchase17
        - @agola11

        Agents / Tools / Toolkits
        - @vowelparrot
        
        VectorStores / Retrievers / Memory
        - @dev2049
        
 -->
@dev2049
1 year ago
Leonid Kuligin aa3c7b3271
Fixed passing creds to VertexAI LLM (#5297)
# Fixed passing creds to VertexAI LLM

Fixes  #5279 

It looks like we should drop a type annotation for Credentials.

Co-authored-by: Leonid Kuligin <kuligin@google.com>
1 year ago
Peng Qu d481d887bc
Add an example to make the prompt more robust (#5291)
# Add example to LLMMath to help with power operator

Add example to LLMMath that helps the model to interpret `^` as the power operator rather than the python xor operator.
1 year ago
Xiangrui Meng aec642febb
LLM wrapper for Databricks (#5142)
This PR adds LLM wrapper for Databricks. It supports two endpoint types:
* serving endpoint
* cluster driver proxy app

An integration notebook is included to show how it works.


Co-authored-by: Davis Chase <130488702+dev2049@users.noreply.github.com>
Co-authored-by: Gengliang Wang <gengliang@apache.org>
Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
1 year ago
Ted Martinez 1cb6498fdb
Tedma4/twilio tool (#5136)
# Add twilio sms tool

---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
1 year ago
Michael Landis 7047a2c1af
feat: add Momento as a standard cache and chat message history provider (#5221)
# Add Momento as a standard cache and chat message history provider

This PR adds Momento as a standard caching provider. Implements the
interface, adds integration tests, and documentation. We also add
Momento as a chat history message provider along with integration tests,
and documentation.

[Momento](https://www.gomomento.com/) is a fully serverless cache.
Similar to S3 or DynamoDB, it requires zero configuration,
infrastructure management, and is instantly available. Users sign up for
free and get 50GB of data in/out for free every month.

## Before submitting

 We have added documentation, notebooks, and integration tests
demonstrating usage.

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
1 year ago
Hassan Ouda 56ad56c812
Support bigquery dialect - SQL (#5261)
# Your PR Title (What it does)

Adding an if statement to deal with bigquery sql dialect. When I use
bigquery dialect before, it failed while using SET search_path TO. So
added a condition to set dataset as the schema parameter which is
equivalent to SET search_path TO . I have tested and it works.


## Who can review?

Community members can review the PR once tests pass. Tag
maintainers/contributors who might be interested:
@dev2049
1 year ago
Abdelsalam ElTamawy 2ef5579eae
Added pipline args to `HuggingFacePipeline.from_model_id` (#5268)
The current `HuggingFacePipeline.from_model_id` does not allow passing
of pipeline arguments to the transformer pipeline.
This PR enables adding important pipeline parameters like setting
`max_new_tokens` for example.
Previous to this PR it would be necessary to manually create the
pipeline through huggingface transformers then handing it to langchain.

For example instead of this
```py
model_id = "gpt2"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)
pipe = pipeline(
    "text-generation", model=model, tokenizer=tokenizer, max_new_tokens=10
)
hf = HuggingFacePipeline(pipeline=pipe)
```
You can write this
```py
hf = HuggingFacePipeline.from_model_id(
    model_id="gpt2", task="text-generation", pipeline_kwargs={"max_new_tokens": 10}
)
```


Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
1 year ago
Davis Chase f01dfe858d
OpenAI lint (#5273)
Causing lint issues if you have openai installed, annoying for local dev
1 year ago
Nicholas Liu 7652d2abb0
Add Multi-CSV/DF support in CSV and DataFrame Toolkits (#5009)
Add Multi-CSV/DF support in CSV and DataFrame Toolkits
* CSV and DataFrame toolkits now accept list of CSVs/DFs
* Add default prompts for many dataframes in `pandas_dataframe` toolkit

Fixes #1958
Potentially fixes #4423

## Testing
* Add single and multi-dataframe integration tests for
`pandas_dataframe` toolkit with permutations of `include_df_in_prompt`
* Add single and multi-CSV integration tests for csv toolkit
---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
1 year ago
Alex Rothberg 3223a97dc6
Add visible_only and strict_mode options to ClickTool (#4088)
Partially addresses: https://github.com/hwchase17/langchain/issues/4066
1 year ago
Ravindra Marella b3988621c5
Add C Transformers for GGML Models (#5218)
# Add C Transformers for GGML Models
I created Python bindings for the GGML models:
https://github.com/marella/ctransformers

Currently it supports GPT-2, GPT-J, GPT-NeoX, LLaMA, MPT, etc. See
[Supported
Models](https://github.com/marella/ctransformers#supported-models).


It provides a unified interface for all models:

```python
from langchain.llms import CTransformers

llm = CTransformers(model='/path/to/ggml-gpt-2.bin', model_type='gpt2')

print(llm('AI is going to'))
```

It can be used with models hosted on the Hugging Face Hub:

```py
llm = CTransformers(model='marella/gpt-2-ggml')
```

It supports streaming:

```py
from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler

llm = CTransformers(model='marella/gpt-2-ggml', callbacks=[StreamingStdOutCallbackHandler()])
```

Please see [README](https://github.com/marella/ctransformers#readme) for
more details.
---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
1 year ago
Alon Diament d3cd21ccf8
Fixed regression in JoplinLoader's get note url (#5265)
Fixes a regression in JoplinLoader that was introduced during the code
review (bad `page` wildcard in _get_note_url).

## Who can review?

Community members can review the PR once tests pass. Tag
maintainers/contributors who might be interested:

@dev2049
@leo-gan
1 year ago
Davis Chase 3be9ba14f3
OpenSearch top k parameter fix (#5216)
For most queries it's the `size` parameter that determines final number
of documents to return. Since our abstractions refer to this as `k`, set
this to be `k` everywhere instead of expecting a separate param. Would
be great to have someone more familiar with OpenSearch validate that
this is reasonable (e.g. that having `size` and what OpenSearch calls
`k` be the same won't lead to any strange behavior). cc @naveentatikonda

Closes #5212
1 year ago