Commit Graph

7004 Commits

Author SHA1 Message Date
Bagatur
1e29b676d5
core[patch]: simple fallback streaming (#16055) 2024-01-19 16:31:54 -08:00
Eugene Yurtsev
4ef0ed4ddc
astream_events: Add version parameter while method is in beta (#16290)
Add a version parameter while the method is in beta phase.

The idea is to make it possible to minimize making breaking changes for users while we're iterating on schema.

Once the API is stable we can assign a default version requirement.
2024-01-19 13:20:02 -05:00
Bagatur
91230ef5d1
openai[patch]: Release 0.0.3 (#16289) 2024-01-19 10:15:08 -08:00
Hamza Kyamanywa
39b3c6d94c
langchain[patch]: Add konlpy based text splitting for Korean (#16003)
- **Description:** Adds a text splitter based on
[Konlpy](https://konlpy.org/en/latest/#start) which is a Python package
for natural language processing (NLP) of the Korean language. (It is
like Spacy or NLTK for Korean)
- **Dependencies:** Konlpy would have to be installed before this
splitter is used,
  - **Twitter handle:** @untilhamza
2024-01-19 09:44:56 -08:00
Hongyu Lin
9b0a531aa2
doc: Fix small typo in quickstart (#16164)
- **Description:** fix small typo in quickstart

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-01-19 09:44:22 -08:00
Sagar B Manjunath
63e2acc964
docs: Fix minor issues in NVIDIA RAG canonical template (#16189)
- **Description:** Fixes a few issues in NVIDIAcanonical RAG template's
README, and adds a notebook for the template
- **Dependencies:** Adds the pypdf dependency which is needed for
ingestion, and updates the lock file

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-01-19 09:44:08 -08:00
Lance Martin
881d1c3ec5
Update MultiON toolkit docs (#16286) 2024-01-19 09:37:20 -08:00
Bagatur
e3828bee43
core[patch]: Release 0.1.13 (#16287) 2024-01-19 09:28:31 -08:00
Bagatur
2454fefc53
docs: agent prompt docs (#16105) 2024-01-19 09:19:22 -08:00
Bagatur
84bf5787a7
core[patch], openai[patch]: Chat openai stream logprobs (#16218) 2024-01-19 09:16:09 -08:00
Bagatur
6f7a414955
docs: fix links (#16284) 2024-01-19 08:51:12 -08:00
Eugene Yurtsev
cc2e30fa13
CI: update the description used for privileged issue template (#16277)
Update description
2024-01-19 10:13:33 -05:00
Eugene Yurtsev
3b649f4331
CI: Add privileged version for issue creation (#16276)
Add privileged version for issue creation.

This adds a version of issue creation which is unstructured by design to
make it easier for maintainers to create issues.

Maintainers are expected to write / describe issues clearly.
2024-01-19 09:53:51 -05:00
Eugene Yurtsev
c0d453d8ac
CI: Disable blank issues, add links to QA discussions & show and tell (#16275)
Update the issue template
2024-01-19 09:34:23 -05:00
Carey
021b0484a8
community[patch]: add skipped test for inner product normalization (#14989)
---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-01-18 23:03:15 -08:00
Lance Martin
f63906a9c2
Test and update MultiON agent toolkit docs (#16235) 2024-01-18 20:24:35 -08:00
Christophe Bornet
3ccbe11363
community[minor]: Add Cassandra document loader (#16215)
- **Description:** document loader for Apache Cassandra
  - **Twitter handle:** cbornet_
2024-01-18 18:49:02 -08:00
Tomaz Bratanic
fc84083ce5
docs: Add neo4j semantic blog post link to templates (#16225) 2024-01-18 18:45:22 -08:00
mikeFore4
9d32af72ce
community[patch]: huggingface hub character removal bug fix (#16233)
- **Description:** Some text-generation models on huggingface repeat the
prompt in their generated response, but not all do! The tests use "gpt2"
which DOES repeat the prompt and as such, the HuggingFaceHub class is
hardcoded to remove the first few characters of the response (to match
the len(prompt)). However, if you are using a model (such as the very
popular "meta-llama/Llama-2-7b-chat-hf") that DOES NOT repeat the prompt
in it's generated text, then the beginning of the generated text will be
cut off. This code change fixes that bug by first checking whether the
prompt is repeated in the generated response and removing it
conditionally.
  - **Issue:** #16232 
  - **Dependencies:** N/A
  - **Twitter handle:** N/A
2024-01-18 18:44:10 -08:00
Andreas Motl
3613d8a2ad
community[patch]: Use SQLAlchemy's bulk_save_objects method to improve insert performance (#16244)
- **Description:** Improve [pgvector vector store
adapter](https://github.com/langchain-ai/langchain/blob/v0.1.1/libs/community/langchain_community/vectorstores/pgvector.py)
to save embeddings in batches, to improve its performance.
  - **Issue:** NA
  - **Dependencies:** NA
  - **References:** https://github.com/crate-workbench/langchain/pull/1


Hi again from the CrateDB team,

following up on GH-16243, this is another minor patch to the pgvector
vector store adapter. Inserting embeddings in batches, using
[SQLAlchemy's
`bulk_save_objects`](https://docs.sqlalchemy.org/en/20/orm/session_api.html#sqlalchemy.orm.Session.bulk_save_objects)
method, can deliver substantial performance gains.

With kind regards,
Andreas.

NB: As I am seeing just now that this method is a legacy feature of SA
2.0, it will need to be reworked on a future iteration. However, it is
not deprecated yet, and I haven't been able to come up with a different
implementation, yet.
2024-01-18 18:35:39 -08:00
Ashley Xu
0f99646ca6
docs: add the enrollment form forBigQueryVectorSearch (#16240)
This PR adds the enrollment form for BigQueryVectorSearch.
2024-01-18 18:34:06 -08:00
Eugene Yurtsev
177af65dc4
core[minor]: RFC Add astream_events to Runnables (#16172)
This PR adds `astream_events` method to Runnables to make it easier to
stream data from arbitrary chains.

* Streaming only works properly in async right now
* One should use `astream()` with if mixing in imperative code as might
be done with tool implementations
* Astream_log has been modified with minimal additive changes, so no
breaking changes are expected
* Underlying callback code / tracing code should be refactored at some
point to handle things more consistently (OK for now)

- ~~[ ] verify event for on_retry~~ does not work until we implement
streaming for retry
- ~~[ ] Any rrenaming? Should we rename "event" to "hook"?~~
- [ ] Any other feedback from community?
- [x] throw NotImplementedError for `RunnableEach` for now

## Example

See this [Example
Notebook](dbbc7fa0d6/docs/docs/modules/agents/how_to/streaming_events.ipynb)
for an example with streaming in the context of an Agent

## Event Hooks Reference

Here is a reference table that shows some events that might be emitted
by the various Runnable objects.
Definitions for some of the Runnable are included after the table.


| event | name | chunk | input | output |

|----------------------|------------------|---------------------------------|-----------------------------------------------|-------------------------------------------------|
| on_chat_model_start | [model name] | | {"messages": [[SystemMessage,
HumanMessage]]} | |
| on_chat_model_stream | [model name] | AIMessageChunk(content="hello")
| | |
| on_chat_model_end | [model name] | | {"messages": [[SystemMessage,
HumanMessage]]} | {"generations": [...], "llm_output": None, ...} |
| on_llm_start | [model name] | | {'input': 'hello'} | |
| on_llm_stream | [model name] | 'Hello' | | |
| on_llm_end | [model name] | | 'Hello human!' |
| on_chain_start | format_docs | | | |
| on_chain_stream | format_docs | "hello world!, goodbye world!" | | |
| on_chain_end | format_docs | | [Document(...)] | "hello world!,
goodbye world!" |
| on_tool_start | some_tool | | {"x": 1, "y": "2"} | |
| on_tool_stream | some_tool | {"x": 1, "y": "2"} | | |
| on_tool_end | some_tool | | | {"x": 1, "y": "2"} |
| on_retriever_start | [retriever name] | | {"query": "hello"} | |
| on_retriever_chunk | [retriever name] | {documents: [...]} | | |
| on_retriever_end | [retriever name] | | {"query": "hello"} |
{documents: [...]} |
| on_prompt_start | [template_name] | | {"question": "hello"} | |
| on_prompt_end | [template_name] | | {"question": "hello"} |
ChatPromptValue(messages: [SystemMessage, ...]) |


Here are declarations associated with the events shown above:

`format_docs`:

```python
def format_docs(docs: List[Document]) -> str:
    '''Format the docs.'''
    return ", ".join([doc.page_content for doc in docs])

format_docs = RunnableLambda(format_docs)
```

`some_tool`:

```python
@tool
def some_tool(x: int, y: str) -> dict:
    '''Some_tool.'''
    return {"x": x, "y": y}
```

`prompt`:

```python
template = ChatPromptTemplate.from_messages(
    [("system", "You are Cat Agent 007"), ("human", "{question}")]
).with_config({"run_name": "my_template", "tags": ["my_template"]})
```
2024-01-18 21:27:01 -05:00
SN
f175bf7d7b
Use env for revision id if not passed in as param; use git describe as backup (#16227)
Co-authored-by: William Fu-Hinthorn <13333726+hinthornw@users.noreply.github.com>
2024-01-18 16:15:26 -08:00
Erick Friis
e5878c467a
infra: scheduled testing env (#16239) 2024-01-18 14:28:01 -08:00
Erick Friis
2f348c695a
infra: add nvidia api secret to integration testing (#15972) 2024-01-18 14:20:02 -08:00
Erick Friis
50959abf0c
infra: google cse id integration test (#16238) 2024-01-18 14:12:00 -08:00
Erick Friis
b9495da92d
langchain[patch]: fix stuff documents chain api docs render (#16159) 2024-01-18 14:07:44 -08:00
Erick Friis
eec3347939
docs: together cookbook import (#16236) 2024-01-18 14:07:19 -08:00
Erick Friis
92bc80483a
infra: google search api key (#16237) 2024-01-18 14:06:38 -08:00
Erick Friis
0e76d84137
google-vertexai[patch]: more integration test fixes (#16234) 2024-01-18 13:59:23 -08:00
Erick Friis
aa35b43bcd
docs, google-vertex[patch]: function docs (#16231) 2024-01-18 13:15:09 -08:00
Erick Friis
f2b2d59e82
docs: transport and client options docs (#16226)
<!-- Thank you for contributing to LangChain!

Please title your PR "<package>: <description>", where <package> is
whichever of langchain, community, core, experimental, etc. is being
modified.

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes if applicable,
  - **Dependencies:** any dependencies required for this change,
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` from the root
of the package you've modified to check this locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc: https://python.langchain.com/docs/contributing/

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
2024-01-18 12:23:04 -08:00
Harrison Chase
f60f59d69f
google-vertexai[patch]: Harrison/vertex function calling (#16223)
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-01-18 12:17:40 -08:00
Rajesh Thallam
6bc6d64a12
langchain_google_vertexai[patch]: Add support for SystemMessage for Gemini chat model (#15933)
- **Description:** In Google Vertex AI, Gemini Chat models currently
doesn't have a support for SystemMessage. This PR adds support for it
only if a user provides additional convert_system_message_to_human flag
during model initialization (in this case, SystemMessage would be
prepended to the first HumanMessage). **NOTE:** The implementation is
similar to #14824


- **Twitter handle:** rajesh_thallam

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-01-18 10:22:07 -08:00
Erick Friis
65b231d40b
mistralai[patch]: async integration tests (#16214) 2024-01-18 09:45:44 -08:00
jzaldi
ed118950fe
docs: Updated integration docs structure for llm/google_vertex_ai_palm (#16091)
- **Description**: Updated doc for llm/google_vertex_ai_palm with new
functions: `invoke`, `stream`... Changed structure of the document to
match the required one.
- **Issue**: #15664 
- **Dependencies**: None
- **Twitter handle**: None

---------

Co-authored-by: Jorge Zaldívar <jzaldivar@google.com>
2024-01-18 09:45:27 -08:00
Bagatur
aa2e642ce3
docs: tool use nits (#16211) 2024-01-18 09:17:53 -08:00
Eugene Zapolsky
6b9e3ed9e9
google-vertexai[minor]: added safety_settings property to gemini wrapper (#15344)
**Description:** Gemini model has quite annoying default safety_settings
settings. In addition, current VertexAI class doesn't provide a property
to override such settings.
So, this PR aims to 
 - add safety_settings property to VertexAI
- fix issue with incorrect LLM output parsing when LLM responds with
appropriate 'blocked' response
- fix issue with incorrect parsing LLM output when Gemini API blocks
prompt itself as inappropriate
- add safety_settings related tests

I'm not enough familiar with langchain code base and guidelines. So, any
comments and/or suggestions are very welcome.
 
**Issue:** it will likely fix #14841

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-01-18 08:54:30 -08:00
Eugene Yurtsev
ecd4f0a7ec
core[patch]: testing add chat model for unit-tests (#16209)
This PR adds a fake chat model for testing purposes.

Used in this PR: https://github.com/langchain-ai/langchain/pull/16172
2024-01-18 11:30:53 -05:00
Bagatur
27ad65cc68
docs: add tool use diagrams (#16207) 2024-01-18 07:59:54 -08:00
SN
7d444724d7
Add revision identifier to run_on_dataset (#16167)
Allow specifying revision identifier for better project versioning
2024-01-17 20:27:43 -08:00
Eugene Yurtsev
5d8c147332
docs: Document and test PydanticOutputFunctionsParser (#15759)
This PR adds documentation and testing to
`PydanticOutputFunctionsParser(OutputFunctionsParser)`.
2024-01-17 18:21:18 -08:00
Christophe Bornet
3502a407d9
infra: Use dotenv in langchain-community's integration tests (#16137)
* Removed some env vars not used in langchain package IT
* Added Astra DB env vars in langchain package, used for cache tests
* Added conftest.py to load env vars in langchain_community IT
* Added .env.example in  langchain_community IT
2024-01-17 18:18:26 -08:00
Nuno Campos
ca014d5b04
Update readme (#16160)
<!-- Thank you for contributing to LangChain!

Please title your PR "<package>: <description>", where <package> is
whichever of langchain, community, core, experimental, etc. is being
modified.

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes if applicable,
  - **Dependencies:** any dependencies required for this change,
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` from the root
of the package you've modified to check this locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc: https://python.langchain.com/docs/contributing/

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
2024-01-17 13:56:07 -08:00
Tomaz Bratanic
1e80113ac9
community[patch]: Add neo4j timeout and value sanitization option (#16138)
The timeout function comes in handy when you want to kill longrunning
queries.
The value sanitization removes all lists that are larger than 128
elements. The idea here is to remove embedding properties from results.
2024-01-17 13:22:19 -08:00
Bagatur
27ed2673da
docs: model io order (#16163) 2024-01-17 13:13:31 -08:00
Krishna Shedbalkar
f238217cea
community[patch]: Basic Logging and Human input to ShellTool (#15932)
- **Description:** As Shell tool is very versatile, while integrating it
into applications as openai functions, developers have no clue about
what command is being executed using the ShellTool. All one can see is:

![image](https://github.com/langchain-ai/langchain/assets/60742358/540e274a-debc-4564-9027-046b91424df3)

Summarising my feature request:
1. There's no visibility about what command was executed.
2. There's no mechanism to prevent a command to be executed using
ShellTool, like a y/n human input which can be accepted from user to
proceed with executing the command.,
  - **Issue:** the issue #15931 it fixes if applicable,
  - **Dependencies:** There isn't any dependancy,
  - **Twitter handle:** @krishnashed
2024-01-17 12:57:51 -08:00
Bagatur
2af813c7eb
docs: bump sphinx>=5 (#16162) 2024-01-17 12:57:34 -08:00
Bagatur
679a3ae933
openai[patch]: clarify azure error (#16157) 2024-01-17 12:43:14 -08:00
Bagatur
7ad9eba8f4
core[patch]: Release 0.1.12 (#16161) 2024-01-17 12:39:45 -08:00