Commit Graph

8685 Commits (56fe4ab382631a12a08929c499008bd221d9ee49)
 

Author SHA1 Message Date
DrKroll c4da8d0813
langchain[patch]: load ReadFileTool (#14301)
---------

Co-authored-by: Dr. Simon Kroll <krolls@fida.de>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Eugene Yurtsev <eugene@langchain.dev>
Co-authored-by: Bagatur <baskaryan@gmail.com>
4 months ago
anshaneel 0884e5de7f
community[minor]: Add Alpha Vantage API Tool (#14332)
### Description
This implementation adds functionality from the AlphaVantage API,
renowned for its comprehensive financial data. The class encapsulates
various methods, each dedicated to fetching specific types of financial
information from the API.

### Implemented Functions

- **`search_symbols`**: 
- Searches the AlphaVantage API for financial symbols using the provided
keywords.

- **`_get_market_news_sentiment`**: 
- Retrieves market news sentiment for a specified stock symbol from the
AlphaVantage API.

- **`_get_time_series_daily`**: 
- Fetches daily time series data for a specific symbol from the
AlphaVantage API.

- **`_get_quote_endpoint`**: 
- Obtains the latest price and volume information for a given symbol
from the AlphaVantage API.

- **`_get_time_series_weekly`**: 
- Gathers weekly time series data for a particular symbol from the
AlphaVantage API.

- **`_get_top_gainers_losers`**: 
- Provides details on top gainers, losers, and most actively traded
tickers in the US market from the AlphaVantage API.

  ### Issue: 
  - #11994 
  
### Dependencies: 
  - 'requests' library for HTTP requests. (import requests)
  - 'pytest' library for testing. (import pytest)

---------

Co-authored-by: Adam Badar <94140103+adam-badar@users.noreply.github.com>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
4 months ago
Alex Sherstinsky a9bc212bf2
community[minor]: fix failing Predibase integration (#19776)
- [x] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [x] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** Langchain-Predibase integration was failing, because
it was not current with the Predibase SDK; in addition, Predibase
integration tests were instantiating the Langchain Community `Predibase`
class with one required argument (`model`) missing. This change updates
the Predibase SDK usage and fixes the integration tests.
    - **Twitter handle:** `@alexsherstinsky`


---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
4 months ago
ethynic e9caa22d47
community[patch]: Update minimax.py (#14384)
MiniMaxChat class _generate method shoud return a ChatResult object not
str

Co-authored-by: Bagatur <baskaryan@gmail.com>
4 months ago
Ahmed Moubtahij f5d4ce840f
langchain[patch]: Simplify ensemble retriever (#14427)
- **Description:** code simplification to improve readability and remove
unnecessary memory allocations.
  - **Tag maintainer**: @baskaryan, @eyurtsev, @hwchase17.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
4 months ago
Snehil Kumar b36f4147b0
docs: Google Drive Loader always set the env var (#14791)
- **Description:** Code written by following, the official documentation
of [Google Drive
Loader](https://python.langchain.com/docs/integrations/document_loaders/google_drive),
gives errors. I have opened an issue regarding this. See #14725. This is
a pull request for modifying the documentation to use an approach that
makes the code work. Basically, the change is that we need to always set
the GOOGLE_APPLICATION_CREDENTIALS env var to an emtpy string, rather
than only in case of RefreshError. Also, rewrote 2 paragraphs to make
the instructions more clear.
- **Issue:** See this related [issue #
14725](https://github.com/langchain-ai/langchain/issues/14725)
  - **Dependencies:** NA
  - **Tag maintainer:** @baskaryan
  - **Twitter handle:** NA

Co-authored-by: Snehil <snehil@example.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
4 months ago
M.Abdulrahman Alnaseer ba54f1577f
community[minor]: add support for llmsherpa (#19741)
Thank you for contributing to LangChain!

- [x] **PR title**: "community: added support for llmsherpa library"

- [x] **Add tests and docs**: 
1. Integration test:
'docs/docs/integrations/document_loaders/test_llmsherpa.py'.
2. an example notebook:
`docs/docs/integrations/document_loaders/llmsherpa.ipynb`.


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
4 months ago
Naveenkhasyap a99bd098ac
docs: fix for #16702 and #16703 (#16705)
- **Description:** Quickstart Documentation updates for missing
dependency installation steps.
- **Issue:** the issue # it prompts users to install required
dependency.
  - **Dependencies:** no,
  - **Twitter handle:** @naveenkashyap_

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
4 months ago
Brace Sproul 6d93a03bef
docs[patch]: Fix or remove broken mdx links (#19777)
this pr also drops the community added action for checking broken links
in mdx. It does not work well for our use case, throwing errors for
local paths, plus the rest of the errors our in house solution had.
4 months ago
Bagatur 2f5606a318
mistralai[patch]: correct integration_test (#19774) 4 months ago
Pierre Véron ace7b66261
mistralai[patch]: add missing _combine_llm_outputs implementation in ChatMistralAI (#18603)
# Description
Implementing `_combine_llm_outputs` to `ChatMistralAI` to override the
default implementation in `BaseChatModel` returning `{}`. The
implementation is inspired by the one in `ChatOpenAI` from package
`langchain-openai`.
# Issue
None
# Dependencies
None
# Twitter handle
None

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
4 months ago
lvliang-intel 0175906437
templates: add RAG template for Intel Xeon Scalable Processors (#18424)
**Description:**
This template utilizes Chroma and TGI (Text Generation Inference) to
execute RAG on the Intel Xeon Scalable Processors. It serves as a
demonstration for users, illustrating the deployment of the RAG service
on the Intel Xeon Scalable Processors and showcasing the resulting
performance enhancements.

**Issue:**
None

**Dependencies:**
The template contains the poetry project requirements to run this
template.
CPU TGI batching is WIP.

**Twitter handle:**
None

---------

Signed-off-by: lvliang-intel <liang1.lv@intel.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
4 months ago
Nuno Campos d4673a3507
openai[patch]: Update openai chat model to new base class interface (#19729) 4 months ago
harry-cohere 23fcc14650
cohere[patch]: support kwargs in with_structured_output (#19736)
**Description:** We'd like to support passing additional kwargs in
`with_structured_output`. I believe this is the accepted approach to
enable additional arguments on API calls.
4 months ago
Brace Sproul ce0a588ae6
docs[minor]: Add chat model tabs to docs pages (#19589) 4 months ago
BeatrixCohere bd02b83acd
cohere[patch]: Allow overriding of the base URL in Cohere Client (#19766)
This PR adds the ability for a user to override the base API url for the
Cohere client for embeddings and chat llm.
4 months ago
Nisarg Trivedi 1252ccce6f
text-splitters[minor]: Added Haskell support in langchain.text_splitter module (#16191)
- **Description:** Haskell language support added in text_splitter
module
  - **Dependencies:** No
  - **Twitter handle:** @nisargtr

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
4 months ago
Hrvoje Milković b7344e3347
community[minor]: Infobip tool integration (#16805)
**Description:** Adding Tool that wraps Infobip API for sending sms or
emails and email validation.
**Dependencies:** None,
**Twitter handle:** @hmilkovic

Implementation:
```
libs/community/langchain_community/utilities/infobip.py
```

Integration tests:
```
libs/community/tests/integration_tests/utilities/test_infobip.py
```

Example notebook:
```
docs/docs/integrations/tools/infobip.ipynb
```

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
4 months ago
Luka Krapic 727a2ea9f1
community[patch]: history size support for DynamoDBChatMessageHistory (#16794)
**Description:** PR adds support for limiting number of messages
preserved in a session history for DynamoDBChatMessageHistory

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
4 months ago
Dt22 6dbf1a2de0
community[patch]: fix redis input type for index_schema field (#16874)
### Subject: Fix Type Misdeclaration for index_schema in redis/base.py

I noticed a type misdeclaration for the index_schema column in the
redis/base.py file.

When following the instructions outlined in [Redis Custom Metadata
Indexing](https://python.langchain.com/docs/integrations/vectorstores/redis)
to create our own index_schema, it leads to a Pylance type error. <br/>
**The error message indicates that Dict[str, list[Dict[str, str]]] is
incompatible with the type Optional[Union[Dict[str, str], str,
os.PathLike]].**

```
index_schema = {
    "tag": [{"name": "credit_score"}],
    "text": [{"name": "user"}, {"name": "job"}],
    "numeric": [{"name": "age"}],
}

rds, keys = Redis.from_texts_return_keys(
    texts,
    embeddings,
    metadatas=metadata,
    redis_url="redis://localhost:6379",
    index_name="users_modified",
    index_schema=index_schema,  
)
```
Therefore, I have created this pull request to rectify the type
declaration problem.

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
4 months ago
morgana 074ad5095f
community[patch]: mmr search for Rockset vectorstore integration (#16908)
- **Description:** Adding support for mmr search in the Rockset
vectorstore integration.
  - **Issue:** N/A
  - **Dependencies:** N/A
  - **Twitter handle:** `@_morgan_adams_`

---------

Co-authored-by: Rockset API Bot <admin@rockset.io>
Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
4 months ago
shahrin014 f51e6a35ba
community[patch]: OllamaEmbeddings - Pass headers to post request (#16880)
## Feature
- Set additional headers in constructor
- Headers will be sent in post request

This feature is useful if deploying Ollama on a cloud service such as
hugging face, which requires authentication tokens to be passed in the
request header.

## Tests
- Test if header is passed
- Test if header is not passed

Similar to https://github.com/langchain-ai/langchain/pull/15881

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
4 months ago
Lance Martin e0f137dbe0
docs: Agentic and Self-RAG w/ LangGraph (#16910)
To do:
[ ] Add streaming
[ ] Move to LangGraph
4 months ago
Jan Chorowski b8b42ccbc5
community[minor]: Pathway vectorstore(#14859)
- **Description:** Integration with pathway.com data processing pipeline
acting as an always updated vectorstore
  - **Issue:** not applicable
- **Dependencies:** optional dependency on
[`pathway`](https://pypi.org/project/pathway/)
  - **Twitter handle:** pathway_com

The PR provides and integration with `pathway` to provide an easy to use
always updated vector store:

```python
import pathway as pw
from langchain.embeddings.openai import OpenAIEmbeddings
from langchain.text_splitter import CharacterTextSplitter
from langchain.vectorstores import PathwayVectorClient, PathwayVectorServer

data_sources = []
data_sources.append(
    pw.io.gdrive.read(object_id="17H4YpBOAKQzEJ93xmC2z170l0bP2npMy", service_user_credentials_file="credentials.json", with_metadata=True))

text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
embeddings_model = OpenAIEmbeddings(openai_api_key=os.environ["OPENAI_API_KEY"])
vector_server = PathwayVectorServer(
    *data_sources,
    embedder=embeddings_model,
    splitter=text_splitter,
)
vector_server.run_server(host="127.0.0.1", port="8765", threaded=True, with_cache=False)
client = PathwayVectorClient(
    host="127.0.0.1",
    port="8765",
)
query = "What is Pathway?"
docs = client.similarity_search(query)
```

The `PathwayVectorServer` builds a data processing pipeline which
continusly scans documents in a given source connector (google drive,
s3, ...) and builds a vector store. The `PathwayVectorClient` implements
LangChain's `VectorStore` interface and connects to the server to
retrieve documents.

---------

Co-authored-by: Mateusz Lewandowski <lewymati@users.noreply.github.com>
Co-authored-by: mlewandowski <mlewandowski@MacBook-Pro-mlewandowski.local>
Co-authored-by: Berke <berkecanrizai1@gmail.com>
Co-authored-by: Adrian Kosowski <adrian@pathway.com>
Co-authored-by: mlewandowski <mlewandowski@macbook-pro-mlewandowski.home>
Co-authored-by: berkecanrizai <63911408+berkecanrizai@users.noreply.github.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: mlewandowski <mlewandowski@MBPmlewandowski.ht.home>
Co-authored-by: Szymon Dudycz <szymond@pathway.com>
Co-authored-by: Szymon Dudycz <szymon.dudycz@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
4 months ago
ccurme 0dbd5f5012
add script to check imports (#19611) 4 months ago
Arturs Konfino 2319212d54
community[patch]: avoid executing `toolkit.get_context()` when not necessary (#19762)
If `prompt` is passed into `create_sql_agent()`, then
`toolkit.get_context()` shouldn't be executed against the database
unless relevant prompt variables (`table_info` or `table_names`) are
present .
4 months ago
高璟琦 ec7a59c96c
community[minor]: Add solar embedding (#19761)
Solar is a large language model developed by
[Upstage](https://upstage.ai/). It's a powerful and purpose-trained LLM.
You can visit the embedding service provided by Solar within this pr.

You may get **SOLAR_API_KEY** from
https://console.upstage.ai/services/embedding
You can refer to more details about accepted llm integration at
https://python.langchain.com/docs/integrations/llms/solar.
4 months ago
Tomaz Bratanic dec00d3050
community[patch]: Add the ability to pass maps to neo4j retrieval query (#19758)
Makes it easier to flatten complex values to text, so you don't have to
use a lot of Cypher to do it.
4 months ago
Robby f7e8a382cc
community[minor]: add hugging face text-to-speech inference API (#18880)
Description: I implemented a tool to use Hugging Face text-to-speech
inference API.

Issue: n/a

Dependencies: n/a

Twitter handle: No Twitter, but do have
[LinkedIn](https://www.linkedin.com/in/robby-horvath/) lol.

---------

Co-authored-by: Robby <h0rv@users.noreply.github.com>
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
4 months ago
DasDingoCodes 73eb3f8fd9
community[minor]: Implement DirectoryLoader lazy_load function (#19537)
Thank you for contributing to LangChain!

- [x] **PR title**: "community: Implement DirectoryLoader lazy_load
function"

- [x] **Description**: The `lazy_load` function of the `DirectoryLoader`
yields each document separately. If the given `loader_cls` of the
`DirectoryLoader` also implemented `lazy_load`, it will be used to yield
subdocuments of the file.

- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access:
`libs/community/tests/unit_tests/document_loaders/test_directory_loader.py`
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory:
`docs/docs/integrations/document_loaders/directory.ipynb`


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
4 months ago
Christophe Bornet 6b2b511f68
core[minor]: Add aformat_messages to FewShotChatMessagePromptTemplate and ChatPromptTemplate (#19648)
Needed since the example selector may use a vector store.
4 months ago
Leonid Ganeline 5f814820f6
docs: providers pinecone fix (#19737)
Current providers page use link to the old package.
- Fixed installation instructions
- Added a reference to the Pinecone retriever
4 months ago
Bob Lin 53a74ad12b
docs: use markdown cell instead of code block (#19740)
I found that the code of async and async batch was divided into two
blocks:

<img width="823" alt="Screenshot 2024-03-29 at 7 45 59 AM"
src="https://github.com/langchain-ai/langchain/assets/10000925/0fa59d29-a692-4309-afb8-2260f03242ec">


so I changed it to unified.
4 months ago
Ekaterina Aidova 4ce36af335
docs: fix link in openvino integration doc (#19749)
- **Description:** fix incorrect link in docs
 - **Dependencies:** None
4 months ago
Jialei f7c903e24a
community[minor]: add support for Moonshot llm and chat model (#17100) 4 months ago
Gustavo Isturiz 824dccf5e2
docs: fixed xml URL on sitemap docs exmaple, issue #17236 (#17304) 4 months ago
Ethan Yang 7164015135
community[minor]: Add Openvino embedding support (#19632)
This PR is used to support both HF and BGE embeddings with openvino

---------

Co-authored-by: Alexander Kozlov <alexander.kozlov@intel.com>
4 months ago
Guangdong Liu cd55d587c2
langchain[patch]: Upgrade openai's sdk and solve some interface adaptation problems. (#19548)
- **Issue:** close #19534
4 months ago
Kirushikesh DB 12861273e1
experimental[patch]: Removed 'SQLResults:' from the LLMResponse in SQLDatabaseChain (#17104)
**Description:** 
When using the SQLDatabaseChain with Llama2-70b LLM and, SQLite
database. I was getting `Warning: You can only execute one statement at
a time.`.

```
from langchain.sql_database import SQLDatabase
from langchain_experimental.sql import SQLDatabaseChain

sql_database_path = '/dccstor/mmdataretrieval/mm_dataset/swimming_record/rag_data/swimmingdataset.db'
sql_db = get_database(sql_database_path)
db_chain = SQLDatabaseChain.from_llm(mistral, sql_db, verbose=True, callbacks = [callback_obj])
db_chain.invoke({
    "query": "What is the best time of Lance Larson in men's 100 meter butterfly competition?"
})
```
Error:
```
Warning                                   Traceback (most recent call last)
Cell In[31], line 3
      1 import langchain
      2 langchain.debug=False
----> 3 db_chain.invoke({
      4     "query": "What is the best time of Lance Larson in men's 100 meter butterfly competition?"
      5 })

File ~/.conda/envs/guardrails1/lib/python3.9/site-packages/langchain/chains/base.py:162, in Chain.invoke(self, input, config, **kwargs)
    160 except BaseException as e:
    161     run_manager.on_chain_error(e)
--> 162     raise e
    163 run_manager.on_chain_end(outputs)
    164 final_outputs: Dict[str, Any] = self.prep_outputs(
    165     inputs, outputs, return_only_outputs
    166 )

File ~/.conda/envs/guardrails1/lib/python3.9/site-packages/langchain/chains/base.py:156, in Chain.invoke(self, input, config, **kwargs)
    149 run_manager = callback_manager.on_chain_start(
    150     dumpd(self),
    151     inputs,
    152     name=run_name,
    153 )
    154 try:
    155     outputs = (
--> 156         self._call(inputs, run_manager=run_manager)
    157         if new_arg_supported
    158         else self._call(inputs)
    159     )
    160 except BaseException as e:
    161     run_manager.on_chain_error(e)

File ~/.conda/envs/guardrails1/lib/python3.9/site-packages/langchain_experimental/sql/base.py:198, in SQLDatabaseChain._call(self, inputs, run_manager)
    194 except Exception as exc:
    195     # Append intermediate steps to exception, to aid in logging and later
    196     # improvement of few shot prompt seeds
    197     exc.intermediate_steps = intermediate_steps  # type: ignore
--> 198     raise exc

File ~/.conda/envs/guardrails1/lib/python3.9/site-packages/langchain_experimental/sql/base.py:143, in SQLDatabaseChain._call(self, inputs, run_manager)
    139     intermediate_steps.append(
    140         sql_cmd
    141     )  # output: sql generation (no checker)
    142     intermediate_steps.append({"sql_cmd": sql_cmd})  # input: sql exec
--> 143     result = self.database.run(sql_cmd)
    144     intermediate_steps.append(str(result))  # output: sql exec
    145 else:

File ~/.conda/envs/guardrails1/lib/python3.9/site-packages/langchain_community/utilities/sql_database.py:436, in SQLDatabase.run(self, command, fetch, include_columns)
    425 def run(
    426     self,
    427     command: str,
    428     fetch: Literal["all", "one"] = "all",
    429     include_columns: bool = False,
    430 ) -> str:
    431     """Execute a SQL command and return a string representing the results.
    432 
    433     If the statement returns rows, a string of the results is returned.
    434     If the statement returns no rows, an empty string is returned.
    435     """
--> 436     result = self._execute(command, fetch)
    438     res = [
    439         {
    440             column: truncate_word(value, length=self._max_string_length)
   (...)
    443         for r in result
    444     ]
    446     if not include_columns:

File ~/.conda/envs/guardrails1/lib/python3.9/site-packages/langchain_community/utilities/sql_database.py:413, in SQLDatabase._execute(self, command, fetch)
    410     elif self.dialect == "postgresql":  # postgresql
    411         connection.exec_driver_sql("SET search_path TO %s", (self._schema,))
--> 413 cursor = connection.execute(text(command))
    414 if cursor.returns_rows:
    415     if fetch == "all":

File ~/.conda/envs/guardrails1/lib/python3.9/site-packages/sqlalchemy/engine/base.py:1416, in Connection.execute(self, statement, parameters, execution_options)
   1414     raise exc.ObjectNotExecutableError(statement) from err
   1415 else:
-> 1416     return meth(
   1417         self,
   1418         distilled_parameters,
   1419         execution_options or NO_OPTIONS,
   1420     )

File ~/.conda/envs/guardrails1/lib/python3.9/site-packages/sqlalchemy/sql/elements.py:516, in ClauseElement._execute_on_connection(self, connection, distilled_params, execution_options)
    514     if TYPE_CHECKING:
    515         assert isinstance(self, Executable)
--> 516     return connection._execute_clauseelement(
    517         self, distilled_params, execution_options
    518     )
    519 else:
    520     raise exc.ObjectNotExecutableError(self)

File ~/.conda/envs/guardrails1/lib/python3.9/site-packages/sqlalchemy/engine/base.py:1639, in Connection._execute_clauseelement(self, elem, distilled_parameters, execution_options)
   1627 compiled_cache: Optional[CompiledCacheType] = execution_options.get(
   1628     "compiled_cache", self.engine._compiled_cache
   1629 )
   1631 compiled_sql, extracted_params, cache_hit = elem._compile_w_cache(
   1632     dialect=dialect,
   1633     compiled_cache=compiled_cache,
   (...)
   1637     linting=self.dialect.compiler_linting | compiler.WARN_LINTING,
   1638 )
-> 1639 ret = self._execute_context(
   1640     dialect,
   1641     dialect.execution_ctx_cls._init_compiled,
   1642     compiled_sql,
   1643     distilled_parameters,
   1644     execution_options,
   1645     compiled_sql,
   1646     distilled_parameters,
   1647     elem,
   1648     extracted_params,
   1649     cache_hit=cache_hit,
   1650 )
   1651 if has_events:
   1652     self.dispatch.after_execute(
   1653         self,
   1654         elem,
   (...)
   1658         ret,
   1659     )

File ~/.conda/envs/guardrails1/lib/python3.9/site-packages/sqlalchemy/engine/base.py:1848, in Connection._execute_context(self, dialect, constructor, statement, parameters, execution_options, *args, **kw)
   1843     return self._exec_insertmany_context(
   1844         dialect,
   1845         context,
   1846     )
   1847 else:
-> 1848     return self._exec_single_context(
   1849         dialect, context, statement, parameters
   1850     )

File ~/.conda/envs/guardrails1/lib/python3.9/site-packages/sqlalchemy/engine/base.py:1988, in Connection._exec_single_context(self, dialect, context, statement, parameters)
   1985     result = context._setup_result_proxy()
   1987 except BaseException as e:
-> 1988     self._handle_dbapi_exception(
   1989         e, str_statement, effective_parameters, cursor, context
   1990     )
   1992 return result

File ~/.conda/envs/guardrails1/lib/python3.9/site-packages/sqlalchemy/engine/base.py:2346, in Connection._handle_dbapi_exception(self, e, statement, parameters, cursor, context, is_sub_exec)
   2344     else:
   2345         assert exc_info[1] is not None
-> 2346         raise exc_info[1].with_traceback(exc_info[2])
   2347 finally:
   2348     del self._reentrant_error

File ~/.conda/envs/guardrails1/lib/python3.9/site-packages/sqlalchemy/engine/base.py:1969, in Connection._exec_single_context(self, dialect, context, statement, parameters)
   1967                 break
   1968     if not evt_handled:
-> 1969         self.dialect.do_execute(
   1970             cursor, str_statement, effective_parameters, context
   1971         )
   1973 if self._has_events or self.engine._has_events:
   1974     self.dispatch.after_cursor_execute(
   1975         self,
   1976         cursor,
   (...)
   1980         context.executemany,
   1981     )

File ~/.conda/envs/guardrails1/lib/python3.9/site-packages/sqlalchemy/engine/default.py:922, in DefaultDialect.do_execute(self, cursor, statement, parameters, context)
    921 def do_execute(self, cursor, statement, parameters, context=None):
--> 922     cursor.execute(statement, parameters)

Warning: You can only execute one statement at a time.
```
**Issue:** 
The Error occurs because when generating the SQLQuery, the llm_input
includes the stop character of "\nSQLResult:", so for this user query
the LLM generated response is **SELECT Time FROM men_butterfly_100m
WHERE Swimmer = 'Lance Larson';\nSQLResult:** it is required to remove
the SQLResult suffix on the llm response before executing it on the
database.

```
llm_inputs = {
            "input": input_text,
            "top_k": str(self.top_k),
            "dialect": self.database.dialect,
            "table_info": table_info,
            "stop": ["\nSQLResult:"],
        }

sql_cmd = self.llm_chain.predict(
                callbacks=_run_manager.get_child(),
                **llm_inputs,
            ).strip()

if SQL_RESULT in sql_cmd:
    sql_cmd = sql_cmd.split(SQL_RESULT)[0].strip()
result = self.database.run(sql_cmd)
```


<!-- Thank you for contributing to LangChain!

Please title your PR "<package>: <description>", where <package> is
whichever of langchain, community, core, experimental, etc. is being
modified.

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes if applicable,
  - **Dependencies:** any dependencies required for this change,
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` from the root
of the package you've modified to check this locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc: https://python.langchain.com/docs/contributing/

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
4 months ago
T Cramer 540ebf35a9
community[patch]: Add explicit error message to Bedrock error output. (#17328)
- **Description:** Propagate Bedrock errors into Langchain explicitly.
Use-case: unset region error is hidden behind 'Could not load
credentials...' message
- **Issue:**
[17654](https://github.com/langchain-ai/langchain/issues/17654)
  - **Dependencies:** None

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
4 months ago
Marcus Virginia 69bb96c80f
community[patch]: surrealdb handle for empty metadata and allow collection names with complex characters (#17374)
- **Description:** Handle for empty metadata and allow collection names
with complex characters
  - **Issue:** #17057
  - **Dependencies:** `surrealdb`

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
4 months ago
ale-delfino 0df76bee37
core[patch]:: XML parser to cover the case when the xml only contains the root level tag (#17456)
Description: Fix xml parser to handle strings that only contain the root
tag
Issue: N/A
Dependencies: None
Twitter handle: N/A

A valid xml text can contain only the root level tag. Example: <body>
  Some text here
</body>
The example above is a valid xml string. If parsed with the current
implementation the result is {"body": []}. This fix checks if the root
level text contains any non-whitespace character and if that's the case
it returns {root.tag: root.text}. The result is that the above text is
correctly parsed as {"body": "Some text here"}

@ale-delfino

Thank you for contributing to LangChain!

Checklist:

- [x] PR title: Please title your PR "package: description", where
"package" is whichever of langchain, community, core, experimental, etc.
is being modified. Use "docs: ..." for purely docs changes, "templates:
..." for template changes, "infra: ..." for CI changes.
  - Example: "community: add foobar LLM"
- [x] PR message: **Delete this entire template message** and replace it
with the following bulleted list
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!
- [x] Pass lint and test: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified to check that you're
passing lint and testing. See contribution guidelines for more
information on how to write/run tests, lint, etc:
https://python.langchain.com/docs/contributing/
- [x] Add tests and docs: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @efriis, @eyurtsev, @hwchase17.

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
4 months ago
kYLe 124ab79c23
community[minor]: Add Anyscale embedding support (#17605)
**Description:** Add embedding model support for Anyscale Endpoint
**Dependencies:** openai

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
4 months ago
Lance Martin 12843f292f
community[patch]: llama cpp embeddings reset default n_batch (#17594)
When testing Nomic embeddings --
```
from langchain_community.embeddings import LlamaCppEmbeddings
embd_model_path = "/Users/rlm/Desktop/Code/llama.cpp/models/nomic-embd/nomic-embed-text-v1.Q4_K_S.gguf"
embd_lc = LlamaCppEmbeddings(model_path=embd_model_path)
embedding_lc = embd_lc.embed_query(query)
```

We were seeing this error for strings > a certain size -- 
```
File ~/miniforge3/envs/llama2/lib/python3.9/site-packages/llama_cpp/llama.py:827, in Llama.embed(self, input, normalize, truncate, return_count)
    824     s_sizes = []
    826 # add to batch
--> 827 self._batch.add_sequence(tokens, len(s_sizes), False)
    828 t_batch += n_tokens
    829 s_sizes.append(n_tokens)

File ~/miniforge3/envs/llama2/lib/python3.9/site-packages/llama_cpp/_internals.py:542, in _LlamaBatch.add_sequence(self, batch, seq_id, logits_all)
    540 self.batch.token[j] = batch[i]
    541 self.batch.pos[j] = i
--> 542 self.batch.seq_id[j][0] = seq_id
    543 self.batch.n_seq_id[j] = 1
    544 self.batch.logits[j] = logits_all

ValueError: NULL pointer access
```

The default `n_batch` of llama-cpp-python's Llama is `512` but we were
explicitly setting it to `8`.
 
These need to be set to equal for embedding models. 
* The embedding.cpp example has an assertion to make sure these are
always equal.
* Apparently this is not being done properly in llama-cpp-python.

With `n_batch` set to 8, if more than 8 tokens are passed the batch runs
out of space and it crashes.

This also explains why the CPU compute buffer size was small:

raw client with default `n_batch=512`
```
llama_new_context_with_model:        CPU input buffer size   =     3.51 MiB
llama_new_context_with_model:        CPU compute buffer size =    21.00 MiB
```
langchain with `n_batch=8`
```
llama_new_context_with_model:        CPU input buffer size   =     0.04 MiB
llama_new_context_with_model:        CPU compute buffer size =     0.33 MiB
```

We can work around this by passing `n_batch=512`, but this will not be
obvious to some users:
```
    embedding = LlamaCppEmbeddings(model_path=embd_model_path,
                                   n_batch=512)
```

From discussion w/ @cebtenzzre. Related:

https://github.com/abetlen/llama-cpp-python/issues/1189

Co-authored-by: Bagatur <baskaryan@gmail.com>
4 months ago
Zijian Han 8e976545f3
community[patch]: support OpenAI whisper base url (#17695)
**Description:** The base URL for OpenAI is retrieved from the
environment variable "OPENAI_BASE_URL", whereas for langchain it is
obtained from "OPENAI_API_BASE". By adding `base_url =
os.environ.get("OPENAI_API_BASE")`, the OpenAI proxy can execute
correctly.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
4 months ago
Paulo Nascimento 44a3484503
community[patch]: add NotebookLoader unit test (#17721)
Thank you for contributing to LangChain!

- **Description:** added unit tests for NotebookLoader. Linked PR:
https://github.com/langchain-ai/langchain/pull/17614
- **Issue:**
[#17614](https://github.com/langchain-ai/langchain/pull/17614)
    - **Twitter handle:** @paulodoestech
- [x] Pass lint and test: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified to check that you're
passing lint and testing. See contribution guidelines for more
information on how to write/run tests, lint, etc:
https://python.langchain.com/docs/contributing/
- [x] Add tests and docs: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.

---------

Co-authored-by: lachiewalker <lachiewalker1@hotmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
4 months ago
Paulo Nascimento 4c3a67122f
community[patch]: add Integration for OpenAI image gen with v1 sdk (#17771)
**Description:** Created a Langchain Tool for OpenAI DALLE Image
Generation.
**Issue:**
[#15901](https://github.com/langchain-ai/langchain/issues/15901)
**Dependencies:** n/a
**Twitter handle:** @paulodoestech

- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
4 months ago
Kaixin Yang a8104ea8e9
openai[patch]: add checking codes for calling AI model get error (#17909)
**Description:**: adding checking codes for calling AI model get error
in chat_models/base.py and llms/base.py
**Issue**: Sometimes the AI Model calling will get error, we should
raise it.
Otherwise, the next code 'choices.extend(response["choices"])' will
throw a "TypeError: 'NoneType' object is not iterable" error to mask the
true error.
       Because 'response["choices"]' is None.
**Dependencies**: None

---------

Co-authored-by: yangkx <yangkx@asiainfo-int.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
4 months ago
Vincent Chen 833d61adb3
docs: update Together README.md (#18004)
## PR message
**Description:** This PR adds a README file for the Together API in the
`libs/partners` folder of this repository. The README includes:
 - A brief description of the package
 - Installation instructions and class introductions
 - Simple usage examples

**Issue:** #17545 

This PR only contains document changes.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
4 months ago
Jiaming 3d3cc71287
community[patch]: fix bugs for bilibili Loader (#18036)
- **Description:** 
1. Fix the BiliBiliLoader that can receive cookie parameters, it
requires 3 other parameters to run. The change is backward compatible.
  2. Add test;      
  3. Add example in docs

- **Issue:** [#14213]

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
4 months ago