Commit Graph

25 Commits (4367e89c9a8adbe004347da2867cf5fdef8a2a92)

Author SHA1 Message Date
Yuwen Hu ba0dca46d7
community[minor]: Add IPEX-LLM BGE embedding support on both Intel CPU and GPU (#22226)
**Description:** [IPEX-LLM](https://github.com/intel-analytics/ipex-llm)
is a PyTorch library for running LLM on Intel CPU and GPU (e.g., local
PC with iGPU, discrete GPU such as Arc, Flex and Max) with very low
latency. This PR adds ipex-llm integrations to langchain for BGE
embedding support on both Intel CPU and GPU.
**Dependencies:** `ipex-llm`, `sentence-transformers`
**Contribution maintainer**: @Oscilloscope98 
**tests and docs**: 
- langchain/docs/docs/integrations/text_embedding/ipex_llm.ipynb
- langchain/docs/docs/integrations/text_embedding/ipex_llm_gpu.ipynb
-
langchain/libs/community/tests/integration_tests/embeddings/test_ipex_llm.py

---------

Co-authored-by: Shengsheng Huang <shannie.huang@gmail.com>
4 months ago
acho98 45ed5f3f51
community[minor]: Add Clova Embeddings for LangChain Community (#21890)
- [ ] **PR title**: "Add Naver ClovaX embedding to LangChain community"
- HyperClovaX is a large language model developed by
[Naver](https://clova-x.naver.com/welcome).
It's a powerful and purpose-trained LLM.

- You can visit the embedding service provided by
[ClovaX](https://www.ncloud.com/product/aiService/clovaStudio)

- You may get CLOVA_EMB_API_KEY, CLOVA_EMB_APIGW_API_KEY,
CLOVA_EMB_APP_ID From
https://www.ncloud.com/product/aiService/clovaStudio

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
4 months ago
Jorge Piedrahita Ortiz e65652c3e8
community: add SambaNova embeddings integration (#21227)
- **Description:**  SambaNova hosted embeddings integration
5 months ago
Rohan Aggarwal 8021d2a2ab
community[minor]: Oraclevs integration (#21123)
Thank you for contributing to LangChain!

- Oracle AI Vector Search 
Oracle AI Vector Search is designed for Artificial Intelligence (AI)
workloads that allows you to query data based on semantics, rather than
keywords. One of the biggest benefit of Oracle AI Vector Search is that
semantic search on unstructured data can be combined with relational
search on business data in one single system. This is not only powerful
but also significantly more effective because you don't need to add a
specialized vector database, eliminating the pain of data fragmentation
between multiple systems.


- Oracle AI Vector Search is designed for Artificial Intelligence (AI)
workloads that allows you to query data based on semantics, rather than
keywords. One of the biggest benefit of Oracle AI Vector Search is that
semantic search on unstructured data can be combined with relational
search on business data in one single system. This is not only powerful
but also significantly more effective because you don't need to add a
specialized vector database, eliminating the pain of data fragmentation
between multiple systems.
This Pull Requests Adds the following functionalities
Oracle AI Vector Search : Vector Store
Oracle AI Vector Search : Document Loader
Oracle AI Vector Search : Document Splitter
Oracle AI Vector Search : Summary
Oracle AI Vector Search : Oracle Embeddings


- We have added unit tests and have our own local unit test suite which
verifies all the code is correct. We have made sure to add guides for
each of the components and one end to end guide that shows how the
entire thing runs.


- We have made sure that make format and make lint run clean.

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.

---------

Co-authored-by: skmishraoracle <shailendra.mishra@oracle.com>
Co-authored-by: hroyofc <harichandan.roy@oracle.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
5 months ago
pjb157 479be3cc91
community[minor]: Unify Titan Takeoff Integrations and Adding Embedding Support (#18775)
**Community: Unify Titan Takeoff Integrations and Adding Embedding
Support**

 **Description:** 
Titan Takeoff no longer reflects this either of the integrations in the
community folder. The two integrations (TitanTakeoffPro and
TitanTakeoff) where causing confusion with clients, so have moved code
into one place and created an alias for backwards compatibility. Added
Takeoff Client python package to do the bulk of the work with the
requests, this is because this package is actively updated with new
versions of Takeoff. So this integration will be far more robust and
will not degrade as badly over time.

**Issue:**
Fixes bugs in the old Titan integrations and unified the code with added
unit test converge to avoid future problems.

**Dependencies:**
Added optional dependency takeoff-client, all imports still work without
dependency including the Titan Takeoff classes but just will fail on
initialisation if not pip installed takeoff-client

**Twitter**
@MeryemArik9

Thanks all :)

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
5 months ago
Leonid Ganeline 4cb5f4c353
community[patch]: import flattening fix (#20110)
This PR should make it easier for linters to do type checking and for IDEs to jump to definition of code.

See #20050 as a template for this PR.
- As a byproduct: Added 3 missed `test_imports`.
- Added missed `SolarChat` in to __init___.py Added it into test_import
ut.
- Added `# type: ignore` to fix linting. It is not clear, why linting
errors appear after ^ changes.

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
5 months ago
高璟琦 ec7a59c96c
community[minor]: Add solar embedding (#19761)
Solar is a large language model developed by
[Upstage](https://upstage.ai/). It's a powerful and purpose-trained LLM.
You can visit the embedding service provided by Solar within this pr.

You may get **SOLAR_API_KEY** from
https://console.upstage.ai/services/embedding
You can refer to more details about accepted llm integration at
https://python.langchain.com/docs/integrations/llms/solar.
6 months ago
Ethan Yang 7164015135
community[minor]: Add Openvino embedding support (#19632)
This PR is used to support both HF and BGE embeddings with openvino

---------

Co-authored-by: Alexander Kozlov <alexander.kozlov@intel.com>
6 months ago
kYLe 124ab79c23
community[minor]: Add Anyscale embedding support (#17605)
**Description:** Add embedding model support for Anyscale Endpoint
**Dependencies:** openai

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
6 months ago
yuwenzho 3a7d2cf443
community[minor]: Add ITREX optimized Embeddings (#18474)
Introduction
[Intel® Extension for
Transformers](https://github.com/intel/intel-extension-for-transformers)
is an innovative toolkit designed to accelerate GenAI/LLM everywhere
with the optimal performance of Transformer-based models on various
Intel platforms

Description

adding ITREX runtime embeddings using intel-extension-for-transformers.
added mdx documentation and example notebooks
added embedding import testing.

---------

Signed-off-by: yuwenzho <yuwen.zhou@intel.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
6 months ago
Anindyadeep b2a11ce686
community[minor]: Prem AI langchain integration (#19113)
### Prem SDK integration in LangChain

This PR adds the integration with [PremAI's](https://www.premai.io/)
prem-sdk with langchain. User can now access to deployed models
(llms/embeddings) and use it with langchain's ecosystem. This PR adds
the following:

### This PR adds the following:

- [x]  Add chat support
- [X]  Adding embedding support
- [X]  writing integration tests
    - [X]  writing tests for chat 
    - [X]  writing tests for embedding
- [X]  writing unit tests
    - [X]  writing tests for chat 
    - [X]  writing tests for embedding
- [X]  Adding documentation
    - [X]  writing documentation for chat
    - [X]  writing documentation for embedding
- [X] run `make test`
- [X] run `make lint`, `make lint_diff` 
- [X]  Final checks (spell check, lint, format and overall testing)

---------

Co-authored-by: Anindyadeep Sannigrahi <anindyadeepsannigrahi@Anindyadeeps-MacBook-Pro.local>
Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
6 months ago
Dmitry Tyumentsev 08b769d539
community[patch]: YandexGPT Use recent yandexcloud sdk version (#19341)
Fixed inability to work with [yandexcloud
SDK](https://pypi.org/project/yandexcloud/) version higher 0.265.0
6 months ago
Mikelarg dac2e0165a
community[minor]: Added GigaChat Embeddings support + updated previous GigaChat integration (#19516)
- **Description:** Added integration with
[GigaChat](https://developers.sber.ru/portal/products/gigachat)
embeddings. Also added support for extra fields in GigaChat LLM and
fixed docs.
6 months ago
Kate Silverstein b7c71e2e07
community[minor]: llamafile embeddings support (#17976)
* **Description:** adds `LlamafileEmbeddings` class implementation for
generating embeddings using
[llamafile](https://github.com/Mozilla-Ocho/llamafile)-based models.
Includes related unit tests and notebook showing example usage.
* **Issue:** N/A
* **Dependencies:** N/A
7 months ago
Dan Stambler 69344a0661
community: Add Laser Embedding Integration (#18111)
- **Description:** Added Integration with Meta AI's LASER
Language-Agnostic SEntence Representations embedding library, which
supports multilingual embedding for any of the languages listed here:
https://github.com/facebookresearch/flores/blob/main/flores200/README.md#languages-in-flores-200,
including several low resource languages
- **Dependencies:** laser_encoders
7 months ago
Michael Feil 242981b8f0
community[minor]: infinity embedding local option (#17671)
**drop-in-replacement for sentence-transformers
inference.**

https://github.com/langchain-ai/langchain/discussions/17670

tldr from the discussion above -> around a 4x-22x speedup over using
SentenceTransformers / huggingface embeddings. For more info:
https://github.com/michaelfeil/infinity (pure-python dependency)

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
7 months ago
Guangdong Liu 3ba1cb8650
community[minor]: Add SparkLLM Text Embedding Model and SparkLLM introduction (#17573) 7 months ago
Moshe Berchansky 20a56fe0a2
community[minor]: Add QuantizedEmbedders (#17391)
**Description:** 
* adding Quantized embedders using optimum-intel and
intel-extension-for-pytorch.
* added mdx documentation and example notebooks 
* added embedding import testing.

**Dependencies:** 
optimum = {extras = ["neural-compressor"], version = "^1.14.0", optional
= true}
intel_extension_for_pytorch = {version = "^2.2.0", optional = true}

Dependencies have been added to pyproject.toml for the community lib.  

**Twitter handle:** @peter_izsak

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
7 months ago
nvpranak 91bcc9c5c9
community[minor]: Nemo embeddings(#16206)
This PR is adding support for NVIDIA NeMo embeddings issue #16095.

---------

Co-authored-by: Praveen Nakshatrala <pnakshatrala@gmail.com>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
7 months ago
baichuan-assistant 70ff54eace
community[minor]: Add Baichuan Text Embedding Model and Baichuan Inc introduction (#16568)
- **Description:** Adding Baichuan Text Embedding Model and Baichuan Inc
introduction.

Baichuan Text Embedding ranks #1 in C-MTEB leaderboard:
https://huggingface.co/spaces/mteb/leaderboard

Co-authored-by: BaiChuanHelper <wintergyc@WinterGYCs-MacBook-Pro.local>
8 months ago
Rave Harpaz c4e9c9ca29
community[minor]: Add OCI Generative AI integration (#16548)
<!-- Thank you for contributing to LangChain!

Please title your PR "<package>: <description>", where <package> is
whichever of langchain, community, core, experimental, etc. is being
modified.

Replace this entire comment with:
- **Description:** Adding Oracle Cloud Infrastructure Generative AI
integration. Oracle Cloud Infrastructure (OCI) Generative AI is a fully
managed service that provides a set of state-of-the-art, customizable
large language models (LLMs) that cover a wide range of use cases, and
which is available through a single API. Using the OCI Generative AI
service you can access ready-to-use pretrained models, or create and
host your own fine-tuned custom models based on your own data on
dedicated AI clusters.
https://docs.oracle.com/en-us/iaas/Content/generative-ai/home.htm
  - **Issue:** None,
  - **Dependencies:** OCI Python SDK,
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` from the root
of the package you've modified to check this locally.
Passed

See contribution guidelines for more information on how to write/run
tests, lint, etc: https://python.langchain.com/docs/contributing/

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.

we provide unit tests. However, we cannot provide integration tests due
to Oracle policies that prohibit public sharing of api keys.
 
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->

---------

Co-authored-by: Arthur Cheng <arthur.cheng@oracle.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
8 months ago
Eli Lucherini 6b2a57161a
community[patch]: allow additional kwargs in MlflowEmbeddings for compatibility with Cohere API (#15242)
- **Description:** add support for kwargs in`MlflowEmbeddings`
`embed_document()` and `embed_query()` so that all the arguments
required by Cohere API (and others?) can be passed down to the server.
  - **Issue:** #15234 
- **Dependencies:** MLflow with MLflow Deployments (`pip install
mlflow[genai]`)

**Tests**
Now this code [adapted from the
docs](https://python.langchain.com/docs/integrations/providers/mlflow#embeddings-example)
for the Cohere API works locally.

```python
"""
Setup
-----
export COHERE_API_KEY=...
mlflow deployments start-server --config-path examples/deployments/cohere/config.yaml

Run
---
python /path/to/this/file.py
"""
embeddings = MlflowCohereEmbeddings(target_uri="http://127.0.0.1:5000", endpoint="embeddings")
print(embeddings.embed_query("hello")[:3])
print(embeddings.embed_documents(["hello", "world"])[0][:3])
```

Output
```
[0.060455322, 0.028793335, -0.025848389]
[0.031707764, 0.021057129, -0.009361267]
```
8 months ago
chyroc 32e96a471c
Refactor: use SecretStr for llm_rails embeddings (#15090) 9 months ago
Hin 2cf1e73d12
Feat add volcano embedding (#14693)
Description: Volcano Ark is an enterprise-grade large-model service
platform for developers, providing a full range of functions and
services such as model training, inference, evaluation, fine-tuning. You
can visit its homepage at https://www.volcengine.com/docs/82379/1099455
for details. This change could help developers use the platform for
embedding.
Issue: None
Dependencies: volcengine
Tag maintainer: @baskaryan
Twitter handle: @hinnnnnnnnnnnns

---------

Co-authored-by: lujingxuansc <lujingxuansc@bytedance.com>
9 months ago
Bagatur ed58eeb9c5
community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463)
Moved the following modules to new package langchain-community in a backwards compatible fashion:

```
mv langchain/langchain/adapters community/langchain_community
mv langchain/langchain/callbacks community/langchain_community/callbacks
mv langchain/langchain/chat_loaders community/langchain_community
mv langchain/langchain/chat_models community/langchain_community
mv langchain/langchain/document_loaders community/langchain_community
mv langchain/langchain/docstore community/langchain_community
mv langchain/langchain/document_transformers community/langchain_community
mv langchain/langchain/embeddings community/langchain_community
mv langchain/langchain/graphs community/langchain_community
mv langchain/langchain/llms community/langchain_community
mv langchain/langchain/memory/chat_message_histories community/langchain_community
mv langchain/langchain/retrievers community/langchain_community
mv langchain/langchain/storage community/langchain_community
mv langchain/langchain/tools community/langchain_community
mv langchain/langchain/utilities community/langchain_community
mv langchain/langchain/vectorstores community/langchain_community
mv langchain/langchain/agents/agent_toolkits community/langchain_community
mv langchain/langchain/cache.py community/langchain_community
mv langchain/langchain/adapters community/langchain_community
mv langchain/langchain/callbacks community/langchain_community/callbacks
mv langchain/langchain/chat_loaders community/langchain_community
mv langchain/langchain/chat_models community/langchain_community
mv langchain/langchain/document_loaders community/langchain_community
mv langchain/langchain/docstore community/langchain_community
mv langchain/langchain/document_transformers community/langchain_community
mv langchain/langchain/embeddings community/langchain_community
mv langchain/langchain/graphs community/langchain_community
mv langchain/langchain/llms community/langchain_community
mv langchain/langchain/memory/chat_message_histories community/langchain_community
mv langchain/langchain/retrievers community/langchain_community
mv langchain/langchain/storage community/langchain_community
mv langchain/langchain/tools community/langchain_community
mv langchain/langchain/utilities community/langchain_community
mv langchain/langchain/vectorstores community/langchain_community
mv langchain/langchain/agents/agent_toolkits community/langchain_community
mv langchain/langchain/cache.py community/langchain_community
```

Moved the following to core
```
mv langchain/langchain/utils/json_schema.py core/langchain_core/utils
mv langchain/langchain/utils/html.py core/langchain_core/utils
mv langchain/langchain/utils/strings.py core/langchain_core/utils
cat langchain/langchain/utils/env.py >> core/langchain_core/utils/env.py
rm langchain/langchain/utils/env.py
```

See .scripts/community_split/script_integrations.sh for all changes
10 months ago