docs `integrations/embeddings` consistency (#10302)

Updated `integrations/embeddings`: fixed titles; added links,
descriptions
Updated `integrations/providers`.
bagatur/konko
Leonid Ganeline 1 year ago committed by GitHub
parent 1b3ea1eeb4
commit fdba711d28
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23

@ -2216,6 +2216,10 @@
"source": "/docs/modules/data_connection/text_embedding/integrations/tensorflowhub",
"destination": "/docs/integrations/text_embedding/tensorflowhub"
},
{
"source": "/docs/integrations/text_embedding/Awa",
"destination": "/docs/integrations/text_embedding/awadb"
},
{
"source": "/en/latest/modules/indexes/vectorstores/examples/analyticdb.html",
"destination": "/docs/integrations/vectorstores/analyticdb"

@ -9,13 +9,20 @@ pip install awadb
```
## VectorStore
## Vector Store
There exists a wrapper around AwaDB vector databases, allowing you to use it as a vectorstore,
whether for semantic search or example selection.
```python
from langchain.vectorstores import AwaDB
```
For a more detailed walkthrough of the AwaDB wrapper, see [here](/docs/integrations/vectorstores/awadb.html).
See a [usage example](/docs/integrations/vectorstores/awadb).
## Text Embedding Model
```python
from langchain.embeddings import AwaEmbeddings
```
See a [usage example](/docs/integrations/text_embedding/awadb).

@ -1,20 +1,24 @@
# ModelScope
>[ModelScope](https://www.modelscope.cn/home) is a big repository of the models and datasets.
This page covers how to use the modelscope ecosystem within LangChain.
It is broken into two parts: installation and setup, and then references to specific modelscope wrappers.
## Installation and Setup
* Install the Python SDK with `pip install modelscope`
Install the `modelscope` package.
```bash
pip install modelscope
```
## Wrappers
### Embeddings
## Text Embedding Models
There exists a modelscope Embeddings wrapper, which you can access with
```python
from langchain.embeddings import ModelScopeEmbeddings
```
For a more detailed walkthrough of this, see [this notebook](/docs/integrations/text_embedding/modelscope_hub.html)
For a more detailed walkthrough of this, see [this notebook](/docs/integrations/text_embedding/modelscope_hub)

@ -1,17 +1,31 @@
# NLPCloud
This page covers how to use the NLPCloud ecosystem within LangChain.
It is broken into two parts: installation and setup, and then references to specific NLPCloud wrappers.
>[NLP Cloud](https://docs.nlpcloud.com/#introduction) is an artificial intelligence platform that allows you to use the most advanced AI engines, and even train your own engines with your own data.
## Installation and Setup
- Install the Python SDK with `pip install nlpcloud`
- Install the `nlpcloud` package.
```bash
pip install nlpcloud
```
- Get an NLPCloud api key and set it as an environment variable (`NLPCLOUD_API_KEY`)
## Wrappers
### LLM
## LLM
See a [usage example](/docs/integrations/llms/nlpcloud).
There exists an NLPCloud LLM wrapper, which you can access with
```python
from langchain.llms import NLPCloud
```
## Text Embedding Models
See a [usage example](/docs/integrations/text_embedding/nlp_cloud)
```python
from langchain.embeddings import NLPCloudEmbeddings
```

@ -18,3 +18,11 @@ See a [usage example](/docs/modules/data_connection/document_transformers/text_s
```python
from langchain.text_splitter import SpacyTextSplitter
```
## Text Embedding Models
See a [usage example](/docs/integrations/text_embedding/spacy_embedding)
```python
from langchain.embeddings.spacy_embeddings import SpacyEmbeddings
```

@ -5,9 +5,11 @@
"id": "b14a24db",
"metadata": {},
"source": [
"# AwaEmbedding\n",
"# AwaDB\n",
"\n",
"This notebook explains how to use AwaEmbedding, which is included in [awadb](https://github.com/awa-ai/awadb), to embedding texts in langchain."
">[AwaDB](https://github.com/awa-ai/awadb) is an AI Native database for the search and storage of embedding vectors used by LLM Applications.\n",
"\n",
"This notebook explains how to use `AwaEmbeddings` in LangChain."
]
},
{
@ -101,7 +103,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.4"
"version": "3.10.12"
}
},
"nbformat": 4,

@ -5,7 +5,9 @@
"id": "75e378f5-55d7-44b6-8e2e-6d7b8b171ec4",
"metadata": {},
"source": [
"# Bedrock Embeddings"
"# Bedrock\n",
"\n",
">[Amazon Bedrock](https://aws.amazon.com/bedrock/) is a fully managed service that makes FMs from leading AI startups and Amazon available via an API, so you can choose from a wide range of FMs to find the model that is best suited for your use case.\n"
]
},
{
@ -91,7 +93,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.13"
"version": "3.10.12"
}
},
"nbformat": 4,

@ -5,26 +5,29 @@
"id": "719619d3",
"metadata": {},
"source": [
"# BGE Hugging Face Embeddings\n",
"# BGE on Hugging Face\n",
"\n",
"This notebook shows how to use BGE Embeddings through Hugging Face"
">[BGE models on the HuggingFace](https://huggingface.co/BAAI/bge-large-en) are [the best open-source embedding models](https://huggingface.co/spaces/mteb/leaderboard).\n",
">BGE model is created by the [Beijing Academy of Artificial Intelligence (BAAI)](https://www.baai.ac.cn/english.html). `BAAI` is a private non-profit organization engaged in AI research and development.\n",
"\n",
"This notebook shows how to use `BGE Embeddings` through `Hugging Face`"
]
},
{
"cell_type": "code",
"execution_count": 8,
"execution_count": null,
"id": "f7a54279",
"metadata": {
"scrolled": true
},
"outputs": [],
"source": [
"# !pip install sentence_transformers"
"#!pip install sentence_transformers"
]
},
{
"cell_type": "code",
"execution_count": 5,
"execution_count": null,
"id": "9e1d5b6b",
"metadata": {},
"outputs": [],
@ -43,12 +46,24 @@
},
{
"cell_type": "code",
"execution_count": 7,
"execution_count": 5,
"id": "e59d1a89",
"metadata": {},
"outputs": [],
"outputs": [
{
"data": {
"text/plain": [
"384"
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"embedding = hf.embed_query(\"hi this is harrison\")"
"embedding = hf.embed_query(\"hi this is harrison\")\n",
"len(embedding)"
]
},
{
@ -76,7 +91,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.1"
"version": "3.10.12"
}
},
"nbformat": 4,

@ -1,13 +1,14 @@
{
"cells": [
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"# Google Cloud Platform Vertex AI PaLM \n",
"# Google Vertex AI PaLM \n",
"\n",
"Note: This is seperate from the Google PaLM integration, it exposes [Vertex AI PaLM API](https://cloud.google.com/vertex-ai/docs/generative-ai/learn/overview) on Google Cloud. \n",
">[Vertex AI PaLM API](https://cloud.google.com/vertex-ai/docs/generative-ai/learn/overview) is a service on Google Cloud exposing the embedding models. \n",
"\n",
"Note: This integration is seperate from the Google PaLM integration.\n",
"\n",
"By default, Google Cloud [does not use](https://cloud.google.com/vertex-ai/docs/generative-ai/data-governance#foundation_model_development) Customer Data to train its foundation models as part of Google Cloud`s AI/ML Privacy Commitment. More details about how Google processes data can also be found in [Google's Customer Data Processing Addendum (CDPA)](https://cloud.google.com/terms/data-processing-addendum).\n",
"\n",
@ -96,7 +97,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.1"
"version": "3.10.12"
},
"vscode": {
"interpreter": {

@ -1,12 +1,13 @@
{
"cells": [
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"# ModelScope\n",
"\n",
">[ModelScope](https://www.modelscope.cn/home) is big repository of the models and datasets.\n",
"\n",
"Let's load the ModelScope Embedding class."
]
},
@ -67,16 +68,23 @@
],
"metadata": {
"kernelspec": {
"display_name": "chatgpt",
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"version": "3.9.15"
},
"orig_nbformat": 4
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.12"
}
},
"nbformat": 4,
"nbformat_minor": 2
"nbformat_minor": 4
}

@ -1,15 +1,14 @@
{
"cells": [
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"# MosaicML embeddings\n",
"# MosaicML\n",
"\n",
"[MosaicML](https://docs.mosaicml.com/en/latest/inference.html) offers a managed inference service. You can either use a variety of open source models, or deploy your own.\n",
">[MosaicML](https://docs.mosaicml.com/en/latest/inference.html) offers a managed inference service. You can either use a variety of open source models, or deploy your own.\n",
"\n",
"This example goes over how to use LangChain to interact with MosaicML Inference for text embedding."
"This example goes over how to use LangChain to interact with `MosaicML` Inference for text embedding."
]
},
{
@ -94,6 +93,11 @@
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
@ -103,9 +107,10 @@
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3"
"pygments_lexer": "ipython3",
"version": "3.10.12"
}
},
"nbformat": 4,
"nbformat_minor": 2
"nbformat_minor": 4
}

@ -7,7 +7,7 @@
"source": [
"# NLP Cloud\n",
"\n",
"NLP Cloud is an artificial intelligence platform that allows you to use the most advanced AI engines, and even train your own engines with your own data. \n",
">[NLP Cloud](https://docs.nlpcloud.com/#introduction) is an artificial intelligence platform that allows you to use the most advanced AI engines, and even train your own engines with your own data. \n",
"\n",
"The [embeddings](https://docs.nlpcloud.com/#embeddings) endpoint offers the following model:\n",
"\n",
@ -80,7 +80,7 @@
],
"metadata": {
"kernelspec": {
"display_name": "Python 3.11.2 64-bit",
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
@ -94,7 +94,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.2"
"version": "3.10.12"
},
"vscode": {
"interpreter": {

@ -5,11 +5,13 @@
"id": "1f83f273",
"metadata": {},
"source": [
"# SageMaker Endpoint Embeddings\n",
"# SageMaker\n",
"\n",
"Let's load the SageMaker Endpoints Embeddings class. The class can be used if you host, e.g. your own Hugging Face model on SageMaker.\n",
"Let's load the `SageMaker Endpoints Embeddings` class. The class can be used if you host, e.g. your own Hugging Face model on SageMaker.\n",
"\n",
"For instructions on how to do this, please see [here](https://www.philschmid.de/custom-inference-huggingface-sagemaker). **Note**: In order to handle batched requests, you will need to adjust the return line in the `predict_fn()` function within the custom `inference.py` script:\n",
"For instructions on how to do this, please see [here](https://www.philschmid.de/custom-inference-huggingface-sagemaker). \n",
"\n",
"**Note**: In order to handle batched requests, you will need to adjust the return line in the `predict_fn()` function within the custom `inference.py` script:\n",
"\n",
"Change from\n",
"\n",
@ -143,7 +145,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.1"
"version": "3.10.12"
},
"vscode": {
"interpreter": {

@ -5,8 +5,8 @@
"id": "eec4efda",
"metadata": {},
"source": [
"# Self Hosted Embeddings\n",
"Let's load the SelfHostedEmbeddings, SelfHostedHuggingFaceEmbeddings, and SelfHostedHuggingFaceInstructEmbeddings classes."
"# Self Hosted\n",
"Let's load the `SelfHostedEmbeddings`, `SelfHostedHuggingFaceEmbeddings`, and `SelfHostedHuggingFaceInstructEmbeddings` classes."
]
},
{
@ -149,9 +149,7 @@
"cell_type": "code",
"execution_count": null,
"id": "fc1bfd0f",
"metadata": {
"scrolled": false
},
"metadata": {},
"outputs": [],
"source": [
"query_result = embeddings.embed_query(text)"
@ -182,7 +180,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.1"
"version": "3.10.12"
},
"vscode": {
"interpreter": {

@ -1,16 +1,15 @@
{
"cells": [
{
"attachments": {},
"cell_type": "markdown",
"id": "ed47bb62",
"metadata": {},
"source": [
"# Sentence Transformers Embeddings\n",
"# Sentence Transformers\n",
"\n",
"[SentenceTransformers](https://www.sbert.net/) embeddings are called using the `HuggingFaceEmbeddings` integration. We have also added an alias for `SentenceTransformerEmbeddings` for users who are more familiar with directly using that package.\n",
">[SentenceTransformers](https://www.sbert.net/) embeddings are called using the `HuggingFaceEmbeddings` integration. We have also added an alias for `SentenceTransformerEmbeddings` for users who are more familiar with directly using that package.\n",
"\n",
"SentenceTransformers is a python package that can generate text and image embeddings, originating from [Sentence-BERT](https://arxiv.org/abs/1908.10084)"
"`SentenceTransformers` is a python package that can generate text and image embeddings, originating from [Sentence-BERT](https://arxiv.org/abs/1908.10084)"
]
},
{
@ -109,7 +108,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.16"
"version": "3.10.12"
},
"vscode": {
"interpreter": {

@ -1,21 +1,31 @@
{
"cells": [
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"# Spacy Embedding\n",
"# SpaCy\n",
"\n",
"### Loading the Spacy embedding class to generate and query embeddings"
">[spaCy](https://spacy.io/) is an open-source software library for advanced natural language processing, written in the programming languages Python and Cython.\n",
" \n",
"\n",
"## Installation and Setup"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"#!pip install spacy"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Import the necessary classes"
"Import the necessary classes"
]
},
{
@ -28,11 +38,12 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Initialize SpacyEmbeddings.This will load the Spacy model into memory."
"## Example\n",
"\n",
"Initialize SpacyEmbeddings.This will load the Spacy model into memory."
]
},
{
@ -45,11 +56,10 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Define some example texts . These could be any documents that you want to analyze - for example, news articles, social media posts, or product reviews."
"Define some example texts . These could be any documents that you want to analyze - for example, news articles, social media posts, or product reviews."
]
},
{
@ -67,11 +77,10 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Generate and print embeddings for the texts . The SpacyEmbeddings class generates an embedding for each document, which is a numerical representation of the document's content. These embeddings can be used for various natural language processing tasks, such as document similarity comparison or text classification."
"Generate and print embeddings for the texts . The SpacyEmbeddings class generates an embedding for each document, which is a numerical representation of the document's content. These embeddings can be used for various natural language processing tasks, such as document similarity comparison or text classification."
]
},
{
@ -86,11 +95,10 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Generate and print an embedding for a single piece of text. You can also generate an embedding for a single piece of text, such as a search query. This can be useful for tasks like information retrieval, where you want to find documents that are similar to a given query."
"Generate and print an embedding for a single piece of text. You can also generate an embedding for a single piece of text, such as a search query. This can be useful for tasks like information retrieval, where you want to find documents that are similar to a given query."
]
},
{
@ -106,11 +114,24 @@
}
],
"metadata": {
"language_info": {
"name": "python"
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"orig_nbformat": 4
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.12"
}
},
"nbformat": 4,
"nbformat_minor": 2
"nbformat_minor": 4
}

Loading…
Cancel
Save