{ "cells": [ { "cell_type": "markdown", "id": "eb1c0ea9", "metadata": {}, "source": [ "# Aleph Alpha\n", "\n", "There are two possible ways to use Aleph Alpha's semantic embeddings. If you have texts with a dissimilar structure (e.g. a Document and a Query) you would want to use asymmetric embeddings. Conversely, for texts with comparable structures, symmetric embeddings are the suggested approach." ] }, { "cell_type": "markdown", "id": "9ecc84f9", "metadata": {}, "source": [ "## Asymmetric" ] }, { "cell_type": "code", "execution_count": 1, "id": "8a920a89", "metadata": {}, "outputs": [], "source": [ "from langchain.embeddings import AlephAlphaAsymmetricSemanticEmbedding" ] }, { "cell_type": "code", "execution_count": 2, "id": "f2d04da3", "metadata": {}, "outputs": [], "source": [ "document = \"This is a content of the document\"\n", "query = \"What is the content of the document?\"" ] }, { "cell_type": "code", "execution_count": 3, "id": "e6ecde96", "metadata": {}, "outputs": [], "source": [ "embeddings = AlephAlphaAsymmetricSemanticEmbedding(normalize=True, compress_to_size=128)" ] }, { "cell_type": "code", "execution_count": 4, "id": "90e68411", "metadata": {}, "outputs": [], "source": [ "doc_result = embeddings.embed_documents([document])" ] }, { "cell_type": "code", "execution_count": 5, "id": "55903233", "metadata": {}, "outputs": [], "source": [ "query_result = embeddings.embed_query(query)" ] }, { "cell_type": "markdown", "id": "b8c00aab", "metadata": {}, "source": [ "## Symmetric" ] }, { "cell_type": "code", "execution_count": 6, "id": "eabb763a", "metadata": {}, "outputs": [], "source": [ "from langchain.embeddings import AlephAlphaSymmetricSemanticEmbedding" ] }, { "cell_type": "code", "execution_count": 7, "id": "0ad799f7", "metadata": {}, "outputs": [], "source": [ "text = \"This is a test text\"" ] }, { "cell_type": "code", "execution_count": 8, "id": "af86dc10", "metadata": {}, "outputs": [], "source": [ "embeddings = AlephAlphaSymmetricSemanticEmbedding(normalize=True, compress_to_size=128)" ] }, { "cell_type": "code", "execution_count": 9, "id": "d292536f", "metadata": {}, "outputs": [], "source": [ "doc_result = embeddings.embed_documents([text])" ] }, { "cell_type": "code", "execution_count": 10, "id": "c704a7cf", "metadata": {}, "outputs": [], "source": [ "query_result = embeddings.embed_query(text)" ] }, { "cell_type": "code", "execution_count": null, "id": "5d999f8f", "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.13" }, "vscode": { "interpreter": { "hash": "7377c2ccc78bc62c2683122d48c8cd1fb85a53850a1b1fc29736ed39852c9885" } } }, "nbformat": 4, "nbformat_minor": 5 }