langchain-docker readme

linting in docker and parallel make jobs
- linting can be run in docker in parallel with `make -j4 docker.lint`
133 changed files with 8206 additions and 329 deletions
--- a/.dockerignore
+++ b/.dockerignore
@ -0,0 +1,144 @@
+.vscode/
+.idea/
+# Byte-compiled / optimized / DLL files
+__pycache__/
+*.py[cod]
+*$py.class
+
+# C extensions
+*.so
+
+# Distribution / packaging
+.Python
+build/
+develop-eggs/
+dist/
+downloads/
+eggs/
+.eggs/
+lib/
+lib64/
+parts/
+sdist/
+var/
+wheels/
+pip-wheel-metadata/
+share/python-wheels/
+*.egg-info/
+.installed.cfg
+*.egg
+MANIFEST
+
+# PyInstaller
+#  Usually these files are written by a python script from a template
+#  before PyInstaller builds the exe, so as to inject date/other infos into it.
+*.manifest
+*.spec
+
+# Installer logs
+pip-log.txt
+pip-delete-this-directory.txt
+
+# Unit test / coverage reports
+htmlcov/
+.tox/
+.nox/
+.coverage
+.coverage.*
+.cache
+nosetests.xml
+coverage.xml
+*.cover
+*.py,cover
+.hypothesis/
+.pytest_cache/
+
+# Translations
+*.mo
+*.pot
+
+# Django stuff:
+*.log
+local_settings.py
+db.sqlite3
+db.sqlite3-journal
+
+# Flask stuff:
+instance/
+.webassets-cache
+
+# Scrapy stuff:
+.scrapy
+
+# Sphinx documentation
+docs/_build/
+
+# PyBuilder
+target/
+
+# Jupyter Notebook
+.ipynb_checkpoints
+notebooks/
+
+# IPython
+profile_default/
+ipython_config.py
+
+# pyenv
+.python-version
+
+# pipenv
+#   According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
+#   However, in case of collaboration, if having platform-specific dependencies or dependencies
+#   having no cross-platform support, pipenv may install dependencies that don't work, or not
+#   install all needed dependencies.
+#Pipfile.lock
+
+# PEP 582; used by e.g. github.com/David-OConnor/pyflow
+__pypackages__/
+
+# Celery stuff
+celerybeat-schedule
+celerybeat.pid
+
+# SageMath parsed files
+*.sage.py
+
+# Environments
+.env
+.venv
+.venvs
+env/
+venv/
+ENV/
+env.bak/
+venv.bak/
+
+# Spyder project settings
+.spyderproject
+.spyproject
+
+# Rope project settings
+.ropeproject
+
+# mkdocs documentation
+/site
+
+# mypy
+.mypy_cache/
+.dmypy.json
+dmypy.json
+
+# Pyre type checker
+.pyre/
+
+# macOS display setting files
+.DS_Store
+
+
+
+# docker
+docker/
+!docker/assets/
+.dockerignore
+docker.build
--- a/.gitignore
+++ b/.gitignore
@ -106,6 +106,7 @@ celerybeat.pid

 # Environments
 .env
+!docker/.env
 .venv
 .venvs
 env/
@ -134,3 +135,4 @@ dmypy.json

 # macOS display setting files
 .DS_Store
+docker.build
--- a/CONTRIBUTING.md
+++ b/CONTRIBUTING.md
@ -151,6 +151,10 @@ poetry run jupyter notebook

 When you run `poetry install`, the `langchain` package is installed as editable in the virtualenv, so your new logic can be imported into the notebook.

+## Using Docker
+
+Refer to [DOCKER.md](docker/DOCKER.md) for more information.
+
 ## Documentation

 ### Contribute Documentation
--- a/29
+++ b/29
@ -1,5 +1,8 @@
 .PHONY: all clean format lint test tests test_watch integration_tests help

+GIT_HASH ?= $(shell git rev-parse --short HEAD)
+LANGCHAIN_VERSION := $(shell grep '^version' pyproject.toml | cut -d '=' -f2 | tr -d '"')
+
 all: help
 	
 coverage:
@ -21,19 +24,17 @@ docs_linkcheck:

 format:
 	poetry run black .
-	poetry run isort .
+	poetry run ruff --select I --fix .

 lint:
 	poetry run mypy .
 	poetry run black . --check
-	poetry run isort . --check
-	poetry run flake8 .
+	poetry run ruff .

 test:
 	poetry run pytest tests/unit_tests

-tests:
-	poetry run pytest tests/unit_tests
+tests: test

 test_watch:
 	poetry run ptw --now . -- tests/unit_tests
@ -47,8 +48,26 @@ help:
 	@echo 'docs_build          - build the documentation'
 	@echo 'docs_clean          - clean the documentation build artifacts'
 	@echo 'docs_linkcheck      - run linkchecker on the documentation'
+ifneq ($(shell command -v docker 2> /dev/null),)
+	@echo 'docker              - build and run the docker dev image'
+	@echo 'docker.run          - run the docker dev image'
+	@echo 'docker.jupyter      - start a jupyter notebook inside container'
+	@echo 'docker.build        - build the docker dev image'
+	@echo 'docker.force_build  - force a rebuild'
+	@echo 'docker.test         - run the unit tests in docker'
+	@echo 'docker.lint         - run the linters in docker'
+	@echo 'docker.clean        - remove the docker dev image'
+endif
 	@echo 'format              - run code formatters'
 	@echo 'lint                - run linters'
 	@echo 'test                - run unit tests'
 	@echo 'test_watch          - run unit tests in watch mode'
 	@echo 'integration_tests   - run integration tests'
+
+# include the following makefile if the docker executable is available
+ifeq ($(shell command -v docker 2> /dev/null),)
+	$(info Docker not found, skipping docker-related targets)
+else
+include docker/Makefile
+endif
+	
--- a/README.md
+++ b/README.md
@ -1,11 +1,15 @@
-# 🦜️🔗 LangChain
+# 🦜️🔗 LangChain - Docker

-⚡ Building applications with LLMs through composability ⚡
+WIP: This is a fork of langchain focused on implementing a docker warpper and
+toolchain. The goal is to make it easy to use LLM chains running inside a
+container, build custom docker based tools and let agents run arbitrary
+untrusted code inside.

-[![lint](https://github.com/hwchase17/langchain/actions/workflows/lint.yml/badge.svg)](https://github.com/hwchase17/langchain/actions/workflows/lint.yml) [![test](https://github.com/hwchase17/langchain/actions/workflows/test.yml/badge.svg)](https://github.com/hwchase17/langchain/actions/workflows/test.yml) [![linkcheck](https://github.com/hwchase17/langchain/actions/workflows/linkcheck.yml/badge.svg)](https://github.com/hwchase17/langchain/actions/workflows/linkcheck.yml) [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT) [![Twitter](https://img.shields.io/twitter/url/https/twitter.com/langchainai.svg?style=social&label=Follow%20%40LangChainAI)](https://twitter.com/langchainai) [![](https://dcbadge.vercel.app/api/server/6adMQxSpJS?compact=true&style=flat)](https://discord.gg/6adMQxSpJS)
+Currently exploring the following:

-**Production Support:** As you move your LangChains into production, we'd love to offer more comprehensive support.
-Please fill out [this form](https://forms.gle/57d8AmXBYp8PP8tZA) and we'll set up a dedicated support Slack channel.
+-  Docker wrapper for LLMs and chains
+-  Creating a toolchain for building docker based LLM tools.
+-  Building agents that can run arbitrary untrusted code inside a container.

 ## Quick Install

--- a/docker/.env
+++ b/docker/.env
@ -0,0 +1,13 @@
+# python env
+PYTHON_VERSION=3.10
+
+# -E flag is required
+# comment the following line to only install dev dependencies
+POETRY_EXTRA_PACKAGES="-E all"
+
+# at least one group needed
+POETRY_DEPENDENCIES="dev,test,lint,typing"
+
+# langchain env. warning: these variables will be baked into the docker image !
+OPENAI_API_KEY=${OPENAI_API_KEY:-}
+SERPAPI_API_KEY=${SERPAPI_API_KEY:-}
--- a/docker/DOCKER.md
+++ b/docker/DOCKER.md
@ -0,0 +1,53 @@
+# Using Docker
+
+To quickly get started, run the command `make docker`.
+
+If docker is installed the Makefile will export extra targets in the fomrat `docker.*` to build and run the docker image. Type `make` for a list of available tasks.
+
+There is a basic `docker-compose.yml` in the docker directory.
+
+## Building the development image
+
+Using `make docker` will build the dev image if it does not exist, then drops
+you inside the container with the langchain environment available in the shell.
+
+### Customizing the image and installed dependencies
+
+The image is built with a default python version and all extras and dev
+dependencies. It can be customized by changing the variables in the [.env](/docker/.env)
+file. 
+
+If you don't need all the `extra` dependencies a slimmer image can be obtained by 
+commenting out `POETRY_EXTRA_PACKAGES` in the [.env](docker/.env) file.
+
+### Image caching
+
+The Dockerfile is optimized to cache the poetry install step. A rebuild is triggered when there a change to the source code.
+
+## Example Usage
+
+All commands from langchain's python environment are available by default in the container.
+
+A few examples:
+```bash
+# run jupyter notebook
+docker run --rm -it IMG jupyter notebook
+
+# run ipython
+docker run --rm -it IMG ipython
+
+# start web server
+docker run --rm -p 8888:8888 IMG python -m http.server 8888
+```
+
+## Testing / Linting
+
+Tests and lints are run using your local source directory that is mounted on the volume /src.
+
+Run unit tests in the container with `make docker.test`.
+
+Run the linting and formatting checks with `make docker.lint`.
+
+Note: this task can run in parallel using `make -j4 docker.lint`.
+
+
--- a/docker/Dockerfile
+++ b/docker/Dockerfile
@ -0,0 +1,104 @@
+# vim: ft=dockerfile
+#
+# see also: https://github.com/python-poetry/poetry/discussions/1879
+#   - with https://github.com/bneijt/poetry-lock-docker
+# see https://github.com/thehale/docker-python-poetry
+# see https://github.com/max-pfeiffer/uvicorn-poetry
+
+# use by default the slim version of python
+ARG PYTHON_IMAGE_TAG=slim 
+ARG PYTHON_VERSION=${PYTHON_VERSION:-3.11.2}
+
+####################
+# Base Environment
+####################
+FROM python:$PYTHON_VERSION-$PYTHON_IMAGE_TAG AS lchain-base
+
+ARG UID=1000
+ARG USERNAME=lchain
+
+ENV USERNAME=$USERNAME
+
+RUN groupadd -g ${UID} $USERNAME
+RUN useradd -l -m -u ${UID} -g ${UID} $USERNAME
+
+# used for mounting source code
+RUN mkdir /src
+VOLUME /src
+
+
+#######################
+## Poetry Builder Image
+#######################
+FROM lchain-base AS lchain-base-builder
+
+ARG POETRY_EXTRA_PACKAGES=$POETRY_EXTRA_PACKAGES
+ARG POETRY_DEPENDENCIES=$POETRY_DEPENDENCIES
+
+ENV HOME=/root
+ENV POETRY_HOME=/root/.poetry
+ENV POETRY_VIRTUALENVS_IN_PROJECT=false
+ENV POETRY_NO_INTERACTION=1
+ENV CACHE_DIR=$HOME/.cache
+ENV POETRY_CACHE_DIR=$CACHE_DIR/pypoetry
+ENV PATH="$POETRY_HOME/bin:$PATH"
+
+WORKDIR /root
+
+RUN apt-get update && \
+    apt-get install -y \
+    build-essential \
+    git \
+    curl
+
+SHELL ["/bin/bash", "-o", "pipefail", "-c"]
+
+RUN mkdir -p $CACHE_DIR
+
+## setup poetry
+RUN curl -sSL -o $CACHE_DIR/pypoetry-installer.py https://install.python-poetry.org/
+RUN python3 $CACHE_DIR/pypoetry-installer.py
+
+
+# # Copy poetry files
+COPY poetry.* pyproject.toml ./
+
+RUN mkdir /pip-prefix
+
+RUN poetry export $POETRY_EXTRA_PACKAGES --with $POETRY_DEPENDENCIES -f requirements.txt --output requirements.txt --without-hashes && \
+    pip install --no-cache-dir --disable-pip-version-check --prefix /pip-prefix -r requirements.txt
+
+
+# add custom motd message
+COPY docker/assets/etc/motd /tmp/motd
+RUN cat /tmp/motd > /etc/motd
+
+RUN printf "\n%s\n%s\n" "$(poetry version)" "$(python --version)" >> /etc/motd
+
+###################
+## Runtime Image
+###################
+FROM lchain-base AS lchain
+
+#jupyter port
+EXPOSE 8888
+
+COPY docker/assets/entry.sh /entry
+RUN chmod +x /entry
+
+COPY --from=lchain-base-builder /etc/motd /etc/motd
+COPY --from=lchain-base-builder /usr/bin/git /usr/bin/git
+
+USER ${USERNAME:-lchain}
+ENV HOME /home/$USERNAME
+WORKDIR /home/$USERNAME
+
+COPY --chown=lchain:lchain --from=lchain-base-builder /pip-prefix $HOME/.local/
+
+COPY . .
+
+SHELL ["/bin/bash", "-o", "pipefail", "-c"]
+RUN pip install --no-deps --disable-pip-version-check --no-cache-dir -e .
+
+
+entrypoint ["/entry"]
--- a/docker/Makefile
+++ b/docker/Makefile
@ -0,0 +1,84 @@
+#do not call this makefile it is included in the main Makefile
+.PHONY: docker docker.jupyter docker.run docker.force_build docker.clean \
+	docker.test docker.lint docker.lint.mypy docker.lint.black \
+	docker.lint.isort docker.lint.flake
+
+# read python version from .env file ignoring comments
+PYTHON_VERSION := $(shell grep PYTHON_VERSION docker/.env | cut -d '=' -f2)
+POETRY_EXTRA_PACKAGES := $(shell grep '^[^#]*POETRY_EXTRA_PACKAGES' docker/.env | cut -d '=' -f2)
+POETRY_DEPENDENCIES := $(shell grep 'POETRY_DEPENDENCIES' docker/.env | cut -d '=' -f2)
+
+
+DOCKER_SRC := $(shell find docker -type f)
+DOCKER_IMAGE_NAME = langchain/dev
+
+# SRC is all files matched by the git ls-files command
+SRC := $(shell git ls-files -- '*' ':!:docker/*')
+
+# set DOCKER_BUILD_PROGRESS=plain to see detailed build progress
+DOCKER_BUILD_PROGRESS ?= auto
+
+# extra message to show when entering the docker container
+DOCKER_MOTD := docker/assets/etc/motd
+
+ROOTDIR := $(shell git rev-parse --show-toplevel)
+
+DOCKER_LINT_CMD = docker run --rm -i -u lchain -v $(ROOTDIR):/src  $(DOCKER_IMAGE_NAME):$(GIT_HASH)
+
+docker: docker.run
+
+docker.run: docker.build
+	@echo "Docker image: $(DOCKER_IMAGE_NAME):$(GIT_HASH)"
+	docker run --rm -it -u lchain -v $(ROOTDIR):/src  $(DOCKER_IMAGE_NAME):$(GIT_HASH)
+
+docker.jupyter: docker.build
+	docker run --rm -it -v $(ROOTDIR):/src  $(DOCKER_IMAGE_NAME):$(GIT_HASH) jupyter notebook
+
+docker.build: $(SRC) $(DOCKER_SRC) $(DOCKER_MOTD)
+ifdef $(DOCKER_BUILDKIT)
+	docker buildx build --build-arg PYTHON_VERSION=$(PYTHON_VERSION) \
+			--build-arg POETRY_EXTRA_PACKAGES=$(POETRY_EXTRA_PACKAGES) \
+			--build-arg POETRY_DEPENDENCIES=$(POETRY_DEPENDENCIES) \
+			--progress=$(DOCKER_BUILD_PROGRESS) \
+			$(BUILD_FLAGS) -f docker/Dockerfile -t $(DOCKER_IMAGE_NAME):$(GIT_HASH) .
+else
+	docker build --build-arg PYTHON_VERSION=$(PYTHON_VERSION) \
+			--build-arg POETRY_EXTRA_PACKAGES=$(POETRY_EXTRA_PACKAGES) \
+			--build-arg POETRY_DEPENDENCIES=$(POETRY_DEPENDENCIES) \
+			$(BUILD_FLAGS) -f docker/Dockerfile -t $(DOCKER_IMAGE_NAME):$(GIT_HASH) .
+endif
+	docker tag $(DOCKER_IMAGE_NAME):$(GIT_HASH) $(DOCKER_IMAGE_NAME):latest
+	@touch $@ # this prevents docker from rebuilding dependencies that have not 
+	@         #  changed. Remove the file `docker/docker.build` to force a rebuild.
+
+docker.force_build: $(DOCKER_SRC)
+	@rm -f docker.build
+	@$(MAKE) docker.build BUILD_FLAGS=--no-cache
+
+docker.clean:
+	docker rmi $(DOCKER_IMAGE_NAME):$(GIT_HASH) $(DOCKER_IMAGE_NAME):latest
+
+docker.test: docker.build
+	docker run --rm -it -u lchain -v $(ROOTDIR):/src  $(DOCKER_IMAGE_NAME):$(GIT_HASH) \
+		pytest /src/tests/unit_tests
+
+# this assumes that the docker image has been built 
+docker.lint: docker.lint.mypy docker.lint.black docker.lint.isort \
+	docker.lint.flake
+
+# these can run in parallel with -j[njobs]
+docker.lint.mypy:
+	@$(DOCKER_LINT_CMD) mypy /src
+	@printf "\t%s\n" "mypy ... "
+
+docker.lint.black:
+	@$(DOCKER_LINT_CMD) black /src --check
+	@printf "\t%s\n" "black ... "
+
+docker.lint.isort:
+	@$(DOCKER_LINT_CMD) isort /src --check
+	@printf "\t%s\n" "isort ... "
+
+docker.lint.flake:
+	@$(DOCKER_LINT_CMD) flake8 /src
+	@printf "\t%s\n" "flake8 ... "
--- a/docker/assets/entry.sh
+++ b/docker/assets/entry.sh
@ -0,0 +1,10 @@
+#!/usr/bin/env bash
+
+export PATH=$HOME/.local/bin:$PATH
+
+if [ -z "$1" ]; then
+    cat /etc/motd
+    exec /bin/bash
+fi
+
+exec "$@"
--- a/docker/assets/etc/motd
+++ b/docker/assets/etc/motd
@ -0,0 +1,8 @@
+All dependencies have been installed in the current shell. There is no
+virtualenv or a need for `poetry` inside the container.
+
+Running the command `make docker.run` at the root directory of the project will
+build the container the first time. On the next runs it will use the cached
+image. A rebuild will happen when changes are made to the source code.
+
+You local source directory has been mounted to the /src directory.
--- a/docker/docker-compose.yml
+++ b/docker/docker-compose.yml
@ -0,0 +1,17 @@
+version: "3.7"
+
+services:
+  langchain:
+    hostname: langchain
+    image: langchain/dev:latest
+    build:
+      context: ../
+      dockerfile: docker/Dockerfile
+      args:
+        PYTHON_VERSION: ${PYTHON_VERSION}
+        POETRY_EXTRA_PACKAGES: ${POETRY_EXTRA_PACKAGES}
+        POETRY_DEPENDENCIES: ${POETRY_DEPENDENCIES}
+
+    restart: unless-stopped
+    ports:
+      - 127.0.0.1:8888:8888
--- a/docs/ecosystem/atlas.md
+++ b/docs/ecosystem/atlas.md
@ -0,0 +1,25 @@
+# AtlasDB
+
+This page covers how to Nomic's Atlas ecosystem within LangChain.
+It is broken into two parts: installation and setup, and then references to specific Atlas wrappers.
+
+## Installation and Setup
+- Install the Python package with `pip install nomic`
+- Nomic is also included in langchains poetry extras `poetry install -E all`
+- 
+## Wrappers
+
+### VectorStore
+
+There exists a wrapper around the Atlas neural database, allowing you to use it as a vectorstore.
+This vectorstore also gives you full access to the underlying AtlasProject object, which will allow you to use the full range of Atlas map interactions, such as bulk tagging and automatic topic modeling.
+Please see [the Nomic docs](https://docs.nomic.ai/atlas_api.html) for more detailed information.
+
+
+
+To import this vectorstore:
+```python
+from langchain.vectorstores import AtlasDB
+```
+
+For a more detailed walkthrough of the Chroma wrapper, see [this notebook](../modules/indexes/examples/vectorstores.ipynb)
--- a/docs/ecosystem/bananadev.md
+++ b/docs/ecosystem/bananadev.md
@ -0,0 +1,79 @@
+# Banana
+
+This page covers how to use the Banana ecosystem within LangChain.
+It is broken into two parts: installation and setup, and then references to specific Banana wrappers.
+
+## Installation and Setup
+
+- Install with `pip3 install banana-dev`
+- Get an Banana api key and set it as an environment variable (`BANANA_API_KEY`)
+
+## Define your Banana Template
+
+If you want to use an available language model template you can find one [here](https://app.banana.dev/templates/conceptofmind/serverless-template-palmyra-base).
+This template uses the Palmyra-Base model by [Writer](https://writer.com/product/api/).
+You can check out an example Banana repository [here](https://github.com/conceptofmind/serverless-template-palmyra-base).
+
+## Build the Banana app
+
+Banana Apps must include the "output" key in the return json. 
+There is a rigid response structure.
+
+```python
+# Return the results as a dictionary
+result = {'output': result}
+```
+
+An example inference function would be:
+
+```python
+def inference(model_inputs:dict) -> dict:
+    global model
+    global tokenizer
+
+    # Parse out your arguments
+    prompt = model_inputs.get('prompt', None)
+    if prompt == None:
+        return {'message': "No prompt provided"}
+
+    # Run the model
+    input_ids = tokenizer.encode(prompt, return_tensors='pt').cuda()
+    output = model.generate(
+        input_ids,
+        max_length=100,
+        do_sample=True,
+        top_k=50,
+        top_p=0.95,
+        num_return_sequences=1,
+        temperature=0.9,
+        early_stopping=True,
+        no_repeat_ngram_size=3,
+        num_beams=5,
+        length_penalty=1.5,
+        repetition_penalty=1.5,
+        bad_words_ids=[[tokenizer.encode(' ', add_prefix_space=True)[0]]]
+        )
+
+    result = tokenizer.decode(output[0], skip_special_tokens=True)
+    # Return the results as a dictionary
+    result = {'output': result}
+    return result
+```
+
+You can find a full example of a Banana app [here](https://github.com/conceptofmind/serverless-template-palmyra-base/blob/main/app.py).
+
+## Wrappers
+
+### LLM
+
+There exists an Banana LLM wrapper, which you can access with
+
+```python
+from langchain.llms import Banana
+```
+
+You need to provide a model key located in the dashboard:
+
+```python
+llm = Banana(model_key="YOUR_MODEL_KEY")
+```
--- a/docs/ecosystem/deepinfra.md
+++ b/docs/ecosystem/deepinfra.md
@ -0,0 +1,17 @@
+# DeepInfra
+
+This page covers how to use the DeepInfra ecosystem within LangChain.
+It is broken into two parts: installation and setup, and then references to specific DeepInfra wrappers.
+
+## Installation and Setup
+- Get your DeepInfra api key from this link [here](https://deepinfra.com/).
+- Get an DeepInfra api key and set it as an environment variable (`DEEPINFRA_API_TOKEN`)
+
+## Wrappers
+
+### LLM
+
+There exists an DeepInfra LLM wrapper, which you can access with
+```python
+from langchain.llms import DeepInfra
+```
--- a/docs/ecosystem/deeplake.md
+++ b/docs/ecosystem/deeplake.md
@ -0,0 +1,25 @@
+# Deep Lake
+
+This page covers how to use the Deep Lake ecosystem within LangChain.
+It is broken into two parts: installation and setup, and then references to specific Deep Lake wrappers. For more information.
+
+1. Here is [whitepaper](https://www.deeplake.ai/whitepaper) and [academic paper](https://arxiv.org/pdf/2209.10785.pdf) for Deep Lake
+
+2. Here is a set of additional resources available for review: [Deep Lake](https://github.com/activeloopai/deeplake), [Getting Started](https://docs.activeloop.ai/getting-started) and [Tutorials](https://docs.activeloop.ai/hub-tutorials)
+
+## Installation and Setup
+- Install the Python package with `pip install deeplake`
+
+## Wrappers
+
+### VectorStore
+
+There exists a wrapper around Deep Lake, a data lake for Deep Learning applications, allowing you to use it as a vectorstore (for now), whether for semantic search or example selection.
+
+To import this vectorstore:
+```python
+from langchain.vectorstores import DeepLake
+```
+
+
+For a more detailed walkthrough of the Deep Lake wrapper, see [this notebook](../modules/indexes/vectorstore_examples/deeplake.ipynb)
--- a/docs/ecosystem/graphsignal.md
+++ b/docs/ecosystem/graphsignal.md
@ -0,0 +1,38 @@
+# Graphsignal
+
+This page covers how to use the Graphsignal to trace and monitor LangChain.
+
+## Installation and Setup
+
+- Install the Python library with `pip install graphsignal`
+- Create free Graphsignal account [here](https://graphsignal.com)
+- Get an API key and set it as an environment variable (`GRAPHSIGNAL_API_KEY`)
+
+## Tracing and Monitoring
+
+Graphsignal automatically instruments and starts tracing and monitoring chains. Traces, metrics and errors are then available in your [Graphsignal dashboard](https://app.graphsignal.com/). No prompts or other sensitive data are sent to Graphsignal cloud, only statistics and metadata.
+
+Initialize the tracer by providing a deployment name:
+
+```python
+import graphsignal
+
+graphsignal.configure(deployment='my-langchain-app-prod')
+```
+
+In order to trace full runs and see a breakdown by chains and tools, you can wrap the calling routine or use a decorator:
+
+```python
+with graphsignal.start_trace('my-chain'):
+    chain.run("some initial text")
+```
+
+Optionally, enable profiling to record function-level statistics for each trace.
+
+```python
+with graphsignal.start_trace(
+        'my-chain', options=graphsignal.TraceOptions(enable_profiling=True)):
+    chain.run("some initial text")
+```
+
+See the [Quick Start](https://graphsignal.com/docs/guides/quick-start/) guide for complete setup instructions.
--- a/docs/ecosystem/helicone.md
+++ b/docs/ecosystem/helicone.md
@ -19,3 +19,35 @@ export OPENAI_API_BASE="https://oai.hconeai.com/v1"
 Now head over to [helicone.ai](https://helicone.ai/onboarding?step=2) to create your account, and add your OpenAI API key within our dashboard to view your logs.

 ![Helicone](../_static/HeliconeKeys.png)
+
+## How to enable Helicone caching
+
+```python
+from langchain.llms import OpenAI
+import openai
+openai.api_base = "https://oai.hconeai.com/v1"
+
+llm = OpenAI(temperature=0.9, headers={"Helicone-Cache-Enabled": "true"})
+text = "What is a helicone?"
+print(llm(text))
+```
+
+[Helicone caching docs](https://docs.helicone.ai/advanced-usage/caching)
+
+## How to use Helicone custom properties
+
+```python
+from langchain.llms import OpenAI
+import openai
+openai.api_base = "https://oai.hconeai.com/v1"
+
+llm = OpenAI(temperature=0.9, headers={
+        "Helicone-Property-Session": "24",
+        "Helicone-Property-Conversation": "support_issue_2",
+        "Helicone-Property-App": "mobile",
+      })
+text = "What is a helicone?"
+print(llm(text))
+```
+
+[Helicone property docs](https://docs.helicone.ai/advanced-usage/custom-properties)
--- a/docs/ecosystem/modal.md
+++ b/docs/ecosystem/modal.md
@ -0,0 +1,66 @@
+# Modal
+
+This page covers how to use the Modal ecosystem within LangChain.
+It is broken into two parts: installation and setup, and then references to specific Modal wrappers.
+
+## Installation and Setup
+- Install with `pip install modal-client`
+- Run `modal token new`
+
+## Define your Modal Functions and Webhooks
+
+You must include a prompt. There is a rigid response structure.
+
+```python
+class Item(BaseModel):
+    prompt: str
+
+@stub.webhook(method="POST")
+def my_webhook(item: Item):
+    return {"prompt": my_function.call(item.prompt)}
+```
+
+An example with GPT2:
+
+```python
+from pydantic import BaseModel
+
+import modal
+
+stub = modal.Stub("example-get-started")
+
+volume = modal.SharedVolume().persist("gpt2_model_vol")
+CACHE_PATH = "/root/model_cache"
+
+@stub.function(
+    gpu="any",
+    image=modal.Image.debian_slim().pip_install(
+        "tokenizers", "transformers", "torch", "accelerate"
+    ),
+    shared_volumes={CACHE_PATH: volume},
+    retries=3,
+)
+def run_gpt2(text: str):
+    from transformers import GPT2Tokenizer, GPT2LMHeadModel
+    tokenizer = GPT2Tokenizer.from_pretrained('gpt2')
+    model = GPT2LMHeadModel.from_pretrained('gpt2')
+    encoded_input = tokenizer(text, return_tensors='pt').input_ids
+    output = model.generate(encoded_input, max_length=50, do_sample=True)
+    return tokenizer.decode(output[0], skip_special_tokens=True)
+
+class Item(BaseModel):
+    prompt: str
+
+@stub.webhook(method="POST")
+def get_text(item: Item):
+    return {"prompt": run_gpt2.call(item.prompt)}
+```
+
+## Wrappers
+
+### LLM
+
+There exists an Modal LLM wrapper, which you can access with 
+```python
+from langchain.llms import Modal
+```
--- a/docs/ecosystem/petals.md
+++ b/docs/ecosystem/petals.md
@ -5,7 +5,7 @@ It is broken into two parts: installation and setup, and then references to spec

 ## Installation and Setup
 - Install with `pip install petals`
- Get an Huggingface api key and set it as an environment variable (`HUGGINGFACE_API_KEY`)
+- Get a Hugging Face api key and set it as an environment variable (`HUGGINGFACE_API_KEY`)

 ## Wrappers

@ -14,4 +14,4 @@ It is broken into two parts: installation and setup, and then references to spec
 There exists an Petals LLM wrapper, which you can access with 
 ```python
 from langchain.llms import Petals
-```
+```
--- a/docs/ecosystem/stochasticai.md
+++ b/docs/ecosystem/stochasticai.md
@ -0,0 +1,17 @@
+# StochasticAI
+
+This page covers how to use the StochasticAI ecosystem within LangChain.
+It is broken into two parts: installation and setup, and then references to specific StochasticAI wrappers.
+
+## Installation and Setup
+- Install with `pip install stochasticx`
+- Get an StochasticAI api key and set it as an environment variable (`STOCHASTICAI_API_KEY`)
+
+## Wrappers
+
+### LLM
+
+There exists an StochasticAI LLM wrapper, which you can access with 
+```python
+from langchain.llms import StochasticAI
+```
--- a/docs/ecosystem/unstructured.md
+++ b/docs/ecosystem/unstructured.md
@ -17,10 +17,6 @@ This page is broken into two parts: installation and setup, and then references
    - `poppler-utils`
    - `tesseract-ocr`
    - `libreoffice`
- Run the following to install NLTK dependencies. `unstructured` will handle this automatically
-  soon.
-  - `python -c "import nltk; nltk.download('punkt')"`
-  - `python -c "import nltk; nltk.download('averaged_perceptron_tagger')"`
 - If you are parsing PDFs, run the following to install the `detectron2` model, which
  `unstructured` uses for layout detection:
    - `pip install "detectron2@git+https://github.com/facebookresearch/detectron2.git@v0.6#egg=detectron2"`
--- a/docs/ecosystem/writer.md
+++ b/docs/ecosystem/writer.md
@ -0,0 +1,16 @@
+# Writer
+
+This page covers how to use the Writer ecosystem within LangChain.
+It is broken into two parts: installation and setup, and then references to specific Writer wrappers.
+
+## Installation and Setup
+- Get an Writer api key and set it as an environment variable (`WRITER_API_KEY`)
+
+## Wrappers
+
+### LLM
+
+There exists an Writer LLM wrapper, which you can access with 
+```python
+from langchain.llms import Writer
+```
--- a/docs/modules/agents.rst
+++ b/docs/modules/agents.rst
@ -2,7 +2,7 @@ Agents
 ==========================

 Some applications will require not just a predetermined chain of calls to LLMs/other tools,
-but potentially an unknown chain that depends on the user input.
+but potentially an unknown chain that depends on the user's input.
 In these types of chains, there is a “agent” which has access to a suite of tools.
 Depending on the user input, the agent can then decide which, if any, of these tools to call.

@ -12,7 +12,7 @@ The following sections of documentation are provided:

 - `Key Concepts <./agents/key_concepts.html>`_: A conceptual guide going over the various concepts related to agents.

- `How-To Guides <./agents/how_to_guides.html>`_: A collection of how-to guides. These highlight how to integrate various types of tools, how to work with different types of agent, and how to customize agents.
+- `How-To Guides <./agents/how_to_guides.html>`_: A collection of how-to guides. These highlight how to integrate various types of tools, how to work with different types of agents, and how to customize agents.

 - `Reference <../reference/modules/agents.html>`_: API reference documentation for all Agent classes.

@ -27,4 +27,4 @@ The following sections of documentation are provided:
   ./agents/getting_started.ipynb
   ./agents/key_concepts.md
   ./agents/how_to_guides.rst
-   Reference<../reference/modules/agents.rst>
+   Reference<../reference/modules/agents.rst>
--- a/docs/modules/agents/agents.md
+++ b/docs/modules/agents/agents.md
@ -1,7 +1,7 @@
 # Agents

 Agents use an LLM to determine which actions to take and in what order.
-An action can either be using a tool and observing its output, or returning to the user.
+An action can either be using a tool and observing its output, or returning a response to the user.
 For a list of easily loadable tools, see [here](tools.md).
 Here are the agents available in LangChain.

--- a/docs/modules/agents/examples/agent_vectorstore.ipynb
+++ b/docs/modules/agents/examples/agent_vectorstore.ipynb
@ -0,0 +1,494 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "id": "68b24990",
+   "metadata": {},
+   "source": [
+    "# Agents and Vectorstores\n",
+    "\n",
+    "This notebook covers how to combine agents and vectorstores. The use case for this is that you've ingested your data into a vectorstore and want to interact with it in an agentic manner.\n",
+    "\n",
+    "The reccomended method for doing so is to create a VectorDBQAChain and then use that as a tool in the overall agent. Let's take a look at doing this below. You can do this with multiple different vectordbs, and use the agent as a way to route between them. There are two different ways of doing this - you can either let the agent use the vectorstores as normal tools, or you can set `return_direct=True` to really just use the agent as a router."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "9b22020a",
+   "metadata": {},
+   "source": [
+    "## Create the Vectorstore"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 20,
+   "id": "2e87c10a",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from langchain.embeddings.openai import OpenAIEmbeddings\n",
+    "from langchain.vectorstores import Chroma\n",
+    "from langchain.text_splitter import CharacterTextSplitter\n",
+    "from langchain import OpenAI, VectorDBQA\n",
+    "llm = OpenAI(temperature=0)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 37,
+   "id": "f2675861",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Running Chroma using direct local API.\n",
+      "Using DuckDB in-memory for database. Data will be transient.\n"
+     ]
+    }
+   ],
+   "source": [
+    "from langchain.document_loaders import TextLoader\n",
+    "loader = TextLoader('../../state_of_the_union.txt')\n",
+    "documents = loader.load()\n",
+    "text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)\n",
+    "texts = text_splitter.split_documents(documents)\n",
+    "\n",
+    "embeddings = OpenAIEmbeddings()\n",
+    "docsearch = Chroma.from_documents(texts, embeddings, collection_name=\"state-of-union\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 38,
+   "id": "bc5403d4",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "state_of_union = VectorDBQA.from_chain_type(llm=llm, chain_type=\"stuff\", vectorstore=docsearch)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 39,
+   "id": "1431cded",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from langchain.document_loaders import WebBaseLoader"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 40,
+   "id": "915d3ff3",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "loader = WebBaseLoader(\"https://beta.ruff.rs/docs/faq/\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 41,
+   "id": "96a2edf8",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Running Chroma using direct local API.\n",
+      "Using DuckDB in-memory for database. Data will be transient.\n"
+     ]
+    }
+   ],
+   "source": [
+    "docs = loader.load()\n",
+    "ruff_texts = text_splitter.split_documents(docs)\n",
+    "ruff_db = Chroma.from_documents(ruff_texts, embeddings, collection_name=\"ruff\")\n",
+    "ruff = VectorDBQA.from_chain_type(llm=llm, chain_type=\"stuff\", vectorstore=ruff_db)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "71ecef90",
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  },
+  {
+   "cell_type": "markdown",
+   "id": "c0a6c031",
+   "metadata": {},
+   "source": [
+    "## Create the Agent"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 43,
+   "id": "eb142786",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Import things that are needed generically\n",
+    "from langchain.agents import initialize_agent, Tool\n",
+    "from langchain.tools import BaseTool\n",
+    "from langchain.llms import OpenAI\n",
+    "from langchain import LLMMathChain, SerpAPIWrapper"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 44,
+   "id": "850bc4e9",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "tools = [\n",
+    "    Tool(\n",
+    "        name = \"State of Union QA System\",\n",
+    "        func=state_of_union.run,\n",
+    "        description=\"useful for when you need to answer questions about the most recent state of the union address. Input should be a fully formed question.\"\n",
+    "    ),\n",
+    "    Tool(\n",
+    "        name = \"Ruff QA System\",\n",
+    "        func=ruff.run,\n",
+    "        description=\"useful for when you need to answer questions about ruff (a python linter). Input should be a fully formed question.\"\n",
+    "    ),\n",
+    "]"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 45,
+   "id": "fc47f230",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Construct the agent. We will use the default agent type here.\n",
+    "# See documentation for a full list of options.\n",
+    "agent = initialize_agent(tools, llm, agent=\"zero-shot-react-description\", verbose=True)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 46,
+   "id": "10ca2db8",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "\n",
+      "\n",
+      "\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n",
+      "\u001b[32;1m\u001b[1;3m I need to find out what Biden said about Ketanji Brown Jackson in the State of the Union address.\n",
+      "Action: State of Union QA System\n",
+      "Action Input: What did Biden say about Ketanji Brown Jackson in the State of the Union address?\u001b[0m\n",
+      "Observation: \u001b[36;1m\u001b[1;3m Biden said that Jackson is one of the nation's top legal minds and that she will continue Justice Breyer's legacy of excellence.\u001b[0m\n",
+      "Thought:\u001b[32;1m\u001b[1;3m I now know the final answer\n",
+      "Final Answer: Biden said that Jackson is one of the nation's top legal minds and that she will continue Justice Breyer's legacy of excellence.\u001b[0m\n",
+      "\n",
+      "\u001b[1m> Finished chain.\u001b[0m\n"
+     ]
+    },
+    {
+     "data": {
+      "text/plain": [
+       "\"Biden said that Jackson is one of the nation's top legal minds and that she will continue Justice Breyer's legacy of excellence.\""
+      ]
+     },
+     "execution_count": 46,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "agent.run(\"What did biden say about ketanji brown jackson is the state of the union address?\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 47,
+   "id": "4e91b811",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "\n",
+      "\n",
+      "\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n",
+      "\u001b[32;1m\u001b[1;3m I need to find out the advantages of using ruff over flake8\n",
+      "Action: Ruff QA System\n",
+      "Action Input: What are the advantages of using ruff over flake8?\u001b[0m\n",
+      "Observation: \u001b[33;1m\u001b[1;3m Ruff can be used as a drop-in replacement for Flake8 when used (1) without or with a small number of plugins, (2) alongside Black, and (3) on Python 3 code. It also re-implements some of the most popular Flake8 plugins and related code quality tools natively, including isort, yesqa, eradicate, and most of the rules implemented in pyupgrade. Ruff also supports automatically fixing its own lint violations, which Flake8 does not.\u001b[0m\n",
+      "Thought:\u001b[32;1m\u001b[1;3m I now know the final answer\n",
+      "Final Answer: Ruff can be used as a drop-in replacement for Flake8 when used (1) without or with a small number of plugins, (2) alongside Black, and (3) on Python 3 code. It also re-implements some of the most popular Flake8 plugins and related code quality tools natively, including isort, yesqa, eradicate, and most of the rules implemented in pyupgrade. Ruff also supports automatically fixing its own lint violations, which Flake8 does not.\u001b[0m\n",
+      "\n",
+      "\u001b[1m> Finished chain.\u001b[0m\n"
+     ]
+    },
+    {
+     "data": {
+      "text/plain": [
+       "'Ruff can be used as a drop-in replacement for Flake8 when used (1) without or with a small number of plugins, (2) alongside Black, and (3) on Python 3 code. It also re-implements some of the most popular Flake8 plugins and related code quality tools natively, including isort, yesqa, eradicate, and most of the rules implemented in pyupgrade. Ruff also supports automatically fixing its own lint violations, which Flake8 does not.'"
+      ]
+     },
+     "execution_count": 47,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "agent.run(\"Why use ruff over flake8?\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "787a9b5e",
+   "metadata": {},
+   "source": [
+    "## Use the Agent solely as a router"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "9161ba91",
+   "metadata": {},
+   "source": [
+    "You can also set `return_direct=True` if you intend to use the agent as a router and just want to directly return the result of the VectorDBQaChain.\n",
+    "\n",
+    "Notice that in the above examples the agent did some extra work after querying the VectorDBQAChain. You can avoid that and just return the result directly."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 48,
+   "id": "f59b377e",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "tools = [\n",
+    "    Tool(\n",
+    "        name = \"State of Union QA System\",\n",
+    "        func=state_of_union.run,\n",
+    "        description=\"useful for when you need to answer questions about the most recent state of the union address. Input should be a fully formed question.\",\n",
+    "        return_direct=True\n",
+    "    ),\n",
+    "    Tool(\n",
+    "        name = \"Ruff QA System\",\n",
+    "        func=ruff.run,\n",
+    "        description=\"useful for when you need to answer questions about ruff (a python linter). Input should be a fully formed question.\",\n",
+    "        return_direct=True\n",
+    "    ),\n",
+    "]"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 49,
+   "id": "8615707a",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "agent = initialize_agent(tools, llm, agent=\"zero-shot-react-description\", verbose=True)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 50,
+   "id": "36e718a9",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "\n",
+      "\n",
+      "\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n",
+      "\u001b[32;1m\u001b[1;3m I need to find out what Biden said about Ketanji Brown Jackson in the State of the Union address.\n",
+      "Action: State of Union QA System\n",
+      "Action Input: What did Biden say about Ketanji Brown Jackson in the State of the Union address?\u001b[0m\n",
+      "Observation: \u001b[36;1m\u001b[1;3m Biden said that Jackson is one of the nation's top legal minds and that she will continue Justice Breyer's legacy of excellence.\u001b[0m\n",
+      "\u001b[32;1m\u001b[1;3m\u001b[0m\n",
+      "\n",
+      "\u001b[1m> Finished chain.\u001b[0m\n"
+     ]
+    },
+    {
+     "data": {
+      "text/plain": [
+       "\" Biden said that Jackson is one of the nation's top legal minds and that she will continue Justice Breyer's legacy of excellence.\""
+      ]
+     },
+     "execution_count": 50,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "agent.run(\"What did biden say about ketanji brown jackson in the state of the union address?\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 51,
+   "id": "edfd0a1a",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "\n",
+      "\n",
+      "\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n",
+      "\u001b[32;1m\u001b[1;3m I need to find out the advantages of using ruff over flake8\n",
+      "Action: Ruff QA System\n",
+      "Action Input: What are the advantages of using ruff over flake8?\u001b[0m\n",
+      "Observation: \u001b[33;1m\u001b[1;3m Ruff can be used as a drop-in replacement for Flake8 when used (1) without or with a small number of plugins, (2) alongside Black, and (3) on Python 3 code. It also re-implements some of the most popular Flake8 plugins and related code quality tools natively, including isort, yesqa, eradicate, and most of the rules implemented in pyupgrade. Ruff also supports automatically fixing its own lint violations, which Flake8 does not.\u001b[0m\n",
+      "\u001b[32;1m\u001b[1;3m\u001b[0m\n",
+      "\n",
+      "\u001b[1m> Finished chain.\u001b[0m\n"
+     ]
+    },
+    {
+     "data": {
+      "text/plain": [
+       "' Ruff can be used as a drop-in replacement for Flake8 when used (1) without or with a small number of plugins, (2) alongside Black, and (3) on Python 3 code. It also re-implements some of the most popular Flake8 plugins and related code quality tools natively, including isort, yesqa, eradicate, and most of the rules implemented in pyupgrade. Ruff also supports automatically fixing its own lint violations, which Flake8 does not.'"
+      ]
+     },
+     "execution_count": 51,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "agent.run(\"Why use ruff over flake8?\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "49a0cbbe",
+   "metadata": {},
+   "source": [
+    "## Multi-Hop vectorstore reasoning\n",
+    "\n",
+    "Because vectorstores are easily usable as tools in agents, it is easy to use answer multi-hop questions that depend on vectorstores using the existing agent framework"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 57,
+   "id": "d397a233",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "tools = [\n",
+    "    Tool(\n",
+    "        name = \"State of Union QA System\",\n",
+    "        func=state_of_union.run,\n",
+    "        description=\"useful for when you need to answer questions about the most recent state of the union address. Input should be a fully formed question, not referencing any obscure pronouns from the conversation before.\"\n",
+    "    ),\n",
+    "    Tool(\n",
+    "        name = \"Ruff QA System\",\n",
+    "        func=ruff.run,\n",
+    "        description=\"useful for when you need to answer questions about ruff (a python linter). Input should be a fully formed question, not referencing any obscure pronouns from the conversation before.\"\n",
+    "    ),\n",
+    "]"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 58,
+   "id": "06157240",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Construct the agent. We will use the default agent type here.\n",
+    "# See documentation for a full list of options.\n",
+    "agent = initialize_agent(tools, llm, agent=\"zero-shot-react-description\", verbose=True)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 59,
+   "id": "b492b520",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "\n",
+      "\n",
+      "\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n",
+      "\u001b[32;1m\u001b[1;3m I need to find out what tool ruff uses to run over Jupyter Notebooks, and if the president mentioned it in the state of the union.\n",
+      "Action: Ruff QA System\n",
+      "Action Input: What tool does ruff use to run over Jupyter Notebooks?\u001b[0m\n",
+      "Observation: \u001b[33;1m\u001b[1;3m Ruff is integrated into nbQA, a tool for running linters and code formatters over Jupyter Notebooks. After installing ruff and nbqa, you can run Ruff over a notebook like so: > nbqa ruff Untitled.ipynb\u001b[0m\n",
+      "Thought:\u001b[32;1m\u001b[1;3m I now need to find out if the president mentioned this tool in the state of the union.\n",
+      "Action: State of Union QA System\n",
+      "Action Input: Did the president mention nbQA in the state of the union?\u001b[0m\n",
+      "Observation: \u001b[36;1m\u001b[1;3m No, the president did not mention nbQA in the state of the union.\u001b[0m\n",
+      "Thought:\u001b[32;1m\u001b[1;3m I now know the final answer.\n",
+      "Final Answer: No, the president did not mention nbQA in the state of the union.\u001b[0m\n",
+      "\n",
+      "\u001b[1m> Finished chain.\u001b[0m\n"
+     ]
+    },
+    {
+     "data": {
+      "text/plain": [
+       "'No, the president did not mention nbQA in the state of the union.'"
+      ]
+     },
+     "execution_count": 59,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "agent.run(\"What tool does ruff use to run over Jupyter Notebooks? Did the president mention that tool in the state of the union?\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "b3b857d6",
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.9.1"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
--- a/docs/modules/agents/how_to_guides.rst
+++ b/docs/modules/agents/how_to_guides.rst
@ -7,6 +7,8 @@ The first category of how-to guides here cover specific parts of working with ag

 `Custom Tools <./examples/custom_tools.html>`_: How to create custom tools that an agent can use.

+`Agents With Vectorstores <./examples/agent_vectorstore.html>`_: How to use vectorstores with agents.
+
 `Intermediate Steps <./examples/intermediate_steps.html>`_: How to access and use intermediate steps to get more visibility into the internals of an agent.

 `Custom Agent <./examples/custom_agent.html>`_: How to create a custom agent (specifically, a custom LLM + prompt to drive that agent).
--- a/docs/modules/chains.rst
+++ b/docs/modules/chains.rst
@ -2,8 +2,8 @@ Chains
 ==========================

 Using an LLM in isolation is fine for some simple applications,
-but many more complex ones require chaining LLMs - either with eachother or with other experts.
-LangChain provides a standard interface for Chains, as well as some common implementations of chains for easy use.
+but many more complex ones require chaining LLMs - either with each other or with other experts.
+LangChain provides a standard interface for Chains, as well as some common implementations of chains for ease of use.

 The following sections of documentation are provided:

@ -26,4 +26,4 @@ The following sections of documentation are provided:
   ./chains/getting_started.ipynb
   ./chains/how_to_guides.rst
   ./chains/key_concepts.rst
-   Reference<../reference/modules/chains.rst>
+   Reference<../reference/modules/chains.rst>
--- a/docs/modules/chains/getting_started.ipynb
+++ b/docs/modules/chains/getting_started.ipynb
@ -9,13 +9,13 @@
    "In this tutorial, we will learn about creating simple chains in LangChain. We will learn how to create a chain, add components to it, and run it.\n",
    "\n",
    "In this tutorial, we will cover:\n",
-    "- Using the simple LLM chain\n",
+    "- Using a simple LLM chain\n",
    "- Creating sequential chains\n",
    "- Creating a custom chain\n",
    "\n",
    "## Why do we need chains?\n",
    "\n",
-    "Chains allow us to combine multiple components together to create a single, coherent application. For example, we can create a chain that takes user input, format it with a PromptTemplate, and then passes the formatted response to an LLM. We can build more complex chains by combining multiple chains together, or by combining chains with other components.\n"
+    "Chains allow us to combine multiple components together to create a single, coherent application. For example, we can create a chain that takes user input, formats it with a PromptTemplate, and then passes the formatted response to an LLM. We can build more complex chains by combining multiple chains together, or by combining chains with other components.\n"
   ]
  },
  {
@ -88,7 +88,7 @@
   "source": [
    "## Combine chains with the `SequentialChain`\n",
    "\n",
-    "The next step after calling a language model is make a series of calls to a language model. We can do this using sequential chains, which are chains that execute their links in a predefined order. Specifically, we will use the `SimpleSequentialChain`. This is the simplest form of sequential chains, where each step has a singular input/output, and the output of one step is the input to the next.\n",
+    "The next step after calling a language model is to make a series of calls to a language model. We can do this using sequential chains, which are chains that execute their links in a predefined order. Specifically, we will use the `SimpleSequentialChain`. This is the simplest type of a sequential chain, where each step has a single input/output, and the output of one step is the input to the next.\n",
    "\n",
    "In this tutorial, our sequential chain will:\n",
    "1. First, create a company name for a product. We will reuse the `LLMChain` we'd previously initialized to create this company name.\n",
@ -156,7 +156,7 @@
   "source": [
    "## Create a custom chain with the `Chain` class\n",
    "\n",
-    "LangChain provides many chains out of the box, but sometimes you may want to create a custom chains for your specific use case. For this example, we will create a custom chain that concatenates the outputs of 2 `LLMChain`s.\n",
+    "LangChain provides many chains out of the box, but sometimes you may want to create a custom chain for your specific use case. For this example, we will create a custom chain that concatenates the outputs of 2 `LLMChain`s.\n",
    "\n",
    "In order to create a custom chain:\n",
    "1. Start by subclassing the `Chain` class,\n",
--- a/docs/modules/chains/key_concepts.md
+++ b/docs/modules/chains/key_concepts.md
@ -6,6 +6,6 @@ They vary greatly in complexity and are combination of generic, highly configura

 ## Sequential Chain
 This is a specific type of chain where multiple other chains are run in sequence, with the outputs being added as inputs
-to the next. A subtype of this type of chain is the `SimpleSequentialChain`, where all subchains have only one input and one output,
+to the next. A subtype of this type of chain is the [`SimpleSequentialChain`](./generic/sequential_chains.html#simplesequentialchain), where all subchains have only one input and one output,
 and the output of one is therefore used as sole input to the next chain.

--- a/docs/modules/document_loaders/examples/CoNLL-U.ipynb
+++ b/docs/modules/document_loaders/examples/CoNLL-U.ipynb
@ -0,0 +1,116 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "id": "9f98a15e",
+   "metadata": {},
+   "source": [
+    "# CoNLL-U\n",
+    "This is an example of how to load a file in [CoNLL-U](https://universaldependencies.org/format.html) format. The whole file is treated as one document. The example data (`conllu.conllu`) is based on one of the standard UD/CoNLL-U examples."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "d9b2e33e",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from langchain.document_loaders import CoNLLULoader"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "5b5eec48",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "loader = CoNLLULoader(\"example_data/conllu.conllu\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "10f3f725",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "document = loader.load()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "acbb3579",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "document"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.8.8"
+  },
+  "toc": {
+   "base_numbering": 1,
+   "nav_menu": {},
+   "number_sections": true,
+   "sideBar": true,
+   "skip_h1_title": false,
+   "title_cell": "Table of Contents",
+   "title_sidebar": "Contents",
+   "toc_cell": false,
+   "toc_position": {},
+   "toc_section_display": true,
+   "toc_window_display": false
+  },
+  "varInspector": {
+   "cols": {
+    "lenName": 16,
+    "lenType": 16,
+    "lenVar": 40
+   },
+   "kernels_config": {
+    "python": {
+     "delete_cmd_postfix": "",
+     "delete_cmd_prefix": "del ",
+     "library": "var_list.py",
+     "varRefreshCmd": "print(var_dic_list())"
+    },
+    "r": {
+     "delete_cmd_postfix": ") ",
+     "delete_cmd_prefix": "rm(",
+     "library": "var_list.r",
+     "varRefreshCmd": "cat(var_dic_list()) "
+    }
+   },
+   "types_to_exclude": [
+    "module",
+    "function",
+    "builtin_function_or_method",
+    "instance",
+    "_Feature"
+   ],
+   "window_display": false
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
--- a/docs/modules/document_loaders/examples/copypaste.ipynb
+++ b/docs/modules/document_loaders/examples/copypaste.ipynb
@ -0,0 +1,102 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "id": "d9826810",
+   "metadata": {},
+   "source": [
+    "# Copy Paste\n",
+    "\n",
+    "This notebook covers how to load a document object from something you just want to copy and paste. In this case, you don't even need to use a DocumentLoader, but rather can just construct the Document directly."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "id": "fd9e71a2",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from langchain.docstore.document import Document"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 2,
+   "id": "f40d3f30",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "text = \"..... put the text you copy pasted here......\""
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 3,
+   "id": "d409bdba",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "doc = Document(page_content=text)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "cc0eff72",
+   "metadata": {},
+   "source": [
+    "## Metadata\n",
+    "If you want to add metadata about the where you got this piece of text, you easily can with the metadata key."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 4,
+   "id": "fe3aa5aa",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "metadata = {\"source\": \"internet\", \"date\": \"Friday\"}"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 5,
+   "id": "827d4e91",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "doc = Document(page_content=text, metadata=metadata)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "c986a43d",
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.9.1"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
--- a/docs/modules/document_loaders/examples/example_data/conllu.conllu
+++ b/docs/modules/document_loaders/examples/example_data/conllu.conllu
@ -0,0 +1,8 @@
+# sent_id = 1
+# text = They buy and sell books.
+1	They	they	PRON	PRP	Case=Nom|Number=Plur	2	nsubj	2:nsubj|4:nsubj	_
+2	buy	buy	VERB	VBP	Number=Plur|Person=3|Tense=Pres	0	root	0:root	_
+3	and	and	CONJ	CC 	_	4	cc	4:cc	_
+4	sell	sell	VERB	VBP	Number=Plur|Person=3|Tense=Pres	2	conj	0:root|2:conj	_
+5	books	book	NOUN	NNS	Number=Plur	2	obj	2:obj|4:obj	SpaceAfter=No
+6	.	.	PUNCT	.	_	2	punct	2:punct	_
--- a/docs/modules/document_loaders/examples/example_data/facebook_chat.json
+++ b/docs/modules/document_loaders/examples/example_data/facebook_chat.json
@ -0,0 +1,64 @@
+{
+    "participants": [{"name": "User 1"}, {"name": "User 2"}],
+    "messages": [
+        {"sender_name": "User 2", "timestamp_ms": 1675597571851, "content": "Bye!"},
+        {
+            "sender_name": "User 1",
+            "timestamp_ms": 1675597435669,
+            "content": "Oh no worries! Bye",
+        },
+        {
+            "sender_name": "User 2",
+            "timestamp_ms": 1675596277579,
+            "content": "No Im sorry it was my mistake, the blue one is not for sale",
+        },
+        {
+            "sender_name": "User 1",
+            "timestamp_ms": 1675595140251,
+            "content": "I thought you were selling the blue one!",
+        },
+        {
+            "sender_name": "User 1",
+            "timestamp_ms": 1675595109305,
+            "content": "Im not interested in this bag. Im interested in the blue one!",
+        },
+        {
+            "sender_name": "User 2",
+            "timestamp_ms": 1675595068468,
+            "content": "Here is $129",
+        },
+        {
+            "sender_name": "User 2",
+            "timestamp_ms": 1675595060730,
+            "photos": [
+                {"uri": "url_of_some_picture.jpg", "creation_timestamp": 1675595059}
+            ],
+        },
+        {
+            "sender_name": "User 2",
+            "timestamp_ms": 1675595045152,
+            "content": "Online is at least $100",
+        },
+        {
+            "sender_name": "User 1",
+            "timestamp_ms": 1675594799696,
+            "content": "How much do you want?",
+        },
+        {
+            "sender_name": "User 2",
+            "timestamp_ms": 1675577876645,
+            "content": "Goodmorning! $50 is too low.",
+        },
+        {
+            "sender_name": "User 1",
+            "timestamp_ms": 1675549022673,
+            "content": "Hi! Im interested in your bag. Im offering $50. Let me know if you are interested. Thanks!",
+        },
+    ],
+    "title": "User 1 and User 2 chat",
+    "is_still_participant": true,
+    "thread_path": "inbox/User 1 and User 2 chat",
+    "magic_words": [],
+    "image": {"uri": "image_of_the_chat.jpg", "creation_timestamp": 1675549016},
+    "joinable_mode": {"mode": 1, "link": ""},
+}
--- a/docs/modules/document_loaders/examples/example_data/notebook.ipynb
+++ b/docs/modules/document_loaders/examples/example_data/notebook.ipynb
@ -0,0 +1,83 @@
+{
+ "cells": [
+  {
+   "attachments": {},
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Notebook\n",
+    "\n",
+    "This notebook covers how to load data from an .ipynb notebook into a format suitable by LangChain."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from langchain.document_loaders import NotebookLoader"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "loader = NotebookLoader(\"example_data/notebook.ipynb\")"
+   ]
+  },
+  {
+   "attachments": {},
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "`NotebookLoader.load()` loads the `.ipynb` notebook file into a `Document` object.\n",
+    "\n",
+    "**Parameters**:\n",
+    "\n",
+    "* `include_outputs` (bool): whether to include cell outputs in the resulting document (default is False).\n",
+    "* `max_output_length` (int): the maximum number of characters to include from each cell output (default is 10).\n",
+    "* `remove_newline` (bool): whether to remove newline characters from the cell sources and outputs (default is False).\n",
+    "* `traceback` (bool): whether to include full traceback (default is False)."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "loader.load(include_outputs=True, max_output_length=20, remove_newline=True)"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.11.1"
+  },
+  "orig_nbformat": 4,
+  "vscode": {
+   "interpreter": {
+    "hash": "981b6680a42bdb5eb22187741e1607b3aae2cf73db800d1af1f268d1de6a1f70"
+   }
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 2
+}
--- a/docs/modules/document_loaders/examples/facebook_chat.ipynb
+++ b/docs/modules/document_loaders/examples/facebook_chat.ipynb
@ -0,0 +1,77 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Facebook Chat\n",
+    "\n",
+    "This notebook covers how to load data from the Facebook Chats into a format that can be ingested into LangChain."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from langchain.document_loaders import FacebookChatLoader"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 2,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "loader = FacebookChatLoader(\"example_data/facebook_chat.json\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 3,
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "[Document(page_content='User 2 on 2023-02-05 12:46:11: Bye!\\n\\nUser 1 on 2023-02-05 12:43:55: Oh no worries! Bye\\n\\nUser 2 on 2023-02-05 12:24:37: No Im sorry it was my mistake, the blue one is not for sale\\n\\nUser 1 on 2023-02-05 12:05:40: I thought you were selling the blue one!\\n\\nUser 1 on 2023-02-05 12:05:09: Im not interested in this bag. Im interested in the blue one!\\n\\nUser 2 on 2023-02-05 12:04:28: Here is $129\\n\\nUser 2 on 2023-02-05 12:04:05: Online is at least $100\\n\\nUser 1 on 2023-02-05 11:59:59: How much do you want?\\n\\nUser 2 on 2023-02-05 07:17:56: Goodmorning! $50 is too low.\\n\\nUser 1 on 2023-02-04 23:17:02: Hi! Im interested in your bag. Im offering $50. Let me know if you are interested. Thanks!\\n\\n', lookup_str='', metadata={'source': 'docs/modules/document_loaders/examples/example_data/facebook_chat.json'}, lookup_index=0)]"
+      ]
+     },
+     "execution_count": 3,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "loader.load()"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.11.1"
+  },
+  "vscode": {
+   "interpreter": {
+    "hash": "384707f4965e853a82006e90614c2e1a578ea1f6eb0ee07a1dd78a657d37dd67"
+   }
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 2
+}
--- a/docs/modules/document_loaders/examples/ifixit.ipynb
+++ b/docs/modules/document_loaders/examples/ifixit.ipynb
--- a/docs/modules/document_loaders/examples/image.ipynb
+++ b/docs/modules/document_loaders/examples/image.ipynb
@ -0,0 +1,145 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "id": "f70e6118",
+   "metadata": {},
+   "source": [
+    "# Images\n",
+    "\n",
+    "This covers how to load images such as JPGs PNGs into a document format that we can use downstream."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "09d64998",
+   "metadata": {},
+   "source": [
+    "## Using Unstructured"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "id": "0cc0cd42",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from langchain.document_loaders.image import UnstructuredImageLoader"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 2,
+   "id": "082d557c",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "loader = UnstructuredImageLoader(\"layout-parser-paper-fast.jpg\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 3,
+   "id": "df11c953",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "data = loader.load()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 4,
+   "id": "4284d44c",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "Document(page_content=\"LayoutParser: A Unified Toolkit for Deep\\nLearning Based Document Image Analysis\\n\\n\\n‘Zxjiang Shen' (F3}, Ruochen Zhang”, Melissa Dell*, Benjamin Charles Germain\\nLeet, Jacob Carlson, and Weining LiF\\n\\n\\nsugehen\\n\\nshangthrows, et\\n\\n“Abstract. Recent advanocs in document image analysis (DIA) have been\\n‘pimarliy driven bythe application of neural networks dell roar\\n{uteomer could be aly deployed in production and extended fo farther\\n[nvetigtion. However, various factory ke lcely organize codebanee\\nsnd sophisticated modal cnigurations compat the ey ree of\\n‘erin! innovation by wide sence, Though there have been sng\\n‘Hors to improve reuablty and simplify deep lees (DL) mode\\n‘aon, sone of them ae optimized for challenge inthe demain of DIA,\\nThis roprscte a major gap in the extng fol, sw DIA i eal to\\nscademic research acon wie range of dpi in the social ssencee\\n[rary for streamlining the sage of DL in DIA research and appicn\\n‘tons The core LayoutFaraer brary comes with a sch of simple and\\nIntative interfaee or applying and eutomiing DI. odel fr Inyo de\\npltfom for sharing both protrined modes an fal document dist\\n{ation pipeline We demonutate that LayootPareer shea fr both\\nlightweight and lrgeseledgtieation pipelines in eal-word uae ces\\nThe leary pblely smal at Btspe://layost-pareergsthab So\\n\\n\\n\\n‘Keywords: Document Image Analysis» Deep Learning Layout Analysis\\n‘Character Renguition - Open Serres dary « Tol\\n\\n\\nIntroduction\\n\\n\\n‘Deep Learning(DL)-based approaches are the state-of-the-art for a wide range of\\ndoctiment image analysis (DIA) tea including document image clasiffeation [I]\\n\", lookup_str='', metadata={'source': 'layout-parser-paper-fast.jpg'}, lookup_index=0)"
+      ]
+     },
+     "execution_count": 4,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "data[0]"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "09957371",
+   "metadata": {},
+   "source": [
+    "### Retain Elements\n",
+    "\n",
+    "Under the hood, Unstructured creates different \"elements\" for different chunks of text. By default we combine those together, but you can easily keep that separation by specifying `mode=\"elements\"`."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 5,
+   "id": "0fab833b",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "loader = UnstructuredImageLoader(\"layout-parser-paper-fast.jpg\", mode=\"elements\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 6,
+   "id": "c3e8ff1b",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "data = loader.load()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 7,
+   "id": "43c23d2d",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "Document(page_content='LayoutParser: A Unified Toolkit for Deep\\nLearning Based Document Image Analysis\\n', lookup_str='', metadata={'source': 'layout-parser-paper-fast.jpg', 'filename': 'layout-parser-paper-fast.jpg', 'page_number': 1, 'category': 'Title'}, lookup_index=0)"
+      ]
+     },
+     "execution_count": 7,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "data[0]"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.8.13"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
--- a/docs/modules/document_loaders/examples/notebook.ipynb
+++ b/docs/modules/document_loaders/examples/notebook.ipynb
@ -0,0 +1,98 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Notebook\n",
+    "\n",
+    "This notebook covers how to load data from an .ipynb notebook into a format suitable by LangChain."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from langchain.document_loaders import NotebookLoader"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 4,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "loader = NotebookLoader(\"example_data/notebook.ipynb\", include_outputs=True, max_output_length=20, remove_newline=True)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "`NotebookLoader.load()` loads the `.ipynb` notebook file into a `Document` object.\n",
+    "\n",
+    "**Parameters**:\n",
+    "\n",
+    "* `include_outputs` (bool): whether to include cell outputs in the resulting document (default is False).\n",
+    "* `max_output_length` (int): the maximum number of characters to include from each cell output (default is 10).\n",
+    "* `remove_newline` (bool): whether to remove newline characters from the cell sources and outputs (default is False).\n",
+    "* `traceback` (bool): whether to include full traceback (default is False)."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 5,
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "[Document(page_content='\\'markdown\\' cell: \\'[\\'# Notebook\\', \\'\\', \\'This notebook covers how to load data from an .ipynb notebook into a format suitable by LangChain.\\']\\'\\n\\n \\'code\\' cell: \\'[\\'from langchain.document_loaders import NotebookLoader\\']\\'\\n\\n \\'code\\' cell: \\'[\\'loader = NotebookLoader(\"example_data/notebook.ipynb\")\\']\\'\\n\\n \\'markdown\\' cell: \\'[\\'`NotebookLoader.load()` loads the `.ipynb` notebook file into a `Document` object.\\', \\'\\', \\'**Parameters**:\\', \\'\\', \\'* `include_outputs` (bool): whether to include cell outputs in the resulting document (default is False).\\', \\'* `max_output_length` (int): the maximum number of characters to include from each cell output (default is 10).\\', \\'* `remove_newline` (bool): whether to remove newline characters from the cell sources and outputs (default is False).\\', \\'* `traceback` (bool): whether to include full traceback (default is False).\\']\\'\\n\\n \\'code\\' cell: \\'[\\'loader.load(include_outputs=True, max_output_length=20, remove_newline=True)\\']\\'\\n\\n', lookup_str='', metadata={'source': 'example_data/notebook.ipynb'}, lookup_index=0)]"
+      ]
+     },
+     "execution_count": 5,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "loader.load()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.9.1"
+  },
+  "vscode": {
+   "interpreter": {
+    "hash": "981b6680a42bdb5eb22187741e1607b3aae2cf73db800d1af1f268d1de6a1f70"
+   }
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 2
+}
--- a/docs/modules/document_loaders/examples/word_document.ipynb
+++ b/docs/modules/document_loaders/examples/word_document.ipynb
@ -0,0 +1,137 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "id": "39af9ecd",
+   "metadata": {},
+   "source": [
+    "# Word Documents\n",
+    "\n",
+    "This covers how to load Word documents into a document format that we can use downstream."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "id": "721c48aa",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from langchain.document_loaders import UnstructuredWordDocumentLoader"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 2,
+   "id": "9d3d0e35",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "loader = UnstructuredWordDocumentLoader(\"fake.docx\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 3,
+   "id": "06073f91",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "data = loader.load()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 4,
+   "id": "c9adc5cb",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "[Document(page_content='Lorem ipsum dolor sit amet.', lookup_str='', metadata={'source': 'fake.docx'}, lookup_index=0)]"
+      ]
+     },
+     "execution_count": 4,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "data"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "525d6b67",
+   "metadata": {},
+   "source": [
+    "## Retain Elements\n",
+    "\n",
+    "Under the hood, Unstructured creates different \"elements\" for different chunks of text. By default we combine those together, but you can easily keep that separation by specifying `mode=\"elements\"`."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 5,
+   "id": "064f9162",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "loader = UnstructuredWordDocumentLoader(\"fake.docx\", mode=\"elements\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 6,
+   "id": "abefbbdb",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "data = loader.load()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 7,
+   "id": "a547c534",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "Document(page_content='Lorem ipsum dolor sit amet.', lookup_str='', metadata={'source': 'fake.docx', 'filename': 'fake.docx', 'category': 'Title'}, lookup_index=0)"
+      ]
+     },
+     "execution_count": 7,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "data[0]"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.8.13"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
--- a/docs/modules/document_loaders/how_to_guides.rst
+++ b/docs/modules/document_loaders/how_to_guides.rst
@ -57,6 +57,10 @@ There are a lot of different document loaders that LangChain supports. Below are

 `Online PDF <./examples/online_pdf.html>`_: A walkthrough of how to load data from an online PDF.

+`CoNLL-U <./examples/CoNLL-U.html>`_: A walkthrough of how to load data from a ConLL-U file.
+
+`iFixit <./examples/ifixit.html>`_: A walkthrough of how to search and load data like guides, technical Q&A's, and device wikis from iFixit.com
+
 .. toctree::
   :maxdepth: 1
   :glob:
--- a/docs/modules/indexes/chain_examples/chat_vector_db.ipynb
+++ b/docs/modules/indexes/chain_examples/chat_vector_db.ipynb
@ -268,12 +268,48 @@
  },
  {
   "cell_type": "markdown",
-   "id": "7fb44daa",
-   "metadata": {},
+   "source": [
+    "## Chat Vector DB with `search_distance`\n",
+    "If you are using a vector store that supports filtering by search distance, you can add a threshold value parameter."
+   ],
+   "metadata": {
+    "collapsed": false
+   }
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "outputs": [],
+   "source": [
+    "vectordbkwargs = {\"search_distance\": 0.9}"
+   ],
+   "metadata": {
+    "collapsed": false
+   }
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "outputs": [],
+   "source": [
+    "qa = ChatVectorDBChain.from_llm(OpenAI(temperature=0), vectorstore, return_source_documents=True)\n",
+    "chat_history = []\n",
+    "query = \"What did the president say about Ketanji Brown Jackson\"\n",
+    "result = qa({\"question\": query, \"chat_history\": chat_history, \"vectordbkwargs\": vectordbkwargs})"
+   ],
+   "metadata": {
+    "collapsed": false
+   }
+  },
+  {
+   "cell_type": "markdown",
   "source": [
    "## Chat Vector DB with `map_reduce`\n",
    "We can also use different types of combine document chains with the Chat Vector DB chain."
-   ]
+   ],
+   "metadata": {
+    "collapsed": false
+   }
  },
  {
   "cell_type": "code",
@ -486,7 +522,7 @@
   "source": [
    "chat_history = [(query, result[\"answer\"])]\n",
    "query = \"Did he mention who she suceeded\"\n",
-    "result = qa({\"question\": query, \"chat_history\": chat_history})"
+    "result = qa({\"question\": query, \"chat_history\": chat_history})\n"
   ]
  }
 ],
--- a/docs/modules/indexes/chain_examples/question_answering.ipynb
+++ b/docs/modules/indexes/chain_examples/question_answering.ipynb
@ -7,7 +7,7 @@
   "source": [
    "# Question Answering\n",
    "\n",
-    "This notebook walks through how to use LangChain for question answering over a list of documents. It covers four different types of chaings: `stuff`, `map_reduce`, `refine`, `map-rerank`. For a more in depth explanation of what these chain types are, see [here](../combine_docs.md)."
+    "This notebook walks through how to use LangChain for question answering over a list of documents. It covers four different types of chains: `stuff`, `map_reduce`, `refine`, `map-rerank`. For a more in depth explanation of what these chain types are, see [here](../combine_docs.md)."
   ]
  },
  {
@ -30,29 +30,24 @@
    "from langchain.text_splitter import CharacterTextSplitter\n",
    "from langchain.vectorstores import Chroma\n",
    "from langchain.docstore.document import Document\n",
-    "from langchain.prompts import PromptTemplate"
+    "from langchain.prompts import PromptTemplate\n",
+    "from langchain.indexes.vectorstore import VectorstoreIndexCreator"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
-   "id": "291f0117",
+   "id": "ef9305cc",
   "metadata": {},
   "outputs": [],
   "source": [
-    "from langchain.document_loaders import TextLoader\n",
-    "loader = TextLoader('../../state_of_the_union.txt')\n",
-    "documents = loader.load()\n",
-    "text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)\n",
-    "texts = text_splitter.split_documents(documents)\n",
-    "\n",
-    "embeddings = OpenAIEmbeddings()"
+    "index_creator = VectorstoreIndexCreator()"
   ]
  },
  {
   "cell_type": "code",
-   "execution_count": 4,
-   "id": "fd9666a9",
+   "execution_count": 3,
+   "id": "291f0117",
   "metadata": {},
   "outputs": [
    {
@ -65,12 +60,14 @@
    }
   ],
   "source": [
-    "docsearch = Chroma.from_documents(texts, embeddings)"
+    "from langchain.document_loaders import TextLoader\n",
+    "loader = TextLoader('../../state_of_the_union.txt')\n",
+    "docsearch = index_creator.from_loaders([loader])"
   ]
  },
  {
   "cell_type": "code",
-   "execution_count": 5,
+   "execution_count": 4,
   "id": "d1eaf6e6",
   "metadata": {},
   "outputs": [],
--- a/docs/modules/indexes/getting_started.ipynb
+++ b/docs/modules/indexes/getting_started.ipynb
@ -2,45 +2,204 @@
 "cells": [
  {
   "cell_type": "markdown",
-   "id": "07c1e3b9",
+   "id": "2244801b",
   "metadata": {},
   "source": [
    "# Getting Started\n",
    "\n",
-    "This example showcases question answering over a vector database.\n",
-    "We have chosen this as the example for getting started because it nicely combines a lot of different elements (Text splitters, embeddings, vectorstores) and then also shows how to use them in a chain."
+    "This example showcases question answering over documents.\n",
+    "We have chosen this as the example for getting started because it nicely combines a lot of different elements (Text splitters, embeddings, vectorstores) and then also shows how to use them in a chain.\n",
+    "\n",
+    "Question answering over documents consists of three steps:\n",
+    "\n",
+    "1. Create an index\n",
+    "2. Create a question answering chain\n",
+    "3. Ask questions!\n",
+    "\n",
+    "Each of the steps has multiple sub steps and potential configurations. In this notebook we will primarily focus on (1). We will start by showing the one-liner for doing so, but then break down what is actually going on.\n",
+    "\n",
+    "First, let's import some common classes we'll use no matter what."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
-   "id": "82525493",
+   "id": "8d369452",
   "metadata": {},
   "outputs": [],
   "source": [
-    "from langchain.embeddings.openai import OpenAIEmbeddings\n",
-    "from langchain.vectorstores import Chroma\n",
-    "from langchain.text_splitter import CharacterTextSplitter\n",
-    "from langchain import OpenAI, VectorDBQA"
+    "from langchain.chains import VectorDBQA\n",
+    "from langchain.llms import OpenAI"
   ]
  },
  {
   "cell_type": "markdown",
-   "id": "0b7adc54",
+   "id": "07c1e3b9",
   "metadata": {},
   "source": [
-    "Here we load in the documents we want to use to create our index."
+    "Next in the generic setup, let's specify the document loader we want to use."
   ]
  },
  {
   "cell_type": "code",
-   "execution_count": 3,
-   "id": "611e0c19",
+   "execution_count": 2,
+   "id": "33958a86",
   "metadata": {},
   "outputs": [],
   "source": [
    "from langchain.document_loaders import TextLoader\n",
-    "loader = TextLoader('../state_of_the_union.txt')\n",
+    "loader = TextLoader('../state_of_the_union.txt')"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "489c74bb",
+   "metadata": {},
+   "source": [
+    "## One Line Index Creation\n",
+    "\n",
+    "To get started as quickly as possible, we can use the `VectorstoreIndexCreator`."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 3,
+   "id": "403fc231",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from langchain.indexes import VectorstoreIndexCreator"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 4,
+   "id": "57a8a199",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Running Chroma using direct local API.\n",
+      "Using DuckDB in-memory for database. Data will be transient.\n"
+     ]
+    }
+   ],
+   "source": [
+    "index = VectorstoreIndexCreator().from_loaders([loader])"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "f3493fa4",
+   "metadata": {},
+   "source": [
+    "Now that the index is created, we can use it to ask questions of the data! Note that under the hood this is actually doing a few steps as well, which we will cover later in this guide."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 5,
+   "id": "23d0d234",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "\" The president said that Ketanji Brown Jackson is one of the nation's top legal minds, a former top litigator in private practice, a former federal public defender, and from a family of public school educators and police officers. He also said that she is a consensus builder and has received a broad range of support from the Fraternal Order of Police to former judges appointed by Democrats and Republicans.\""
+      ]
+     },
+     "execution_count": 5,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "query = \"What did the president say about Ketanji Brown Jackson\"\n",
+    "index.query(query)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 6,
+   "id": "ae46b239",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "{'question': 'What did the president say about Ketanji Brown Jackson',\n",
+       " 'answer': \" The president said that he nominated Circuit Court of Appeals Judge Ketanji Brown Jackson, one of the nation's top legal minds, to continue Justice Breyer's legacy of excellence, and that she has received a broad range of support from the Fraternal Order of Police to former judges appointed by Democrats and Republicans.\\n\",\n",
+       " 'sources': '../state_of_the_union.txt'}"
+      ]
+     },
+     "execution_count": 6,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "query = \"What did the president say about Ketanji Brown Jackson\"\n",
+    "index.query_with_sources(query)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "ff100212",
+   "metadata": {},
+   "source": [
+    "What is returned from the `VectorstoreIndexCreator` is `VectorStoreIndexWrapper`, which provides these nice `query` and `query_with_sources` functionality. If we just wanted to access the vectorstore directly, we can also do that."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 7,
+   "id": "b04f3c10",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "<langchain.vectorstores.chroma.Chroma at 0x113a3a700>"
+      ]
+     },
+     "execution_count": 7,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "index.vectorstore"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "2cb6d2eb",
+   "metadata": {},
+   "source": [
+    "## Walkthrough\n",
+    "\n",
+    "Okay, so what's actually going on? How is this index getting created?\n",
+    "\n",
+    "A lot of the magic is being hid in this `VectorstoreIndexCreator`. What is this doing?\n",
+    "\n",
+    "There are three main steps going on after the documents are loaded:\n",
+    "\n",
+    "1. Splitting documents into chunks\n",
+    "2. Creating embeddings for each document\n",
+    "3. Storing documents and embeddings in a vectorstore\n",
+    "\n",
+    "Let's walk through this in code"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 6,
+   "id": "54270abc",
+   "metadata": {},
+   "outputs": [],
+   "source": [
    "documents = loader.load()"
   ]
  },
@ -54,11 +213,12 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 4,
+   "execution_count": 8,
   "id": "afecb8cf",
   "metadata": {},
   "outputs": [],
   "source": [
+    "from langchain.text_splitter import CharacterTextSplitter\n",
    "text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)\n",
    "texts = text_splitter.split_documents(documents)"
   ]
@ -73,11 +233,12 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 5,
+   "execution_count": 10,
   "id": "9eaaa735",
   "metadata": {},
   "outputs": [],
   "source": [
+    "from langchain.embeddings import OpenAIEmbeddings\n",
    "embeddings = OpenAIEmbeddings()"
   ]
  },
@ -91,7 +252,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 6,
+   "execution_count": 11,
   "id": "5c7049db",
   "metadata": {},
   "outputs": [
@ -105,6 +266,7 @@
    }
   ],
   "source": [
+    "from langchain.vectorstores import Chroma\n",
    "db = Chroma.from_documents(texts, embeddings)"
   ]
  },
@ -113,12 +275,13 @@
   "id": "30c4e5c6",
   "metadata": {},
   "source": [
-    "Finally, we create a chain and use it to answer questions!"
+    "So that's creating the index.\n",
+    "Then, as before, we create a chain and use it to answer questions!"
   ]
  },
  {
   "cell_type": "code",
-   "execution_count": 9,
+   "execution_count": 12,
   "id": "3018f865",
   "metadata": {},
   "outputs": [],
@ -128,17 +291,17 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 10,
+   "execution_count": 13,
   "id": "032a47f8",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
-       "\" The President said that Ketanji Brown Jackson is one of the nation's top legal minds, a former top litigator in private practice, a former federal public defender, and from a family of public school educators and police officers. He said that she is a consensus builder and has received a broad range of support from the Fraternal Order of Police to former judges appointed by Democrats and Republicans.\""
+       "\" The President said that Ketanji Brown Jackson is one of the nation's top legal minds and a consensus builder, with a broad range of support from the Fraternal Order of Police to former judges appointed by Democrats and Republicans. She is a former top litigator in private practice, a former federal public defender, and from a family of public school educators and police officers.\""
      ]
     },
-     "execution_count": 10,
+     "execution_count": 13,
     "metadata": {},
     "output_type": "execute_result"
    }
@ -148,10 +311,40 @@
    "qa.run(query)"
   ]
  },
+  {
+   "cell_type": "markdown",
+   "id": "9464690e",
+   "metadata": {},
+   "source": [
+    "`VectorstoreIndexCreator` is just a wrapper around all this logic. It is configurable in the text splitter it uses, the embeddings it uses, and the vectorstore it uses. For example, you can configure it as below:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 14,
+   "id": "4001bbc6",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "index_creator = VectorstoreIndexCreator(\n",
+    "    vectorstore_cls=Chroma, \n",
+    "    embedding=OpenAIEmbeddings(),\n",
+    "    text_splitter=CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "78d8d143",
+   "metadata": {},
+   "source": [
+    "Hopefully this highlights what is going on under the hood of `VectorstoreIndexCreator`. While we think it's important to have a simple way to create indexes, we also think it's important to understand what's going on under the hood."
+   ]
+  },
  {
   "cell_type": "code",
   "execution_count": null,
-   "id": "8b403637",
+   "id": "dd7257bf",
   "metadata": {},
   "outputs": [],
   "source": []
--- a/docs/modules/indexes/how_to_guides.rst
+++ b/docs/modules/indexes/how_to_guides.rst
@ -36,6 +36,8 @@ In the below guides, we cover different types of vectorstores and how to use the

 `Chroma <./vectorstore_examples/chroma.html>`_: A walkthrough of how to use the Chroma vectorstore wrapper.

+`DeepLake <./vectorstore_examples/deeplake.html>`_: A walkthrough of how to use the Deep Lake, data lake, wrapper.
+
 `FAISS <./vectorstore_examples/faiss.html>`_: A walkthrough of how to use the FAISS vectorstore wrapper.

 `Elastic Search <./vectorstore_examples/elasticsearch.html>`_: A walkthrough of how to use the ElasticSearch wrapper.
--- a/docs/modules/indexes/vectorstore_examples/atlas.ipynb
+++ b/docs/modules/indexes/vectorstore_examples/atlas.ipynb
@ -0,0 +1,266 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "pycharm": {
+     "name": "#%% md\n"
+    }
+   },
+   "source": [
+    "# AtlasDB\n",
+    "\n",
+    "This notebook shows you how to use functionality related to the AtlasDB"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "metadata": {
+    "pycharm": {
+     "name": "#%%\n"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "import time\n",
+    "from langchain.embeddings.openai import OpenAIEmbeddings\n",
+    "from langchain.text_splitter import SpacyTextSplitter\n",
+    "from langchain.vectorstores import AtlasDB\n",
+    "from langchain.document_loaders import TextLoader"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 2,
+   "metadata": {
+    "scrolled": true
+   },
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Collecting en-core-web-sm==3.5.0\n",
+      "  Downloading https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-3.5.0/en_core_web_sm-3.5.0-py3-none-any.whl (12.8 MB)\n",
+      "\u001B[2K     \u001B[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001B[0m \u001B[32m12.8/12.8 MB\u001B[0m \u001B[31m90.8 MB/s\u001B[0m eta \u001B[36m0:00:00\u001B[0m00:01\u001B[0m00:01\u001B[0m\n",
+      "\u001B[?25hRequirement already satisfied: spacy<3.6.0,>=3.5.0 in /home/ubuntu/langchain/.venv/lib/python3.9/site-packages (from en-core-web-sm==3.5.0) (3.5.0)\n",
+      "Requirement already satisfied: packaging>=20.0 in /home/ubuntu/langchain/.venv/lib/python3.9/site-packages (from spacy<3.6.0,>=3.5.0->en-core-web-sm==3.5.0) (23.0)\n",
+      "Requirement already satisfied: wasabi<1.2.0,>=0.9.1 in /home/ubuntu/langchain/.venv/lib/python3.9/site-packages (from spacy<3.6.0,>=3.5.0->en-core-web-sm==3.5.0) (1.1.1)\n",
+      "Requirement already satisfied: langcodes<4.0.0,>=3.2.0 in /home/ubuntu/langchain/.venv/lib/python3.9/site-packages (from spacy<3.6.0,>=3.5.0->en-core-web-sm==3.5.0) (3.3.0)\n",
+      "Requirement already satisfied: srsly<3.0.0,>=2.4.3 in /home/ubuntu/langchain/.venv/lib/python3.9/site-packages (from spacy<3.6.0,>=3.5.0->en-core-web-sm==3.5.0) (2.4.5)\n",
+      "Requirement already satisfied: pathy>=0.10.0 in /home/ubuntu/langchain/.venv/lib/python3.9/site-packages (from spacy<3.6.0,>=3.5.0->en-core-web-sm==3.5.0) (0.10.1)\n",
+      "Requirement already satisfied: setuptools in /home/ubuntu/langchain/.venv/lib/python3.9/site-packages (from spacy<3.6.0,>=3.5.0->en-core-web-sm==3.5.0) (67.4.0)\n",
+      "Requirement already satisfied: tqdm<5.0.0,>=4.38.0 in /home/ubuntu/langchain/.venv/lib/python3.9/site-packages (from spacy<3.6.0,>=3.5.0->en-core-web-sm==3.5.0) (4.64.1)\n",
+      "Requirement already satisfied: spacy-loggers<2.0.0,>=1.0.0 in /home/ubuntu/langchain/.venv/lib/python3.9/site-packages (from spacy<3.6.0,>=3.5.0->en-core-web-sm==3.5.0) (1.0.4)\n",
+      "Requirement already satisfied: smart-open<7.0.0,>=5.2.1 in /home/ubuntu/langchain/.venv/lib/python3.9/site-packages (from spacy<3.6.0,>=3.5.0->en-core-web-sm==3.5.0) (6.3.0)\n",
+      "Requirement already satisfied: thinc<8.2.0,>=8.1.0 in /home/ubuntu/langchain/.venv/lib/python3.9/site-packages (from spacy<3.6.0,>=3.5.0->en-core-web-sm==3.5.0) (8.1.7)\n",
+      "Requirement already satisfied: cymem<2.1.0,>=2.0.2 in /home/ubuntu/langchain/.venv/lib/python3.9/site-packages (from spacy<3.6.0,>=3.5.0->en-core-web-sm==3.5.0) (2.0.7)\n",
+      "Requirement already satisfied: typer<0.8.0,>=0.3.0 in /home/ubuntu/langchain/.venv/lib/python3.9/site-packages (from spacy<3.6.0,>=3.5.0->en-core-web-sm==3.5.0) (0.7.0)\n",
+      "Requirement already satisfied: requests<3.0.0,>=2.13.0 in /home/ubuntu/langchain/.venv/lib/python3.9/site-packages (from spacy<3.6.0,>=3.5.0->en-core-web-sm==3.5.0) (2.28.2)\n",
+      "Requirement already satisfied: jinja2 in /home/ubuntu/langchain/.venv/lib/python3.9/site-packages (from spacy<3.6.0,>=3.5.0->en-core-web-sm==3.5.0) (3.1.2)\n",
+      "Requirement already satisfied: pydantic!=1.8,!=1.8.1,<1.11.0,>=1.7.4 in /home/ubuntu/langchain/.venv/lib/python3.9/site-packages (from spacy<3.6.0,>=3.5.0->en-core-web-sm==3.5.0) (1.10.5)\n",
+      "Requirement already satisfied: catalogue<2.1.0,>=2.0.6 in /home/ubuntu/langchain/.venv/lib/python3.9/site-packages (from spacy<3.6.0,>=3.5.0->en-core-web-sm==3.5.0) (2.0.8)\n",
+      "Requirement already satisfied: spacy-legacy<3.1.0,>=3.0.11 in /home/ubuntu/langchain/.venv/lib/python3.9/site-packages (from spacy<3.6.0,>=3.5.0->en-core-web-sm==3.5.0) (3.0.12)\n",
+      "Requirement already satisfied: numpy>=1.15.0 in /home/ubuntu/langchain/.venv/lib/python3.9/site-packages (from spacy<3.6.0,>=3.5.0->en-core-web-sm==3.5.0) (1.24.2)\n",
+      "Requirement already satisfied: murmurhash<1.1.0,>=0.28.0 in /home/ubuntu/langchain/.venv/lib/python3.9/site-packages (from spacy<3.6.0,>=3.5.0->en-core-web-sm==3.5.0) (1.0.9)\n",
+      "Requirement already satisfied: preshed<3.1.0,>=3.0.2 in /home/ubuntu/langchain/.venv/lib/python3.9/site-packages (from spacy<3.6.0,>=3.5.0->en-core-web-sm==3.5.0) (3.0.8)\n",
+      "Requirement already satisfied: typing-extensions>=4.2.0 in /home/ubuntu/langchain/.venv/lib/python3.9/site-packages (from pydantic!=1.8,!=1.8.1,<1.11.0,>=1.7.4->spacy<3.6.0,>=3.5.0->en-core-web-sm==3.5.0) (4.5.0)\n",
+      "Requirement already satisfied: charset-normalizer<4,>=2 in /home/ubuntu/langchain/.venv/lib/python3.9/site-packages (from requests<3.0.0,>=2.13.0->spacy<3.6.0,>=3.5.0->en-core-web-sm==3.5.0) (3.0.1)\n",
+      "Requirement already satisfied: idna<4,>=2.5 in /home/ubuntu/langchain/.venv/lib/python3.9/site-packages (from requests<3.0.0,>=2.13.0->spacy<3.6.0,>=3.5.0->en-core-web-sm==3.5.0) (3.4)\n",
+      "Requirement already satisfied: certifi>=2017.4.17 in /home/ubuntu/langchain/.venv/lib/python3.9/site-packages (from requests<3.0.0,>=2.13.0->spacy<3.6.0,>=3.5.0->en-core-web-sm==3.5.0) (2022.12.7)\n",
+      "Requirement already satisfied: urllib3<1.27,>=1.21.1 in /home/ubuntu/langchain/.venv/lib/python3.9/site-packages (from requests<3.0.0,>=2.13.0->spacy<3.6.0,>=3.5.0->en-core-web-sm==3.5.0) (1.26.14)\n",
+      "Requirement already satisfied: blis<0.8.0,>=0.7.8 in /home/ubuntu/langchain/.venv/lib/python3.9/site-packages (from thinc<8.2.0,>=8.1.0->spacy<3.6.0,>=3.5.0->en-core-web-sm==3.5.0) (0.7.9)\n",
+      "Requirement already satisfied: confection<1.0.0,>=0.0.1 in /home/ubuntu/langchain/.venv/lib/python3.9/site-packages (from thinc<8.2.0,>=8.1.0->spacy<3.6.0,>=3.5.0->en-core-web-sm==3.5.0) (0.0.4)\n",
+      "Requirement already satisfied: click<9.0.0,>=7.1.1 in /home/ubuntu/langchain/.venv/lib/python3.9/site-packages (from typer<0.8.0,>=0.3.0->spacy<3.6.0,>=3.5.0->en-core-web-sm==3.5.0) (8.1.3)\n",
+      "Requirement already satisfied: MarkupSafe>=2.0 in /home/ubuntu/langchain/.venv/lib/python3.9/site-packages (from jinja2->spacy<3.6.0,>=3.5.0->en-core-web-sm==3.5.0) (2.1.2)\n",
+      "\n",
+      "\u001B[1m[\u001B[0m\u001B[34;49mnotice\u001B[0m\u001B[1;39;49m]\u001B[0m\u001B[39;49m A new release of pip is available: \u001B[0m\u001B[31;49m23.0\u001B[0m\u001B[39;49m -> \u001B[0m\u001B[32;49m23.0.1\u001B[0m\n",
+      "\u001B[1m[\u001B[0m\u001B[34;49mnotice\u001B[0m\u001B[1;39;49m]\u001B[0m\u001B[39;49m To update, run: \u001B[0m\u001B[32;49mpip install --upgrade pip\u001B[0m\n",
+      "\u001B[38;5;2m✔ Download and installation successful\u001B[0m\n",
+      "You can now load the package via spacy.load('en_core_web_sm')\n"
+     ]
+    }
+   ],
+   "source": [
+    "!python -m spacy download en_core_web_sm"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 3,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "ATLAS_TEST_API_KEY = '7xDPkYXSYDc1_ErdTPIcoAR9RNd8YDlkS3nVNXcVoIMZ6'"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 4,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "loader = TextLoader('../../state_of_the_union.txt')\n",
+    "documents = loader.load()\n",
+    "text_splitter = SpacyTextSplitter(separator='|')\n",
+    "texts = []\n",
+    "for doc in text_splitter.split_documents(documents):\n",
+    "    texts.extend(doc.page_content.split('|'))\n",
+    "                 \n",
+    "texts = [e.strip() for e in texts]"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 5,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "2023-02-24 16:13:49.696 | INFO     | nomic.project:_create_project:884 - Creating project `test_index_1677255228.136989` in organization `Atlas Demo`\n",
+      "2023-02-24 16:13:51.087 | INFO     | nomic.project:wait_for_project_lock:993 - test_index_1677255228.136989: Project lock is released.\n",
+      "2023-02-24 16:13:51.225 | INFO     | nomic.project:wait_for_project_lock:993 - test_index_1677255228.136989: Project lock is released.\n",
+      "2023-02-24 16:13:51.481 | INFO     | nomic.project:add_text:1351 - Uploading text to Atlas.\n",
+      "1it [00:00,  1.20it/s]\n",
+      "2023-02-24 16:13:52.318 | INFO     | nomic.project:add_text:1422 - Text upload succeeded.\n",
+      "2023-02-24 16:13:52.628 | INFO     | nomic.project:wait_for_project_lock:993 - test_index_1677255228.136989: Project lock is released.\n",
+      "2023-02-24 16:13:53.380 | INFO     | nomic.project:create_index:1192 - Created map `test_index_1677255228.136989_index` in project `test_index_1677255228.136989`: https://atlas.nomic.ai/map/ee2354a3-7f9a-4c6b-af43-b0cda09d7198/db996d77-8981-48a0-897a-ff2c22bbf541\n"
+     ]
+    }
+   ],
+   "source": [
+    "db = AtlasDB.from_texts(texts=texts,\n",
+    "                        name='test_index_'+str(time.time()),\n",
+    "                        description='test_index',\n",
+    "                        api_key=ATLAS_TEST_API_KEY,\n",
+    "                        index_kwargs={'build_topic_model': True})"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 6,
+   "metadata": {
+    "scrolled": false
+   },
+   "outputs": [
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "2023-02-24 16:14:09.106 | INFO     | nomic.project:wait_for_project_lock:993 - test_index_1677255228.136989: Project lock is released.\n"
+     ]
+    }
+   ],
+   "source": [
+    "with db.project.wait_for_project_lock():\n",
+    "    time.sleep(1)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 7,
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/html": [
+       "\n",
+       "            <strong><a href=\"https://atlas.nomic.ai/dashboard/project/ee2354a3-7f9a-4c6b-af43-b0cda09d7198\">test_index_1677255228.136989</strong></a>\n",
+       "            <br>\n",
+       "            A description for your project 508 datums inserted.\n",
+       "            <br>\n",
+       "            1 index built.\n",
+       "            <br><strong>Projections</strong>\n",
+       "<ul>\n",
+       "<li>test_index_1677255228.136989_index. Status Completed. <a target=\"_blank\" href=\"https://atlas.nomic.ai/map/ee2354a3-7f9a-4c6b-af43-b0cda09d7198/db996d77-8981-48a0-897a-ff2c22bbf541\">view online</a></li></ul><hr><script>\n",
+       "            destroy = function() {\n",
+       "                document.getElementById(\"iframedb996d77-8981-48a0-897a-ff2c22bbf541\").remove()\n",
+       "            }\n",
+       "        </script>\n",
+       "\n",
+       "        <h4>Projection ID: db996d77-8981-48a0-897a-ff2c22bbf541</h4>\n",
+       "        <div class=\"actions\">\n",
+       "            <div id=\"hide\" class=\"action\" onclick=\"destroy()\">Hide embedded project</div>\n",
+       "            <div class=\"action\" id=\"out\">\n",
+       "                <a href=\"https://atlas.nomic.ai/map/ee2354a3-7f9a-4c6b-af43-b0cda09d7198/db996d77-8981-48a0-897a-ff2c22bbf541\" target=\"_blank\">Explore on atlas.nomic.ai</a>\n",
+       "            </div>\n",
+       "        </div>\n",
+       "        \n",
+       "        <iframe class=\"iframe\" id=\"iframedb996d77-8981-48a0-897a-ff2c22bbf541\" allow=\"clipboard-read; clipboard-write\" src=\"https://atlas.nomic.ai/map/ee2354a3-7f9a-4c6b-af43-b0cda09d7198/db996d77-8981-48a0-897a-ff2c22bbf541\">\n",
+       "        </iframe>\n",
+       "\n",
+       "        <style>\n",
+       "            .iframe {\n",
+       "                /* vh can be **very** large in vscode ipynb. */\n",
+       "                height: min(75vh, 66vw);\n",
+       "                width: 100%;\n",
+       "            }\n",
+       "        </style>\n",
+       "        \n",
+       "        <style>\n",
+       "            .actions {\n",
+       "              display: block;\n",
+       "            }\n",
+       "            .action {\n",
+       "              min-height: 18px;\n",
+       "              margin: 5px;\n",
+       "              transition: all 500ms ease-in-out;\n",
+       "            }\n",
+       "            .action:hover {\n",
+       "              cursor: pointer;\n",
+       "            }\n",
+       "            #hide:hover::after {\n",
+       "                content: \" X\";\n",
+       "            }\n",
+       "            #out:hover::after {\n",
+       "                content: \"\";\n",
+       "            }\n",
+       "        </style>\n",
+       "        "
+      ],
+      "text/plain": [
+       "AtlasProject: <{'id': 'ee2354a3-7f9a-4c6b-af43-b0cda09d7198', 'owner': '9c29afbb-a002-4d49-958e-ecf5ae1351ac', 'project_name': 'test_index_1677255228.136989', 'creator': 'auth0|63efc4b5462246f4d9a6ecf2', 'description': 'A description for your project', 'opensearch_index_id': 'f61fb8dd-0abf-4f31-9130-41870e443902', 'is_public': True, 'project_fields': ['atlas_id', 'text'], 'unique_id_field': 'atlas_id', 'modality': 'text', 'total_datums_in_project': 508, 'created_timestamp': '2023-02-24T16:13:50.313363+00:00', 'atlas_indices': [{'id': 'b1b01833-0964-4597-a4bc-a2d60700949d', 'project_id': 'ee2354a3-7f9a-4c6b-af43-b0cda09d7198', 'index_name': 'test_index_1677255228.136989_index', 'indexed_field': 'text', 'created_timestamp': '2023-02-24T16:13:52.957101+00:00', 'updated_timestamp': '2023-02-24T16:14:03.469621+00:00', 'atoms': ['charchunk', 'document'], 'colorable_fields': [], 'embedders': [{'id': '7ec0868a-4eed-4414-a482-25cce9803e1b', 'atlas_index_id': 'b1b01833-0964-4597-a4bc-a2d60700949d', 'ready': True, 'model_name': 'NomicEmbed', 'hyperparameters': {'norm': 'both', 'batch_size': 20, 'polymerize_by': 'charchunk', 'dataset_buffer_size': 1000}}], 'nearest_neighbor_indices': [{'id': '86f8e3ff-e07c-4678-a4d7-144db4b0301d', 'index_name': 'NomicOrganize', 'ready': True, 'hyperparameters': {'dim': 384, 'space': 'l2'}, 'atom_strategies': ['document']}], 'projections': [{'id': 'db996d77-8981-48a0-897a-ff2c22bbf541', 'projection_name': 'NomicProject', 'ready': True, 'hyperparameters': {'spread': 1.0, 'n_epochs': 50, 'n_neighbors': 15}, 'atom_strategies': ['document'], 'created_timestamp': '2023-02-24T16:13:52.979561+00:00', 'updated_timestamp': '2023-02-24T16:14:03.466309+00:00'}]}], 'insert_update_delete_lock': False}>"
+      ]
+     },
+     "execution_count": 7,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "db.project"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.9.4"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 1
+}
--- a/docs/modules/indexes/vectorstore_examples/deeplake.ipynb
+++ b/docs/modules/indexes/vectorstore_examples/deeplake.ipynb
@ -0,0 +1,234 @@
+{
+ "cells": [
+  {
+   "attachments": {},
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "#  Deep Lake\n",
+    "\n",
+    "This notebook showcases basic functionality related to Deep Lake. While Deep Lake can store embeddings, it is capable of storing any type of data. It is a fully fledged serverless data lake with version control, query engine and streaming dataloader to deep learning frameworks.  \n",
+    "\n",
+    "For more information, please see the Deep Lake [documentation](docs.activeloop.ai) or [api reference](docs.deeplake.ai)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 14,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from langchain.embeddings.openai import OpenAIEmbeddings\n",
+    "from langchain.text_splitter import CharacterTextSplitter\n",
+    "from langchain.vectorstores import DeepLake\n",
+    "from langchain.document_loaders import TextLoader"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 15,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from langchain.document_loaders import TextLoader\n",
+    "loader = TextLoader('../../state_of_the_union.txt')\n",
+    "documents = loader.load()\n",
+    "text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)\n",
+    "docs = text_splitter.split_documents(documents)\n",
+    "\n",
+    "embeddings = OpenAIEmbeddings()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 16,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "Evaluating ingest: 100%|██████████| 41/41 [00:00<00:00\n"
+     ]
+    }
+   ],
+   "source": [
+    "db = DeepLake.from_documents(docs, embeddings)\n",
+    "\n",
+    "query = \"What did the president say about Ketanji Brown Jackson\"\n",
+    "docs = db.similarity_search(query)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 17,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "In state after state, new laws have been passed, not only to suppress the vote, but to subvert entire elections. \n",
+      "\n",
+      "We cannot let this happen. \n",
+      "\n",
+      "Tonight. I call on the Senate to: Pass the Freedom to Vote Act. Pass the John Lewis Voting Rights Act. And while you’re at it, pass the Disclose Act so Americans can know who is funding our elections. \n",
+      "\n",
+      "Tonight, I’d like to honor someone who has dedicated his life to serve this country: Justice Stephen Breyer—an Army veteran, Constitutional scholar, and retiring Justice of the United States Supreme Court. Justice Breyer, thank you for your service. \n",
+      "\n",
+      "One of the most serious constitutional responsibilities a President has is nominating someone to serve on the United States Supreme Court. \n",
+      "\n",
+      "And I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson. One of our nation’s top legal minds, who will continue Justice Breyer’s legacy of excellence.\n"
+     ]
+    }
+   ],
+   "source": [
+    "print(docs[0].page_content)"
+   ]
+  },
+  {
+   "attachments": {},
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Deep Lake datasets on cloud or local\n",
+    "By default deep lake datasets are stored in memory, in case you want to persist locally or to any object storage you can simply provide path to the dataset. You can retrieve token from [app.activeloop.ai](https://app.activeloop.ai/)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 18,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "/bin/bash: -c: line 0: syntax error near unexpected token `newline'\n",
+      "/bin/bash: -c: line 0: `activeloop login -t <token>'\n"
+     ]
+    }
+   ],
+   "source": [
+    "!activeloop login -t <token>"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 20,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "Evaluating ingest: 100%|██████████| 4/4 [00:00<00:00\n"
+     ]
+    }
+   ],
+   "source": [
+    "# Embed and store the texts\n",
+    "dataset_path = \"hub://{username}/{dataset_name}\" # could be also ./local/path (much faster locally), s3://bucket/path/to/dataset, gcs://, etc.\n",
+    "\n",
+    "embedding = OpenAIEmbeddings()\n",
+    "vectordb = DeepLake.from_documents(documents=docs, embedding=embedding, dataset_path=dataset_path)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 21,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "In state after state, new laws have been passed, not only to suppress the vote, but to subvert entire elections. \n",
+      "\n",
+      "We cannot let this happen. \n",
+      "\n",
+      "Tonight. I call on the Senate to: Pass the Freedom to Vote Act. Pass the John Lewis Voting Rights Act. And while you’re at it, pass the Disclose Act so Americans can know who is funding our elections. \n",
+      "\n",
+      "Tonight, I’d like to honor someone who has dedicated his life to serve this country: Justice Stephen Breyer—an Army veteran, Constitutional scholar, and retiring Justice of the United States Supreme Court. Justice Breyer, thank you for your service. \n",
+      "\n",
+      "One of the most serious constitutional responsibilities a President has is nominating someone to serve on the United States Supreme Court. \n",
+      "\n",
+      "And I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson. One of our nation’s top legal minds, who will continue Justice Breyer’s legacy of excellence.\n"
+     ]
+    }
+   ],
+   "source": [
+    "query = \"What did the president say about Ketanji Brown Jackson\"\n",
+    "docs = db.similarity_search(query)\n",
+    "print(docs[0].page_content)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 22,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Dataset(path='./local/path', tensors=['embedding', 'ids', 'metadata', 'text'])\n",
+      "\n",
+      "  tensor     htype     shape     dtype  compression\n",
+      "  -------   -------   -------   -------  ------- \n",
+      " embedding  generic  (4, 1536)   None     None   \n",
+      "    ids      text     (4, 1)      str     None   \n",
+      " metadata    json     (4, 1)      str     None   \n",
+      "   text      text     (4, 1)      str     None   \n"
+     ]
+    }
+   ],
+   "source": [
+    "vectordb.ds.summary()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 23,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "embeddings = vectordb.ds.embedding.numpy()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "base",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.9.7"
+  },
+  "orig_nbformat": 4,
+  "vscode": {
+   "interpreter": {
+    "hash": "7b14174bb6f9d4680b62ac2a6390e1ce94fbfabf172a10844870451d539c58d6"
+   }
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 2
+}
--- a/docs/modules/llms/integrations.rst
+++ b/docs/modules/llms/integrations.rst
@ -17,6 +17,14 @@ The examples here are all "how-to" guides for how to integrate with various LLM

 `Goose AI <./integrations/gooseai_example.html>`_: Covers how to utilize the Goose AI wrapper.

+`Writer <./integrations/writer.html>`_: Covers how to utilize the Writer wrapper.
+
+`Banana <./integrations/banana.html>`_: Covers how to utilize the Banana wrapper.
+
+`Modal <./integrations/modal.html>`_: Covers how to utilize the Modal wrapper.
+
+`StochasticAI <./integrations/stochasticai.html>`_: Covers how to utilize the Stochastic AI wrapper.
+
 `Cerebrium <./integrations/cerebriumai_example.html>`_: Covers how to utilize the Cerebrium AI wrapper.

 `Petals <./integrations/petals_example.html>`_: Covers how to utilize the Petals wrapper.
@ -27,6 +35,8 @@ The examples here are all "how-to" guides for how to integrate with various LLM

 `Anthropic <./integrations/anthropic_example.html>`_: Covers how to use Anthropic models with Langchain.

+`DeepInfra <./integrations/deepinfra_example.html>`_: Covers how to utilize the DeepInfra wrapper.
+
 `Self-Hosted Models (via Runhouse) <./integrations/self_hosted_examples.html>`_: Covers how to run models on existing or on-demand remote compute with Langchain.


--- a/docs/modules/llms/integrations/aleph_alpha.ipynb
+++ b/docs/modules/llms/integrations/aleph_alpha.ipynb
@ -0,0 +1,108 @@
+{
+ "cells": [
+  {
+   "attachments": {},
+   "cell_type": "markdown",
+   "id": "9597802c",
+   "metadata": {},
+   "source": [
+    "# Aleph Alpha\n",
+    "This example goes over how to use LangChain to interact with Aleph Alpha models"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "id": "6fb585dd",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from langchain.llms import AlephAlpha\n",
+    "from langchain import PromptTemplate, LLMChain"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 2,
+   "id": "f81a230d",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "template = \"\"\"Q: {question}\n",
+    "\n",
+    "A:\"\"\"\n",
+    "\n",
+    "prompt = PromptTemplate(template=template, input_variables=[\"question\"])"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 3,
+   "id": "f0d26e48",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "llm = AlephAlpha(model=\"luminous-extended\", maximum_tokens=20, stop_sequences=[\"Q:\"])"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 4,
+   "id": "6811d621",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "llm_chain = LLMChain(prompt=prompt, llm=llm)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 5,
+   "id": "3058e63f",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "' Artificial Intelligence (AI) is the simulation of human intelligence processes by machines, especially computer systems.\\n'"
+      ]
+     },
+     "execution_count": 5,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "question = \"What is AI?\"\n",
+    "\n",
+    "llm_chain.run(question)"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": ".venv",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.10.9"
+  },
+  "vscode": {
+   "interpreter": {
+    "hash": "2d002ec47225e662695b764370d7966aa11eeb4302edc2f497bbf96d49c8f899"
+   }
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
--- a/docs/modules/llms/integrations/banana.ipynb
+++ b/docs/modules/llms/integrations/banana.ipynb
@ -0,0 +1,85 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Banana\n",
+    "This example goes over how to use LangChain to interact with Banana models"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import os\n",
+    "from langchain.llms import Banana\n",
+    "from langchain import PromptTemplate, LLMChain\n",
+    "os.environ[\"BANANA_API_KEY\"] = \"YOUR_API_KEY\""
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "template = \"\"\"Question: {question}\n",
+    "\n",
+    "Answer: Let's think step by step.\"\"\"\n",
+    "\n",
+    "prompt = PromptTemplate(template=template, input_variables=[\"question\"])"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "llm = Banana(model_key=\"YOUR_MODEL_KEY\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "llm_chain = LLMChain(prompt=prompt, llm=llm)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "question = \"What NFL team won the Super Bowl in the year Justin Beiber was born?\"\n",
+    "\n",
+    "llm_chain.run(question)"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3.9.12 ('palm')",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "name": "python",
+   "version": "3.9.12"
+  },
+  "orig_nbformat": 4,
+  "vscode": {
+   "interpreter": {
+    "hash": "a0a0263b650d907a3bfe41c0f8d6a63a071b884df3cfdc1579f00cdc1aed6b03"
+   }
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 2
+}
--- a/docs/modules/llms/integrations/deepinfra_example.ipynb
+++ b/docs/modules/llms/integrations/deepinfra_example.ipynb
@ -0,0 +1,141 @@
+{
+ "cells": [
+  {
+   "attachments": {},
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# DeepInfra LLM Example\n",
+    "This notebook goes over how to use Langchain with [DeepInfra](https://deepinfra.com)."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Imports"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import os\n",
+    "from langchain.llms import DeepInfra\n",
+    "from langchain import PromptTemplate, LLMChain"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Set the Environment API Key\n",
+    "Make sure to get your API key from DeepInfra. You are given a 1 hour free of serverless GPU compute to test different models.\n",
+    "You can print your token with `deepctl auth token`"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "os.environ[\"DEEPINFRA_API_TOKEN\"] = \"YOUR_KEY_HERE\""
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Create the DeepInfra instance\n",
+    "Make sure to deploy your model first via `deepctl deploy create -m google/flat-t5-xl` (for example)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "llm = DeepInfra(model_id=\"DEPLOYED MODEL ID\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Create a Prompt Template\n",
+    "We will create a prompt template for Question and Answer."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "template = \"\"\"Question: {question}\n",
+    "\n",
+    "Answer: Let's think step by step.\"\"\"\n",
+    "\n",
+    "prompt = PromptTemplate(template=template, input_variables=[\"question\"])"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Initiate the LLMChain"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "llm_chain = LLMChain(prompt=prompt, llm=llm)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Run the LLMChain\n",
+    "Provide a question and run the LLMChain."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "question = \"What NFL team won the Super Bowl in 2015?\"\n",
+    "\n",
+    "llm_chain.run(question)"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3.9.12 ('palm')",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "name": "python",
+   "version": "3.9.12"
+  },
+  "orig_nbformat": 4,
+  "vscode": {
+   "interpreter": {
+    "hash": "a0a0263b650d907a3bfe41c0f8d6a63a071b884df3cfdc1579f00cdc1aed6b03"
+   }
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 2
+}
--- a/docs/modules/llms/integrations/modal.ipynb
+++ b/docs/modules/llms/integrations/modal.ipynb
@ -0,0 +1,83 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Modal\n",
+    "This example goes over how to use LangChain to interact with Modal models"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from langchain.llms import Modal\n",
+    "from langchain import PromptTemplate, LLMChain"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "template = \"\"\"Question: {question}\n",
+    "\n",
+    "Answer: Let's think step by step.\"\"\"\n",
+    "\n",
+    "prompt = PromptTemplate(template=template, input_variables=[\"question\"])"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "llm = Modal(endpoint_url=\"YOUR_ENDPOINT_URL\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "llm_chain = LLMChain(prompt=prompt, llm=llm)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "question = \"What NFL team won the Super Bowl in the year Justin Beiber was born?\"\n",
+    "\n",
+    "llm_chain.run(question)"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3.9.12 ('palm')",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "name": "python",
+   "version": "3.9.12"
+  },
+  "orig_nbformat": 4,
+  "vscode": {
+   "interpreter": {
+    "hash": "a0a0263b650d907a3bfe41c0f8d6a63a071b884df3cfdc1579f00cdc1aed6b03"
+   }
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 2
+}
--- a/docs/modules/llms/integrations/openai.ipynb
+++ b/docs/modules/llms/integrations/openai.ipynb
@ -88,7 +88,7 @@
 ],
 "metadata": {
  "kernelspec": {
-   "display_name": "Python 3 (ipykernel)",
+   "display_name": "Python 3.9.12 ('palm')",
   "language": "python",
   "name": "python3"
  },
@ -102,7 +102,12 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.9.1"
+   "version": "3.9.12"
+  },
+  "vscode": {
+   "interpreter": {
+    "hash": "a0a0263b650d907a3bfe41c0f8d6a63a071b884df3cfdc1579f00cdc1aed6b03"
+   }
  }
 },
 "nbformat": 4,
--- a/docs/modules/llms/integrations/stochasticai.ipynb
+++ b/docs/modules/llms/integrations/stochasticai.ipynb
@ -0,0 +1,83 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# StochasticAI\n",
+    "This example goes over how to use LangChain to interact with StochasticAI models"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from langchain.llms import StochasticAI\n",
+    "from langchain import PromptTemplate, LLMChain"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "template = \"\"\"Question: {question}\n",
+    "\n",
+    "Answer: Let's think step by step.\"\"\"\n",
+    "\n",
+    "prompt = PromptTemplate(template=template, input_variables=[\"question\"])"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "llm = StochasticAI(api_url=\"YOUR_API_URL\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "llm_chain = LLMChain(prompt=prompt, llm=llm)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "question = \"What NFL team won the Super Bowl in the year Justin Beiber was born?\"\n",
+    "\n",
+    "llm_chain.run(question)"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3.9.12 ('palm')",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "name": "python",
+   "version": "3.9.12"
+  },
+  "orig_nbformat": 4,
+  "vscode": {
+   "interpreter": {
+    "hash": "a0a0263b650d907a3bfe41c0f8d6a63a071b884df3cfdc1579f00cdc1aed6b03"
+   }
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 2
+}
--- a/docs/modules/llms/integrations/writer.ipynb
+++ b/docs/modules/llms/integrations/writer.ipynb
@ -0,0 +1,83 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Writer\n",
+    "This example goes over how to use LangChain to interact with Writer models"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from langchain.llms import Writer\n",
+    "from langchain import PromptTemplate, LLMChain"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "template = \"\"\"Question: {question}\n",
+    "\n",
+    "Answer: Let's think step by step.\"\"\"\n",
+    "\n",
+    "prompt = PromptTemplate(template=template, input_variables=[\"question\"])"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "llm = Writer()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "llm_chain = LLMChain(prompt=prompt, llm=llm)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "question = \"What NFL team won the Super Bowl in the year Justin Beiber was born?\"\n",
+    "\n",
+    "llm_chain.run(question)"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3.9.12 ('palm')",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "name": "python",
+   "version": "3.9.12"
+  },
+  "orig_nbformat": 4,
+  "vscode": {
+   "interpreter": {
+    "hash": "a0a0263b650d907a3bfe41c0f8d6a63a071b884df3cfdc1579f00cdc1aed6b03"
+   }
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 2
+}
--- a/docs/modules/prompts/examples/partial.ipynb
+++ b/docs/modules/prompts/examples/partial.ipynb
@ -0,0 +1,184 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "id": "9355a547",
+   "metadata": {},
+   "source": [
+    "# Partial Prompt Templates\n",
+    "\n",
+    "A prompt template is a class with a `.format` method which takes in a key-value map and returns a string (a prompt) to pass to the language model. Like other methods, it can make sense to \"partial\" a prompt template - eg pass in a subset of the required values, as to create a new prompt template which expects only the remaining subset of values.\n",
+    "\n",
+    "LangChain supports this in two ways: we allow for partially formatted prompts (1) with string values, (2) with functions that return string values. These two different ways support different use cases. In the documentation below we go over the motivations for both use cases as well as how to do it in LangChain.\n",
+    "\n",
+    "## Partial With Strings\n",
+    "\n",
+    "One common use case for wanting to partial a prompt template is if you get some of the variables before others. For example, suppose you have a prompt template that requires two variables, `foo` and `baz`. If you get the `foo` value early on in the chain, but the `baz` value later, it can be annoying to wait until you have both variables in the same place to pass them to the prompt template. Instead, you can partial the prompt template with the `foo` value, and then pass the partialed prompt template along and just use that. Below is an example of doing this:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "id": "643af5da",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from langchain.prompts import PromptTemplate"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 2,
+   "id": "4080d8d7",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "foobaz\n"
+     ]
+    }
+   ],
+   "source": [
+    "prompt = PromptTemplate(template=\"{foo}{bar}\", input_variables=[\"foo\", \"bar\"])\n",
+    "partial_prompt = prompt.partial(foo=\"foo\");\n",
+    "print(partial_prompt.format(bar=\"baz\"))"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "9986766e",
+   "metadata": {},
+   "source": [
+    "You can also just initialize the prompt with the partialed variables."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 3,
+   "id": "e2ce95b3",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "foobaz\n"
+     ]
+    }
+   ],
+   "source": [
+    "prompt = PromptTemplate(template=\"{foo}{bar}\", input_variables=[\"bar\"], partial_variables={\"foo\": \"foo\"})\n",
+    "print(prompt.format(bar=\"baz\"))"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "a9c66f83",
+   "metadata": {},
+   "source": [
+    "## Partial With Functions\n",
+    "\n",
+    "The other common use is to partial with a function. The use case for this is when you have a variable you know that you always want to fetch in a common way. A prime example of this is with date or time. Imagine you have a prompt which you always want to have the current date. You can't hard code it in the prompt, and passing it along with the other input variables is a bit annoying. In this case, it's very handy to be able to partial the prompt with a function that always returns the current date."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 4,
+   "id": "d0712d8a",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from datetime import datetime\n",
+    "\n",
+    "def _get_datetime():\n",
+    "    now = datetime.now()\n",
+    "    return now.strftime(\"%m/%d/%Y, %H:%M:%S\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 5,
+   "id": "4cbcb666",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Tell me a funny joke about the day 02/27/2023, 22:15:16\n"
+     ]
+    }
+   ],
+   "source": [
+    "prompt = PromptTemplate(\n",
+    "    template=\"Tell me a {adjective} joke about the day {date}\", \n",
+    "    input_variables=[\"adjective\", \"date\"]\n",
+    ");\n",
+    "partial_prompt = prompt.partial(date=_get_datetime)\n",
+    "print(partial_prompt.format(adjective=\"funny\"))"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "ffed6811",
+   "metadata": {},
+   "source": [
+    "You can also just initialize the prompt with the partialed variables, which often makes more sense in this workflow."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 6,
+   "id": "96285b25",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Tell me a funny joke about the day 02/27/2023, 22:15:16\n"
+     ]
+    }
+   ],
+   "source": [
+    "prompt = PromptTemplate(\n",
+    "    template=\"Tell me a {adjective} joke about the day {date}\", \n",
+    "    input_variables=[\"adjective\"],\n",
+    "    partial_variables={\"date\": _get_datetime}\n",
+    ");\n",
+    "print(prompt.format(adjective=\"funny\"))"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "4bff16f7",
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.9.1"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
--- a/docs/modules/prompts/how_to_guides.rst
+++ b/docs/modules/prompts/how_to_guides.rst
@ -17,6 +17,8 @@ The user guide here shows more advanced workflows and how to use the library in

 `Few Shot Prompt Examples <./examples/few_shot_examples.html>`_: Examples of Few Shot Prompt Templates.

+`Partial Prompt Template <./examples/partial.html>`_: How to partial Prompt Templates.
+


 .. toctree::
--- a/docs/modules/utils/examples/bash.ipynb
+++ b/docs/modules/utils/examples/bash.ipynb
@ -1,85 +1,85 @@
 {
- "cells": [
-  {
-   "cell_type": "markdown",
-   "id": "8f210ec3",
-   "metadata": {},
-   "source": [
-    "# Bash\n",
-    "It can often be useful to have an LLM generate bash commands, and then run them. A common use case this is for letting it interact with your local file system. We provide an easy util to execute bash commands."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 1,
-   "id": "f7b3767b",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "from langchain.utilities import BashProcess"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 2,
-   "id": "cf1c92f0",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "bash = BashProcess()"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 3,
-   "id": "2fa952fc",
-   "metadata": {},
-   "outputs": [
+  "cells": [
+    {
+      "cell_type": "markdown",
+      "id": "8f210ec3",
+      "metadata": {},
+      "source": [
+        "# Bash\n",
+        "It can often be useful to have an LLM generate bash commands, and then run them. A common use case for this is letting the LLM interact with your local file system. We provide an easy util to execute bash commands."
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 1,
+      "id": "f7b3767b",
+      "metadata": {},
+      "outputs": [],
+      "source": [
+        "from langchain.utilities import BashProcess"
+      ]
+    },
    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "bash.ipynb\n",
-      "google_search.ipynb\n",
-      "python.ipynb\n",
-      "requests.ipynb\n",
-      "serpapi.ipynb\n",
-      "\n"
-     ]
+      "cell_type": "code",
+      "execution_count": 2,
+      "id": "cf1c92f0",
+      "metadata": {},
+      "outputs": [],
+      "source": [
+        "bash = BashProcess()"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 3,
+      "id": "2fa952fc",
+      "metadata": {},
+      "outputs": [
+        {
+          "name": "stdout",
+          "output_type": "stream",
+          "text": [
+            "bash.ipynb\n",
+            "google_search.ipynb\n",
+            "python.ipynb\n",
+            "requests.ipynb\n",
+            "serpapi.ipynb\n",
+            "\n"
+          ]
+        }
+      ],
+      "source": [
+        "print(bash.run(\"ls\"))"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "id": "851fee9f",
+      "metadata": {},
+      "outputs": [],
+      "source": []
+    }
+  ],
+  "metadata": {
+    "kernelspec": {
+      "display_name": "Python 3 (ipykernel)",
+      "language": "python",
+      "name": "python3"
+    },
+    "language_info": {
+      "codemirror_mode": {
+        "name": "ipython",
+        "version": 3
+      },
+      "file_extension": ".py",
+      "mimetype": "text/x-python",
+      "name": "python",
+      "nbconvert_exporter": "python",
+      "pygments_lexer": "ipython3",
+      "version": "3.10.9"
    }
-   ],
-   "source": [
-    "print(bash.run(\"ls\"))"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "851fee9f",
-   "metadata": {},
-   "outputs": [],
-   "source": []
-  }
- ],
- "metadata": {
-  "kernelspec": {
-   "display_name": "Python 3 (ipykernel)",
-   "language": "python",
-   "name": "python3"
  },
-  "language_info": {
-   "codemirror_mode": {
-    "name": "ipython",
-    "version": 3
-   },
-   "file_extension": ".py",
-   "mimetype": "text/x-python",
-   "name": "python",
-   "nbconvert_exporter": "python",
-   "pygments_lexer": "ipython3",
-   "version": "3.10.9"
-  }
- },
- "nbformat": 4,
- "nbformat_minor": 5
+  "nbformat": 4,
+  "nbformat_minor": 5
 }
--- a/docs/modules/utils/examples/docker.ipynb
+++ b/docs/modules/utils/examples/docker.ipynb
@ -0,0 +1,180 @@
+{
+  "cells": [
+    {
+      "cell_type": "code",
+      "metadata": {
+        "jukit_cell_id": "O4HPx3boF0"
+      },
+      "source": [],
+      "outputs": [],
+      "execution_count": null
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "jukit_cell_id": "hqQkbPEwTJ"
+      },
+      "source": [
+        "# Using the DockerWrapper utility"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "metadata": {
+        "jukit_cell_id": "vCepuypaFH"
+      },
+      "source": [
+        "from langchain.utilities.docker import DockerWrapper"
+      ],
+      "outputs": [],
+      "execution_count": null
+    },
+    {
+      "cell_type": "code",
+      "metadata": {
+        "jukit_cell_id": "BtYVqy2YtO"
+      },
+      "source": [
+        "d = DockerWrapper(image='shell')"
+      ],
+      "outputs": [],
+      "execution_count": null
+    },
+    {
+      "cell_type": "code",
+      "metadata": {
+        "jukit_cell_id": "ELWWm03ptQ"
+      },
+      "source": [
+        "query = \"\"\"\n",
+        "for i in $(seq 1 10)\n",
+        "do\n",
+        "    echo $i\n",
+        "done\n",
+        "\"\"\"\n",
+        "print(d.exec_run(query))"
+      ],
+      "outputs": [
+        {
+          "output_type": "stream",
+          "name": "stdout",
+          "text": "1\n2\n3\n4\n5\n6\n7\n8\n9\n10\n"
+        }
+      ],
+      "execution_count": 1
+    },
+    {
+      "cell_type": "code",
+      "metadata": {
+        "jukit_cell_id": "lGMqLz5sDo"
+      },
+      "source": [
+        "p = DockerWrapper(image='python')\n",
+        "\n",
+        "py_payload = \"\"\"\n",
+        "def hello_world():\n",
+        "    return 'hello world'\n",
+        "\n",
+        "hello_world()\n",
+        "\"\"\""
+      ],
+      "outputs": [],
+      "execution_count": null
+    },
+    {
+      "cell_type": "code",
+      "metadata": {
+        "jukit_cell_id": "X04Wd6zbrk"
+      },
+      "source": [
+        "print(p.exec_run(py_payload))"
+      ],
+      "outputs": [
+        {
+          "output_type": "stream",
+          "name": "stdout",
+          "text": "'hello world'\n"
+        }
+      ],
+      "execution_count": 2
+    },
+    {
+      "cell_type": "code",
+      "metadata": {
+        "jukit_cell_id": "lKOfuDoJGk"
+      },
+      "source": [],
+      "outputs": [],
+      "execution_count": null
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "jukit_cell_id": "eSzXtDrpqU"
+      },
+      "source": [
+        "## Passing custom parameters\n",
+        "\n",
+        "By default containers are run with a safe set of parameters. You can pass any parameters\n",
+        "that are accepted by the docker python sdk to the run and exec commands.\n",
+        "\n",
+        "### Using networking"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "metadata": {
+        "jukit_cell_id": "eWFGCxD9pv"
+      },
+      "source": [
+        "# by default containers don't have access to the network\n",
+        "print(d.run('ping -c 1 google.com'))"
+      ],
+      "outputs": [
+        {
+          "output_type": "stream",
+          "name": "stdout",
+          "text": "STDERR: Command '/bin/sh -c 'ping -c 1 google.com'' in image 'alpine:latest' returned non-zero exit status 1: b\"ping: bad address 'google.com'\\n\"\n"
+        }
+      ],
+      "execution_count": 3
+    },
+    {
+      "cell_type": "code",
+      "metadata": {
+        "jukit_cell_id": "Z0YkpuXVyL"
+      },
+      "source": [
+        "# using the network parameter\n",
+        "print(d.run('ping -c 1 google.com', network='bridge'))"
+      ],
+      "outputs": [
+        {
+          "output_type": "stream",
+          "name": "stdout",
+          "text": "PING google.com (142.250.200.110): 56 data bytes\n64 bytes from 142.250.200.110: seq=0 ttl=42 time=13.695 ms\n\n--- google.com ping statistics ---\n1 packets transmitted, 1 packets received, 0% packet loss\nround-trip min/avg/max = 13.695/13.695/13.695 ms\n"
+        }
+      ],
+      "execution_count": 4
+    },
+    {
+      "cell_type": "code",
+      "metadata": {
+        "jukit_cell_id": "3rMWzzuLHq"
+      },
+      "source": [],
+      "outputs": [],
+      "execution_count": null
+    }
+  ],
+  "metadata": {
+    "anaconda-cloud": {},
+    "kernelspec": {
+      "display_name": "python",
+      "language": "python",
+      "name": "python3"
+    }
+  },
+  "nbformat": 4,
+  "nbformat_minor": 4
+}
--- a/docs/modules/utils/examples/ifttt.ipynb
+++ b/docs/modules/utils/examples/ifttt.ipynb
@ -0,0 +1,121 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "id": "16763ed3",
+   "metadata": {},
+   "source": [
+    "# IFTTT WebHooks\n",
+    "\n",
+    "This notebook shows how to use IFTTT Webhooks.\n",
+    "\n",
+    "From https://github.com/SidU/teams-langchain-js/wiki/Connecting-IFTTT-Services.\n",
+    "\n",
+    "# Creating a webhook\n",
+    "- Go to https://ifttt.com/create\n",
+    "\n",
+    "# Configuring the \"If This\"\n",
+    "- Click on the \"If This\" button in the IFTTT interface.\n",
+    "- Search for \"Webhooks\" in the search bar.\n",
+    "- Choose the first option for \"Receive a web request with a JSON payload.\"\n",
+    "- Choose an Event Name that is specific to the service you plan to connect to.\n",
+    "This will make it easier for you to manage the webhook URL.\n",
+    "For example, if you're connecting to Spotify, you could use \"Spotify\" as your\n",
+    "Event Name.\n",
+    "- Click the \"Create Trigger\" button to save your settings and create your webhook.\n",
+    "\n",
+    "# Configuring the \"Then That\"\n",
+    "- Tap on the \"Then That\" button in the IFTTT interface.\n",
+    "- Search for the service you want to connect, such as Spotify.\n",
+    "- Choose an action from the service, such as \"Add track to a playlist\".\n",
+    "- Configure the action by specifying the necessary details, such as the playlist name,\n",
+    "e.g., \"Songs from AI\".\n",
+    "- Reference the JSON Payload received by the Webhook in your action. For the Spotify\n",
+    "scenario, choose \"{{JsonPayload}}\" as your search query.\n",
+    "- Tap the \"Create Action\" button to save your action settings.\n",
+    "- Once you have finished configuring your action, click the \"Finish\" button to\n",
+    "complete the setup.\n",
+    "- Congratulations! You have successfully connected the Webhook to the desired\n",
+    "service, and you're ready to start receiving data and triggering actions 🎉\n",
+    "\n",
+    "# Finishing up\n",
+    "- To get your webhook URL go to https://ifttt.com/maker_webhooks/settings\n",
+    "- Copy the IFTTT key value from there. The URL is of the form\n",
+    "https://maker.ifttt.com/use/YOUR_IFTTT_KEY. Grab the YOUR_IFTTT_KEY value.\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "id": "10a46e7e",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from langchain.tools.ifttt import IFTTTWebhook"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 2,
+   "id": "12003d72",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import os\n",
+    "key = os.environ[\"IFTTTKey\"]\n",
+    "url = f\"https://maker.ifttt.com/trigger/spotify/json/with/key/{key}\"\n",
+    "tool = IFTTTWebhook(name=\"Spotify\", description=\"Add a song to spotify playlist\", url=url)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 3,
+   "id": "6e68f846",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "\"Congratulations! You've fired the spotify JSON event\""
+      ]
+     },
+     "execution_count": 3,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "tool.run(\"taylor swift\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "a7e599c9",
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.9.1"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
--- a/docs/modules/utils/key_concepts.md
+++ b/docs/modules/utils/key_concepts.md
@ -1,29 +1,35 @@
 # Key Concepts

 ## Python REPL
-Sometimes, for complex calculations, rather than have an LLM generate the answer directly, 
-it can be better to have the LLM generate code to calculate the answer, and then run that code to get the answer. 
+
+Sometimes, for complex calculations, rather than have an LLM generate the answer directly,
+it can be better to have the LLM generate code to calculate the answer, and then run that code to get the answer.
 In order to easily do that, we provide a simple Python REPL to execute commands in.
-This interface will only return things that are printed - 
+This interface will only return things that are printed -
 therefore, if you want to use it to calculate an answer, make sure to have it print out the answer.

 ## Bash
-It can often be useful to have an LLM generate bash commands, and then run them. 
-A common use case this is for letting it interact with your local file system. 
+
+It can often be useful to have an LLM generate bash commands, and then run them.
+A common use case for this is letting the LLM interact with your local file system.
 We provide an easy component to execute bash commands.

 ## Requests Wrapper
-The web contains a lot of information that LLMs do not have access to. 
-In order to easily let LLMs interact with that information, 
+
+The web contains a lot of information that LLMs do not have access to.
+In order to easily let LLMs interact with that information,
 we provide a wrapper around the Python Requests module that takes in a URL and fetches data from that URL.

 ## Google Search
+
 This uses the official Google Search API to look up information on the web.

 ## SerpAPI
+
 This uses SerpAPI, a third party search API engine, to interact with Google Search.

 ## Searx Search
+
 This uses the Searx (SearxNG fork) meta search engine API to lookup information
-on the web.  It supports 139 search engines and is easy to self-host
+on the web. It supports 139 search engines and is easy to self-host
 which makes it a good choice for privacy-conscious users.
--- a/docs/reference/integrations.md
+++ b/docs/reference/integrations.md
@ -50,6 +50,8 @@ The following use cases require specific installs and api keys:
 - _OpenSearch_:
  - Install requirements with `pip install opensearch-py`
  - If you want to set up OpenSearch on your local, [here](https://opensearch.org/docs/latest/)
+- _DeepLake_:
+  - Install requirements with `pip install deeplake`


 If you are using the `NLTKTextSplitter` or the `SpacyTextSplitter`, you will also need to install the appropriate models. For example, if you want to use the `SpacyTextSplitter`, you will need to install the `en_core_web_sm` model with `python -m spacy download en_core_web_sm`. Similarly, if you want to use the `NLTKTextSplitter`, you will need to install the `punkt` model with `python -m nltk.downloader punkt`.
--- a/docs/use_cases/question_answering.md
+++ b/docs/use_cases/question_answering.md
@ -1,5 +1,41 @@
 # Question Answering

+Question answering in this context refers to question answering over your document data. 
+For question answering over other types of data, like [SQL databases](../modules/chains/examples/sqlite.html) or [APIs](../modules/chains/examples/api.html), please see [here](../modules/chains/utility_how_to.html)
+
+For question answering over many documents, you almost always want to create an index over the data.
+This can be used to smartly access the most relevant documents for a given question, allowing you to avoid having to pass all the documents to the LLM (saving you time and money).
+
+See [this notebook](../modules/indexes/getting_started.ipynb) for a more detailed introduction to this, but for a super quick start the steps involved are:
+
+**Load Your Documents**
+```python
+from langchain.document_loaders import TextLoader
+loader = TextLoader('../state_of_the_union.txt')
+```
+See [here](../modules/document_loaders/how_to_guides.rst) for more information on how to get started with document loading.
+
+**Create Your Index**
+```python
+from langchain.indexes import VectorstoreIndexCreator
+index = VectorstoreIndexCreator().from_loaders([loader])
+```
+The best and most popular index by far at the moment is the VectorStore index.
+
+**Query Your Index**
+```python
+query = "What did the president say about Ketanji Brown Jackson"
+index.query(query)
+```
+Alternatively, use `query_with_sources` to also get back the sources involved
+```python
+query = "What did the president say about Ketanji Brown Jackson"
+index.query_with_sources(query)
+```
+Again, these high level interfaces obfuscate a lot of what is going on under the hood, so please see [this notebook](../modules/indexes/getting_started.ipynb) for a lower level walkthrough.
+
+## Document Question Answering
+
 Question answering involves fetching multiple documents, and then asking a question of them.
 The LLM response will contain the answer to your question, based on the content of the documents.

@ -15,7 +51,7 @@ The following resources exist:
 - [Question Answering Notebook](/modules/indexes/chain_examples/question_answering.ipynb): A notebook walking through how to accomplish this task.
 - [VectorDB Question Answering Notebook](/modules/indexes/chain_examples/vector_db_qa.ipynb): A notebook walking through how to do question answering over a vector database. This can often be useful for when you have a LOT of documents, and you don't want to pass them all to the LLM, but rather first want to do some semantic search over embeddings.

-### Adding in sources
+## Adding in sources

 There is also a variant of this, where in addition to responding with the answer the language model will also cite its sources (eg which of the documents passed in it used).

@ -31,7 +67,7 @@ The following resources exist:
 - [QA With Sources Notebook](/modules/indexes/chain_examples/qa_with_sources.ipynb): A notebook walking through how to accomplish this task.
 - [VectorDB QA With Sources Notebook](/modules/indexes/chain_examples/vector_db_qa_with_sources.ipynb): A notebook walking through how to do question answering with sources over a vector database. This can often be useful for when you have a LOT of documents, and you don't want to pass them all to the LLM, but rather first want to do some semantic search over embeddings.

-### Additional Related Resources
+## Additional Related Resources

 Additional related resources include:
 - [Utilities for working with Documents](/modules/utils/how_to_guides.rst): Guides on how to use several of the utilities which will prove helpful for this task, including Text Splitters (for splitting up long documents) and Embeddings & Vectorstores (useful for the above Vector DB example).
--- a/langchain/init.py
+++ b/langchain/init.py
@ -24,13 +24,17 @@ from langchain.chains import (
 from langchain.docstore import InMemoryDocstore, Wikipedia
 from langchain.llms import (
    Anthropic,
+    Banana,
    CerebriumAI,
    Cohere,
    ForefrontAI,
    GooseAI,
    HuggingFaceHub,
+    Modal,
    OpenAI,
    Petals,
+    StochasticAI,
+    Writer,
 )
 from langchain.llms.huggingface_pipeline import HuggingFacePipeline
 from langchain.prompts import (
@ -67,12 +71,16 @@ __all__ = [
    "GoogleSerperAPIWrapper",
    "WolframAlphaAPIWrapper",
    "Anthropic",
+    "Banana",
    "CerebriumAI",
    "Cohere",
    "ForefrontAI",
    "GooseAI",
+    "Modal",
    "OpenAI",
    "Petals",
+    "StochasticAI",
+    "Writer",
    "BasePromptTemplate",
    "Prompt",
    "FewShotPromptTemplate",
--- a/langchain/agents/load_tools.py
+++ b/langchain/agents/load_tools.py
@ -179,7 +179,7 @@ _EXTRA_OPTIONAL_TOOLS = {
    "bing-search": (_get_bing_search, ["bing_subscription_key", "bing_search_url"]),
    "google-serper": (_get_google_serper, ["serper_api_key"]),
    "serpapi": (_get_serpapi, ["serpapi_api_key", "aiosession"]),
-    "searx-search": (_get_searx_search, ["searx_host", "searx_host"]),
+    "searx-search": (_get_searx_search, ["searx_host"]),
 }


--- a/langchain/cache.py
+++ b/langchain/cache.py
@ -87,7 +87,7 @@ class SQLAlchemyCache(BaseCache):
                prompt=prompt, llm=llm_string, response=generation.text, idx=i
            )
            with Session(self.engine) as session, session.begin():
-                session.add(item)
+                session.merge(item)


 class SQLiteCache(SQLAlchemyCache):
--- a/langchain/chains/chat_vector_db/base.py
+++ b/langchain/chains/chat_vector_db/base.py
@ -101,7 +101,7 @@ class ChatVectorDBChain(Chain, BaseModel):
        else:
            return {self.output_key: answer}

-    async def _acall(self, inputs: Dict[str, Any]) -> Dict[str, str]:
+    async def _acall(self, inputs: Dict[str, Any]) -> Dict[str, Any]:
        question = inputs["question"]
        chat_history_str = _get_chat_history(inputs["chat_history"])
        vectordbkwargs = inputs.get("vectordbkwargs", {})
@ -119,4 +119,7 @@ class ChatVectorDBChain(Chain, BaseModel):
        new_inputs["question"] = new_question
        new_inputs["chat_history"] = chat_history_str
        answer, _ = await self.combine_docs_chain.acombine_docs(docs, **new_inputs)
-        return {self.output_key: answer}
+        if self.return_source_documents:
+            return {self.output_key: answer, "source_documents": docs}
+        else:
+            return {self.output_key: answer}
--- a/langchain/chains/constitutional_ai/base.py
+++ b/langchain/chains/constitutional_ai/base.py
@ -16,7 +16,7 @@ class ConstitutionalChain(Chain):
        .. code-block:: python

            from langchain.llms import OpenAI
-            from langchian.chains import LLMChain, ConstitutionalChain
+            from langchain.chains import LLMChain, ConstitutionalChain

            qa_prompt = PromptTemplate(
                template="Q: {question} A:",
--- a/langchain/chains/sql_database/prompt.py
+++ b/langchain/chains/sql_database/prompt.py
@ -2,7 +2,7 @@
 from langchain.prompts.base import CommaSeparatedListOutputParser
 from langchain.prompts.prompt import PromptTemplate

-_DEFAULT_TEMPLATE = """Given an input question, first create a syntactically correct {dialect} query to run, then look at the results of the query and return the answer. Unless the user specifies in his question a specific number of examples he wishes to obtain, always limit your query to at most {top_k} results using the LIMIT clause. You can order the results by a relevant column to return the most interesting examples in the database.
+_DEFAULT_TEMPLATE = """Given an input question, first create a syntactically correct {dialect} query to run, then look at the results of the query and return the answer. Unless the user specifies in his question a specific number of examples he wishes to obtain, always limit your query to at most {top_k} results. You can order the results by a relevant column to return the most interesting examples in the database.

 Never query for all the columns from a specific table, only ask for a the few relevant columns given the question.

--- a/langchain/document_loaders/init.py
+++ b/langchain/document_loaders/init.py
@ -3,10 +3,12 @@
 from langchain.document_loaders.airbyte_json import AirbyteJSONLoader
 from langchain.document_loaders.azlyrics import AZLyricsLoader
 from langchain.document_loaders.college_confidential import CollegeConfidentialLoader
+from langchain.document_loaders.conllu import CoNLLULoader
 from langchain.document_loaders.directory import DirectoryLoader
 from langchain.document_loaders.docx import UnstructuredDocxLoader
 from langchain.document_loaders.email import UnstructuredEmailLoader
 from langchain.document_loaders.evernote import EverNoteLoader
+from langchain.document_loaders.facebook_chat import FacebookChatLoader
 from langchain.document_loaders.gcs_directory import GCSDirectoryLoader
 from langchain.document_loaders.gcs_file import GCSFileLoader
 from langchain.document_loaders.gitbook import GitbookLoader
@ -14,7 +16,10 @@ from langchain.document_loaders.googledrive import GoogleDriveLoader
 from langchain.document_loaders.gutenberg import GutenbergLoader
 from langchain.document_loaders.hn import HNLoader
 from langchain.document_loaders.html import UnstructuredHTMLLoader
+from langchain.document_loaders.ifixit import IFixitLoader
+from langchain.document_loaders.image import UnstructuredImageLoader
 from langchain.document_loaders.imsdb import IMSDbLoader
+from langchain.document_loaders.notebook import NotebookLoader
 from langchain.document_loaders.notion import NotionDirectoryLoader
 from langchain.document_loaders.obsidian import ObsidianLoader
 from langchain.document_loaders.online_pdf import OnlinePDFLoader
@ -34,6 +39,7 @@ from langchain.document_loaders.unstructured import (
 )
 from langchain.document_loaders.url import UnstructuredURLLoader
 from langchain.document_loaders.web_base import WebBaseLoader
+from langchain.document_loaders.word_document import UnstructuredWordDocumentLoader
 from langchain.document_loaders.youtube import YoutubeLoader

 __all__ = [
@ -46,7 +52,9 @@ __all__ = [
    "GoogleDriveLoader",
    "UnstructuredHTMLLoader",
    "UnstructuredPowerPointLoader",
+    "UnstructuredWordDocumentLoader",
    "UnstructuredPDFLoader",
+    "UnstructuredImageLoader",
    "ObsidianLoader",
    "UnstructuredDocxLoader",
    "UnstructuredEmailLoader",
@ -63,6 +71,7 @@ __all__ = [
    "IMSDbLoader",
    "AZLyricsLoader",
    "CollegeConfidentialLoader",
+    "IFixitLoader",
    "GutenbergLoader",
    "PagedPDFSplitter",
    "EverNoteLoader",
@ -71,4 +80,7 @@ __all__ = [
    "PDFMinerLoader",
    "TelegramChatLoader",
    "SRTLoader",
+    "FacebookChatLoader",
+    "NotebookLoader",
+    "CoNLLULoader",
 ]
--- a/langchain/document_loaders/conllu.py
+++ b/langchain/document_loaders/conllu.py
@ -0,0 +1,33 @@
+"""Load CoNLL-U files."""
+import csv
+from typing import List
+
+from langchain.docstore.document import Document
+from langchain.document_loaders.base import BaseLoader
+
+
+class CoNLLULoader(BaseLoader):
+    """Load CoNLL-U files."""
+
+    def __init__(self, file_path: str):
+        """Initialize with file path."""
+        self.file_path = file_path
+
+    def load(self) -> List[Document]:
+        """Load from file path."""
+        with open(self.file_path, encoding="utf8") as f:
+            tsv = list(csv.reader(f, delimiter="\t"))
+
+            # If len(line) > 1, the line is not a comment
+            lines = [line for line in tsv if len(line) > 1]
+
+        text = ""
+        for i, line in enumerate(lines):
+            # Do not add a space after a punctuation mark or at the end of the sentence
+            if line[9] == "SpaceAfter=No" or i == len(lines) - 1:
+                text += line[1]
+            else:
+                text += line[1] + " "
+
+        metadata = {"source": self.file_path}
+        return [Document(page_content=text, metadata=metadata)]
--- a/langchain/document_loaders/facebook_chat.py
+++ b/langchain/document_loaders/facebook_chat.py
@ -0,0 +1,57 @@
+"""Loader that loads Facebook chat json dump."""
+import datetime
+import json
+from pathlib import Path
+from typing import List
+
+from langchain.docstore.document import Document
+from langchain.document_loaders.base import BaseLoader
+
+
+def concatenate_rows(row: dict) -> str:
+    """Combine message information in a readable format ready to be used."""
+    sender = row["sender_name"]
+    text = row["content"]
+    date = datetime.datetime.fromtimestamp(row["timestamp_ms"] / 1000).strftime(
+        "%Y-%m-%d %H:%M:%S"
+    )
+    return f"{sender} on {date}: {text}\n\n"
+
+
+class FacebookChatLoader(BaseLoader):
+    """Loader that loads Facebook messages json directory dump."""
+
+    def __init__(self, path: str):
+        """Initialize with path."""
+        self.file_path = path
+
+    def load(self) -> List[Document]:
+        """Load documents."""
+        try:
+            import pandas as pd
+        except ImportError:
+            raise ValueError(
+                "pandas is needed for Facebook chat loader, "
+                "please install with `pip install pandas`"
+            )
+        p = Path(self.file_path)
+
+        with open(p, encoding="utf8") as f:
+            d = json.load(f)
+
+        normalized_messages = pd.json_normalize(d["messages"])
+        df_normalized_messages = pd.DataFrame(normalized_messages)
+
+        # Only keep plain text messages
+        # (no services, nor links, hashtags, code, bold ...)
+        df_filtered = df_normalized_messages[
+            (df_normalized_messages.content.apply(lambda x: type(x) == str))
+        ]
+
+        df_filtered = df_filtered[["timestamp_ms", "content", "sender_name"]]
+
+        text = df_filtered.apply(concatenate_rows, axis=1).str.cat(sep="")
+
+        metadata = {"source": str(p)}
+
+        return [Document(page_content=text, metadata=metadata)]
--- a/langchain/document_loaders/ifixit.py
+++ b/langchain/document_loaders/ifixit.py
@ -0,0 +1,202 @@
+"""Loader that loads iFixit data."""
+from typing import List, Optional
+
+import requests
+
+from langchain.docstore.document import Document
+from langchain.document_loaders.base import BaseLoader
+from langchain.document_loaders.web_base import WebBaseLoader
+
+IFIXIT_BASE_URL = "https://www.ifixit.com/api/2.0"
+
+
+class IFixitLoader(BaseLoader):
+    """Load iFixit repair guides, device wikis and answers.
+
+    iFixit is the largest, open repair community on the web. The site contains nearly
+    100k repair manuals, 200k Questions & Answers on 42k devices, and all the data is
+    licensed under CC-BY.
+
+    This loader will allow you to download the text of a repair guide, text of Q&A's
+    and wikis from devices on iFixit using their open APIs and web scraping.
+    """
+
+    def __init__(self, web_path: str):
+        """Initialize with web path."""
+        if not web_path.startswith("https://www.ifixit.com"):
+            raise ValueError("web path must start with 'https://www.ifixit.com'")
+
+        path = web_path.replace("https://www.ifixit.com", "")
+
+        allowed_paths = ["/Device", "/Guide", "/Answers", "/Teardown"]
+
+        """ TODO: Add /Wiki """
+        if not any(path.startswith(allowed_path) for allowed_path in allowed_paths):
+            raise ValueError(
+                "web path must start with /Device, /Guide, /Teardown or /Answers"
+            )
+
+        pieces = [x for x in path.split("/") if x]
+
+        """Teardowns are just guides by a different name"""
+        self.page_type = pieces[0] if pieces[0] != "Teardown" else "Guide"
+
+        if self.page_type == "Guide" or self.page_type == "Answers":
+            self.id = pieces[2]
+        else:
+            self.id = pieces[1]
+
+        self.web_path = web_path
+
+    def load(self) -> List[Document]:
+        if self.page_type == "Device":
+            return self.load_device()
+        elif self.page_type == "Guide" or self.page_type == "Teardown":
+            return self.load_guide()
+        elif self.page_type == "Answers":
+            return self.load_questions_and_answers()
+        else:
+            raise ValueError("Unknown page type: " + self.page_type)
+
+    @staticmethod
+    def load_suggestions(query: str = "", doc_type: str = "all") -> List[Document]:
+        res = requests.get(
+            IFIXIT_BASE_URL + "/suggest/" + query + "?doctypes=" + doc_type
+        )
+
+        if res.status_code != 200:
+            raise ValueError(
+                'Could not load suggestions for "' + query + '"\n' + res.json()
+            )
+
+        data = res.json()
+
+        results = data["results"]
+        output = []
+
+        for result in results:
+            try:
+                loader = IFixitLoader(result["url"])
+                if loader.page_type == "Device":
+                    output += loader.load_device(include_guides=False)
+                else:
+                    output += loader.load()
+            except ValueError:
+                continue
+
+        return output
+
+    def load_questions_and_answers(
+        self, url_override: Optional[str] = None
+    ) -> List[Document]:
+        loader = WebBaseLoader(self.web_path if url_override is None else url_override)
+        soup = loader.scrape()
+
+        output = []
+
+        title = soup.find("h1", "post-title").text
+
+        output.append("# " + title)
+        output.append(soup.select_one(".post-content .post-text").text.strip())
+
+        output.append("\n## " + soup.find("div", "post-answers-header").text.strip())
+        for answer in soup.select(".js-answers-list .post.post-answer"):
+            if answer.has_attr("itemprop") and "acceptedAnswer" in answer["itemprop"]:
+                output.append("\n### Accepted Answer")
+            elif "post-helpful" in answer["class"]:
+                output.append("\n### Most Helpful Answer")
+            else:
+                output.append("\n### Other Answer")
+
+            output += [
+                a.text.strip() for a in answer.select(".post-content .post-text")
+            ]
+            output.append("\n")
+
+        text = "\n".join(output).strip()
+
+        metadata = {"source": self.web_path, "title": title}
+
+        return [Document(page_content=text, metadata=metadata)]
+
+    def load_device(
+        self, url_override: Optional[str] = None, include_guides: bool = True
+    ) -> List[Document]:
+        documents = []
+        if url_override is None:
+            url = IFIXIT_BASE_URL + "/wikis/CATEGORY/" + self.id
+        else:
+            url = url_override
+
+        res = requests.get(url)
+        data = res.json()
+        text = "\n".join(
+            [
+                data[key]
+                for key in ["title", "description", "contents_raw"]
+                if key in data
+            ]
+        ).strip()
+
+        metadata = {"source": self.web_path, "title": data["title"]}
+        documents.append(Document(page_content=text, metadata=metadata))
+
+        if include_guides:
+            """Load and return documents for each guide linked to from the device"""
+            guide_urls = [guide["url"] for guide in data["guides"]]
+            for guide_url in guide_urls:
+                documents.append(IFixitLoader(guide_url).load()[0])
+
+        return documents
+
+    def load_guide(self, url_override: Optional[str] = None) -> List[Document]:
+        if url_override is None:
+            url = IFIXIT_BASE_URL + "/guides/" + self.id
+        else:
+            url = url_override
+
+        res = requests.get(url)
+
+        if res.status_code != 200:
+            raise ValueError(
+                "Could not load guide: " + self.web_path + "\n" + res.json()
+            )
+
+        data = res.json()
+
+        doc_parts = ["# " + data["title"], data["introduction_raw"]]
+
+        doc_parts.append("\n\n###Tools Required:")
+        if len(data["tools"]) == 0:
+            doc_parts.append("\n - None")
+        else:
+            for tool in data["tools"]:
+                doc_parts.append("\n - " + tool["text"])
+
+        doc_parts.append("\n\n###Parts Required:")
+        if len(data["parts"]) == 0:
+            doc_parts.append("\n - None")
+        else:
+            for part in data["parts"]:
+                doc_parts.append("\n - " + part["text"])
+
+        for row in data["steps"]:
+            doc_parts.append(
+                "\n\n## "
+                + (
+                    row["title"]
+                    if row["title"] != ""
+                    else "Step {}".format(row["orderby"])
+                )
+            )
+
+            for line in row["lines"]:
+                doc_parts.append(line["text_raw"])
+
+        doc_parts.append(data["conclusion_raw"])
+
+        text = "\n".join(doc_parts)
+
+        metadata = {"source": self.web_path, "title": data["title"]}
+
+        return [Document(page_content=text, metadata=metadata)]
--- a/langchain/document_loaders/image.py
+++ b/langchain/document_loaders/image.py
@ -0,0 +1,13 @@
+"""Loader that loads image files."""
+from typing import List
+
+from langchain.document_loaders.unstructured import UnstructuredFileLoader
+
+
+class UnstructuredImageLoader(UnstructuredFileLoader):
+    """Loader that uses unstructured to load image files, such as PNGs and JPGs."""
+
+    def _get_elements(self) -> List:
+        from unstructured.partition.image import partition_image
+
+        return partition_image(filename=self.file_path)
--- a/langchain/document_loaders/notebook.py
+++ b/langchain/document_loaders/notebook.py
@ -0,0 +1,109 @@
+"""Loader that loads .ipynb notebook files."""
+import json
+from pathlib import Path
+from typing import Any, List
+
+from langchain.docstore.document import Document
+from langchain.document_loaders.base import BaseLoader
+
+
+def concatenate_cells(
+    cell: dict, include_outputs: bool, max_output_length: int, traceback: bool
+) -> str:
+    """Combine cells information in a readable format ready to be used."""
+    cell_type = cell["cell_type"]
+    source = cell["source"]
+    output = cell["outputs"]
+
+    if include_outputs and cell_type == "code" and output:
+        if "ename" in output[0].keys():
+            error_name = output[0]["ename"]
+            error_value = output[0]["evalue"]
+            if traceback:
+                traceback = output[0]["traceback"]
+                return (
+                    f"'{cell_type}' cell: '{source}'\n, gives error '{error_name}',"
+                    f" with description '{error_value}'\n"
+                    f"and traceback '{traceback}'\n\n"
+                )
+            else:
+                return (
+                    f"'{cell_type}' cell: '{source}'\n, gives error '{error_name}',"
+                    f"with description '{error_value}'\n\n"
+                )
+        elif output[0]["output_type"] == "stream":
+            output = output[0]["text"]
+            min_output = min(max_output_length, len(output))
+            return (
+                f"'{cell_type}' cell: '{source}'\n with "
+                f"output: '{output[:min_output]}'\n\n"
+            )
+    else:
+        return f"'{cell_type}' cell: '{source}'\n\n"
+
+    return ""
+
+
+def remove_newlines(x: Any) -> Any:
+    """Remove recursively newlines, no matter the data structure they are stored in."""
+    import pandas as pd
+
+    if isinstance(x, str):
+        return x.replace("\n", "")
+    elif isinstance(x, list):
+        return [remove_newlines(elem) for elem in x]
+    elif isinstance(x, pd.DataFrame):
+        return x.applymap(remove_newlines)
+    else:
+        return x
+
+
+class NotebookLoader(BaseLoader):
+    """Loader that loads .ipynb notebook files."""
+
+    def __init__(
+        self,
+        path: str,
+        include_outputs: bool = False,
+        max_output_length: int = 10,
+        remove_newline: bool = False,
+        traceback: bool = False,
+    ):
+        """Initialize with path."""
+        self.file_path = path
+        self.include_outputs = include_outputs
+        self.max_output_length = max_output_length
+        self.remove_newline = remove_newline
+        self.traceback = traceback
+
+    def load(
+        self,
+    ) -> List[Document]:
+        """Load documents."""
+        try:
+            import pandas as pd
+        except ImportError:
+            raise ValueError(
+                "pandas is needed for Notebook Loader, "
+                "please install with `pip install pandas`"
+            )
+        p = Path(self.file_path)
+
+        with open(p, encoding="utf8") as f:
+            d = json.load(f)
+
+        data = pd.json_normalize(d["cells"])
+        filtered_data = data[["cell_type", "source", "outputs"]]
+        if self.remove_newline:
+            filtered_data = filtered_data.applymap(remove_newlines)
+
+        text = filtered_data.apply(
+            lambda x: concatenate_cells(
+                x, self.include_outputs, self.max_output_length, self.traceback
+            ),
+            axis=1,
+        ).str.cat(sep=" ")
+
+        metadata = {"source": str(p)}
+
+        return [Document(page_content=text, metadata=metadata)]
--- a/langchain/document_loaders/word_document.py
+++ b/langchain/document_loaders/word_document.py
@ -0,0 +1,43 @@
+"""Loader that loads word documents."""
+import os
+from typing import List
+
+from langchain.document_loaders.unstructured import UnstructuredFileLoader
+
+
+class UnstructuredWordDocumentLoader(UnstructuredFileLoader):
+    """Loader that uses unstructured to load word documents."""
+
+    def _get_elements(self) -> List:
+        from unstructured.__version__ import __version__ as __unstructured_version__
+        from unstructured.file_utils.filetype import FileType, detect_filetype
+
+        unstructured_version = tuple(
+            [int(x) for x in __unstructured_version__.split(".")]
+        )
+        # NOTE(MthwRobinson) - magic will raise an import error if the libmagic
+        # system dependency isn't installed. If it's not installed, we'll just
+        # check the file extension
+        try:
+            import magic  # noqa: F401
+
+            is_doc = detect_filetype(self.file_path) == FileType.DOC
+        except ImportError:
+            _, extension = os.path.splitext(self.file_path)
+            is_doc = extension == ".doc"
+
+        if is_doc and unstructured_version < (0, 4, 11):
+            raise ValueError(
+                f"You are on unstructured version {__unstructured_version__}. "
+                "Partitioning .doc files is only supported in unstructured>=0.4.11. "
+                "Please upgrade the unstructured package and try again."
+            )
+
+        if is_doc:
+            from unstructured.partition.doc import partition_doc
+
+            return partition_doc(filename=self.file_path)
+        else:
+            from unstructured.partition.docx import partition_docx
+
+            return partition_docx(filename=self.file_path)
--- a/langchain/document_loaders/youtube.py
+++ b/langchain/document_loaders/youtube.py
@ -10,10 +10,13 @@ from langchain.document_loaders.base import BaseLoader
 class YoutubeLoader(BaseLoader):
    """Loader that loads Youtube transcripts."""

-    def __init__(self, video_id: str, add_video_info: bool = False):
+    def __init__(
+        self, video_id: str, add_video_info: bool = False, language: str = "en"
+    ):
        """Initialize with YouTube video ID."""
        self.video_id = video_id
        self.add_video_info = add_video_info
+        self.language = language

    @classmethod
    def from_youtube_url(cls, youtube_url: str, **kwargs: Any) -> YoutubeLoader:
@ -39,7 +42,9 @@ class YoutubeLoader(BaseLoader):
            video_info = self._get_video_info()
            metadata.update(video_info)

-        transcript_pieces = YouTubeTranscriptApi.get_transcript(self.video_id)
+        transcript_pieces = YouTubeTranscriptApi.get_transcript(
+            self.video_id, languages=(self.language,)
+        )
        transcript = " ".join([t["text"].strip(" ") for t in transcript_pieces])

        return [Document(page_content=transcript, metadata=metadata)]
--- a/langchain/embeddings/cohere.py
+++ b/langchain/embeddings/cohere.py
@ -25,7 +25,7 @@ class CohereEmbeddings(BaseModel, Embeddings):
    model: str = "large"
    """Model name to use."""

-    truncate: str = "NONE"
+    truncate: Optional[str] = None
    """Truncate embeddings that are too long from start or end ("NONE"|"START"|"END")"""

    cohere_api_key: Optional[str] = None
--- a/langchain/indexes/init.py
+++ b/langchain/indexes/init.py
@ -1,4 +1,5 @@
 """All index utils."""
 from langchain.indexes.graph import GraphIndexCreator
+from langchain.indexes.vectorstore import VectorstoreIndexCreator

-__all__ = ["GraphIndexCreator"]
+__all__ = ["GraphIndexCreator", "VectorstoreIndexCreator"]
--- a/langchain/indexes/vectorstore.py
+++ b/langchain/indexes/vectorstore.py
@ -0,0 +1,69 @@
+from typing import Any, List, Optional, Type
+
+from pydantic import BaseModel, Extra, Field
+
+from langchain.chains.qa_with_sources.vector_db import VectorDBQAWithSourcesChain
+from langchain.chains.vector_db_qa.base import VectorDBQA
+from langchain.document_loaders.base import BaseLoader
+from langchain.embeddings.base import Embeddings
+from langchain.embeddings.openai import OpenAIEmbeddings
+from langchain.llms.base import BaseLLM
+from langchain.llms.openai import OpenAI
+from langchain.text_splitter import RecursiveCharacterTextSplitter, TextSplitter
+from langchain.vectorstores.base import VectorStore
+from langchain.vectorstores.chroma import Chroma
+
+
+def _get_default_text_splitter() -> TextSplitter:
+    return RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
+
+
+class VectorStoreIndexWrapper(BaseModel):
+    """Wrapper around a vectorstore for easy access."""
+
+    vectorstore: VectorStore
+
+    class Config:
+        """Configuration for this pydantic object."""
+
+        extra = Extra.forbid
+        arbitrary_types_allowed = True
+
+    def query(self, question: str, llm: Optional[BaseLLM] = None, **kwargs: Any) -> str:
+        """Query the vectorstore."""
+        llm = llm or OpenAI(temperature=0)
+        chain = VectorDBQA.from_chain_type(llm, vectorstore=self.vectorstore, **kwargs)
+        return chain.run(question)
+
+    def query_with_sources(
+        self, question: str, llm: Optional[BaseLLM] = None, **kwargs: Any
+    ) -> dict:
+        """Query the vectorstore and get back sources."""
+        llm = llm or OpenAI(temperature=0)
+        chain = VectorDBQAWithSourcesChain.from_chain_type(
+            llm, vectorstore=self.vectorstore, **kwargs
+        )
+        return chain({chain.question_key: question})
+
+
+class VectorstoreIndexCreator(BaseModel):
+    """Logic for creating indexes."""
+
+    vectorstore_cls: Type[VectorStore] = Chroma
+    embedding: Embeddings = Field(default_factory=OpenAIEmbeddings)
+    text_splitter: TextSplitter = Field(default_factory=_get_default_text_splitter)
+
+    class Config:
+        """Configuration for this pydantic object."""
+
+        extra = Extra.forbid
+        arbitrary_types_allowed = True
+
+    def from_loaders(self, loaders: List[BaseLoader]) -> VectorStoreIndexWrapper:
+        """Create a vectorstore index from loaders."""
+        docs = []
+        for loader in loaders:
+            docs.extend(loader.load())
+        sub_docs = self.text_splitter.split_documents(docs)
+        vectorstore = self.vectorstore_cls.from_documents(sub_docs, self.embedding)
+        return VectorStoreIndexWrapper(vectorstore=vectorstore)
--- a/langchain/llms/init.py
+++ b/langchain/llms/init.py
@ -2,28 +2,38 @@
 from typing import Dict, Type

 from langchain.llms.ai21 import AI21
+from langchain.llms.aleph_alpha import AlephAlpha
 from langchain.llms.anthropic import Anthropic
+from langchain.llms.bananadev import Banana
 from langchain.llms.base import BaseLLM
 from langchain.llms.cerebriumai import CerebriumAI
 from langchain.llms.cohere import Cohere
+from langchain.llms.deepinfra import DeepInfra
 from langchain.llms.forefrontai import ForefrontAI
 from langchain.llms.gooseai import GooseAI
 from langchain.llms.huggingface_endpoint import HuggingFaceEndpoint
 from langchain.llms.huggingface_hub import HuggingFaceHub
 from langchain.llms.huggingface_pipeline import HuggingFacePipeline
+from langchain.llms.modal import Modal
 from langchain.llms.nlpcloud import NLPCloud
 from langchain.llms.openai import AzureOpenAI, OpenAI
 from langchain.llms.petals import Petals
 from langchain.llms.promptlayer_openai import PromptLayerOpenAI
 from langchain.llms.self_hosted import SelfHostedPipeline
 from langchain.llms.self_hosted_hugging_face import SelfHostedHuggingFaceLLM
+from langchain.llms.stochasticai import StochasticAI
+from langchain.llms.writer import Writer

 __all__ = [
    "Anthropic",
+    "AlephAlpha",
+    "Banana",
    "CerebriumAI",
    "Cohere",
+    "DeepInfra",
    "ForefrontAI",
    "GooseAI",
+    "Modal",
    "NLPCloud",
    "OpenAI",
    "Petals",
@ -35,17 +45,23 @@ __all__ = [
    "SelfHostedPipeline",
    "SelfHostedHuggingFaceLLM",
    "PromptLayerOpenAI",
+    "StochasticAI",
+    "Writer",
 ]

 type_to_cls_dict: Dict[str, Type[BaseLLM]] = {
    "ai21": AI21,
+    "aleph_alpha": AlephAlpha,
    "anthropic": Anthropic,
+    "bananadev": Banana,
    "cerebriumai": CerebriumAI,
    "cohere": Cohere,
+    "deepinfra": DeepInfra,
    "forefrontai": ForefrontAI,
    "gooseai": GooseAI,
    "huggingface_hub": HuggingFaceHub,
    "huggingface_endpoint": HuggingFaceEndpoint,
+    "modal": Modal,
    "nlpcloud": NLPCloud,
    "openai": OpenAI,
    "petals": Petals,
@ -53,4 +69,6 @@ type_to_cls_dict: Dict[str, Type[BaseLLM]] = {
    "azure": AzureOpenAI,
    "self_hosted": SelfHostedPipeline,
    "self_hosted_hugging_face": SelfHostedHuggingFaceLLM,
+    "stochasticai": StochasticAI,
+    "writer": Writer,
 }
--- a/langchain/llms/aleph_alpha.py
+++ b/langchain/llms/aleph_alpha.py
@ -0,0 +1,236 @@
+"""Wrapper around Aleph Alpha APIs."""
+from typing import Any, Dict, List, Optional, Sequence
+
+from pydantic import BaseModel, Extra, root_validator
+
+from langchain.llms.base import LLM
+from langchain.llms.utils import enforce_stop_tokens
+from langchain.utils import get_from_dict_or_env
+
+
+class AlephAlpha(LLM, BaseModel):
+    """Wrapper around Aleph Alpha large language models.
+
+    To use, you should have the ``aleph_alpha_client`` python package installed, and the
+    environment variable ``ALEPH_ALPHA_API_KEY`` set with your API key, or pass
+    it as a named parameter to the constructor.
+
+    Parameters are explained more in depth here:
+    https://github.com/Aleph-Alpha/aleph-alpha-client/blob/c14b7dd2b4325c7da0d6a119f6e76385800e097b/aleph_alpha_client/completion.py#L10
+
+    Example:
+        .. code-block:: python
+
+            from langchain.llms import AlephAlpha
+            alpeh_alpha = AlephAlpha(aleph_alpha_api_key="my-api-key")
+    """
+
+    client: Any  #: :meta private:
+    model: Optional[str] = "luminous-base"
+    """Model name to use."""
+
+    maximum_tokens: int = 64
+    """The maximum number of tokens to be generated."""
+
+    temperature: float = 0.0
+    """A non-negative float that tunes the degree of randomness in generation."""
+
+    top_k: int = 0
+    """Number of most likely tokens to consider at each step."""
+
+    top_p: float = 0.0
+    """Total probability mass of tokens to consider at each step."""
+
+    presence_penalty: float = 0.0
+    """Penalizes repeated tokens."""
+
+    frequency_penalty: float = 0.0
+    """Penalizes repeated tokens according to frequency."""
+
+    repetition_penalties_include_prompt: Optional[bool] = False
+    """Flag deciding whether presence penalty or frequency penalty are
+    updated from the prompt."""
+
+    use_multiplicative_presence_penalty: Optional[bool] = False
+    """Flag deciding whether presence penalty is applied
+    multiplicatively (True) or additively (False)."""
+
+    penalty_bias: Optional[str] = None
+    """Penalty bias for the completion."""
+
+    penalty_exceptions: Optional[List[str]] = None
+    """List of strings that may be generated without penalty,
+    regardless of other penalty settings"""
+
+    penalty_exceptions_include_stop_sequences: Optional[bool] = None
+    """Should stop_sequences be included in penalty_exceptions."""
+
+    best_of: Optional[int] = None
+    """returns the one with the "best of" results
+    (highest log probability per token)
+    """
+
+    n: int = 1
+    """How many completions to generate for each prompt."""
+
+    logit_bias: Optional[Dict[int, float]] = None
+    """The logit bias allows to influence the likelihood of generating tokens."""
+
+    log_probs: Optional[int] = None
+    """Number of top log probabilities to be returned for each generated token."""
+
+    tokens: Optional[bool] = False
+    """return tokens of completion."""
+
+    disable_optimizations: Optional[bool] = False
+
+    minimum_tokens: Optional[int] = 0
+    """Generate at least this number of tokens."""
+
+    echo: bool = False
+    """Echo the prompt in the completion."""
+
+    use_multiplicative_frequency_penalty: bool = False
+
+    sequence_penalty: float = 0.0
+
+    sequence_penalty_min_length: int = 2
+
+    use_multiplicative_sequence_penalty: bool = False
+
+    completion_bias_inclusion: Optional[Sequence[str]] = None
+
+    completion_bias_inclusion_first_token_only: bool = False
+
+    completion_bias_exclusion: Optional[Sequence[str]] = None
+
+    completion_bias_exclusion_first_token_only: bool = False
+    """Only consider the first token for the completion_bias_exclusion."""
+
+    contextual_control_threshold: Optional[float] = None
+    """If set to None, attention control parameters only apply to those tokens that have
+    explicitly been set in the request.
+    If set to a non-None value, control parameters are also applied to similar tokens.
+    """
+
+    control_log_additive: Optional[bool] = True
+    """True: apply control by adding the log(control_factor) to attention scores.
+    False: (attention_scores - - attention_scores.min(-1)) * control_factor
+    """
+
+    repetition_penalties_include_completion: bool = True
+    """Flag deciding whether presence penalty or frequency penalty
+    are updated from the completion."""
+
+    raw_completion: bool = False
+    """Force the raw completion of the model to be returned."""
+
+    aleph_alpha_api_key: Optional[str] = None
+    """API key for Aleph Alpha API."""
+
+    stop_sequences: Optional[List[str]] = None
+    """Stop sequences to use."""
+
+    class Config:
+        """Configuration for this pydantic object."""
+
+        extra = Extra.forbid
+
+    @root_validator()
+    def validate_environment(cls, values: Dict) -> Dict:
+        """Validate that api key and python package exists in environment."""
+        aleph_alpha_api_key = get_from_dict_or_env(
+            values, "aleph_alpha_api_key", "ALEPH_ALPHA_API_KEY"
+        )
+        try:
+            import aleph_alpha_client
+
+            values["client"] = aleph_alpha_client.Client(token=aleph_alpha_api_key)
+        except ImportError:
+            raise ValueError(
+                "Could not import aleph_alpha_client python package. "
+                "Please it install it with `pip install aleph_alpha_client`."
+            )
+        return values
+
+    @property
+    def _default_params(self) -> Dict[str, Any]:
+        """Get the default parameters for calling the Aleph Alpha API."""
+        return {
+            "maximum_tokens": self.maximum_tokens,
+            "temperature": self.temperature,
+            "top_k": self.top_k,
+            "top_p": self.top_p,
+            "presence_penalty": self.presence_penalty,
+            "frequency_penalty": self.frequency_penalty,
+            "n": self.n,
+            "repetition_penalties_include_prompt": self.repetition_penalties_include_prompt,  # noqa: E501
+            "use_multiplicative_presence_penalty": self.use_multiplicative_presence_penalty,  # noqa: E501
+            "penalty_bias": self.penalty_bias,
+            "penalty_exceptions": self.penalty_exceptions,
+            "penalty_exceptions_include_stop_sequences": self.penalty_exceptions_include_stop_sequences,  # noqa: E501
+            "best_of": self.best_of,
+            "logit_bias": self.logit_bias,
+            "log_probs": self.log_probs,
+            "tokens": self.tokens,
+            "disable_optimizations": self.disable_optimizations,
+            "minimum_tokens": self.minimum_tokens,
+            "echo": self.echo,
+            "use_multiplicative_frequency_penalty": self.use_multiplicative_frequency_penalty,  # noqa: E501
+            "sequence_penalty": self.sequence_penalty,
+            "sequence_penalty_min_length": self.sequence_penalty_min_length,
+            "use_multiplicative_sequence_penalty": self.use_multiplicative_sequence_penalty,  # noqa: E501
+            "completion_bias_inclusion": self.completion_bias_inclusion,
+            "completion_bias_inclusion_first_token_only": self.completion_bias_inclusion_first_token_only,  # noqa: E501
+            "completion_bias_exclusion": self.completion_bias_exclusion,
+            "completion_bias_exclusion_first_token_only": self.completion_bias_exclusion_first_token_only,  # noqa: E501
+            "contextual_control_threshold": self.contextual_control_threshold,
+            "control_log_additive": self.control_log_additive,
+            "repetition_penalties_include_completion": self.repetition_penalties_include_completion,  # noqa: E501
+            "raw_completion": self.raw_completion,
+        }
+
+    @property
+    def _identifying_params(self) -> Dict[str, Any]:
+        """Get the identifying parameters."""
+        return {**{"model": self.model}, **self._default_params}
+
+    @property
+    def _llm_type(self) -> str:
+        """Return type of llm."""
+        return "alpeh_alpha"
+
+    def _call(self, prompt: str, stop: Optional[List[str]] = None) -> str:
+        """Call out to Aleph Alpha's completion endpoint.
+
+        Args:
+            prompt: The prompt to pass into the model.
+            stop: Optional list of stop words to use when generating.
+
+        Returns:
+            The string generated by the model.
+
+        Example:
+            .. code-block:: python
+
+                response = alpeh_alpha("Tell me a joke.")
+        """
+        from aleph_alpha_client import CompletionRequest, Prompt
+
+        params = self._default_params
+        if self.stop_sequences is not None and stop is not None:
+            raise ValueError(
+                "stop sequences found in both the input and default params."
+            )
+        elif self.stop_sequences is not None:
+            params["stop_sequences"] = self.stop_sequences
+        else:
+            params["stop_sequences"] = stop
+        request = CompletionRequest(prompt=Prompt.from_text(prompt), **params)
+        response = self.client.complete(model=self.model, request=request)
+        text = response.completions[0].completion
+        # If stop tokens are provided, Aleph Alpha's endpoint returns them.
+        # In order to make this consistent with other endpoints, we strip them.
+        if stop is not None or self.stop_sequences is not None:
+            text = enforce_stop_tokens(text, params["stop_sequences"])
+        return text
--- a/langchain/llms/anthropic.py
+++ b/langchain/llms/anthropic.py
@ -18,7 +18,7 @@ class Anthropic(LLM, BaseModel):
    Example:
        .. code-block:: python
            import anthropic
-            from langchain import Anthropic
+            from langchain.llms import Anthropic
            model = Anthropic(model="<model_name>", anthropic_api_key="my-api-key")

            # Simplest invocation, automatically wrapped with HUMAN_PROMPT
--- a/langchain/llms/bananadev.py
+++ b/langchain/llms/bananadev.py
@ -0,0 +1,117 @@
+"""Wrapper around Banana API."""
+import logging
+from typing import Any, Dict, List, Mapping, Optional
+
+from pydantic import BaseModel, Extra, Field, root_validator
+
+from langchain.llms.base import LLM
+from langchain.llms.utils import enforce_stop_tokens
+from langchain.utils import get_from_dict_or_env
+
+logger = logging.getLogger(__name__)
+
+
+class Banana(LLM, BaseModel):
+    """Wrapper around Banana large language models.
+
+    To use, you should have the ``banana-dev`` python package installed,
+    and the environment variable ``BANANA_API_KEY`` set with your API key.
+
+    Any parameters that are valid to be passed to the call can be passed
+    in, even if not explicitly saved on this class.
+
+    Example:
+        .. code-block:: python
+            from langchain.llms import Banana
+            banana = Banana(model_key="")
+    """
+
+    model_key: str = ""
+    """model endpoint to use"""
+
+    model_kwargs: Dict[str, Any] = Field(default_factory=dict)
+    """Holds any model parameters valid for `create` call not
+    explicitly specified."""
+
+    banana_api_key: Optional[str] = None
+
+    class Config:
+        """Configuration for this pydantic config."""
+
+        extra = Extra.forbid
+
+    @root_validator(pre=True)
+    def build_extra(cls, values: Dict[str, Any]) -> Dict[str, Any]:
+        """Build extra kwargs from additional params that were passed in."""
+        all_required_field_names = {field.alias for field in cls.__fields__.values()}
+
+        extra = values.get("model_kwargs", {})
+        for field_name in list(values):
+            if field_name not in all_required_field_names:
+                if field_name in extra:
+                    raise ValueError(f"Found {field_name} supplied twice.")
+                logger.warning(
+                    f"""{field_name} was transfered to model_kwargs.
+                    Please confirm that {field_name} is what you intended."""
+                )
+                extra[field_name] = values.pop(field_name)
+        values["model_kwargs"] = extra
+        return values
+
+    @root_validator()
+    def validate_environment(cls, values: Dict) -> Dict:
+        """Validate that api key and python package exists in environment."""
+        banana_api_key = get_from_dict_or_env(
+            values, "banana_api_key", "BANANA_API_KEY"
+        )
+        values["banana_api_key"] = banana_api_key
+        return values
+
+    @property
+    def _identifying_params(self) -> Mapping[str, Any]:
+        """Get the identifying parameters."""
+        return {
+            **{"model_key": self.model_key},
+            **{"model_kwargs": self.model_kwargs},
+        }
+
+    @property
+    def _llm_type(self) -> str:
+        """Return type of llm."""
+        return "banana"
+
+    def _call(self, prompt: str, stop: Optional[List[str]] = None) -> str:
+        """Call to Banana endpoint."""
+        try:
+            import banana_dev as banana
+        except ImportError:
+            raise ValueError(
+                "Could not import banana-dev python package. "
+                "Please install it with `pip install banana-dev`."
+            )
+        params = self.model_kwargs or {}
+        api_key = self.banana_api_key
+        model_key = self.model_key
+        model_inputs = {
+            # a json specific to your model.
+            "prompt": prompt,
+            **params,
+        }
+        response = banana.run(api_key, model_key, model_inputs)
+        try:
+            text = response["modelOutputs"][0]["output"]
+        except (KeyError, TypeError):
+            returned = response["modelOutputs"][0]
+            raise ValueError(
+                "Response should be of schema: {'output': 'text'}."
+                f"\nResponse was: {returned}"
+                "\nTo fix this:"
+                "\n- fork the source repo of the Banana model"
+                "\n- modify app.py to return the above schema"
+                "\n- deploy that as a custom repo"
+            )
+        if stop is not None:
+            # I believe this is required since the stop tokens
+            # are not enforced by the model parameters
+            text = enforce_stop_tokens(text, stop)
+        return text
--- a/langchain/llms/cerebriumai.py
+++ b/langchain/llms/cerebriumai.py
@ -22,7 +22,7 @@ class CerebriumAI(LLM, BaseModel):

    Example:
        .. code-block:: python
-            from langchain import CerebriumAI
+            from langchain.llms import CerebriumAI
            cerebrium = CerebriumAI(endpoint_url="")

    """
--- a/langchain/llms/cohere.py
+++ b/langchain/llms/cohere.py
@ -21,7 +21,7 @@ class Cohere(LLM, BaseModel):
    Example:
        .. code-block:: python

-            from langchain import Cohere
+            from langchain.llms import Cohere
            cohere = Cohere(model="gptd-instruct-tft", cohere_api_key="my-api-key")
    """

@ -47,6 +47,10 @@ class Cohere(LLM, BaseModel):
    presence_penalty: int = 0
    """Penalizes repeated tokens."""

+    truncate: Optional[str] = None
+    """Specify how the client handles inputs longer than the maximum token
+    length: Truncate from START, END or NONE"""
+
    cohere_api_key: Optional[str] = None

    stop: Optional[List[str]] = None
@ -83,6 +87,7 @@ class Cohere(LLM, BaseModel):
            "p": self.p,
            "frequency_penalty": self.frequency_penalty,
            "presence_penalty": self.presence_penalty,
+            "truncate": self.truncate,
        }

    @property
--- a/langchain/llms/deepinfra.py
+++ b/langchain/llms/deepinfra.py
@ -0,0 +1,97 @@
+"""Wrapper around DeepInfra APIs."""
+from typing import Any, Dict, List, Mapping, Optional
+
+import requests
+from pydantic import BaseModel, Extra, root_validator
+
+from langchain.llms.base import LLM
+from langchain.llms.utils import enforce_stop_tokens
+from langchain.utils import get_from_dict_or_env
+
+DEFAULT_MODEL_ID = "google/flan-t5-xl"
+
+
+class DeepInfra(LLM, BaseModel):
+    """Wrapper around DeepInfra deployed models.
+
+    To use, you should have the ``requests`` python package installed, and the
+    environment variable ``DEEPINFRA_API_TOKEN`` set with your API token, or pass
+    it as a named parameter to the constructor.
+
+    Only supports `text-generation` and `text2text-generation` for now.
+
+    Example:
+        .. code-block:: python
+
+            from langchain.llms import DeepInfra
+            di = DeepInfra(model_id="google/flan-t5-xl",
+                                deepinfra_api_token="my-api-key")
+    """
+
+    model_id: str = DEFAULT_MODEL_ID
+    model_kwargs: Optional[dict] = None
+
+    deepinfra_api_token: Optional[str] = None
+
+    class Config:
+        """Configuration for this pydantic object."""
+
+        extra = Extra.forbid
+
+    @root_validator()
+    def validate_environment(cls, values: Dict) -> Dict:
+        """Validate that api key and python package exists in environment."""
+        deepinfra_api_token = get_from_dict_or_env(
+            values, "deepinfra_api_token", "DEEPINFRA_API_TOKEN"
+        )
+        values["deepinfra_api_token"] = deepinfra_api_token
+        return values
+
+    @property
+    def _identifying_params(self) -> Mapping[str, Any]:
+        """Get the identifying parameters."""
+        return {
+            **{"model_id": self.model_id},
+            **{"model_kwargs": self.model_kwargs},
+        }
+
+    @property
+    def _llm_type(self) -> str:
+        """Return type of llm."""
+        return "deepinfra"
+
+    def _call(self, prompt: str, stop: Optional[List[str]] = None) -> str:
+        """Call out to DeepInfra's inference API endpoint.
+
+        Args:
+            prompt: The prompt to pass into the model.
+            stop: Optional list of stop words to use when generating.
+
+        Returns:
+            The string generated by the model.
+
+        Example:
+            .. code-block:: python
+
+                response = di("Tell me a joke.")
+        """
+        _model_kwargs = self.model_kwargs or {}
+
+        res = requests.post(
+            f"https://api.deepinfra.com/v1/inference/{self.model_id}",
+            headers={
+                "Authorization": f"bearer {self.deepinfra_api_token}",
+                "Content-Type": "application/json",
+            },
+            json={"input": prompt, **_model_kwargs},
+        )
+
+        if res.status_code != 200:
+            raise ValueError("Error raised by inference API")
+        text = res.json()[0]["generated_text"]
+
+        if stop is not None:
+            # I believe this is required since the stop tokens
+            # are not enforced by the model parameters
+            text = enforce_stop_tokens(text, stop)
+        return text
--- a/langchain/llms/forefrontai.py
+++ b/langchain/llms/forefrontai.py
@ -18,7 +18,7 @@ class ForefrontAI(LLM, BaseModel):
    Example:
        .. code-block:: python

-            from langchain import ForefrontAI
+            from langchain.llms import ForefrontAI
            forefrontai = ForefrontAI(endpoint_url="")
    """

--- a/langchain/llms/gooseai.py
+++ b/langchain/llms/gooseai.py
@ -21,7 +21,7 @@ class GooseAI(LLM, BaseModel):

    Example:
        .. code-block:: python
-            from langchain import GooseAI
+            from langchain.llms import GooseAI
            gooseai = GooseAI(model_name="gpt-neo-20b")

    """
--- a/langchain/llms/huggingface_hub.py
+++ b/langchain/llms/huggingface_hub.py
@ -23,7 +23,7 @@ class HuggingFaceHub(LLM, BaseModel):
    Example:
        .. code-block:: python

-            from langchain import HuggingFaceHub
+            from langchain.llms import HuggingFaceHub
            hf = HuggingFaceHub(repo_id="gpt2", huggingfacehub_api_token="my-api-key")
    """

--- a/langchain/llms/huggingface_pipeline.py
+++ b/langchain/llms/huggingface_pipeline.py
@ -25,14 +25,14 @@ class HuggingFacePipeline(LLM, BaseModel):
    Example using from_model_id:
        .. code-block:: python

-            from langchain.llms.huggingface_pipeline import HuggingFacePipeline
+            from langchain.llms import HuggingFacePipeline
            hf = HuggingFacePipeline.from_model_id(
                model_id="gpt2", task="text-generation"
            )
    Example passing pipeline in directly:
        .. code-block:: python

-            from langchain.llms.huggingface_pipeline import HuggingFacePipeline
+            from langchain.llms import HuggingFacePipeline
            from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline

            model_id = "gpt2"
--- a/langchain/llms/modal.py
+++ b/langchain/llms/modal.py
@ -0,0 +1,92 @@
+"""Wrapper around Modal API."""
+import logging
+from typing import Any, Dict, List, Mapping, Optional
+
+import requests
+from pydantic import BaseModel, Extra, Field, root_validator
+
+from langchain.llms.base import LLM
+from langchain.llms.utils import enforce_stop_tokens
+
+logger = logging.getLogger(__name__)
+
+
+class Modal(LLM, BaseModel):
+    """Wrapper around Modal large language models.
+
+    To use, you should have the ``modal-client`` python package installed.
+
+    Any parameters that are valid to be passed to the call can be passed
+    in, even if not explicitly saved on this class.
+
+    Example:
+        .. code-block:: python
+            from langchain.llms import Modal
+            modal = Modal(endpoint_url="")
+
+    """
+
+    endpoint_url: str = ""
+    """model endpoint to use"""
+
+    model_kwargs: Dict[str, Any] = Field(default_factory=dict)
+    """Holds any model parameters valid for `create` call not
+    explicitly specified."""
+
+    class Config:
+        """Configuration for this pydantic config."""
+
+        extra = Extra.forbid
+
+    @root_validator(pre=True)
+    def build_extra(cls, values: Dict[str, Any]) -> Dict[str, Any]:
+        """Build extra kwargs from additional params that were passed in."""
+        all_required_field_names = {field.alias for field in cls.__fields__.values()}
+
+        extra = values.get("model_kwargs", {})
+        for field_name in list(values):
+            if field_name not in all_required_field_names:
+                if field_name in extra:
+                    raise ValueError(f"Found {field_name} supplied twice.")
+                logger.warning(
+                    f"""{field_name} was transfered to model_kwargs.
+                    Please confirm that {field_name} is what you intended."""
+                )
+                extra[field_name] = values.pop(field_name)
+        values["model_kwargs"] = extra
+        return values
+
+    @property
+    def _identifying_params(self) -> Mapping[str, Any]:
+        """Get the identifying parameters."""
+        return {
+            **{"endpoint_url": self.endpoint_url},
+            **{"model_kwargs": self.model_kwargs},
+        }
+
+    @property
+    def _llm_type(self) -> str:
+        """Return type of llm."""
+        return "modal"
+
+    def _call(self, prompt: str, stop: Optional[List[str]] = None) -> str:
+        """Call to Modal endpoint."""
+        params = self.model_kwargs or {}
+        response = requests.post(
+            url=self.endpoint_url,
+            headers={
+                "Content-Type": "application/json",
+            },
+            json={"prompt": prompt, **params},
+        )
+        try:
+            if prompt in response.json()["prompt"]:
+                response_json = response.json()
+        except KeyError:
+            raise ValueError("LangChain requires 'prompt' key in response.")
+        text = response_json["prompt"]
+        if stop is not None:
+            # I believe this is required since the stop tokens
+            # are not enforced by the model parameters
+            text = enforce_stop_tokens(text, stop)
+        return text
--- a/langchain/llms/nlpcloud.py
+++ b/langchain/llms/nlpcloud.py
@ -16,7 +16,7 @@ class NLPCloud(LLM, BaseModel):
    Example:
        .. code-block:: python

-            from langchain import NLPCloud
+            from langchain.llms import NLPCloud
            nlpcloud = NLPCloud(model="gpt-neox-20b")
    """

--- a/langchain/llms/openai.py
+++ b/langchain/llms/openai.py
@ -75,7 +75,7 @@ class BaseOpenAI(BaseLLM, BaseModel):
    Example:
        .. code-block:: python

-            from langchain import OpenAI
+            from langchain.llms import OpenAI
            openai = OpenAI(model_name="text-davinci-003")
    """

@ -251,7 +251,9 @@ class BaseOpenAI(BaseLLM, BaseModel):
                    prompt=_prompts, **params
                ):
                    self.callback_manager.on_llm_new_token(
-                        stream_resp["choices"][0]["text"], verbose=self.verbose
+                        stream_resp["choices"][0]["text"],
+                        verbose=self.verbose,
+                        logprobs=stream_resp["choices"][0]["logprobs"],
                    )
                    _update_response(response, stream_resp)
                choices.extend(response["choices"])
@ -285,11 +287,15 @@ class BaseOpenAI(BaseLLM, BaseModel):
                ):
                    if self.callback_manager.is_async:
                        await self.callback_manager.on_llm_new_token(
-                            stream_resp["choices"][0]["text"], verbose=self.verbose
+                            stream_resp["choices"][0]["text"],
+                            verbose=self.verbose,
+                            logprobs=stream_resp["choices"][0]["logprobs"],
                        )
                    else:
                        self.callback_manager.on_llm_new_token(
-                            stream_resp["choices"][0]["text"], verbose=self.verbose
+                            stream_resp["choices"][0]["text"],
+                            verbose=self.verbose,
+                            logprobs=stream_resp["choices"][0]["logprobs"],
                        )
                    _update_response(response, stream_resp)
                choices.extend(response["choices"])
--- a/langchain/llms/petals.py
+++ b/langchain/llms/petals.py
@ -22,7 +22,7 @@ class Petals(LLM, BaseModel):

    Example:
        .. code-block:: python
-            from langchain import petals
+            from langchain.llms import petals
            petals = Petals()

    """
--- a/langchain/llms/promptlayer_openai.py
+++ b/langchain/llms/promptlayer_openai.py
@ -23,7 +23,7 @@ class PromptLayerOpenAI(OpenAI, BaseModel):
    Example:
        .. code-block:: python

-            from langchain import OpenAI
+            from langchain.llms import OpenAI
            openai = OpenAI(model_name="text-davinci-003")
    """

--- a/langchain/llms/stochasticai.py
+++ b/langchain/llms/stochasticai.py
@ -0,0 +1,130 @@
+"""Wrapper around StochasticAI APIs."""
+import logging
+import time
+from typing import Any, Dict, List, Mapping, Optional
+
+import requests
+from pydantic import BaseModel, Extra, Field, root_validator
+
+from langchain.llms.base import LLM
+from langchain.llms.utils import enforce_stop_tokens
+from langchain.utils import get_from_dict_or_env
+
+logger = logging.getLogger(__name__)
+
+
+class StochasticAI(LLM, BaseModel):
+    """Wrapper around StochasticAI large language models.
+
+    To use, you should have the environment variable ``STOCHASTICAI_API_KEY``
+    set with your API key.
+
+    Example:
+        .. code-block:: python
+
+            from langchain.llms import StochasticAI
+            stochasticai = StochasticAI(api_url="")
+    """
+
+    api_url: str = ""
+    """Model name to use."""
+
+    model_kwargs: Dict[str, Any] = Field(default_factory=dict)
+    """Holds any model parameters valid for `create` call not
+    explicitly specified."""
+
+    stochasticai_api_key: Optional[str] = None
+
+    class Config:
+        """Configuration for this pydantic object."""
+
+        extra = Extra.forbid
+
+    @root_validator(pre=True)
+    def build_extra(cls, values: Dict[str, Any]) -> Dict[str, Any]:
+        """Build extra kwargs from additional params that were passed in."""
+        all_required_field_names = {field.alias for field in cls.__fields__.values()}
+
+        extra = values.get("model_kwargs", {})
+        for field_name in list(values):
+            if field_name not in all_required_field_names:
+                if field_name in extra:
+                    raise ValueError(f"Found {field_name} supplied twice.")
+                logger.warning(
+                    f"""{field_name} was transfered to model_kwargs.
+                    Please confirm that {field_name} is what you intended."""
+                )
+                extra[field_name] = values.pop(field_name)
+        values["model_kwargs"] = extra
+        return values
+
+    @root_validator()
+    def validate_environment(cls, values: Dict) -> Dict:
+        """Validate that api key exists in environment."""
+        stochasticai_api_key = get_from_dict_or_env(
+            values, "stochasticai_api_key", "STOCHASTICAI_API_KEY"
+        )
+        values["stochasticai_api_key"] = stochasticai_api_key
+        return values
+
+    @property
+    def _identifying_params(self) -> Mapping[str, Any]:
+        """Get the identifying parameters."""
+        return {
+            **{"endpoint_url": self.api_url},
+            **{"model_kwargs": self.model_kwargs},
+        }
+
+    @property
+    def _llm_type(self) -> str:
+        """Return type of llm."""
+        return "stochasticai"
+
+    def _call(self, prompt: str, stop: Optional[List[str]] = None) -> str:
+        """Call out to StochasticAI's complete endpoint.
+
+        Args:
+            prompt: The prompt to pass into the model.
+            stop: Optional list of stop words to use when generating.
+
+        Returns:
+            The string generated by the model.
+
+        Example:
+            .. code-block:: python
+
+                response = StochasticAI("Tell me a joke.")
+        """
+        params = self.model_kwargs or {}
+        response_post = requests.post(
+            url=self.api_url,
+            json={"prompt": prompt, "params": params},
+            headers={
+                "apiKey": f"{self.stochasticai_api_key}",
+                "Accept": "application/json",
+                "Content-Type": "application/json",
+            },
+        )
+        response_post.raise_for_status()
+        response_post_json = response_post.json()
+        completed = False
+        while not completed:
+            response_get = requests.get(
+                url=response_post_json["data"]["responseUrl"],
+                headers={
+                    "apiKey": f"{self.stochasticai_api_key}",
+                    "Accept": "application/json",
+                    "Content-Type": "application/json",
+                },
+            )
+            response_get.raise_for_status()
+            response_get_json = response_get.json()["data"]
+            text = response_get_json.get("completion")
+            completed = text is not None
+            time.sleep(0.5)
+        text = text[0]
+        if stop is not None:
+            # I believe this is required since the stop tokens
+            # are not enforced by the model parameters
+            text = enforce_stop_tokens(text, stop)
+        return text
--- a/langchain/llms/writer.py
+++ b/langchain/llms/writer.py
@ -0,0 +1,155 @@
+"""Wrapper around Writer APIs."""
+from typing import Any, Dict, List, Mapping, Optional
+
+import requests
+from pydantic import BaseModel, Extra, root_validator
+
+from langchain.llms.base import LLM
+from langchain.llms.utils import enforce_stop_tokens
+from langchain.utils import get_from_dict_or_env
+
+
+class Writer(LLM, BaseModel):
+    """Wrapper around Writer large language models.
+
+    To use, you should have the environment variable ``WRITER_API_KEY``
+    set with your API key.
+
+    Example:
+        .. code-block:: python
+
+            from langchain import Writer
+            writer = Writer(model_id="palmyra-base")
+    """
+
+    model_id: str = "palmyra-base"
+    """Model name to use."""
+
+    tokens_to_generate: int = 24
+    """Max number of tokens to generate."""
+
+    logprobs: bool = False
+    """Whether to return log probabilities."""
+
+    temperature: float = 1.0
+    """What sampling temperature to use."""
+
+    length: int = 256
+    """The maximum number of tokens to generate in the completion."""
+
+    top_p: float = 1.0
+    """Total probability mass of tokens to consider at each step."""
+
+    top_k: int = 1
+    """The number of highest probability vocabulary tokens to
+    keep for top-k-filtering."""
+
+    repetition_penalty: float = 1.0
+    """Penalizes repeated tokens according to frequency."""
+
+    random_seed: int = 0
+    """The model generates random results.
+    Changing the random seed alone will produce a different response
+    with similar characteristics. It is possible to reproduce results
+    by fixing the random seed (assuming all other hyperparameters
+    are also fixed)"""
+
+    beam_search_diversity_rate: float = 1.0
+    """Only applies to beam search, i.e. when the beam width is >1.
+    A higher value encourages beam search to return a more diverse
+    set of candidates"""
+
+    beam_width: Optional[int] = None
+    """The number of concurrent candidates to keep track of during
+    beam search"""
+
+    length_pentaly: float = 1.0
+    """Only applies to beam search, i.e. when the beam width is >1.
+    Larger values penalize long candidates more heavily, thus preferring
+    shorter candidates"""
+
+    writer_api_key: Optional[str] = None
+
+    stop: Optional[List[str]] = None
+    """Sequences when completion generation will stop"""
+
+    base_url: Optional[str] = None
+    """Base url to use, if None decides based on model name."""
+
+    class Config:
+        """Configuration for this pydantic object."""
+
+        extra = Extra.forbid
+
+    @root_validator()
+    def validate_environment(cls, values: Dict) -> Dict:
+        """Validate that api key exists in environment."""
+        writer_api_key = get_from_dict_or_env(
+            values, "writer_api_key", "WRITER_API_KEY"
+        )
+        values["writer_api_key"] = writer_api_key
+        return values
+
+    @property
+    def _default_params(self) -> Mapping[str, Any]:
+        """Get the default parameters for calling Writer API."""
+        return {
+            "tokens_to_generate": self.tokens_to_generate,
+            "stop": self.stop,
+            "logprobs": self.logprobs,
+            "temperature": self.temperature,
+            "top_p": self.top_p,
+            "top_k": self.top_k,
+            "repetition_penalty": self.repetition_penalty,
+            "random_seed": self.random_seed,
+            "beam_search_diversity_rate": self.beam_search_diversity_rate,
+            "beam_width": self.beam_width,
+            "length_pentaly": self.length_pentaly,
+        }
+
+    @property
+    def _identifying_params(self) -> Mapping[str, Any]:
+        """Get the identifying parameters."""
+        return {**{"model_id": self.model_id}, **self._default_params}
+
+    @property
+    def _llm_type(self) -> str:
+        """Return type of llm."""
+        return "writer"
+
+    def _call(self, prompt: str, stop: Optional[List[str]] = None) -> str:
+        """Call out to Writer's complete endpoint.
+
+        Args:
+            prompt: The prompt to pass into the model.
+            stop: Optional list of stop words to use when generating.
+
+        Returns:
+            The string generated by the model.
+
+        Example:
+            .. code-block:: python
+
+                response = Writer("Tell me a joke.")
+        """
+        if self.base_url is not None:
+            base_url = self.base_url
+        else:
+            base_url = (
+                "https://api.llm.writer.com/v1/models/{self.model_id}/completions"
+            )
+        response = requests.post(
+            url=base_url,
+            headers={
+                "Authorization": f"Bearer {self.writer_api_key}",
+                "Content-Type": "application/json",
+                "Accept": "application/json",
+            },
+            json={"prompt": prompt, **self._default_params},
+        )
+        text = response.text
+        if stop is not None:
+            # I believe this is required since the stop tokens
+            # are not enforced by the model parameters
+            text = enforce_stop_tokens(text, stop)
+        return text
--- a/langchain/prompts/base.py
+++ b/langchain/prompts/base.py
@ -1,12 +1,14 @@
 """BasePrompt schema definition."""
+from __future__ import annotations
+
 import json
 import re
 from abc import ABC, abstractmethod
 from pathlib import Path
-from typing import Any, Callable, Dict, List, Optional, Union
+from typing import Any, Callable, Dict, List, Mapping, Optional, Union

 import yaml
-from pydantic import BaseModel, Extra, root_validator
+from pydantic import BaseModel, Extra, Field, root_validator

 from langchain.formatting import formatter

@ -117,6 +119,9 @@ class BasePromptTemplate(BaseModel, ABC):
    """A list of the names of the variables the prompt template expects."""
    output_parser: Optional[BaseOutputParser] = None
    """How to parse the output of calling an LLM on this formatted prompt."""
+    partial_variables: Mapping[str, Union[str, Callable[[], str]]] = Field(
+        default_factory=dict
+    )

    class Config:
        """Configuration for this pydantic object."""
@ -132,8 +137,38 @@ class BasePromptTemplate(BaseModel, ABC):
                "Cannot have an input variable named 'stop', as it is used internally,"
                " please rename."
            )
+        if "stop" in values["partial_variables"]:
+            raise ValueError(
+                "Cannot have an partial variable named 'stop', as it is used "
+                "internally, please rename."
+            )
+
+        overall = set(values["input_variables"]).intersection(
+            values["partial_variables"]
+        )
+        if overall:
+            raise ValueError(
+                f"Found overlapping input and partial variables: {overall}"
+            )
        return values

+    def partial(self, **kwargs: Union[str, Callable[[], str]]) -> BasePromptTemplate:
+        """Return a partial of the prompt template."""
+        prompt_dict = self.__dict__.copy()
+        prompt_dict["input_variables"] = list(
+            set(self.input_variables).difference(kwargs)
+        )
+        prompt_dict["partial_variables"] = {**self.partial_variables, **kwargs}
+        return type(self)(**prompt_dict)
+
+    def _merge_partial_and_user_variables(self, **kwargs: Any) -> Dict[str, Any]:
+        # Get partial params:
+        partial_kwargs = {
+            k: v if isinstance(v, str) else v()
+            for k, v in self.partial_variables.items()
+        }
+        return {**partial_kwargs, **kwargs}
+
    @abstractmethod
    def format(self, **kwargs: Any) -> str:
        """Format the prompt with the inputs.
@ -173,6 +208,8 @@ class BasePromptTemplate(BaseModel, ABC):

            prompt.save(file_path="path/prompt.yaml")
        """
+        if self.partial_variables:
+            raise ValueError("Cannot save prompt with partial variables.")
        # Convert file to Path object.
        if isinstance(file_path, str):
            save_path = Path(file_path)
--- a/langchain/prompts/example_selector/length_based.py
+++ b/langchain/prompts/example_selector/length_based.py
@ -8,6 +8,10 @@ from langchain.prompts.example_selector.base import BaseExampleSelector
 from langchain.prompts.prompt import PromptTemplate


+def _get_length_based(text: str) -> int:
+    return len(re.split("\n| ", text))
+
+
 class LengthBasedExampleSelector(BaseExampleSelector, BaseModel):
    """Select examples based on length."""

@ -17,7 +21,7 @@ class LengthBasedExampleSelector(BaseExampleSelector, BaseModel):
    example_prompt: PromptTemplate
    """Prompt template used to format the examples."""

-    get_text_length: Callable[[str], int] = lambda x: len(re.split("\n| ", x))
+    get_text_length: Callable[[str], int] = _get_length_based
    """Function to measure prompt length. Defaults to word count."""

    max_length: int = 2048
--- a/Show More
+++ b/Show More
Author	SHA1	Message	Date
blob42	84d7ad397d	langchain-docker readme	1 year ago
blob42	de551d62a8	linting in docker and parallel make jobs - linting can be run in docker in parallel with `make -j4 docker.lint`	1 year ago
blob42	d8fd0e790c	enable test + lint on docker	1 year ago
blob42	97c2b31cc5	added all extra dependencies to dev image + customized builds - downgraded to python 3.10 to accomadate installing all dependencies - by default installs all dev + extra dependencies - option to install only dev dependencies by customizing .env file	1 year ago
blob42	f1dc03d0cc	docker development image and helper makefile separate makefile and build env: - separate makefile for docker - only show docker commands when docker detected in system - only rebuild container on change - use an unpriviliged user builder image and base dev image: - fully isolated environment inside container. - all venv installed inside container shell and available as commands. - ex: `docker run IMG jupyter notebook` to launch notebook. - pure python based container without poetry. - custom motd to add a message displayed to users when they connect to container. - print environment versions (git, package, python) on login - display help message when starting container	1 year ago
Harrison Chase	f76e9eaab1	bump version (#1342 )	1 year ago
Harrison Chase	db2e9c2b0d	partial variables (#1308 )	1 year ago
Tim Asp	d22651d82a	Add new iFixit document loader (#1333 ) iFixit is a wikipedia-like site that has a huge amount of open content on how to fix things, questions/answers for common troubleshooting and "things" related content that is more technical in nature. All content is licensed under CC-BY-SA-NC 3.0 Adding docs from iFixit as context for user questions like "I dropped my phone in water, what do I do?" or "My macbook pro is making a whining noise, what's wrong with it?" can yield significantly better responses than context free response from LLMs.	1 year ago
Matt Robinson	c46478d70e	feat: document loader for image files (#1330 ) ### Summary Adds a document loader for image files such as `.jpg` and `.png` files. ### Testing Run the following using the example document from the [`unstructured` repo](https://github.com/Unstructured-IO/unstructured/tree/main/example-docs). ```python from langchain.document_loaders.image import UnstructuredImageLoader loader = UnstructuredImageLoader("layout-parser-paper-fast.jpg") loader.load() ```	1 year ago
Eugene Yurtsev	e3fcc72879	Documentation: Minor typo fixes (#1327 ) Fixing a few minor typos in the documentation (and likely introducing other ones in the process).	1 year ago
blob42	2fdb1d842b	refactoring into submodules	1 year ago
blob42	c30ef7dbc4	drop network capabilities by default, example on using networking	1 year ago
blob42	8a7871ece3	add exec_attached: attach to running container and exec cmd	1 year ago
blob42	201ecdc9ee	fix run and exec_run default commands, actually use gVisor - run and exec_run need a separate default command. Run usually executes a script while exec_run simulates an interactive session. The image templates and run funcs have been upgraded to handle both types of commands. - test: make docker tests run when docker is installed and docker lib avaialble. - test that runsc runtime is used by default when gVisor is installed. (manually removing gVisor skips the test)	1 year ago
blob42	149fe0055e	exec_run fixes to keep stdin open	1 year ago
blob42	096b82f2a1	update notebook for utility	1 year ago
blob42	87b5a84cfb	update tests and docstrings	1 year ago
blob42	ed97aa65af	exec_run: add timeout and delay params - use `delay` to wait for sent payload to finish - use `timeout` to control how long to wait for output	1 year ago
blob42	c9e6baf60d	image templates, enhanced wrapper building with custom prameters - quickly run or exec_run commands with sane defaults - wip image templates with parameters for common docker images - shell escaping logic - capture stdout+stderr for exec commands - added minimal testing	1 year ago
blob42	7cde1cbfc3	docker: attach to container's stdin - wip image helper for optimized params with common images - gVisor runtime checker - make tests skipped if docker installed	1 year ago
blob42	17213209e0	stream stdin and stdout to container through docker API's socket	1 year ago
blob42	895f862662	docker wrapper tool for untrusted execution	1 year ago
Harrison Chase	f61858163d	bump version to 0.0.95 (#1324 )	1 year ago
Harrison Chase	0824d65a5c	Harrison/indexing pipeline (#1317 )	1 year ago
Akshay	a0bf856c70	Update agent_vectorstore.ipynb (#1318 ) nitpicking but just thought i'd add this typo which I found when going through the How-to 😄 (unless it was intentional) also, it's amazing that you added ReAct to LangChain!	1 year ago
Harrison Chase	166cda2cc6	Harrison/deeplake (#1316 ) Co-authored-by: Davit Buniatyan <d@activeloop.ai>	1 year ago
Harrison Chase	aaad6cc954	Harrison/atlas db (#1315 ) Co-authored-by: Brandon Duderstadt <brandonduderstadt@gmail.com>	1 year ago
Marc Puig	3989c793fd	Making it possible to use "certainty" as a parameter for the weaviate similarity_search (#1218 ) Checking if weaviate similarity_search kwargs contains "certainty" and use it accordingly. The minimal level of certainty must be a float, and it is computed by normalized distance.	1 year ago
Alexander Hoyle	42b892c21b	Avoid IntegrityError for SQLiteCache updates (#1286 ) While using a `SQLiteCache`, if there are duplicate `(prompt, llm, idx)` tuples passed to [`update_cache()`](`c5dd491a21/langchain/llms/base.py (L39)`), then an `IntegrityError` is thrown. This can happen when there are duplicated prompts within the same batch. This PR changes the SQLAlchemy `session.add()` to a `session.merge()` in `cache.py`, [following the solution from this SO thread](https://stackoverflow.com/questions/10322514/dealing-with-duplicate-primary-keys-on-insert-in-sqlalchemy-declarative-style). I believe this fixes #983, but not entirely sure since that also involves async Here's a minimal example of the error: ```python from pathlib import Path import langchain from langchain.cache import SQLiteCache llm = langchain.OpenAI(model_name="text-ada-001", openai_api_key=Path("/.openai_api_key").read_text().strip()) langchain.llm_cache = SQLiteCache("test_cache.db") llm.generate(['a'] * 5) ``` ``` > IntegrityError: (sqlite3.IntegrityError) UNIQUE constraint failed: full_llm_cache.prompt, full_llm_cache.llm, full_llm_cache.idx [SQL: INSERT INTO full_llm_cache (prompt, llm, idx, response) VALUES (?, ?, ?, ?)] [parameters: ('a', "[('_type', 'openai'), ('best_of', 1), ('frequency_penalty', 0), ('logit_bias', {}), ('max_tokens', 256), ('model_name', 'text-ada-001'), ('n', 1), ('presence_penalty', 0), ('request_timeout', None), ('stop', None), ('temperature', 0.7), ('top_p', 1)]", 0, '\n\nA is for air.\n\nA is for atmosphere.')] (Background on this error at: https://sqlalche.me/e/14/gkpj) ``` After the change, we now have the following ```python class Output: def __init__(self, text): self.text = text # make dummy data cache = SQLiteCache("test_cache_2.db") cache.update(prompt="prompt_0", llm_string="llm_0", return_val=[Output("text_0")]) cache.engine.execute("SELECT * FROM full_llm_cache").fetchall() # output > [('prompt_0', 'llm_0', 0, 'text_0')] ``` ```python # update data, before change this would have thrown an `IntegrityError` cache.update(prompt="prompt_0", llm_string="llm_0", return_val=[Output("text_0_new")]) cache.engine.execute("SELECT * FROM full_llm_cache").fetchall() # output > [('prompt_0', 'llm_0', 0, 'text_0_new')] ```	1 year ago
Harrison Chase	81abcae91a	Harrison/banana fix (#1311 ) Co-authored-by: Erik Dunteman <44653944+erik-dunteman@users.noreply.github.com>	1 year ago
Casey A. Fitzpatrick	648b3b3909	Fix use case sentence for bash util doc (#1295 ) Thanks for all your hard work! I noticed a small typo in the bash util doc so here's a quick update. Additionally, my formatter caught some spacing in the `.md` as well. Happy to revert that if it's an issue. The main change is just ``` - A common use case this is for letting it interact with your local file system. + A common use case for this is letting the LLM interact with your local file system. ``` ## Testing `make docs_build` succeeds locally and the changes show as expected ✌️ <img width="704" alt="image" src="https://user-images.githubusercontent.com/17773666/221376160-e99e59a6-b318-49d1-a1d7-89f5c17cdab4.png">	1 year ago
Ingo Kleiber	fd9975dad7	add CoNLL-U document loader (#1297 ) I've added a simple [CoNLL-U](https://universaldependencies.org/format.html) document loader. CoNLL-U is a common format for NLP tasks and is used, for example, in the Universal Dependencies treebank corpora. The loader reads a single file in standard CoNLL-U format and returns a document.	1 year ago
Harrison Chase	d29f74114e	copy paste loader (#1302 )	1 year ago
Harrison Chase	ce441edd9c	improve docs (#1309 )	1 year ago
Harrison Chase	6f30d68581	add example of using agent with vectorstores (#1285 )	1 year ago
Harrison Chase	002da6edc0	ruff ruff (#1203 )	1 year ago
Harrison Chase	0963096491	fix imports (#1288 )	1 year ago
Harrison Chase	c5dd491a21	bump version to 0094 (#1280 )	1 year ago
Matt Robinson	2f15c11b87	feat: document loader for MS Word documents (#1282 ) ### Summary Adds a document loader for MS Word Documents. Works with both `.docx` and `.doc` files as longer as the user has installed `unstructured>=0.4.11`. ### Testing The follow workflow test the loader for both `.doc` and `.docx` files using example docs from the `unstructured` repo. #### `.docx` ```python from langchain.document_loaders import UnstructuredWordDocumentLoader filename = "../unstructured/example-docs/fake.docx" loader = UnstructuredWordDocumentLoader(filename) loader.load() ``` #### `.doc` ```python from langchain.document_loaders import UnstructuredWordDocumentLoader filename = "../unstructured/example-docs/fake.doc" loader = UnstructuredWordDocumentLoader(filename) loader.load() ```	1 year ago
Harrison Chase	96db6ed073	cleanup (#1274 )	1 year ago
Harrison Chase	7e8f832cd6	Harrison/cohere params (#1278 ) Co-authored-by: Stefano Faraggi <40745694+stepp1@users.noreply.github.com>	1 year ago
Harrison Chase	a8e88e1874	Harrison/logprobs (#1279 ) Co-authored-by: Prateek Shah <97124740+prateekspanning@users.noreply.github.com>	1 year ago
Harrison Chase	42167a1e24	Harrison/fb loader (#1277 ) Co-authored-by: Vairo Di Pasquale <vairo.dp@gmail.com>	1 year ago
Harrison Chase	bb53d9722d	Harrison/errors (#1276 ) Co-authored-by: Kevin Huo <5000881+kwhuo68@users.noreply.github.com>	1 year ago
Klein Tahiraj	8a0751dadd	adding .ipynb loader and documentation Fixes #1248 (#1252 ) `NotebookLoader.load()` loads the `.ipynb` notebook file into a `Document` object. Parameters: * `include_outputs` (bool): whether to include cell outputs in the resulting document (default is False). * `max_output_length` (int): the maximum number of characters to include from each cell output (default is 10). * `remove_newline` (bool): whether to remove newline characters from the cell sources and outputs (default is False). * `traceback` (bool): whether to include full traceback (default is False).	1 year ago
Harrison Chase	4b5d427421	Harrison/source docs (#1275 ) Co-authored-by: Tushar Dhadiwal <tushardhadiwal@users.noreply.github.com>	1 year ago
Enrico Shippole	9becdeaadf	Add Writer, Banana, Modal, StochasticAI (#1270 ) Add LLM wrappers and examples for Banana, Writer, Modal, Stochastic AI Added rigid json format for Banana and Modal	1 year ago
blob42	5457d48416	searx: add `query_suffix` parameter (#1259 ) - allows to build tools and dynamically inject extra searxh suffix in the query. example: `search.run("python library", query_suffix="site:github.com")` resulting query: `python library site:github.com` Co-authored-by: blob42 <spike@w530>	1 year ago
Harrison Chase	9381005098	fix bug with length function (#1257 )	1 year ago
Matt Robinson	10e73a3723	docs: remove nltk download steps (#1253 ) ### Summary Updates the docs to remove the `nltk` download steps from `unstructured`. As of `unstructured` `0.4.14`, this is handled automatically in the relevant modules within `unstructured`.	1 year ago
Justin Torre	5bc6dc076e	added caching and properties docs (#1255 )	1 year ago
Harrison Chase	6d37d089e9	bump version to 0093 (#1251 )	1 year ago
Iskren Ivov Chernev	8e3cd3e0dd	Add DeepInfra LLM support (#1232 ) DeepInfra is an Inference-as-a-Service provider. Add a simple wrapper using HTTPS requests.	1 year ago
Dmitri Melikyan	b7765a95a0	docs: add Graphsignal ecosystem page (#1228 ) Adds a Graphsignal ecosystem page	1 year ago
Satoru Sakamoto	d480330fae	fix to specific language transcript (#1231 ) Currently youtube loader only seems to support English audio. Changed to load videos in the specified language.	1 year ago
Harrison Chase	6085fe18d4	add ifttt tool (#1244 )	1 year ago
Jon Luo	8a35811556	Don't instruct LLM to use the LIMIT clause, which is incompatible with SQL Server (#1242 ) The current prompt specifically instructs the LLM to use the `LIMIT` clause. This will cause issues with MS SQL Server, which uses `SELECT TOP` instead of `LIMIT`. The generated SQL will use `LIMIT`; the instruction to "always limit... using the LIMIT clause" seems to override the "create a syntactically correct mssql query to run" portion. Reported here: https://github.com/hwchase17/langchain/issues/1103#issuecomment-1441144224 I don't have access to a SQL Server instance to test, but removing that part of the prompt in OpenAI Playground results in the correct `SELECT TOP` syntax, whereas keeping it in results in the `LIMIT` clause, even when instructing it to generate syntactically correct mssql. It's also still correctly using `LIMIT` in my MariaDB database. I think in this case we can assume that the model will select the appropriate method based on the dialect specified. In general, it would be nice to be able to test a suite of SQL dialects for things like dialect-specific syntax and other issues we've run into in the past, but I'm not quite sure how to best approach that yet.	1 year ago
Harrison Chase	71709ad5d5	Update key_concepts.md (#1209 ) (#1237 ) Link for easier navigation (it's not immediately clear where to find more info on SimpleSequentialChain (3 clicks away) --------- Co-authored-by: Larry Fisherman <l4rryfisherman@protonmail.com>	1 year ago
Dennis Antela Martinez	53c67e04d4	add aleph alpha llm (#1207 ) Integrate Aleph Alpha's client into Langchain to provide access to the luminous models - more info on latest benchmarks here: https://www.aleph-alpha.com/luminous-performance-benchmarks	1 year ago
Klein Tahiraj	c6ab1bb3cb	Fixing typo in loading.py (#1235 ) Just fixing a typo I found in loading.py	1 year ago
Ikko Eltociear Ashimine	334b553260	Update petals.md (#1225 ) Huggingface -> Hugging Face	1 year ago
Jon Luo	ac1320aae8	fix sqlite internal tables breaking table_info (#1224 ) With the current method used to get the SQL table info, sqlite internal schema tables are being included and are not being handled correctly by sqlalchemy because the columns have no types. This is easy to see with the Chinook database: ```python db = SQLDatabase.from_uri("sqlite:///Chinook.db") print(db.table_info) ``` ```python ... sqlalchemy.exc.CompileError: (in table 'sqlite_sequence', column 'name'): Can't generate DDL for NullType(); did you forget to specify a type on this Column? ``` SQLAlchemy 2.0 [ignores these by default](`63d90b0f44/lib/sqlalchemy/dialects/sqlite/base.py (L856-L880)`): `63d90b0f44/lib/sqlalchemy/dialects/sqlite/base.py (L2096-L2123)`	1 year ago
djacobs7	4e28982d2b	Fix typo in constitutional_ai base.py (#1216 ) Found a typo in the documentation code for the constitutional_ai module	1 year ago
Sason	cc7d2e5621	Correct typo in "Question Answering" How-To Guide (#1221 )	1 year ago
blob42	424e71705d	searx: remove duplicate param (#1219 ) Co-authored-by: blob42 <spike@w530>	1 year ago