langchain/docs/integrations/beam.md
Leonid Ganeline 92a5f00ffb
docs: ecosystem/integrations update 5 (#5752)
- added missed integration to `docs/ecosystem/integrations/`
- updated notebooks to consistent format: changed titles, file names;
added descriptions

#### Who can review?
 @hwchase17 
 @dev2049
2023-06-05 16:08:55 -07:00

2.3 KiB
Raw Blame History

Beam

Beam makes it easy to run code on GPUs, deploy scalable web APIs, schedule cron jobs, and run massively parallel workloads — without managing any infrastructure.

Installation and Setup

  • Create an account
  • Install the Beam CLI with curl https://raw.githubusercontent.com/slai-labs/get-beam/main/get-beam.sh -sSfL | sh
  • Register API keys with beam configure
  • Set environment variables (BEAM_CLIENT_ID) and (BEAM_CLIENT_SECRET)
  • Install the Beam SDK:
pip install beam-sdk

LLM

from langchain.llms.beam import Beam

Example of the Beam app

This is the environment youll be developing against once you start the app. It's also used to define the maximum response length from the model.

llm = Beam(model_name="gpt2",
           name="langchain-gpt2-test",
           cpu=8,
           memory="32Gi",
           gpu="A10G",
           python_version="python3.8",
           python_packages=[
               "diffusers[torch]>=0.10",
               "transformers",
               "torch",
               "pillow",
               "accelerate",
               "safetensors",
               "xformers",],
           max_length="50",
           verbose=False)

Deploy the Beam app

Once defined, you can deploy your Beam app by calling your model's _deploy() method.

llm._deploy()

Call the Beam app

Once a beam model is deployed, it can be called by calling your model's _call() method. This returns the GPT2 text response to your prompt.

response = llm._call("Running machine learning on a remote GPU")

An example script which deploys the model and calls it would be:

from langchain.llms.beam import Beam
import time

llm = Beam(model_name="gpt2",
           name="langchain-gpt2-test",
           cpu=8,
           memory="32Gi",
           gpu="A10G",
           python_version="python3.8",
           python_packages=[
               "diffusers[torch]>=0.10",
               "transformers",
               "torch",
               "pillow",
               "accelerate",
               "safetensors",
               "xformers",],
           max_length="50",
           verbose=False)

llm._deploy()

response = llm._call("Running machine learning on a remote GPU")

print(response)