sage-guide

6 months ago · 1570b9c9c5
parent a4483cf255
commit 1570b9c9c5
1 changed files with 120 additions and 0 deletions
--- a/docs/pages/Deploying/Sagemaker-Docsgpt.md
+++ b/docs/pages/Deploying/Sagemaker-Docsgpt.md
@ -0,0 +1,120 @@
 # How to deploy LLM's on Sagemaker for DocsGPT
 This guide uses some of the methods from the [Phil Schmid's guides](https://www.philschmid.de/) so if you want to dive deeper into the topic, check out his guides.
 ### 1. Create a new python notebook on Sagemaker and prep dependencies and permissions
 Install dependencies
 ```python
 !pip install "sagemaker>=2.175.0" --upgrade --quiet
 ```
 Check permissions
 ```python
 import sagemaker
 import boto3
 sess = sagemaker.Session()
 # sagemaker session bucket -> used for uploading data, models and logs
 # sagemaker will automatically create this bucket if it not exists
 sagemaker_session_bucket=None
 if sagemaker_session_bucket is None and sess is not None:
    # set to default bucket if a bucket name is not given
    sagemaker_session_bucket = sess.default_bucket()
 try:
    role = sagemaker.get_execution_role()
 except ValueError:
    iam = boto3.client('iam')
    role = iam.get_role(RoleName='sagemaker_execution_role')['Role']['Arn']
 sess = sagemaker.Session(default_bucket=sagemaker_session_bucket)
 print(f"sagemaker role arn: {role}")
 print(f"sagemaker session region: {sess.boto_region_name}")
 print(f"sagemaker session bucket: {sess.default_bucket()}")
 ```
 Get huggingfaces llm image uri for the container
 ```python
 from sagemaker.huggingface import get_huggingface_llm_image_uri
 # retrieve the llm image uri
 llm_image = get_huggingface_llm_image_uri(
  "huggingface",
  version="1.1.0",
 )
 # print ecr image uri
 print(f"llm image uri: {llm_image}")
 ```
 ### 2. Prepare the Model
 Running this code will create a model with some default parameters. You can change these parameters to suit your needs.
 There are two ways you can choose which model to use.
 You can either use the model_id from the huggingface.co/models page or you can use the model_data from a previous training job.
 ```python
 import json
 from sagemaker.huggingface import HuggingFaceModel
 # sagemaker config
 instance_type = "ml.g5.xlarge"
 number_of_gpu = 1
 health_check_timeout = 600
 # Define Model and Endpoint configuration parameter
 config = {
  'HF_MODEL_ID': "/opt/ml/model", # model_id from hf.co/models
  #'HF_MODEL_ID': "Arc53/DocsGPT-7B",  
  'SM_NUM_GPUS': json.dumps(number_of_gpu), # Number of GPU used per replica
  'MAX_INPUT_LENGTH': json.dumps(7000),  # Max length of input text
  'MAX_TOTAL_TOKENS': json.dumps(8000),  # Max length of the generation (including input text)
  'MAX_BATCH_TOTAL_TOKENS': json.dumps(8192),  # Limits the number of tokens that can be processed in parallel during the generation
  'MAX_BATCH_PREFILL_TOKENS': json.dumps(7000),
 }
 # create HuggingFaceModel with the image uri
 llm_model = HuggingFaceModel(
  model_data="s3://docsgpt/models/hf-tensors/docsgpt-7b-O-hq-64-alpha-2023-11-22-15-04-04-455/model.tar.gz",
  role=role,
  image_uri=llm_image,
  env=config
 )
 ```
 ### 3. Deploy the Model
 Running this line will create Model in the Sagemaker console. Next it will create an endpoint configuration and finally it will create an endpoint.
 ```python
 llm = llm_model.deploy(
  initial_instance_count=1,
  endpoint_name="docsgpt-7b",
  instance_type=instance_type,
  container_startup_health_check_timeout=health_check_timeout, # 10 minutes to be able to load the model
 )
 ```
 ### 4. Connect it to the application
 Change you .env file and set the following variables:
 ```python
 SAGEMAKER_ENDPOINT: str = None # SageMaker endpoint name (docsgpt-7b)
 SAGEMAKER_REGION: str = None # SageMaker region name
 SAGEMAKER_ACCESS_KEY: str = None # SageMaker access key
 SAGEMAKER_SECRET_KEY: str = None # SageMaker secret key
 ```
 > **_NOTE:_** If you are using the same AWS account for the application and SageMaker, you can leave the access and secret keys empty.
 Also make sure you switch to appropriate embeddings if you want everything runs locally for example
 ```python
 EMBEDDINGS_NAME=huggingface_sentence-transformers/all-mpnet-base-v2
 ```