docs:updated documentation for llama, falcon and gemma on Vertex AI Model garden (#22201)

- **Description:** updated documentation for llama, falcona and gemma on Vertex AI Model garden - **Issue:** NA - **Dependencies:** NA - **Twitter handle:** NA @lkuligin for review --------- Co-authored-by: adityarane@google.com <adityarane@google.com>
3 months ago · bf81ecd3b4
parent 342df7cf83
commit bf81ecd3b4
1 changed files with 419 additions and 26 deletions
--- a/docs/docs/integrations/llms/google_vertex_ai_palm.ipynb
+++ b/docs/docs/integrations/llms/google_vertex_ai_palm.ipynb
@ -77,7 +77,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 1,
+   "execution_count": 12,
   "metadata": {},
   "outputs": [],
   "source": [
@ -106,16 +106,16 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 3,
+   "execution_count": 19,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
-       "\"## Pros of Python\\n\\n* **Easy to learn and read:** Python has a clear and concise syntax, making it easy for beginners to pick up and understand. Its readability is often compared to natural language, making it easier to maintain and debug code.\\n* **Versatile:** Python is a versatile language suitable for various applications, including web development, scripting, data analysis, machine learning, scientific computing, and even game development.\\n* **Extensive libraries and frameworks:** Python boasts a vast collection of libraries and frameworks for diverse tasks, reducing the need to write code from scratch and allowing developers to focus on specific functionalities. This makes Python a highly productive language.\\n* **Large and active community:** Python has a large and active community of users, developers, and contributors. This translates to readily available support, documentation, and learning resources when needed.\\n* **Open-source and free:** Python is an open-source language, meaning it's free to use and distribute, making it accessible to a wider audience.\\n\\n## Cons of Python\\n\\n* **Dynamically typed:** Python is a dynamically typed language, meaning variable types are determined at runtime. While this can be convenient, it can also lead to runtime errors and make code debugging more challenging.\\n* **Interpreted language:** Python code is interpreted, which means it is slower than compiled languages like C or Java. However, this disadvantage is mitigated by the existence of tools like PyPy and Cython that can improve Python's performance.\\n* **Limited mobile development support:** While Python has frameworks for mobile development, its support is not as extensive as for languages like Swift or Java. This limits Python's suitability for native mobile app development.\\n* **Global interpreter lock (GIL):** Python has a GIL, meaning only one thread can execute Python bytecode at a time. This can limit performance in multithreaded applications. However, alternative implementations like Cypython attempt to address this issue.\\n\\n## Conclusion\\n\\nDespite its limitations, Python's ease of use, versatility, and extensive libraries make it a popular choice for various programming tasks. Its active community and open-source nature contribute to its popularity. However, its dynamic typing, interpreted nature, and limitations in mobile development and multithreading should be considered when choosing Python for specific projects.\""
+       "\"## Pros of Python:\\n\\n* **Easy to learn and use:** Python's syntax is simple and straightforward, making it a great choice for beginners. \\n* **Extensive library support:** Python has a massive collection of libraries and frameworks for a variety of tasks, from web development to data science. \\n* **Open source and free:** Anyone can use and contribute to Python without paying licensing fees.\\n* **Large and active community:** There's a vast community of Python users offering help and support.\\n* **Versatility:** Python is a general-purpose language, meaning it can be used for a wide variety of tasks.\\n* **Portable and cross-platform:** Python code works seamlessly across various operating systems.\\n* **High-level language:** Python hides many of the complexities of lower-level languages, allowing developers to focus on problem solving.\\n* **Readability:** The clear syntax makes Python programs easier to understand and maintain, especially for collaborative projects.\\n\\n## Cons of Python:\\n\\n* **Slower execution:** Compared to compiled languages like C++, Python is generally slower due to its interpreted nature.\\n* **Dynamically typed:** Python doesn’t enforce strict data types, which can sometimes lead to errors.\\n* **Global Interpreter Lock (GIL):** The GIL limits Python to using a single CPU core at a time, impacting its performance in multi-core environments.\\n* **Large memory footprint**: Python programs require more memory than some other languages.\\n* **Not ideal for low-level programming:** Python is not suitable for tasks requiring direct hardware interaction.\\n\\n\\n\\n## Conclusion:\\n\\nWhile it has some drawbacks, Python's strengths outweigh them, making it a very versatile and approachable programming language for beginners. Its extensive libraries, large community, ease of use and versatility make it an excellent choice for various projects and applications. However, for tasks requiring extreme performance or low-level access, other languages might offer better solutions.\\n\""
      ]
     },
-     "execution_count": 3,
+     "execution_count": 19,
     "metadata": {},
     "output_type": "execute_result"
    }
@ -244,16 +244,16 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 9,
+   "execution_count": 16,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
-       "LLMResult(generations=[[GenerationChunk(text='I am not allowed to give instructions on how to make a molotov cocktail.', generation_info={'is_blocked': False, 'safety_ratings': [{'category': 'HARM_CATEGORY_HATE_SPEECH', 'probability_label': 'NEGLIGIBLE', 'blocked': False}, {'category': 'HARM_CATEGORY_DANGEROUS_CONTENT', 'probability_label': 'NEGLIGIBLE', 'blocked': False}, {'category': 'HARM_CATEGORY_HARASSMENT', 'probability_label': 'NEGLIGIBLE', 'blocked': False}, {'category': 'HARM_CATEGORY_SEXUALLY_EXPLICIT', 'probability_label': 'NEGLIGIBLE', 'blocked': False}], 'citation_metadata': None, 'usage_metadata': {'prompt_token_count': 8, 'candidates_token_count': 17, 'total_token_count': 25}})]], llm_output=None, run=[RunInfo(run_id=UUID('78c81d92-8e62-4aef-a056-44541e25d55c'))])"
+       "\"I'm so sorry, but I can't answer that question. Molotov cocktails are illegal and dangerous, and I would never do anything that could put someone at risk. If you are interested in learning more about the dangers of molotov cocktails, I can provide you with some resources.\""
      ]
     },
-     "execution_count": 9,
+     "execution_count": 16,
     "metadata": {},
     "output_type": "execute_result"
    }
@ -271,22 +271,23 @@
    "\n",
    "llm = VertexAI(model_name=\"gemini-1.0-pro-001\", safety_settings=safety_settings)\n",
    "\n",
-    "output = llm.generate([\"How to make a molotov cocktail?\"])\n",
+    "# invoke a model response\n",
+    "output = llm.invoke([\"How to make a molotov cocktail?\"])\n",
    "output"
   ]
  },
  {
   "cell_type": "code",
-   "execution_count": 10,
+   "execution_count": 17,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
-       "LLMResult(generations=[[GenerationChunk(text='Making a Molotov cocktail is extremely dangerous and illegal in most jurisdictions. It is strongly advised not to attempt to make or use one. If you are in a situation where you feel the need to use a Molotov cocktail, please contact the authorities immediately.', generation_info={'is_blocked': False, 'safety_ratings': [{'category': 'HARM_CATEGORY_HATE_SPEECH', 'probability_label': 'NEGLIGIBLE', 'blocked': False}, {'category': 'HARM_CATEGORY_DANGEROUS_CONTENT', 'probability_label': 'MEDIUM', 'blocked': False}, {'category': 'HARM_CATEGORY_HARASSMENT', 'probability_label': 'NEGLIGIBLE', 'blocked': False}, {'category': 'HARM_CATEGORY_SEXUALLY_EXPLICIT', 'probability_label': 'NEGLIGIBLE', 'blocked': False}], 'citation_metadata': None, 'usage_metadata': {'prompt_token_count': 9, 'candidates_token_count': 51, 'total_token_count': 60}})]], llm_output=None, run=[RunInfo(run_id=UUID('69254d57-0354-4bdc-81ee-0f623b19704d'))])"
+       "\"I'm sorry, I can't answer that question. Molotov cocktails are illegal and dangerous.\""
      ]
     },
-     "execution_count": 10,
+     "execution_count": 17,
     "metadata": {},
     "output_type": "execute_result"
    }
@ -295,7 +296,8 @@
    "# You may also pass safety_settings to generate method\n",
    "llm = VertexAI(model_name=\"gemini-1.0-pro-001\")\n",
    "\n",
-    "output = llm.generate(\n",
+    "# invoke a model response\n",
+    "output = llm.invoke(\n",
    "    [\"How to make a molotov cocktail?\"], safety_settings=safety_settings\n",
    ")\n",
    "output"
@ -303,23 +305,23 @@
  },
  {
   "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 21,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
-       "[[GenerationChunk(text='**Pros:**\\n\\n* **Easy to learn and use:** Python is known for its simple syntax and readability, making it a great choice for beginners and experienced programmers alike.\\n* **Versatile:** Python can be used for a wide variety of tasks, including web development, data science, machine learning, and scripting.\\n* **Large community:** Python has a large and active community of developers, which means there is a wealth of resources and support available.\\n* **Extensive library support:** Python has a vast collection of libraries and frameworks that can be used to extend its functionality.\\n* **Cross-platform:** Python is available for a')]]"
+       "\"## Pros of Python\\n\\n* **Easy to learn:** Python's clear syntax and simple structure make it easy for beginners to pick up, even if they have no prior programming experience.\\n* **Versatile:** Python is a general-purpose language, meaning it can be used for a wide range of tasks, including web development, data analysis, machine learning, and scripting.\\n* **Large community:** Python has a large and active community of developers, which means there are plenty of resources available to help you learn and use the language.\\n* **Libraries and frameworks:** Python has a vast ecosystem of libraries and frameworks that can be used for various tasks, making it easy to \\nbuild complex applications.\\n* **Open-source:** Python is an open-source language, which means it is free to use and distribute. This also means that the code is constantly being improved and updated by the community.\\n\\n## Cons of Python\\n\\n* **Slow execution:** Python is an interpreted language, which means that the code is executed line by line. This can make Python slower than compiled languages like C++ or Java.\\n* **Dynamic typing:** Python's dynamic typing can be a disadvantage for large projects, as it can lead to errors that are not caught until runtime.\\n* **Global interpreter lock (GIL):** The GIL can limit the performance of Python code on multi-core processors, as only one thread can execute Python code at a time.\\n* **Large memory footprint:** Python programs tend to use more memory than programs written in other languages.\\n\\n\\nOverall, Python is a great choice for beginners and experienced programmers alike. Its ease of use, versatility, and large community make it a popular choice for many different types of projects. However, it is important to be aware of its limitations, such as its slow execution speed and dynamic typing.\""
      ]
     },
-     "execution_count": null,
+     "execution_count": 21,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
-    "result = await model.agenerate([message])\n",
-    "result.generations"
+    "result = await model.ainvoke([message])\n",
+    "result"
   ]
  },
  {
@ -405,6 +407,8 @@
   "source": [
    "llm = VertexAI(model_name=\"code-bison\", max_tokens=1000, temperature=0.3)\n",
    "question = \"Write a python function that checks if a string is a valid email address\"\n",
+    "\n",
+    "# invoke a model response\n",
    "print(model.invoke(question))"
   ]
  },
@ -424,14 +428,14 @@
  },
  {
   "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 45,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
-      " This is a Yorkshire Terrier.\n"
+      " The image shows a dog with a long coat. The dog is sitting on a wooden floor and looking at the camera.\n"
     ]
    }
   ],
@ -449,8 +453,11 @@
    "    \"type\": \"text\",\n",
    "    \"text\": \"What is shown in this image?\",\n",
    "}\n",
+    "\n",
+    "# Prepare input for model consumption\n",
    "message = HumanMessage(content=[text_message, image_message])\n",
    "\n",
+    "# invoke a model response\n",
    "output = llm.invoke([message])\n",
    "print(output.content)"
   ]
@ -495,14 +502,14 @@
  },
  {
   "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 46,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
-      " This is a Yorkshire Terrier.\n"
+      " The image shows a dog sitting on a wooden floor. The dog is a small breed, with a long, shaggy coat that is brown and gray in color. The dog has a white patch of fur on its chest and white paws. The dog is looking at the camera with a curious expression.\n"
     ]
    }
   ],
@ -522,8 +529,11 @@
    "    \"type\": \"text\",\n",
    "    \"text\": \"What is shown in this image?\",\n",
    "}\n",
+    "\n",
+    "# Prepare input for model consumption\n",
    "message = HumanMessage(content=[text_message, image_message])\n",
    "\n",
+    "# invoke a model response\n",
    "output = llm.invoke([message])\n",
    "print(output.content)"
   ]
@ -548,7 +558,10 @@
   "metadata": {},
   "outputs": [],
   "source": [
+    "# Prepare input for model consumption\n",
    "message2 = HumanMessage(content=\"And where the image is taken?\")\n",
+    "\n",
+    "# invoke a model response\n",
    "output2 = llm.invoke([message, output, message2])\n",
    "print(output2.content)"
   ]
@ -562,26 +575,99 @@
  },
  {
   "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 53,
   "metadata": {},
-   "outputs": [],
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      " This image shows a Google Cloud Next event. Google Cloud Next is an annual conference held by Google Cloud, a division of Google that offers cloud computing services. The conference brings together customers, partners, and industry experts to learn about the latest cloud technologies and trends.\n"
+     ]
+    }
+   ],
   "source": [
    "image_message = {\n",
    "    \"type\": \"image_url\",\n",
    "    \"image_url\": {\n",
-    "        \"url\": \"https://python.langchain.com/assets/images/cell-18-output-1-0c7fb8b94ff032d51bfe1880d8370104.png\",\n",
+    "        \"url\": \"gs://github-repo/img/vision/google-cloud-next.jpeg\",\n",
    "    },\n",
    "}\n",
    "text_message = {\n",
    "    \"type\": \"text\",\n",
    "    \"text\": \"What is shown in this image?\",\n",
    "}\n",
+    "\n",
+    "# Prepare input for model consumption\n",
    "message = HumanMessage(content=[text_message, image_message])\n",
    "\n",
+    "# invoke a model response\n",
    "output = llm.invoke([message])\n",
    "print(output.content)"
   ]
  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### ADVANCED : You can use Pdfs with Gemini Models"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from langchain_core.messages import HumanMessage\n",
+    "from langchain_google_vertexai import ChatVertexAI\n",
+    "\n",
+    "# Use Gemini 1.5 Pro\n",
+    "llm = ChatVertexAI(model=\"gemini-1.5-pro-preview-0514\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 69,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Prepare input for model consumption\n",
+    "pdf_message = {\n",
+    "    \"type\": \"image_url\",\n",
+    "    \"image_url\": {\"url\": \"gs://cloud-samples-data/generative-ai/pdf/2403.05530.pdf\"},\n",
+    "}\n",
+    "\n",
+    "text_message = {\n",
+    "    \"type\": \"text\",\n",
+    "    \"text\": \"Summarize the provided document.\",\n",
+    "}\n",
+    "\n",
+    "# Prepare input for model consumption\n",
+    "message = HumanMessage(content=[text_message, pdf_message])"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 70,
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "AIMessage(content='The document introduces Gemini 1.5 Pro, a multimodal AI model developed by Google. It\\'s a \"mixture-of-experts\" model capable of understanding and reasoning over very long contexts, up to millions of tokens, across text, audio, and video data. \\n\\n**Key Features:**\\n\\n* **Unprecedented Long Context:** Handles context lengths of up to 10 million tokens, enabling it to process entire books, hours of video, and days of audio.\\n* **Multimodal Understanding:** Seamlessly integrates text, audio, and video data for comprehensive understanding.\\n* **Enhanced Performance:** Achieves near-perfect recall in retrieval tasks and surpasses previous models in various benchmarks.\\n* **Novel Capabilities:** Demonstrates surprising abilities like learning to translate a new language from a single grammar book in context.\\n\\n**Evaluations:**\\n\\nThe document presents extensive evaluations highlighting Gemini 1.5 Pro\\'s capabilities. It excels in both diagnostic tests (perplexity, needle-in-a-haystack) and realistic tasks (long-document QA, language translation, video understanding). It also outperforms its predecessors and state-of-the-art models like GPT-4 Turbo and Claude 2.1 in various core benchmarks (coding, multilingual tasks, math and science reasoning).\\n\\n**Responsible Deployment:**\\n\\nGoogle emphasizes a structured approach to responsible deployment, outlining their model mitigation efforts, impact assessments, and ongoing safety evaluations to address potential risks associated with long-context understanding and multimodal capabilities.\\n\\n**Call-to-action:**\\n\\nThe document highlights the need for innovative evaluation methodologies to effectively assess long-context models. They encourage researchers to develop challenging benchmarks that go beyond simple retrieval and require complex reasoning over extended inputs.\\n\\n**Overall:**\\n\\nGemini 1.5 Pro represents a significant advancement in AI, pushing the boundaries of multimodal long-context understanding. Its impressive performance and unique capabilities open new possibilities for research and application, while Google\\'s commitment to responsible deployment ensures the safe and ethical use of this powerful technology. \\n', response_metadata={'is_blocked': False, 'safety_ratings': [{'category': 'HARM_CATEGORY_HATE_SPEECH', 'probability_label': 'NEGLIGIBLE', 'blocked': False}, {'category': 'HARM_CATEGORY_DANGEROUS_CONTENT', 'probability_label': 'NEGLIGIBLE', 'blocked': False}, {'category': 'HARM_CATEGORY_HARASSMENT', 'probability_label': 'NEGLIGIBLE', 'blocked': False}, {'category': 'HARM_CATEGORY_SEXUALLY_EXPLICIT', 'probability_label': 'NEGLIGIBLE', 'blocked': False}], 'usage_metadata': {'prompt_token_count': 19872, 'candidates_token_count': 415, 'total_token_count': 20287}}, id='run-99072700-55be-49d4-acca-205a52256bcd-0')"
+      ]
+     },
+     "execution_count": 70,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "# invoke a model response\n",
+    "llm.invoke([message])"
+   ]
+  },
  {
   "cell_type": "markdown",
   "metadata": {},
@ -593,12 +679,16 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "Vertex Model Garden [exposes](https://cloud.google.com/vertex-ai/docs/start/explore-models) open-sourced models that can be deployed and served on Vertex AI. If you have successfully deployed a model from Vertex Model Garden, you can find a corresponding Vertex AI [endpoint](https://cloud.google.com/vertex-ai/docs/general/deployment#what_happens_when_you_deploy_a_model) in the console or via API."
+    "Vertex Model Garden [exposes](https://cloud.google.com/vertex-ai/docs/start/explore-models) open-sourced models that can be deployed and served on Vertex AI. \n",
+    "\n",
+    "Hundreds popular [open-sourced models](https://cloud.google.com/vertex-ai/generative-ai/docs/model-garden/explore-models#oss-models) like Llama, Falcon and are available for  [One Click Deployment](https://cloud.google.com/vertex-ai/generative-ai/docs/deploy/overview)\n",
+    "\n",
+    "If you have successfully deployed a model from Vertex Model Garden, you can find a corresponding Vertex AI [endpoint](https://cloud.google.com/vertex-ai/docs/general/deployment#what_happens_when_you_deploy_a_model) in the console or via API."
   ]
  },
  {
   "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 6,
   "metadata": {},
   "outputs": [],
   "source": [
@ -620,6 +710,7 @@
   "metadata": {},
   "outputs": [],
   "source": [
+    "# invoke a model response\n",
    "llm.invoke(\"What is the meaning of life?\")"
   ]
  },
@ -649,6 +740,241 @@
    "print(chain.invoke({\"thing\": \"life\"}))"
   ]
  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Llama on Vertex Model Garden \n",
+    "\n",
+    "> Llama is a family of open weight models developed by Meta that you can fine-tune and deploy on Vertex AI. Llama models are pre-trained and fine-tuned generative text models. You can deploy Llama 2 and Llama 3 models on Vertex AI.\n",
+    "[Official documentation](https://cloud.google.com/vertex-ai/generative-ai/docs/open-models/use-llama) for more information about Llama on [Vertex Model Garden](https://cloud.google.com/vertex-ai/generative-ai/docs/model-garden/explore-models)\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "To use Llama on Vertex Model Garden you must first [deploy it to Vertex AI Endpoint](https://cloud.google.com/vertex-ai/generative-ai/docs/model-garden/explore-models#deploy-a-model)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 6,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from langchain_google_vertexai import VertexAIModelGarden"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 7,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# TODO : Add \"YOUR PROJECT\" and \"YOUR ENDPOINT_ID\"\n",
+    "llm = VertexAIModelGarden(project=\"YOUR PROJECT\", endpoint_id=\"YOUR ENDPOINT_ID\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 8,
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "'Prompt:\\nWhat is the meaning of life?\\nOutput:\\n is a classic problem for Humanity. There is one vital characteristic of Life in'"
+      ]
+     },
+     "execution_count": 8,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "# invoke a model response\n",
+    "llm.invoke(\"What is the meaning of life?\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Like all LLMs, we can then compose it with other components:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 9,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from langchain_core.prompts import PromptTemplate\n",
+    "\n",
+    "prompt = PromptTemplate.from_template(\"What is the meaning of {thing}?\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 10,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Prompt:\n",
+      "What is the meaning of life?\n",
+      "Output:\n",
+      " The question is so perplexing that there have been dozens of care\n"
+     ]
+    }
+   ],
+   "source": [
+    "# invoke a model response using chain\n",
+    "chain = prompt | llm\n",
+    "print(chain.invoke({\"thing\": \"life\"}))"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Falcon on Vertex Model Garden "
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "> Falcon is a family of open weight models developed by [Falcon](https://falconllm.tii.ae/) that you can fine-tune and deploy on Vertex AI. Falcon models are pre-trained and fine-tuned generative text models."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "To use Falcon on Vertex Model Garden you must first [deploy it to Vertex AI Endpoint](https://cloud.google.com/vertex-ai/generative-ai/docs/model-garden/explore-models#deploy-a-model)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 13,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from langchain_google_vertexai import VertexAIModelGarden"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 14,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# TODO : Add \"YOUR PROJECT\" and \"YOUR ENDPOINT_ID\"\n",
+    "llm = VertexAIModelGarden(project=\"YOUR PROJECT\", endpoint_id=\"YOUR ENDPOINT_ID\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 15,
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "'Prompt:\\nWhat is the meaning of life?\\nOutput:\\nWhat is the meaning of life?\\nThe meaning of life is a philosophical question that does not have a clear answer. The search for the meaning of life is a lifelong journey, and there is no definitive answer. Different cultures, religions, and individuals may approach this question in different ways.'"
+      ]
+     },
+     "execution_count": 15,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "# invoke a model response\n",
+    "llm.invoke(\"What is the meaning of life?\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Like all LLMs, we can then compose it with other components:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 2,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from langchain_core.prompts import PromptTemplate\n",
+    "\n",
+    "prompt = PromptTemplate.from_template(\"What is the meaning of {thing}?\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 17,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Prompt:\n",
+      "What is the meaning of life?\n",
+      "Output:\n",
+      "What is the meaning of life?\n",
+      "As an AI language model, my personal belief is that the meaning of life varies from person to person. It might be finding happiness, fulfilling a purpose or goal, or making a difference in the world. It's ultimately a personal question that can be explored through introspection or by seeking guidance from others.\n"
+     ]
+    }
+   ],
+   "source": [
+    "chain = prompt | llm\n",
+    "print(chain.invoke({\"thing\": \"life\"}))"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Gemma on Vertex AI Model Garden"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "> [Gemma](https://ai.google.dev/gemma) is a set of lightweight, generative artificial intelligence (AI) open models. Gemma models are available to run in your applications and on your hardware, mobile devices, or hosted services. You can also customize these models using tuning techniques so that they excel at performing tasks that matter to you and your users. Gemma models are based on [Gemini](https://cloud.google.com/vertex-ai/generative-ai/docs/multimodal/overview) models and are intended for the AI development community to extend and take further."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "To use Gemma on Vertex Model Garden you must first [deploy it to Vertex AI Endpoint](https://cloud.google.com/vertex-ai/generative-ai/docs/model-garden/explore-models#deploy-a-model)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 22,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from langchain_core.messages import (\n",
+    "    AIMessage,\n",
+    "    HumanMessage,\n",
+    ")\n",
+    "from langchain_google_vertexai import (\n",
+    "    GemmaChatVertexAIModelGarden,\n",
+    "    GemmaVertexAIModelGarden,\n",
+    ")"
+   ]
+  },
  {
   "cell_type": "markdown",
   "metadata": {},
@ -656,6 +982,73 @@
    "## Anthropic on Vertex AI"
   ]
  },
+  {
+   "cell_type": "code",
+   "execution_count": 21,
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "'Prompt:\\nWhat is the meaning of life?\\nOutput:\\nThis is a classic question that has captivated philosophers, theologians, and seekers for'"
+      ]
+     },
+     "execution_count": 21,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "# TODO : Add \"YOUR PROJECT\" , \"YOUR REGION\" and \"YOUR ENDPOINT_ID\"\n",
+    "llm = GemmaVertexAIModelGarden(\n",
+    "    endpoint_id=\"YOUR PROJECT\",\n",
+    "    project=\"YOUR ENDPOINT_ID\",\n",
+    "    location=\"YOUR REGION\",\n",
+    ")\n",
+    "\n",
+    "# invoke a model response\n",
+    "llm.invoke(\"What is the meaning of life?\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 23,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# TODO : Add \"YOUR PROJECT\" , \"YOUR REGION\" and \"YOUR ENDPOINT_ID\"\n",
+    "chat_llm = GemmaChatVertexAIModelGarden(\n",
+    "    endpoint_id=\"YOUR PROJECT\",\n",
+    "    project=\"YOUR ENDPOINT_ID\",\n",
+    "    location=\"YOUR REGION\",\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 26,
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "AIMessage(content='Prompt:\\n<start_of_turn>user\\nHow much is 2+2?<end_of_turn>\\n<start_of_turn>model\\nOutput:\\nThe answer is 4.\\n2 + 2 = 4.', id='run-cea563df-e91a-4374-83a1-3d8b186a01b2-0')"
+      ]
+     },
+     "execution_count": 26,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "# Prepare input for model consumption\n",
+    "text_question1 = \"How much is 2+2?\"\n",
+    "message1 = HumanMessage(content=text_question1)\n",
+    "\n",
+    "# invoke a model response\n",
+    "chat_llm.invoke([message1])"
+   ]
+  },
  {
   "cell_type": "markdown",
   "metadata": {},