{ "cells": [ { "cell_type": "markdown", "id": "f2605a68-4ec8-40c5-aefc-e5ae7b23b884", "metadata": {}, "source": [ "# Building hotel room search with self-querying retrieval\n", "\n", "In this example we'll walk through how to build and iterate on a hotel room search service that leverages an LLM to generate structured filter queries that can then be passed to a vector store.\n", "\n", "For an introduction to self-querying retrieval [check out the docs](https://python.langchain.com/docs/modules/data_connection/retrievers/self_query)." ] }, { "cell_type": "markdown", "id": "d621de99-d993-4f4b-b94a-d02b2c7ad4e0", "metadata": {}, "source": [ "## Imports and data prep\n", "\n", "In this example we use `ChatOpenAI` for the model and `ElasticsearchStore` for the vector store, but these can be swapped out with an LLM/ChatModel and [any VectorStore that support self-querying](https://python.langchain.com/docs/integrations/retrievers/self_query/).\n", "\n", "Download data from: https://www.kaggle.com/datasets/keshavramaiah/hotel-recommendation" ] }, { "cell_type": "code", "execution_count": null, "id": "8ecd1fbb-bdba-420b-bcc7-5ea8a232ab11", "metadata": {}, "outputs": [], "source": [ "!pip install langchain lark openai elasticsearch pandas" ] }, { "cell_type": "code", "execution_count": 1, "id": "14d48ff6-2552-4b95-95a9-42dd444471d9", "metadata": {}, "outputs": [], "source": [ "import pandas as pd" ] }, { "cell_type": "code", "execution_count": 2, "id": "b852ec6e-7bf6-405e-ae7f-f457eb6e17f1", "metadata": {}, "outputs": [], "source": [ "details = pd.read_csv(\"~/Downloads/archive/Hotel_details.csv\").drop_duplicates(subset=\"hotelid\").set_index(\"hotelid\")\n", "attributes = pd.read_csv(\"~/Downloads/archive/Hotel_Room_attributes.csv\", index_col=\"id\")\n", "price = pd.read_csv(\"~/Downloads/archive/hotels_RoomPrice.csv\", index_col=\"id\")" ] }, { "cell_type": "code", "execution_count": 3, "id": "35a32177-2ca5-4d10-b8dc-f34c25795630", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", " | roomtype | \n", "onsiterate | \n", "roomamenities | \n", "maxoccupancy | \n", "roomdescription | \n", "hotelname | \n", "city | \n", "country | \n", "starrating | \n", "mealsincluded | \n", "
---|---|---|---|---|---|---|---|---|---|---|
0 | \n", "Vacation Home | \n", "636.09 | \n", "Air conditioning: ;Closet: ;Fireplace: ;Free W... | \n", "4 | \n", "Shower, Kitchenette, 2 bedrooms, 1 double bed ... | \n", "Pantlleni | \n", "Beddgelert | \n", "United Kingdom | \n", "3 | \n", "False | \n", "
1 | \n", "Vacation Home | \n", "591.74 | \n", "Air conditioning: ;Closet: ;Dishwasher: ;Firep... | \n", "4 | \n", "Shower, Kitchenette, 2 bedrooms, 1 double bed ... | \n", "Willow Cottage | \n", "Beverley | \n", "United Kingdom | \n", "3 | \n", "False | \n", "
2 | \n", "Guest room, Queen or Twin/Single Bed(s) | \n", "0.00 | \n", "NaN | \n", "2 | \n", "NaN | \n", "AC Hotel Manchester Salford Quays | \n", "Manchester | \n", "United Kingdom | \n", "4 | \n", "False | \n", "
3 | \n", "Bargemaster King Accessible Room | \n", "379.08 | \n", "Air conditioning: ;Free Wi-Fi in all rooms!: ;... | \n", "2 | \n", "Shower | \n", "Lincoln Plaza London, Curio Collection by Hilton | \n", "London | \n", "United Kingdom | \n", "4 | \n", "True | \n", "
4 | \n", "Twin Room | \n", "156.17 | \n", "Additional toilet: ;Air conditioning: ;Blackou... | \n", "2 | \n", "Room size: 15 m²/161 ft², Non-smoking, Shower,... | \n", "Ibis London Canning Town | \n", "London | \n", "United Kingdom | \n", "3 | \n", "True | \n", "