"This notebook contains an end-to-end workflow to set up an Enterprise Knowledge Retrieval solution from scratch.\n",
"\n",
"### Problem Statement\n",
"\n",
"LLMs have great conversational ability but their knowledge is general and often out of date. Relevant knowledge often exists, but is kept in disparate datestores that are hard to surface with current search solutions.\n",
"\n",
"\n",
"### Objective\n",
"\n",
"We want to deliver an outstanding user experience where the user is presented with the right knowledge when they need it in a clear and conversational way. To accomplish this we need an LLM-powered solution that knows our organizational context and data, that can retrieve the right knowledge when the user needs it. \n"
"We'll build a knowledge retrieval solution that will embed a corpus of knowledge (in our case a database of Wikipedia manuals) and use it to answer user questions.\n",
"\n",
"### Learning Path\n",
"\n",
"#### Walkthrough\n",
"\n",
"You can follow on to this solution walkthrough through either the video recorded here, or the text walkthrough below. We'll build out the solution in the following stages:\n",
"- **Setup:** Initiate variables and connect to a vector database.\n",
"- **Storage:** Configure the database, prepare our data and store embeddings and metadata for retrieval.\n",
"- **Search:** Extract relevant documents back out with a basic search function and use an LLM to summarise results into a concise reply.\n",
"- **Answer:** Add a more sophisticated agent which will process the user's query and maintain a memory for follow-up questions.\n",
"- **Evaluate:** Take a sample evaluated question/answer pairs using our service and plot them to scope out remedial action."
]
},
{
"cell_type": "markdown",
"id": "ae9b1412",
"metadata": {},
"source": [
"## Walkthrough"
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "4e85be52",
"metadata": {},
"outputs": [],
"source": [
"%load_ext autoreload\n",
"%autoreload 2"
]
},
{
"cell_type": "markdown",
"id": "ab1a0a6a",
"metadata": {},
"source": [
"## Setup\n",
"\n",
"Import libraries and set up a connection to a Redis vector database for our knowledge base.\n",
"\n",
"You can substitute Redis for any other vectorstore or database - there are a [selection](https://python.langchain.com/en/latest/modules/indexes/vectorstores.html) that are supported by Langchain natively, while other connectors will need to be developed yourself."
" <td>Photons (from Greek φως, meaning light), in many atomic models in physics, are particles which transmit light. In other words, light is carried over space by photons. Photon is an elementary particle that is its own antiparticle. In quantum mechanics each photon has a characteristic quantum of energy that depends on frequency: A photon associated with light at a higher frequency will have more energy (and be associated with light at a shorter wavelength).\\n\\nPhotons have a rest mass of 0 (zero). However, Einstein's theory of relativity says that they do have a certain amount of momentum. Before the photon got its name, Einstein revived the proposal that light is separate pieces of energy (particles). These particles came to be known as photons. \\n\\nA photon is usually given the symbol γ (gamma),\\n\\nProperties \\n\\nPhotons are fundamental particles. Although they can be created and destroyed, their lifetime is infinite.\\n\\nIn a vacuum, all photons move at the speed of light, c, which is equal to 299,792,458 meters (approximately 300,000 kilometers) per second.\\n\\nA photon has a given frequency, which determines its color. Radio technology makes great use of frequency. Beyond the visible range, frequency is less discussed, for example it is little used in distinguishing between X-Ray photons and infrared. Frequency is equivalent to the quantum energy of the photon, as related by the Planck constant equation,\\n\\n,\\n\\nwhere is the photon's energy, is the Plank constant, and is the frequency of the light associated with the photon. This frequency, , is typically measured in cycles per second, or equivalently, in Hz. The quantum energy of different photons is often used in cameras, and other machines that use visible and higher than visible radiation. This because these photons are energetic enough to ionize atoms. \\n\\nAnother property of a photon is its wavelength. The frequency , wavelength , and speed of light are related by the equation,\\n\\n,\\n\\nwhere (lambda) is the wavelength, or length of the wave (typically measured in meters.)\\n\\nAnother important property of a photon is its polarity. If you saw a giant photon coming straight at you, it could appear as a swath whipping vertically, horizontally, or somewhere in between. Polarized sunglasses stop photons swinging up and down from passing. This is how they reduce glare as light bouncing off of surfaces tend to fly that way. Liquid crystal displays also use polarity to control which light passes through. Some animals can see light polarization. \\n\\nFinally, a photon has a property called spin. Spin is related to light's circular polarization.\\n\\nPhoton interactions with matter\\nLight is often created or absorbed when an electron gains or loses energy. This energy can be in the form of heat, kinetic energy, or other form. For example, an incandescent light bulb uses heat. The increase of energy can push an electron up one level in a shell called a \"valence\". This makes it unstable, and like everything, it wants to be in the lowest energy state. (If being in the lowest energy state is confusing, pick up a pencil and drop it. Once on the ground, the pencil will be in a lower energy state). When the electron drops back down to a lower energy state, it needs to release the energy that hit it, and it must obey the conservation of energy (energy can neither be created nor destroyed). Electrons release this energy as photons, and at higher intensities, this photon can be seen as visible light.\\n\\nPhotons and the electromagnetic force\\nIn particle physics, photons are responsible for electromagnetic force. Electromagnetism is an idea that combines electricity with magnetism. One common way that we experience electromagnetism in our daily lives is light, which is caused by electromagnetism. Electromagnetism is also responsible for charge, which is the reason that you can not push your hand through a table. Since photons are the force-carrying particle of electromagnetism, they are also gauge bosons. Some matter–cal
" <td>Thomas Dolby (born Thomas Morgan Robertson; 14 October 1958) is a British musican and computer designer. He is probably most famous for his 1982 hit, \"She Blinded me with Science\".\\n\\nHe married actress Kathleen Beller in 1988. The couple have three children together.\\n\\nDiscography\\n\\nSingles\\n\\nA Track did not chart in North America until 1983, after the success of \"She Blinded Me With Science\".\\n\\nAlbums\\n\\nStudio albums\\n\\nEPs\\n\\nReferences\\n\\nEnglish musicians\\nLiving people\\n1958 births\\nNew wave musicians\\nWarner Bros. Records artists</td>\n",
" <td>Embroidery is the art of decorating fabric or other materials with designs stitched in strands of thread or yarn using a needle. Embroidery may also incorporate other materials such as metal strips, pearls, beads, quills, and sequins. Sewing machines can be used to create machine embroidery.\\n\\nQualifications \\nCity and Guilds qualification in Embroidery allows embroiderers to become recognized for their skill. This qualification also gives them the credibility to teach. For example, the notable textiles artist, Kathleen Laurel Sage, began her teaching career by getting the City and Guilds Embroidery 1 and 2 qualifications. She has now gone on to write a book on the subject.\\n\\nReferences\\n\\nOther websites\\n The Crimson Thread of Kinship at the National Museum of Australia\\n\\nNeedlework</td>\n",
" <td>Consecutive numbers are numbers that follow each other in order. They have a difference of 1 between every two numbers. In a set of consecutive numbers, the mean and the median are equal. \\n\\nIf n is a number, then the next numbers will be n+1 and n+2. \\n\\nExamples \\n\\nConsecutive numbers that follow each other in order:\\n\\n 1, 2, 3, 4, 5\\n -3, −2, −1, 0, 1, 2, 3, 4\\n 6, 7, 8, 9, 10, 11, 12, 13\\n\\nConsecutive even numbers \\nConsecutive even numbers are even numbers that follow each other. They have a difference of 2 between every two numbers.\\n\\nIf n is an even integer, then n, n+2, n+4 and n+6 will be consecutive even numbers.\\n\\nFor example - 2,4,6,8,10,12,14,18 etc.\\n\\nConsecutive odd numbers\\nConsecutive odd numbers are odd numbers that follow each other. Like consecutive odd numbers, they have a difference of 2 between every two numbers.\\n\\nIf n is an odd integer, then n, n+2, n+4 and n+6 will be consecutive odd numbers.\\n\\nExamples\\n\\n3, 5, 7, 9, 11, 13, etc.\\n\\n−23, −21, −19, −17, −15, -13, -11\\n\\nIntegers</td>\n",
" <td>The German Empire (\"Deutsches Reich\" or \"Deutsches Kaiserreich\" in the German language) is the name for a group of German countries from January 18, 1871 to November 9, 1918. This is from the Unification of Germany when Wilhelm I of Prussia was made German Kaiser to when the third Emperor Wilhelm II was removed from power at the end of the First World War. In the 1920s, German nationalists started to call it the \"Second Reich\".\\n\\nThe name of Germany was \"Deutsches Reich\" until 1945. \"Reich\" can mean many things, empire, kingdom, state, \"richness\" or \"wealth\". Most members of the Empire were previously members of the North German Confederation. \\n\\nAt different times, there were three groups of smaller countries, each group was later called a \"Reich\" by some Germans. The first was the Holy Roman Empire. The second was the German Empire. The third was the Third Reich.\\n\\nThe words \"Second Reich\" were used for the German Empire by Arthur Moeller van den Bruck, a nationalist writer in the 1920s. He was trying to make a link with the earlier Holy Roman Empire which had once been very strong. Germany had lost First World War and was suffering big problems. van den Bruck wanted to start a \"Third Reich\" to unite the country. These words were later used by the Nazis to make themselves appear stronger.\\n\\nStates in the Empire\\n\\nRelated pages\\n Germany\\n Holy Roman Empire\\n Nazi Germany, or \"Drittes Reich\"\\n\\n1870s establishments in Germany\\n \\nStates and territories disestablished in the 20th century\\nStates and territories established in the 19th century\\n1871 establishments in Europe\\n1918 disestablishments in Germany</td>\n",
"0 Photons (from Greek φως, meaning light), in many atomic models in physics, are particles which transmit light. In other words, light is carried over space by photons. Photon is an elementary particle that is its own antiparticle. In quantum mechanics each photon has a characteristic quantum of energy that depends on frequency: A photon associated with light at a higher frequency will have more energy (and be associated with light at a shorter wavelength).\\n\\nPhotons have a rest mass of 0 (zero). However, Einstein's theory of relativity says that they do have a certain amount of momentum. Before the photon got its name, Einstein revived the proposal that light is separate pieces of energy (particles). These particles came to be known as photons. \\n\\nA photon is usually given the symbol γ (gamma),\\n\\nProperties \\n\\nPhotons are fundamental particles. Although they can be created and destroyed, their lifetime is infinite.\\n\\nIn a vacuum, all photons move at the speed of light, c, which is equal to 299,792,458 meters (approximately 300,000 kilometers) per second.\\n\\nA photon has a given frequency, which determines its color. Radio technology makes great use of frequency. Beyond the visible range, frequency is less discussed, for example it is little used in distinguishing between X-Ray photons and infrared. Frequency is equivalent to the quantum energy of the photon, as related by the Planck constant equation,\\n\\n,\\n\\nwhere is the photon's energy, is the Plank constant, and is the frequency of the light associated with the photon. This frequency, , is typically measured in cycles per second, or equivalently, in Hz. The quantum energy of different photons is often used in cameras, and other machines that use visible and higher than visible radiation. This because these photons are energetic enough to ionize atoms. \\n\\nAnother property of a photon is its wavelength. The frequency , wavelength , and speed of light are related by the equation,\\n\\n,\\n\\nwhere (lambda) is the wavelength, or length of the wave (typically measured in meters.)\\n\\nAnother important property of a photon is its polarity. If you saw a giant photon coming straight at you, it could appear as a swath whipping vertically, horizontally, or somewhere in between. Polarized sunglasses stop photons swinging up and down from passing. This is how they reduce glare as light bouncing off of surfaces tend to fly that way. Liquid crystal displays also use polarity to control which light passes through. Some animals can see light polarization. \\n\\nFinally, a photon has a property called spin. Spin is related to light's circular polarization.\\n\\nPhoton interactions with matter\\nLight is often created or absorbed when an electron gains or loses energy. This energy can be in the form of heat, kinetic energy, or other form. For example, an incandescent light bulb uses heat. The increase of energy can push an electron up one level in a shell called a \"valence\". This makes it unstable, and like everything, it wants to be in the lowest energy state. (If being in the lowest energy state is confusing, pick up a pencil and drop it. Once on the ground, the pencil will be in a lower energy state). When the electron drops back down to a lower energy state, it needs to release the energy that hit it, and it must obey the conservation of energy (energy can neither be created nor destroyed). Electrons release this energy as photons, and at higher intensities, this photon can be seen as visible light.\\n\\nPhotons and the electromagnetic force\\nIn particle physics, photons are responsible for electromagnetic force. Electromagnetism is an idea that combines electricity with magnetism. One common way that we experience electromagnetism in our daily lives is light, which is caused by electromagnetism. Electromagnetism is also responsible for charge, which is the reason that you can not push your hand through a table. Since photons are the force-carrying particle of electromagnetism, they are also gauge bosons. Some matter–called dar
"1 Thomas Dolby (born Thomas Morgan Robertson; 14 October 1958) is a British musican and computer designer. He is probably most famous for his 1982 hit, \"She Blinded me with Science\".\\n\\nHe married actress Kathleen Beller in 1988. The couple have three children together.\\n\\nDiscography\\n\\nSingles\\n\\nA Track did not chart in North America until 1983, after the success of \"She Blinded Me With Science\".\\n\\nAlbums\\n\\nStudio albums\\n\\nEPs\\n\\nReferences\\n\\nEnglish musicians\\nLiving people\\n1958 births\\nNew wave musicians\\nWarner Bros. Records artists
"2 Embroidery is the art of decorating fabric or other materials with designs stitched in strands of thread or yarn using a needle. Embroidery may also incorporate other materials such as metal strips, pearls, beads, quills, and sequins. Sewing machines can be used to create machine embroidery.\\n\\nQualifications \\nCity and Guilds qualification in Embroidery allows embroiderers to become recognized for their skill. This qualification also gives them the credibility to teach. For example, the notable textiles artist, Kathleen Laurel Sage, began her teaching career by getting the City and Guilds Embroidery 1 and 2 qualifications. She has now gone on to write a book on the subject.\\n\\nReferences\\n\\nOther websites\\n The Crimson Thread of Kinship at the National Museum of Australia\\n\\nNeedlework
"3 Consecutive numbers are numbers that follow each other in order. They have a difference of 1 between every two numbers. In a set of consecutive numbers, the mean and the median are equal. \\n\\nIf n is a number, then the next numbers will be n+1 and n+2. \\n\\nExamples \\n\\nConsecutive numbers that follow each other in order:\\n\\n 1, 2, 3, 4, 5\\n -3, −2, −1, 0, 1, 2, 3, 4\\n 6, 7, 8, 9, 10, 11, 12, 13\\n\\nConsecutive even numbers \\nConsecutive even numbers are even numbers that follow each other. They have a difference of 2 between every two numbers.\\n\\nIf n is an even integer, then n, n+2, n+4 and n+6 will be consecutive even numbers.\\n\\nFor example - 2,4,6,8,10,12,14,18 etc.\\n\\nConsecutive odd numbers\\nConsecutive odd numbers are odd numbers that follow each other. Like consecutive odd numbers, they have a difference of 2 between every two numbers.\\n\\nIf n is an odd integer, then n, n+2, n+4 and n+6 will be consecutive odd numbers.\\n\\nExamples\\n\\n3, 5, 7, 9, 11, 13, etc.\\n\\n−23, −21, −19, −17, −15, -13, -11\\n\\nIntegers
"4 The German Empire (\"Deutsches Reich\" or \"Deutsches Kaiserreich\" in the German language) is the name for a group of German countries from January 18, 1871 to November 9, 1918. This is from the Unification of Germany when Wilhelm I of Prussia was made German Kaiser to when the third Emperor Wilhelm II was removed from power at the end of the First World War. In the 1920s, German nationalists started to call it the \"Second Reich\".\\n\\nThe name of Germany was \"Deutsches Reich\" until 1945. \"Reich\" can mean many things, empire, kingdom, state, \"richness\" or \"wealth\". Most members of the Empire were previously members of the North German Confederation. \\n\\nAt different times, there were three groups of smaller countries, each group was later called a \"Reich\" by some Germans. The first was the Holy Roman Empire. The second was the German Empire. The third was the Third Reich.\\n\\nThe words \"Second Reich\" were used for the German Empire by Arthur Moeller van den Bruck, a nationalist writer in the 1920s. He was trying to make a link with the earlier Holy Roman Empire which had once been very strong. Germany had lost First World War and was suffering big problems. van den Bruck wanted to start a \"Third Reich\" to unite the country. These words were later used by the Nazis to make themselves appear stronger.\\n\\nStates in the Empire\\n\\nRelated pages\\n Germany\\n Holy Roman Empire\\n Nazi Germany, or \"Drittes Reich\"\\n\\n1870s establishments in Germany\\n \\nStates and territories disestablished in the 20th century\\nStates and territories established in the 19th century\\n1871 establishments in Europe\\n1918 disestablishments in Germany
"We'll initialise our vector database first. Which database you choose and how you store data in it is a key decision point, and we've collated a few principles to aid your decision here:\n",
"\n",
"#### How much data to store\n",
"How much metadata do you want to include in the index. Metadata can be used to filter your queries or to bring back more information upon retrieval for your application to use, but larger indices will be slower so there is a trade-off.\n",
"\n",
"There are two common design patterns here:\n",
"- **All-in-one:** Store your metadata with the vector embeddings so you perform semantic search and retrieval on the same database. This is easier to setup and run, but can run into scaling issues when your index grows.\n",
"- **Vectors only:** Store just the embeddings and any IDs/references needed to locate the metadata that goes with the vector in a different database or location. In this pattern the vector database is only used to locate the most relevant IDs, then those are looked up from a different database. This can be more scalable if your vector database is going to be extremely large, or if you have large volumes of metadata with each vector.\n",
"\n",
"#### Which vector database to use\n",
"\n",
"The vector database market is wide and varied, so we won't recommend one over the other. For a few options you can review [this cookbook](./vector_databases/Using_vector_databases_for_embeddings_search.ipynb) and the sub-folders, which have examples supplied by many of the vector database providers in the market. \n",
"\n",
"We're going to use Redis as our database for both document contents and the vector embeddings. You will need the full Redis Stack to enable use of Redisearch, which is the module that allows semantic search - more detail is in the [docs for Redis Stack](https://redis.io/docs/stack/get-started/install/docker/).\n",
"\n",
"To set this up locally, you will need to:\n",
"- Install an appropriate version of [Docker](https://docs.docker.com/desktop/) for your OS\n",
"- Ensure Docker is running i.e. by running ```docker run hello-world```\n",
"- Run the following command: ```docker run -d --name redis-stack -p 6379:6379 -p 8001:8001 redis/redis-stack:latest```.\n",
"\n",
"The code used here draws heavily on [this repo](https://github.com/RedisAI/vecsim-demo).\n",
"\n",
"After setting up the Docker instance of Redis Stack, you can follow the below instructions to initiate a Redis connection and create a Hierarchical Navigable Small World (HNSW) index for semantic search."
"The next step is to prepare your data. There are a few decisions to keep in mind here:\n",
"\n",
"#### Chunking your data\n",
"\n",
"In this context, \"chunking\" means cutting up the text into reasonable sizes so that the content will fit into the context length of the language model you choose. If your data is small enough or your LLM has a large enough context limit then you can proceed with no chunking, but in many cases you'll need to chunk your data. I'll share two main design patterns here:\n",
"- **Token-based:** Chunking your data based on some common token threshold i.e. 300, 500, 1000 depending on your use case. This approach works best with a grid-search evaluation to decide the optimal chunking logic over a set of evaluation questions. Variables to consider are whether chunks have overlaps, and whether you extend or truncate a section to keep full sentences and paragraphs together.\n",
"- **Deterministic:** Deterministic chunking uses some common delimiter, like a page break, paragraph end, section header etc. to chunk. This can work well if you have data of reasonable uniform structure, or if you can use GPT to help annotate the data first so you can guarantee common delimiters. However, it can be difficult to handle your chunks when you stuff them into the prompt given you need to cater for many different lengths of content, so consider that in your application design.\n",
"\n",
"#### Which vectors should you store\n",
"\n",
"It is critical to think through the user experience you're building towards because this will inform both the number and content of your vectors. Here are two example use cases that show how these can pan out:\n",
"- **Tool Manual Knowledge Base:** We have a database of manuals that our customers want to search over. For this use case, we want a vector to allow the user to identify the right manual, before searching a different set of vectors to interrogate the content of the manual to avoid any cross-pollination of similar content between different manuals. \n",
" - **Title Vector:** Could include title, author name, brand and abstract.\n",
" - **Content Vector:** Includes content only.\n",
"- **Investor Reports:** We have a database of investor reports that contain financial information about public companies. I want relevant snippets pulled out and summarised so I can decide how to invest. In this instance we want one set of content vectors, so that the retrieval can pull multiple entries on a company or industry, and summarise them to form a composite analysis.\n",
" - **Content Vector:** Includes content only, or content supplemented by other features that improve search quality such as author, industry etc.\n",
" \n",
"For this walkthrough we'll go with 1000 token-based chunking of text content with no overlap, and embed them with the article title included as a prefix."
" 'content': 'Title: Photon;\\nPhotons (from Greek φως, meaning light), in many atomic models in physics, are particles which transmit light. In other words, light is carried over space by photons. Photon is an elementary particle that is its own antiparticle. In quantum mechanics each photon has a characteristic quantum of energy that depends on frequency: A photon associated with light at a higher frequency will have more energy (and be associated with light at a shorter wavelength).\\n\\nPhotons have a rest mass of 0 (zero). However, Einstein\\'s theory of relativity says that they do have a certain amount of momentum. Before the photon got its name, Einstein revived the proposal that light is separate pieces of energy (particles). These particles came to be known as photons. \\n\\nA photon is usually given the symbol γ (gamma),\\n\\nProperties \\n\\nPhotons are fundamental particles. Although they can be created and destroyed, their lifetime is infinite.\\n\\nIn a vacuum, all photons move at the speed of light, c, which is equal to 299,792,458 meters (approximately 300,000 kilometers) per second.\\n\\nA photon has a given frequency, which determines its color. Radio technology makes great use of frequency. Beyond the visible range, frequency is less discussed, for example it is little used in distinguishing between X-Ray photons and infrared. Frequency is equivalent to the quantum energy of the photon, as related by the Planck constant equation,\\n\\n,\\n\\nwhere is the photon\\'s energy, is the Plank constant, and is the frequency of the light associated with the photon. This frequency, , is typically measured in cycles per second, or equivalently, in Hz. The quantum energy of different photons is often used in cameras, and other machines that use visible and higher than visible radiation. This because these photons are energetic enough to ionize atoms. \\n\\nAnother property of a photon is its wavelength. The frequency , wavelength , and speed of light are related by the equation,\\n\\n,\\n\\nwhere (lambda) is the wavelength, or length of the wave (typically measured in meters.)\\n\\nAnother important property of a photon is its polarity. If you saw a giant photon coming straight at you, it could appear as a swath whipping vertically, horizontally, or somewhere in between. Polarized sunglasses stop photons swinging up and down from passing. This is how they reduce glare as light bouncing off of surfaces tend to fly that way. Liquid crystal displays also use polarity to control which light passes through. Some animals can see light polarization. \\n\\nFinally, a photon has a property called spin. Spin is related to light\\'s circular polarization.\\n\\nPhoton interactions with matter\\nLight is often created or absorbed when an electron gains or loses energy. This energy can be in the form of heat, kinetic energy, or other form. For example, an incandescent light bulb uses heat. The increase of energy can push an electron up one level in a shell called a \"valence\". This makes it unstable, and like everything, it wants to be in the lowest energy state. (If being in the lowest energy state is confusing, pick up a pencil and drop it. Once on the ground, the pencil will be in a lower energy state). When the electron drops back down to a lower energy state, it needs to release the energy that hit it, and it must obey the conservation of energy (energy can neither be created nor destroyed). Electrons release this energy as photons, and at higher intensities, this photon can be seen as visible light.\\n\\nPhotons and the electromagnetic force\\nIn particle physics, photons are responsible for electromagnetic force. Electromagnetism is an idea that combines electricity with magnetism. One common way that we experience electromagnetism in our daily lives is light, which is caused by electromagnetism. Electromagnetism is also responsible for charge, which is the reason that you can not push your hand through a table. Since photons are the force-carrying particle of electromagnetism, they are also gaug
" 'content': 'Title: Photon;\\nPhotons (from Greek φως, meaning light), in many atomic models in physics, are particles which transmit light. In other words, light is carried over space by photons. Photon is an elementary particle that is its own antiparticle. In quantum mechanics each photon has a characteristic quantum of energy that depends on frequency: A photon associated with light at a higher frequency will have more energy (and be associated with light at a shorter wavelength).\\n\\nPhotons have a rest mass of 0 (zero). However, Einstein\\'s theory of relativity says that they do have a certain amount of momentum. Before the photon got its name, Einstein revived the proposal that light is separate pieces of energy (particles). These particles came to be known as photons. \\n\\nA photon is usually given the symbol γ (gamma),\\n\\nProperties \\n\\nPhotons are fundamental particles. Although they can be created and destroyed, their lifetime is infinite.\\n\\nIn a vacuum, all photons move at the speed of light, c, which is equal to 299,792,458 meters (approximately 300,000 kilometers) per second.\\n\\nA photon has a given frequency, which determines its color. Radio technology makes great use of frequency. Beyond the visible range, frequency is less discussed, for example it is little used in distinguishing between X-Ray photons and infrared. Frequency is equivalent to the quantum energy of the photon, as related by the Planck constant equation,\\n\\n,\\n\\nwhere is the photon\\'s energy, is the Plank constant, and is the frequency of the light associated with the photon. This frequency, , is typically measured in cycles per second, or equivalently, in Hz. The quantum energy of different photons is often used in cameras, and other machines that use visible and higher than visible radiation. This because these photons are energetic enough to ionize atoms. \\n\\nAnother property of a photon is its wavelength. The frequency , wavelength , and speed of light are related by the equation,\\n\\n,\\n\\nwhere (lambda) is the wavelength, or length of the wave (typically measured in meters.)\\n\\nAnother important property of a photon is its polarity. If you saw a giant photon coming straight at you, it could appear as a swath whipping vertically, horizontally, or somewhere in between. Polarized sunglasses stop photons swinging up and down from passing. This is how they reduce glare as light bouncing off of surfaces tend to fly that way. Liquid crystal displays also use polarity to control which light passes through. Some animals can see light polarization. \\n\\nFinally, a photon has a property called spin. Spin is related to light\\'s circular polarization.\\n\\nPhoton interactions with matter\\nLight is often created or absorbed when an electron gains or loses energy. This energy can be in the form of heat, kinetic energy, or other form. For example, an incandescent light bulb uses heat. The increase of energy can push an electron up one level in a shell called a \"valence\". This makes it unstable, and like everything, it wants to be in the lowest energy state. (If being in the lowest energy state is confusing, pick up a pencil and drop it. Once on the ground, the pencil will be in a lower energy state). When the electron drops back down to a lower energy state, it needs to release the energy that hit it, and it must obey the conservation of energy (energy can neither be created nor destroyed). Electrons release this energy as photons, and at higher intensities, this photon can be seen as visible light.\\n\\nPhotons and the electromagnetic force\\nIn particle physics, photons are responsible for electromagnetic force. Electromagnetism is an idea that combines electricity with magnetism. One common way that we experience electromagnetism in our daily lives is light, which is caused by electromagnetism. Electromagnetism is also responsible for charge, which is the reason that you can not push your hand through a table. Since photons are the force-carrying particle of electromagnetism, they are also gaug
"We can now use our knowledge base to bring back search results. This is one of the areas of highest friction in enterprise knowledge retrieval use cases, with the most common being that the system is not retrieving what you intuitively think are the most relevant documents. There are a few ways of tackling this - I'll share a few options here, as well as some resources to take your research further:\n",
"\n",
"#### Vector search, keyword search or a hybrid\n",
"\n",
"Despite the strong capabilities out of the box that vector search gives, search is still not a solved problem, and there are well proven [Lucene-based](https://en.wikipedia.org/wiki/Apache_Lucene) search solutions such Elasticsearch and Solr that use methods that work well for certain use cases, as well as the sparse vector methods of traditional NLP such as [TF-IDF](https://en.wikipedia.org/wiki/Tf%E2%80%93idf). If your retrieval is poor, the answer may be one of these in particular, or a combination:\n",
"- **Vector search:** Converts your text into vector embeddings which can be searched using KNN, SVM or some other model to return the most relevant results. This is the approach we take in this workbook, using a RediSearch vector DB which employs a KNN search under the hood.\n",
"- **Keyword search:** This method uses any keyword-based search approach to return a score - it could use Elasticsearch/Solr out-of-the-box, or a TF-IDF approach like BM25.\n",
"- **Hybrid search:** This last approach is a mix of the two, where you produce both a vector search and keyword search result, before using an ```alpha``` between 0 and 1 to weight the outputs. There is a great example of this explained by the Weaviate team [here](https://weaviate.io/blog/hybrid-search-explained).\n",
"\n",
"#### Hypothetical Document Embeddings (HyDE)\n",
"\n",
"This is a novel approach from [this paper](https://arxiv.org/abs/2212.10496), which states that a hypothetical answer to a question is more semantically similar to the real answer than the question is. In practice this means that your search would use GPT to generate a hypothetical answer, then embed that and use it for search. I've seen success with this both as a pure search, and as a retry step if the initial retrieval fails to retrieve relevant content. A simple example implementation is here:\n",
"```\n",
"def answer_question_hyde(question,prompt):\n",
" \n",
" hyde_prompt = '''You are OracleGPT, an helpful expert who answers user questions to the best of their ability.\n",
" Provide a confident answer to their question. If you don't know the answer, make the best guess you can based on the context of the question.\n",
"This next approach leverages the learning you gain from real question/answer pairs that your users will generate during the evaluation approach. It works by:\n",
"- Creating a dataset of positive (and optionally negative) question and answer pairs. Positive examples would be a correct retrieval to a question, while negative would be poor retrievals.\n",
"- Calculating the embeddings for both questions and answers and the cosine similarity between them.\n",
"- Train a model to optimize the embeddings matrix and test retrieval, picking the best one.\n",
"- Perform a matrix multiplication of the base Ada embeddings by this new best matrix, creating a new fine-tuned embedding to do for retrieval.\n",
"\n",
"There is a great walkthrough of both the approach and the code to perform it in [this cookbook](./Customizing_embeddings.ipynb).\n",
"\n",
"#### Reranking\n",
"\n",
"One other well-proven method from traditional search solutions that can be applied to any of the above approaches is reranking, where we over-fetch our search results, and then deterministically rerank based on a modifier or set of modifiers.\n",
"\n",
"An example is investor reports again - it is highly likely that if we have 3 reports on Apple, we'll want to make our investment decisions based on the latest one. In this instance a ```recency``` modifier could be applied to the vector scores to sort them, giving us the latest one on the top even if it is not the most semantically similar to our search question. "
]
},
{
"cell_type": "markdown",
"id": "9b2fdc7a",
"metadata": {},
"source": [
"For this walkthrough we'll stick with a basic semantic search bringing back the top 5 chunks for a user question, and providing a summarised response using GPT."
" <td>Title: Thomas Dolby;\\nThomas Dolby (born Thomas Morgan Robertson; 14 October 1958) is a British musican and computer designer. He is probably most famous for his 1982 hit, \"She Blinded me with Science\".\\n\\nHe married actress Kathleen Beller in 1988. The couple have three children together.\\n\\nDiscography\\n\\nSingles\\n\\nA Track did not chart in North America until 1983, after the success of \"She Blinded Me With Science\".\\n\\nAlbums\\n\\nStudio albums\\n\\nEPs\\n\\nReferences\\n\\nEnglish musicians\\nLiving people\\n1958 births\\nNew wave musicians\\nWarner Bros. Records artists</td>\n",
"0 Title: Thomas Dolby;\\nThomas Dolby (born Thomas Morgan Robertson; 14 October 1958) is a British musican and computer designer. He is probably most famous for his 1982 hit, \"She Blinded me with Science\".\\n\\nHe married actress Kathleen Beller in 1988. The couple have three children together.\\n\\nDiscography\\n\\nSingles\\n\\nA Track did not chart in North America until 1983, after the success of \"She Blinded Me With Science\".\\n\\nAlbums\\n\\nStudio albums\\n\\nEPs\\n\\nReferences\\n\\nEnglish musicians\\nLiving people\\n1958 births\\nNew wave musicians\\nWarner Bros. Records artists \n",
"We've now created a knowledge base that can answer user questions on Wikipedia. However, the user experience could be better, and this is where the Answer layer comes in, where an LLM Agent is used to interact with the user.\n",
"\n",
"There are different level of complexity in building a knowledge retrieval experience leveraging an LLM; there is an experience vs. effort trade-off to consider when selecting the right type of interaction. There are many patterns, but I'll highlight a few of the most common here:\n",
"\n",
"#### Choosing the user experience and architecture\n",
"\n",
"There are different level of complexity in building a knowledge retrieval experience leveraging an LLM; there is an experience vs. effort trade-off to consider when selecting the right type of interaction. There are many patterns, but I'll highlight a few of the most common here:\n",
"- **Q&A:** Your classic search engine use case, where the user inputs a question and your LLM gives them an answer either using its knowledge or, much more commonly, using a knowledge base that you prepare using the steps we've covered already. This simple use case assumes no memory of past queries is required, and no ability to clarify with the human or ask for more information.\n",
"- **Chat:** I think of Chat as being Q&A + memory - this is a slightly more sophisticated interaction where the LLM remembers what was previously asked and can delve deeper on something already covered.\n",
"- **Agent:** The most sophisticated is what LangChain calls an Agent, they leverage large language models to process and produce human-like results through a variety of tools, and will chain queries together dynamically until it has an answer that the LLM feels is appropriate to answer the user's question. However, for every \"turn\" you allow between Agent and user you increase the risks of loss of context, hallucination, or parsing errors, so be clear about the exact requirements your users have before embarking on building the Answer layer.\n",
"\n",
"Q&A use cases are the simplest to implement, while Agents can give the most sophisticated user experience - in this notebook we'll build an Agent with memory and a single Tool to give an appreciation for the flexibilty prompt chaining gives you in getting a more complete answer for your users.\n",
"\n",
"#### Ensuring reliability\n",
"\n",
"The more complexity you add, the more chance your LLM will fail to respond correctly, or a response will come back in the wrong format and break your Answer pipeline. We'll share a few methods our customers have used elsewhere to help \"channel\" the Agent down a more deterministic path, and to deal with issues when they do crop up:\n",
"- **Prompt chaining:** Prompting the model to take a step-by-step approach and think aloud using a scratchpad has been proven to deliver more consistent results. It also means that as a developer you can break up one complex prompt into many simpler, more deterministic prompts, with the output of one prompt becoming the input for the next. This approach is known as Chain-of-Thought (CoT) reasoning - I'd suggest digging deeper as this is a dynamic new area of research, with a few of the key papers referenced here:\n",
" - Chain of thought prompting [paper](https://arxiv.org/abs/2201.11903)\n",
"- **Self-referencing:** You can return references for the LLM's answer through either your application logic, or by prompt engineering it to return references. I would generally suggest doing it in your application logic, although if you have multiple chunks then a hybrid approach where you ask the LLM to return the key of the chunk it used could be advisable. I view this as a UX opportunity, where for many search use cases giving the \"raw\" output of the chunks retrieved as well as the summarised answer can give the user the best of both worlds, but please go with whatever is most appropriate for your users.\n",
"- **Discriminator models:** The best control for unwanted outputs is undoubtably through preventing it from happening with prompt engineering, prompt chaining and retrieval. However, when all these fail then a discriminator model is a useful detective control. This is a classifier trained on past unwanted outputs, that flags the Agent's response to the user as Safe or Not, enabling you to perform some business logic to either retry, pass to a human, or say it doesn't know. \n",
" - There is an example in our [Help Center](https://help.openai.com/en/articles/5528730-fine-tuning-a-classifier-to-improve-truthfulness).\n",
"\n",
"This is a dynamic topic that has still not consolidated to a clear design that works best above all others, so for ease of implementation we will use LangChain, which supplies a framework with implementations for most of the concepts we've discussed above.\n",
"\n",
"We'll create an Agent with access to our knowledge base, give it a prompt template and a custom parser for extracting the answers, set up a prompt chain and then let it answer our Wikipedia questions.\n",
"\n",
"Our work here draws heavily on LangChain's great documentation, in particular [this guide](https://python.langchain.com/en/latest/modules/agents/agents/custom_llm_chat_agent.html)."
"Observation:\u001b[36;1m\u001b[1;3mThomas Dolby is known for being a British musician and computer designer, and his 1982 hit \"She Blinded Me With Science\".\u001b[0m\u001b[32;1m\u001b[1;3mNow that I know who Thomas Dolby is, I can answer the question.\n",
"Final Answer: Thomas Dolby is known for being a British musician and computer designer, and his 1982 hit \"She Blinded Me With Science\".\u001b[0m\n",
"Last comes the not-so-fun bit that will make the difference between nifty prototype and production application - the process of evaluating and tuning your results. \n",
"\n",
"The key takeaway here is to make a framework that saves the results of each evaluation, as well as the parameters. Evaluation can be a difficult task that takes significant resources, so it is best to start prepared to handle multiple iterations. Some useful principles we've seen successful deployments use are:\n",
"- **Assign clear product ownership and metrics:** Ensure you have a team aligned from the start to annotate the outputs and determine whether they're bad or good. This may seem an obvious step, but too often the focus is on the engineering challenge of successfully retrieving content rather than the product challenge of providing retrieval results that are useful.\n",
"- **Log everything:** Store all requests and responses to and from your LLM and retrieval service if you can, it builds a great base for fine-tuning both the embeddings and any fine-tuned models or few-shot LLMs in future.\n",
"- **Use GPT-4 as a labeller:** When running evaluations, it can help to use GPT-4 as a gatekeeper for human annotation. Human annotation is costly and time-consuming, so doing an initial evaluation run with GPT-4 can help set a quality bar that needs to be met to justify human labeling. At this stage I would not suggest using GPT-4 as your only labeler, but it can certainly ease the burden.\n",
" - This approach is outlined further in [this paper](https://arxiv.org/abs/2108.13487).\n",
"\n",
"We'll use these principles to make a quick evaluation framework where we will:\n",
"- Use GPT-4 to make a list of hypothetical questions on our topic\n",
"- Ask our Agent the questions and save question/answer tuples\n",
" - These two above steps simulate the actual users interacting with your application\n",
"- Get GPT-4 to evaluate whether the answers correctly respond to the questions\n",
"- Look at our results to measure how well the Agent answered the questions\n",
"evaluation_question_prompt = '''You are a helpful Wikipedia assistant who will generate a list of 10 creative general knowledge questions in markdown format.\n",
"['1. What is the difference between weather and climate?', '2. Who designed the Eiffel Tower?', '3. What is the capital of Australia?', '4. What is the chemical symbol for gold?', '5. Who invented the telephone?', '6. What is the largest organ in the human body?', '7. Which famous artist painted the Mona Lisa?', '8. What is the highest mountain in Africa?', '9. What famous building was destroyed during the September 11th attacks?', '10. Who wrote the novel \"To Kill a Mockingbird\"?']\n"
" [('1. What is the difference between weather and climate?',\n",
" 'Weather refers to short-term atmospheric conditions in a specific area, while climate refers to long-term patterns and trends of weather in a particular region over a period of time.'),\n",
" ('2. Who designed the Eiffel Tower?',\n",
" 'Gustave Eiffel designed the Eiffel Tower.'),\n",
" ('3. What is the capital of Australia?',\n",
" 'The capital of Australia is Canberra.'),\n",
" ('4. What is the chemical symbol for gold?',\n",
" 'The chemical symbol for gold is Au.'),\n",
" ('5. Who invented the telephone?',\n",
" 'Alexander Graham Bell invented the telephone.')])"
"Depending on how GPT did here you may have actually gotten some good responses, but in all likelihood in the real world you'll end up with incorrect or unable to answer results, and will need to tune your search, LLM or another aspect of the pipeline.\n",
"- **Incorrect answers:** Either prompt engineering to help the model work out how to answer better (maybe even a bigger model like GPT-4), or search optimisation to return more relevant chunks. Chunking/embedding changes may help this as well - larger chunks may give more context, allowing the model to formulate a better answer.\n",
"- **Unable to answer:** This is either a retrieval problem, or the data doesn't exist in our knowledge base. We can prompt engineer to classify questions that are \"out-of-bounds\" and give the user a stock reply, or we can tune our search so the relevant data is returned.\n",
"This is the framework we'll build on to get our knowledge retrieval solution to production - again, log everything and store each run down to a question level so you can track regressions and iterate towards your production solution."
"This concludes our Enterprise Knowledge Retrieval walkthrough. We hope you've found it useful, and that you're now in a position to build enterprise knowledge retrieval solutions, and have a few tricks to start you down the road of putting them into production."