RL Chain with VowpalWabbit (#10242)

- Description: This PR adds a new chain `rl_chain.PickBest` for learned prompt variable injection, detailed description and usage can be found in the example notebook added. It essentially adds a [VowpalWabbit](https://github.com/VowpalWabbit/vowpal_wabbit) layer before the llm call in order to learn or personalize prompt variable selections. Most of the code is to make the API simple and provide lots of defaults and data wrangling that is needed to use Vowpal Wabbit, so that the user of the chain doesn't have to worry about it. - Dependencies: [vowpal-wabbit-next](https://pypi.org/project/vowpal-wabbit-next/), - sentence-transformers (already a dep) - numpy (already a dep) - tagging @ataymano who contributed to this chain - Tag maintainer: @baskaryan - Twitter handle: @olgavrou Added example notebook and unit tests
9 months ago · 3b07c0cf3d
parent 3a299b9680
commit 3b07c0cf3d
4 changed files with 0 additions and 0 deletions
--- a/libs/experimental/tests/integration_tests/chains/rl_chain/test_pick_best_chain_call.py
+++ b/libs/experimental/tests/integration_tests/chains/rl_chain/test_pick_best_chain_call.py
--- a/libs/experimental/tests/integration_tests/chains/rl_chain/test_pick_best_text_embedder.py
+++ b/libs/experimental/tests/integration_tests/chains/rl_chain/test_pick_best_text_embedder.py
--- a/libs/experimental/tests/integration_tests/chains/rl_chain/test_rl_chain_base_embedder.py
+++ b/libs/experimental/tests/integration_tests/chains/rl_chain/test_rl_chain_base_embedder.py
--- a/libs/experimental/tests/integration_tests/chains/rl_chain/test_utils.py
+++ b/libs/experimental/tests/integration_tests/chains/rl_chain/test_utils.py