Commit Graph

1 Commits (560e2182d85fa6c55a2930d5fa1be64202a5e56d)

Author SHA1 Message Date
olgavrou 3b07c0cf3d
RL Chain with VowpalWabbit (#10242)
- Description: This PR adds a new chain `rl_chain.PickBest` for learned
prompt variable injection, detailed description and usage can be found
in the example notebook added. It essentially adds a
[VowpalWabbit](https://github.com/VowpalWabbit/vowpal_wabbit) layer
before the llm call in order to learn or personalize prompt variable
selections.

Most of the code is to make the API simple and provide lots of defaults
and data wrangling that is needed to use Vowpal Wabbit, so that the user
of the chain doesn't have to worry about it.

- Dependencies:
[vowpal-wabbit-next](https://pypi.org/project/vowpal-wabbit-next/),
     - sentence-transformers (already a dep)
     - numpy (already a dep)
  - tagging @ataymano who contributed to this chain
  - Tag maintainer: @baskaryan
  - Twitter handle: @olgavrou


Added example notebook and unit tests
11 months ago