langchain/libs/experimental/langchain_experimental/rl_chain/__init__.py

"""
**RL (Reinforcement Learning) Chain** leverages the `Vowpal Wabbit (VW)` models
for reinforcement learning with a context, with the goal of modifying
the prompt before the LLM call.

[Vowpal Wabbit](https://vowpalwabbit.org/) provides fast, efficient,
and flexible online machine learning techniques for reinforcement learning,
supervised learning, and more.
"""

import logging

from langchain_experimental.rl_chain.base import (
    AutoSelectionScorer,
    BasedOn,
    Embed,
    Embedder,
    Policy,
    SelectionScorer,
    ToSelectFrom,
    VwPolicy,
)
from langchain_experimental.rl_chain.helpers import embed, stringify_embedding
from langchain_experimental.rl_chain.pick_best_chain import (
    PickBest,
    PickBestEvent,
    PickBestFeatureEmbedder,
    PickBestRandomPolicy,
    PickBestSelected,
)


def configure_logger() -> None:
    logger = logging.getLogger(__name__)
    logger.setLevel(logging.INFO)
    ch = logging.StreamHandler()
    formatter = logging.Formatter(
        "%(asctime)s - %(name)s - %(levelname)s - %(message)s"
    )
    ch.setFormatter(formatter)
    ch.setLevel(logging.INFO)
    logger.addHandler(ch)


configure_logger()

__all__ = [
    "PickBest",
    "PickBestEvent",
    "PickBestSelected",
    "PickBestFeatureEmbedder",
    "PickBestRandomPolicy",
    "Embed",
    "BasedOn",
    "ToSelectFrom",
    "SelectionScorer",
    "AutoSelectionScorer",
    "Embedder",
    "Policy",
    "VwPolicy",
    "embed",
    "stringify_embedding",
]
experimental[patch]: update module doc strings (#19539) Added missed module descriptions. Fixed format. 2024-03-26 14:38:10 +00:00			`"""`
			RL (Reinforcement Learning) Chain leverages the `Vowpal Wabbit (VW)` models
			`for reinforcement learning with a context, with the goal of modifying`
			`the prompt before the LLM call.`

			`[Vowpal Wabbit](https://vowpalwabbit.org/) provides fast, efficient,`
			`and flexible online machine learning techniques for reinforcement learning,`
			`supervised learning, and more.`
			`"""`
infra: update mypy 1.10, ruff 0.5 (#23721) ```python """python scripts/update_mypy_ruff.py""" import glob import tomllib from pathlib import Path import toml import subprocess import re ROOT_DIR = Path(__file__).parents[1] def main(): for path in glob.glob(str(ROOT_DIR / "libs/*/pyproject.toml"), recursive=True): print(path) with open(path, "rb") as f: pyproject = tomllib.load(f) try: pyproject["tool"]["poetry"]["group"]["typing"]["dependencies"]["mypy"] = ( "^1.10" ) pyproject["tool"]["poetry"]["group"]["lint"]["dependencies"]["ruff"] = ( "^0.5" ) except KeyError: continue with open(path, "w") as f: toml.dump(pyproject, f) cwd = "/".join(path.split("/")[:-1]) completed = subprocess.run( "poetry lock --no-update; poetry install --with typing; poetry run mypy . --no-color", cwd=cwd, shell=True, capture_output=True, text=True, ) logs = completed.stdout.split("\n") to_ignore = {} for l in logs: if re.match("^(.)\:(\d+)\: error:.\[(.)\]", l): path, line_no, error_type = re.match( "^(.)\:(\d+)\: error:.\[(.*)\]", l ).groups() if (path, line_no) in to_ignore: to_ignore[(path, line_no)].append(error_type) else: to_ignore[(path, line_no)] = [error_type] print(len(to_ignore)) for (error_path, line_no), error_types in to_ignore.items(): all_errors = ", ".join(error_types) full_path = f"{cwd}/{error_path}" try: with open(full_path, "r") as f: file_lines = f.readlines() except FileNotFoundError: continue file_lines[int(line_no) - 1] = ( file_lines[int(line_no) - 1][:-1] + f" # type: ignore[{all_errors}]\n" ) with open(full_path, "w") as f: f.write("".join(file_lines)) subprocess.run( "poetry run ruff format .; poetry run ruff --select I --fix .", cwd=cwd, shell=True, capture_output=True, text=True, ) if __name__ == "__main__": main() ``` 2024-07-03 17:33:27 +00:00
resolving linting and formatting errors 2023-08-18 11:09:30 +00:00			`import logging`

move everything into experimental 2023-09-11 16:16:08 +00:00			`from langchain_experimental.rl_chain.base import (`
Imported changes from repo VowpalWabbit/rl_chain into rl_chain directory 2023-08-18 06:02:01 +00:00			`AutoSelectionScorer,`
resolving linting and formatting errors 2023-08-18 11:09:30 +00:00			`BasedOn,`
			`Embed,`
Imported changes from repo VowpalWabbit/rl_chain into rl_chain directory 2023-08-18 06:02:01 +00:00			`Embedder,`
			`Policy,`
resolving linting and formatting errors 2023-08-18 11:09:30 +00:00			`SelectionScorer,`
			`ToSelectFrom,`
Imported changes from repo VowpalWabbit/rl_chain into rl_chain directory 2023-08-18 06:02:01 +00:00			`VwPolicy,`
proper embeddings and rolling window average 2023-09-01 00:14:41 +00:00			`)`
experimental[patch]: refactor rl chain structure (#25398) can't have a class and function with same name but different capitalization in same file for api reference building 2024-08-14 17:09:43 +00:00			`from langchain_experimental.rl_chain.helpers import embed, stringify_embedding`
move everything into experimental 2023-09-11 16:16:08 +00:00			`from langchain_experimental.rl_chain.pick_best_chain import (`
proper embeddings and rolling window average 2023-09-01 00:14:41 +00:00			`PickBest,`
			`PickBestEvent,`
cleanup 2023-09-04 11:36:47 +00:00			`PickBestFeatureEmbedder,`
add random policy and notebook 2023-09-04 22:08:46 +00:00			`PickBestRandomPolicy,`
proper embeddings and rolling window average 2023-09-01 00:14:41 +00:00			`PickBestSelected,`
Imported changes from repo VowpalWabbit/rl_chain into rl_chain directory 2023-08-18 06:02:01 +00:00			`)`


no errors in pick best chain 2023-08-28 12:13:23 +00:00			`def configure_logger() -> None:`
Imported changes from repo VowpalWabbit/rl_chain into rl_chain directory 2023-08-18 06:02:01 +00:00			`logger = logging.getLogger(__name__)`
			`logger.setLevel(logging.INFO)`
			`ch = logging.StreamHandler()`
			`formatter = logging.Formatter(`
			`"%(asctime)s - %(name)s - %(levelname)s - %(message)s"`
			`)`
			`ch.setFormatter(formatter)`
			`ch.setLevel(logging.INFO)`
			`logger.addHandler(ch)`


			`configure_logger()`
resolving linting and formatting errors 2023-08-18 11:09:30 +00:00
			`__all__ = [`
			`"PickBest",`
proper embeddings and rolling window average 2023-09-01 00:14:41 +00:00			`"PickBestEvent",`
			`"PickBestSelected",`
cleanup 2023-09-04 11:36:47 +00:00			`"PickBestFeatureEmbedder",`
add random policy and notebook 2023-09-04 22:08:46 +00:00			`"PickBestRandomPolicy",`
resolving linting and formatting errors 2023-08-18 11:09:30 +00:00			`"Embed",`
			`"BasedOn",`
			`"ToSelectFrom",`
			`"SelectionScorer",`
			`"AutoSelectionScorer",`
			`"Embedder",`
			`"Policy",`
			`"VwPolicy",`
proper embeddings and rolling window average 2023-09-01 00:14:41 +00:00			`"embed",`
			`"stringify_embedding",`
resolving linting and formatting errors 2023-08-18 11:09:30 +00:00			`]`