langchain/libs/experimental/langchain_experimental/llms/rellm_decoder.py

"""Experimental implementation of RELLM wrapped LLM."""
from __future__ import annotations

from typing import TYPE_CHECKING, Any, List, Optional, cast

from langchain.callbacks.manager import CallbackManagerForLLMRun
from langchain_community.llms.huggingface_pipeline import HuggingFacePipeline
from langchain_community.llms.utils import enforce_stop_tokens

from langchain_experimental.pydantic_v1 import Field, root_validator

if TYPE_CHECKING:
    import rellm
    from regex import Pattern as RegexPattern
else:
    try:
        from regex import Pattern as RegexPattern
    except ImportError:
        pass


def import_rellm() -> rellm:
    """Lazily import rellm."""
    try:
        import rellm
    except ImportError:
        raise ImportError(
            "Could not import rellm python package. "
            "Please install it with `pip install rellm`."
        )
    return rellm


class RELLM(HuggingFacePipeline):
    """RELLM wrapped LLM using HuggingFace Pipeline API."""

    regex: RegexPattern = Field(..., description="The structured format to complete.")
    max_new_tokens: int = Field(
        default=200, description="Maximum number of new tokens to generate."
    )

    # TODO: move away from `root_validator` since it is deprecated in pydantic v2
    #       and causes mypy type-checking failures (hence the `type: ignore`)
    @root_validator  # type: ignore[call-overload]
    def check_rellm_installation(cls, values: dict) -> dict:
        import_rellm()
        return values

    def _call(
        self,
        prompt: str,
        stop: Optional[List[str]] = None,
        run_manager: Optional[CallbackManagerForLLMRun] = None,
        **kwargs: Any,
    ) -> str:
        rellm = import_rellm()
        from transformers import Text2TextGenerationPipeline

        pipeline = cast(Text2TextGenerationPipeline, self.pipeline)

        text = rellm.complete_re(
            prompt,
            self.regex,
            tokenizer=pipeline.tokenizer,
            model=pipeline.model,
            max_new_tokens=self.max_new_tokens,
        )
        if stop is not None:
            # This is a bit hacky, but I can't figure out a better way to enforce
            # stop tokens when making calls to huggingface_hub.
            text = enforce_stop_tokens(text, stop)
        return text
Add RELLM and JSONFormer experimental LLM decoding (#4185) [RELLM](https://github.com/r2d4/rellm) is a library that wraps local HuggingFace pipeline models for structured decoding. RELLM works by generating tokens one at a time. At each step, it masks tokens that don't conform to the provided partial regular expression. [JSONFormer](https://github.com/1rgs/jsonformer) is a bit different, where it sequentially adds the keys then decodes each value directly 2023-05-14 22:40:03 +00:00			`"""Experimental implementation of RELLM wrapped LLM."""`
			`from __future__ import annotations`

support kwargs (#5990) 2023-06-11 17:09:22 +00:00			`from typing import TYPE_CHECKING, Any, List, Optional, cast`
Add RELLM and JSONFormer experimental LLM decoding (#4185) [RELLM](https://github.com/r2d4/rellm) is a library that wraps local HuggingFace pipeline models for structured decoding. RELLM works by generating tokens one at a time. At each step, it masks tokens that don't conform to the provided partial regular expression. [JSONFormer](https://github.com/1rgs/jsonformer) is a bit different, where it sequentially adds the keys then decodes each value directly 2023-05-14 22:40:03 +00:00
			`from langchain.callbacks.manager import CallbackManagerForLLMRun`
docs, community[patch], experimental[patch], langchain[patch], cli[pa… (#15412) …tch]: import models from community ran ```bash git grep -l 'from langchain\.chat_models' \| xargs -L 1 sed -i '' "s/from\ langchain\.chat_models/from\ langchain_community.chat_models/g" git grep -l 'from langchain\.llms' \| xargs -L 1 sed -i '' "s/from\ langchain\.llms/from\ langchain_community.llms/g" git grep -l 'from langchain\.embeddings' \| xargs -L 1 sed -i '' "s/from\ langchain\.embeddings/from\ langchain_community.embeddings/g" git checkout master libs/langchain/tests/unit_tests/llms git checkout master libs/langchain/tests/unit_tests/chat_models git checkout master libs/langchain/tests/unit_tests/embeddings/test_imports.py make format cd libs/langchain; make format cd ../experimental; make format cd ../core; make format ``` 2024-01-02 20:32:16 +00:00			`from langchain_community.llms.huggingface_pipeline import HuggingFacePipeline`
			`from langchain_community.llms.utils import enforce_stop_tokens`
Use a submodule for pydantic v1 compat (#9371) <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. These live is docs/extras directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17, @rlancemartin. --> 2023-08-17 15:35:49 +00:00
			`from langchain_experimental.pydantic_v1 import Field, root_validator`
Add RELLM and JSONFormer experimental LLM decoding (#4185) [RELLM](https://github.com/r2d4/rellm) is a library that wraps local HuggingFace pipeline models for structured decoding. RELLM works by generating tokens one at a time. At each step, it masks tokens that don't conform to the provided partial regular expression. [JSONFormer](https://github.com/1rgs/jsonformer) is a bit different, where it sequentially adds the keys then decodes each value directly 2023-05-14 22:40:03 +00:00
			`if TYPE_CHECKING:`
			`import rellm`
			`from regex import Pattern as RegexPattern`
			`else:`
			`try:`
			`from regex import Pattern as RegexPattern`
			`except ImportError:`
			`pass`


			`def import_rellm() -> rellm:`
			`"""Lazily import rellm."""`
			`try:`
			`import rellm`
			`except ImportError:`
update experimental (#8402) some changes were made to experimental, porting them over 2023-07-28 20:01:36 +00:00			`raise ImportError(`
Add RELLM and JSONFormer experimental LLM decoding (#4185) [RELLM](https://github.com/r2d4/rellm) is a library that wraps local HuggingFace pipeline models for structured decoding. RELLM works by generating tokens one at a time. At each step, it masks tokens that don't conform to the provided partial regular expression. [JSONFormer](https://github.com/1rgs/jsonformer) is a bit different, where it sequentially adds the keys then decodes each value directly 2023-05-14 22:40:03 +00:00			`"Could not import rellm python package. "`
			"Please install it with `pip install rellm`."
			`)`
			`return rellm`


			`class RELLM(HuggingFacePipeline):`
update experimental (#8402) some changes were made to experimental, porting them over 2023-07-28 20:01:36 +00:00			`"""RELLM wrapped LLM using HuggingFace Pipeline API."""`

Add RELLM and JSONFormer experimental LLM decoding (#4185) [RELLM](https://github.com/r2d4/rellm) is a library that wraps local HuggingFace pipeline models for structured decoding. RELLM works by generating tokens one at a time. At each step, it masks tokens that don't conform to the provided partial regular expression. [JSONFormer](https://github.com/1rgs/jsonformer) is a bit different, where it sequentially adds the keys then decodes each value directly 2023-05-14 22:40:03 +00:00			`regex: RegexPattern = Field(..., description="The structured format to complete.")`
			`max_new_tokens: int = Field(`
			`default=200, description="Maximum number of new tokens to generate."`
			`)`

Upgrade experimental package dependencies and use Poetry 1.6.1. (#11339) Part of upgrading our CI to use Poetry 1.6.1. 2023-10-17 01:13:31 +00:00			# TODO: move away from `root_validator` since it is deprecated in pydantic v2
			# and causes mypy type-checking failures (hence the `type: ignore`)
			`@root_validator # type: ignore[call-overload]`
Add RELLM and JSONFormer experimental LLM decoding (#4185) [RELLM](https://github.com/r2d4/rellm) is a library that wraps local HuggingFace pipeline models for structured decoding. RELLM works by generating tokens one at a time. At each step, it masks tokens that don't conform to the provided partial regular expression. [JSONFormer](https://github.com/1rgs/jsonformer) is a bit different, where it sequentially adds the keys then decodes each value directly 2023-05-14 22:40:03 +00:00			`def check_rellm_installation(cls, values: dict) -> dict:`
			`import_rellm()`
			`return values`

			`def _call(`
			`self,`
			`prompt: str,`
			`stop: Optional[List[str]] = None,`
			`run_manager: Optional[CallbackManagerForLLMRun] = None,`
support kwargs (#5990) 2023-06-11 17:09:22 +00:00			`**kwargs: Any,`
Add RELLM and JSONFormer experimental LLM decoding (#4185) [RELLM](https://github.com/r2d4/rellm) is a library that wraps local HuggingFace pipeline models for structured decoding. RELLM works by generating tokens one at a time. At each step, it masks tokens that don't conform to the provided partial regular expression. [JSONFormer](https://github.com/1rgs/jsonformer) is a bit different, where it sequentially adds the keys then decodes each value directly 2023-05-14 22:40:03 +00:00			`) -> str:`
			`rellm = import_rellm()`
			`from transformers import Text2TextGenerationPipeline`

			`pipeline = cast(Text2TextGenerationPipeline, self.pipeline)`

			`text = rellm.complete_re(`
			`prompt,`
			`self.regex,`
			`tokenizer=pipeline.tokenizer,`
			`model=pipeline.model,`
			`max_new_tokens=self.max_new_tokens,`
			`)`
			`if stop is not None:`
			`# This is a bit hacky, but I can't figure out a better way to enforce`
			`# stop tokens when making calls to huggingface_hub.`
			`text = enforce_stop_tokens(text, stop)`
			`return text`