You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
langchain/libs/experimental/langchain_experimental
Jordy Jackson Antunes da Rocha a50eabbd48
experimental: LLMGraphTransformer add missing conditional adding restrictions to prompts for LLM that do not support function calling (#22793)
- Description: Modified the prompt created by the function
`create_unstructured_prompt` (which is called for LLMs that do not
support function calling) by adding conditional checks that verify if
restrictions on entity types and rel_types should be added to the
prompt. If the user provides a sufficiently large text, the current
prompt **may** fail to produce results in some LLMs. I have first seen
this issue when I implemented a custom LLM class that did not support
Function Calling and used Gemini 1.5 Pro, but I was able to replicate
this issue using OpenAI models.

By loading a sufficiently large text
```python
from langchain_community.llms import Ollama
from langchain_openai import ChatOpenAI, OpenAI
from langchain_core.prompts import PromptTemplate
import re
from langchain_experimental.graph_transformers import LLMGraphTransformer
from langchain_core.documents import Document

with open("texto-longo.txt", "r") as file:
    full_text = file.read()
    partial_text = full_text[:4000]

documents = [Document(page_content=partial_text)] # cropped to fit GPT 3.5 context window
```

And using the chat class (that has function calling)
```python
chat_openai = ChatOpenAI(model="gpt-3.5-turbo", model_kwargs={"seed": 42})
chat_gpt35_transformer = LLMGraphTransformer(llm=chat_openai)
graph_from_chat_gpt35 = chat_gpt35_transformer.convert_to_graph_documents(documents)
```
It works:
```
>>> print(graph_from_chat_gpt35[0].nodes)
[Node(id="Jesu, Joy of Man's Desiring", type='Music'), Node(id='Godel', type='Person'), Node(id='Johann Sebastian Bach', type='Person'), Node(id='clever way of encoding the complicated expressions as numbers', type='Concept')]
```

But if you try to use the non-chat LLM class (that does not support
function calling)
```python
openai = OpenAI(
    model="gpt-3.5-turbo-instruct",
    max_tokens=1000,
)
gpt35_transformer = LLMGraphTransformer(llm=openai)
graph_from_gpt35 = gpt35_transformer.convert_to_graph_documents(documents)
```

It uses the prompt that has issues and sometimes does not produce any
result
```
>>> print(graph_from_gpt35[0].nodes)
[]
```

After implementing the changes, I was able to use both classes more
consistently:

```shell
>>> chat_gpt35_transformer = LLMGraphTransformer(llm=chat_openai)
>>> graph_from_chat_gpt35 = chat_gpt35_transformer.convert_to_graph_documents(documents)
>>> print(graph_from_chat_gpt35[0].nodes)
[Node(id="Jesu, Joy Of Man'S Desiring", type='Music'), Node(id='Johann Sebastian Bach', type='Person'), Node(id='Godel', type='Person')]
>>> gpt35_transformer = LLMGraphTransformer(llm=openai)
>>> graph_from_gpt35 = gpt35_transformer.convert_to_graph_documents(documents)
>>> print(graph_from_gpt35[0].nodes)
[Node(id='I', type='Pronoun'), Node(id="JESU, JOY OF MAN'S DESIRING", type='Song'), Node(id='larger memory', type='Memory'), Node(id='this nice tree structure', type='Structure'), Node(id='how you can do it all with the numbers', type='Process'), Node(id='JOHANN SEBASTIAN BACH', type='Composer'), Node(id='type of structure', type='Characteristic'), Node(id='that', type='Pronoun'), Node(id='we', type='Pronoun'), Node(id='worry', type='Verb')]
```

The results are a little inconsistent because the GPT 3.5 model may
produce incomplete json due to the token limit, but that could be solved
(or mitigated) by checking for a complete json when parsing it.
3 months ago
..
agents experimental[patch]/docs[patch]: Update links to security docs (#22864) 3 months ago
autonomous_agents experimental[patch]: return from HuggingGPT task executor task.run() exception (#20219) 5 months ago
chat_models infra: rm unused # noqa violations (#22049) 4 months ago
comprehend_moderation langchain: `callbacks` imports fix (#20348) 5 months ago
cpal Fix: lint errors and update Field alias in models.py and AutoSelectionScorer initialization (#22846) 3 months ago
data_anonymizer experimental[patch]: update module doc strings (#19539) 6 months ago
fallacy_removal experimental[patch]: `prompts` import fix (#20534) 5 months ago
generative_agents patch: deprecate (a)get_relevant_documents (#20477) 5 months ago
graph_transformers experimental: LLMGraphTransformer add missing conditional adding restrictions to prompts for LLM that do not support function calling (#22793) 3 months ago
llm_bash infra: rm unused # noqa violations (#22049) 4 months ago
llm_symbolic_math experimental[patch]: `prompts` import fix (#20534) 5 months ago
llms core[minor]: BaseChatModel with_structured_output implementation (#22859) 3 months ago
open_clip experimental[patch]: update module doc strings (#19539) 6 months ago
openai_assistant
pal_chain community[major], experimental[patch]: Remove Python REPL from community (#22904) 3 months ago
plan_and_execute experimental[patch]: `prompts` import fix (#20534) 5 months ago
prompt_injection_identifier experimental[minor]: upgrade the prompt injection model (#20783) 5 months ago
prompts experimental[patch]: `prompts` import fix (#20534) 5 months ago
pydantic_v1
recommenders infra: rm unused # noqa violations (#22049) 4 months ago
retrievers langchain: `callbacks` imports fix (#20348) 5 months ago
rl_chain infra: rm unused # noqa violations (#22049) 4 months ago
smart_llm experimental[patch]: `prompts` import fix (#20534) 5 months ago
sql experimental[patch], docs: refine notebook for MyScale `SelfQueryRetriever` (#22016) 4 months ago
synthetic_data experimental[patch]: `prompts` import fix (#20534) 5 months ago
tabular_synthetic_data experimental[patch]: `prompts` import fix (#20534) 5 months ago
tools langchain: `callbacks` imports fix (#20348) 5 months ago
tot experimental[patch]: `prompts` import fix (#20534) 5 months ago
utilities experimental: clean python repl input(experimental:Added code for PythonREPL) (#20930) 5 months ago
video_captioning langchain: `callbacks` imports fix (#20348) 5 months ago
__init__.py
py.typed
text_splitter.py SemanticChunker : Feature Addition ("Semantic Splitting with gradient") (#22895) 3 months ago