mirror of
https://github.com/hwchase17/langchain
synced 2024-11-13 19:10:52 +00:00
c1d8c33df6
**Description:** This PR fixes an issue where non-ASCII characters in Pydantic field descriptions were being escaped to their Unicode representations when using `JsonOutputParser`. The change allows non-ASCII characters to be preserved in the output, which is especially important for multilingual support and when working with non-English languages. **Issue:** Fixes #27256 **Example Code:** ```python from pydantic import BaseModel, Field from langchain_core.output_parsers import JsonOutputParser class Article(BaseModel): title: str = Field(description="科学文章的标题") output_data_structure = Article parser = JsonOutputParser(pydantic_object=output_data_structure) print(parser.get_format_instructions()) ``` **Previous Output**: ```... "title": {"description": "\\u79d1\\u5b66\\u6587\\u7ae0\\u7684\\u6807\\u9898", "title": "Title", "type": "string"}} ...``` **Current Output**: ```... "title": {"description": "科学文章的标题", "title": "Title", "type": "string"}} ...``` **Changes made**: - Modified `json.dumps()` call in `langchain_core/output_parsers/json.py` to use `ensure_ascii=False` - Added a unit test to verify Unicode handling Co-authored-by: Harsimran-19 <harsimran1869@gmail.com> |
||
---|---|---|
.. | ||
cli | ||
community | ||
core | ||
experimental | ||
langchain | ||
partners | ||
standard-tests | ||
text-splitters |