mirror of
https://github.com/hwchase17/langchain
synced 2024-11-18 09:25:54 +00:00
4b7969efc5
**Description** : New documents loader for visio files (with extension .vsdx) A [visio file](https://fr.wikipedia.org/wiki/Microsoft_Visio) (with extension .vsdx) is associated with Microsoft Visio, a diagram creation software. It stores information about the structure, layout, and graphical elements of a diagram. This format facilitates the creation and sharing of visualizations in areas such as business, engineering, and computer science. A Visio file can contain multiple pages. Some of them may serve as the background for others, and this can occur across multiple layers. This loader extracts the textual content from each page and its associated pages, enabling the extraction of all visible text from each page, similar to what an OCR algorithm would do. **Dependencies** : xmltodict package
329 KiB
329 KiB