forked from Archives/langchain
multi_strategy_parser
master
searx_updates
chatconv_parser_exception
textloader_autodetect_encodings
docker
dynamic_agent_tools
parallel_dir_loader
parallel_dir_loader_back
fix_agent_callbacks
fix-readthedocs
doc
searx-async
searx
mine
agent-lookup-tool
agent-lookup-tool-bad
tool-patch
fix-searx
main
docker-utility
searx-doc
docker-utility-pexpect
searx-query-suffix
searx-search-suffix
searx-query-suffixy
searx-api
makefile
makefile-update-1
searx-api-pre
ankush/async-llmchain
harrison/deployments1
ankush/async-llm
ankush/retry-openai
harrison/image
scad/api-chain
harrison/document-split
harrison/prompt-bugs
harrison/sql-agent
harrison/pinecone-try-except
harrison/callback-updates
harrison/map-rerank
harrison/combine-docs-parse
fork-chains
harrison/azure-rfc
harrison/sequential_chain_from_prompts
harrison/agent-refactor
harrison/agent_intermediate_steps
harrison/agent_multi_inputs
harrison/promot-mrkl
harrison/fix_logging_api
harrison/use_output_parser
harrison/track_intermediate_steps
harrison/sql_error
harrison/logging_to_file
harrison/output_parser
harrison/flexible_model_args
harrison/agent-improvements
harrison/router_docs
harrison/docs
samantha/add_llm_to_example
harrison/reorg_smart_chains
mako-templates
harrison/save_metadatas
harrison/router
harrison/custom_pipeline
harrison/chain_pipeline
harrison/prompts_docs
harrison/attempt_citing_in_prompt
harrison/load_prompt
harrison/prompts_take_2
harrison/ape
harrison/prompt_examples
william/cot_sc
harrison/add_dependencies
v0.0.64
v0.0.65
v0.0.66
v0.0.67
v0.0.68
v0.0.69
v0.0.70
v0.0.71
v0.0.72
v0.0.73
v0.0.74
v0.0.75
v0.0.76
${ noResults }
1 Commits (4e42c737f835ae202f78f3737d9a92cf8f2f0d8e)
Author | SHA1 | Message | Date |
---|---|---|---|
Eugene Yurtsev |
3c490b5ba3
|
Docugami DataLoader (#4727)
### Adds a document loader for Docugami Specifically: 1. Adds a data loader that talks to the [Docugami](http://docugami.com) API to download processed documents as semantic XML 2. Parses the semantic XML into chunks, with additional metadata capturing chunk semantics 3. Adds a detailed notebook showing how you can use additional metadata returned by Docugami for techniques like the [self-querying retriever](https://python.langchain.com/en/latest/modules/indexes/retrievers/examples/self_query_retriever.html) 4. Adds an integration test, and related documentation Here is an example of a result that is not possible without the capabilities added by Docugami (from the notebook): <img width="1585" alt="image" src="https://github.com/hwchase17/langchain/assets/749277/bb6c1ce3-13dc-4349-a53b-de16681fdd5b"> --------- Co-authored-by: Taqi Jaffri <tjaffri@docugami.com> Co-authored-by: Taqi Jaffri <tjaffri@gmail.com> |
1 year ago |