Commit Graph

36 Commits (8a98789be12598964d0eaaa02c9ce097f6d6111c)

Author SHA1 Message Date
Anton Larin 6d5b698c39 fix arc53/DocsGPT#199 1 year ago
Anton Larin dd9f1abcea fix arc53/DocsGPT#199 1 year ago
Anton Larin b4bd34fb96 fix arc53/DocsGPT#199 1 year ago
Nazih Kalo da5d62cc1c updating the bulk ingest file metadata to account for parsers that output lists 1 year ago
Anton Larin 962becb9a5
Linting
* validate python formatting on every build with Ruff
* fix lint warnings
1 year ago
Anton Larin 168648e789 Proper PEP8 formatting 1 year ago
Pavel ce8f0ef9e1
Merge pull request #168 from arc53/feature/backend-uploads
Feature/backend uploads
1 year ago
Pavel 4532b6cd8c print minus 1 year ago
Pavel 53424a5c19 Added cli commands 1 year ago
Pavel b6c02c850a token ingeest 1 year ago
Pavel bac25112b7 v1 1 year ago
Alex 1d2162705d uploads backend first 1 year ago
Alex ac0224b687 mdx format 1 year ago
Alex 1f02f3b376 Update rst_parser.py 1 year ago
Alex f7d7244588 chunks rst 1 year ago
Alex 1af4ca2340
Merge pull request #129 from arc53/code-ingestion
Code_to_dict
1 year ago
Pavel 2c364d3c00 Code_to_dict
3 languages added, works well with python. Java and Js require additional revieving
1 year ago
Alex 5c2a537393
Merge pull request #116 from arc53/code-ingestion
Code ingestion
1 year ago
Pavel 0fb28e5213 Calc + structure 1 year ago
Manan 524e0f6f01 fix | Chunk creation error when
title not the first element in HTML
1 year ago
Manan 16eb503e36 Added HTML Support. read, clean-up, filter return 1 year ago
Manan e8baa46eb6 Merge branch 'main' of https://github.com/arc53/DocsGPT into main 1 year ago
Manan d0b472ad38 Implemented html_parser: cleaning & chunk creation 1 year ago
Alex 4d1ff8238d switching between llms 1 year ago
Alex f9fe3f2f48 Merge branch 'main' into custom-llm 1 year ago
EricGao888 aeac186484 Add retry strategy to increase stability 1 year ago
冯不游 b83589a308 feat: add support for directory list
example: `python ingest.py --dir inputs1 --dir another --dir ../inputs`,
the outputs will be in `outputs/input_folder_name/`
1 year ago
冯不游 636783ca8a fix: avoid second error issue 1 year ago
冯不游 458f2a3ff3 fix: restore index back when continue process 1 year ago
Alex 046fbebf56 Enable other llm's 1 year ago
Alex e88ff885fe
Merge pull request #75 from arc53/rst-interpreters 1 year ago
Pavel b1a6ebffba Directives + Interpreted
Some additional filters for rst parsing
1 year ago
Alex 205be538a3 fix dbqa, with new chain type, also fix for doc export 1 year ago
Alex 9228005a7e chunked embedding 1 year ago
vintro 2a203aa547
Create __init__.py
otherwise running `python ingest.py` complains about `parser` not being a package
1 year ago
Alex d642782a5a move folder 1 year ago