Commit Graph

52 Commits (ed081235503d881f67ac3edc8ab9aea953e39aca)

Author SHA1 Message Date
Anton Larin e8099c4db5 script folder related changes:
* optmize content of requirements.txt
* upgrade libs
* fix imports
5 months ago
Exterminator11 a7f5303eaf Cleaned up the code 9 months ago
Exterminator11 36b243e9d2 Formatted all the changed files 9 months ago
Exterminator11 bd70e00f08 Added tests and updated openapi3_parser.py 9 months ago
Exterminator11 ddd938fd64 Parser for OpenAPI3(Swagger) 9 months ago
John Bampton 32ea0213f7 Remove unneeded duplicate words 9 months ago
John Bampton 2c6ab18e41 Fix spelling 9 months ago
Anton Larin ecfbc7b9fd count coverage 11 months ago
Alex 33dce10bc3
Merge pull request #296 from larinam/revert_breaking_renaming_azure_change
Revert "Changed environment variable names OPENAI_API_BASE and OPENAI…
11 months ago
Anton Larin ac5ac3e9f1 Revert "Changed environment variable names OPENAI_API_BASE and OPENAI_API_VERSION to AZURE_OPENAI_API_BASE and AZURE_OPENAI_API_VERSION"
This reverts commit ce8b29e9d0.
11 months ago
Anton Larin bed25b317c Fix min_tokens logic for grouping documents: documents with (lengh >= min_tokens) should not be grouped into one document for indexing 11 months ago
Alex a64a30c088 fix 12 months ago
Alex dac76a867f fix tokens for header 12 months ago
Rik Schoonbeek ce8b29e9d0 Changed environment variable names OPENAI_API_BASE and OPENAI_API_VERSION to AZURE_OPENAI_API_BASE and AZURE_OPENAI_API_VERSION 12 months ago
Idan 897b4ef2cd Fixed a bug with reading md files 1 year ago
Anton Larin 84168e22d0 add missing variable after testin and minor fixes. 1 year ago
Anton Larin 6d5b698c39 fix arc53/DocsGPT#199 1 year ago
Anton Larin dd9f1abcea fix arc53/DocsGPT#199 1 year ago
Anton Larin b4bd34fb96 fix arc53/DocsGPT#199 1 year ago
Nazih Kalo da5d62cc1c updating the bulk ingest file metadata to account for parsers that output lists 1 year ago
Anton Larin 962becb9a5
Linting
* validate python formatting on every build with Ruff
* fix lint warnings
1 year ago
Anton Larin 168648e789 Proper PEP8 formatting 1 year ago
Pavel ce8f0ef9e1
Merge pull request #168 from arc53/feature/backend-uploads
Feature/backend uploads
1 year ago
Pavel 4532b6cd8c print minus 1 year ago
Pavel 53424a5c19 Added cli commands 1 year ago
Pavel b6c02c850a token ingeest 1 year ago
Pavel bac25112b7 v1 1 year ago
Alex 1d2162705d uploads backend first 1 year ago
Alex ac0224b687 mdx format 1 year ago
Alex 1f02f3b376 Update rst_parser.py 1 year ago
Alex f7d7244588 chunks rst 1 year ago
Alex 1af4ca2340
Merge pull request #129 from arc53/code-ingestion
Code_to_dict
1 year ago
Pavel 2c364d3c00 Code_to_dict
3 languages added, works well with python. Java and Js require additional revieving
1 year ago
Alex 5c2a537393
Merge pull request #116 from arc53/code-ingestion
Code ingestion
1 year ago
Pavel 0fb28e5213 Calc + structure 1 year ago
Manan 524e0f6f01 fix | Chunk creation error when
title not the first element in HTML
1 year ago
Manan 16eb503e36 Added HTML Support. read, clean-up, filter return 1 year ago
Manan e8baa46eb6 Merge branch 'main' of https://github.com/arc53/DocsGPT into main 1 year ago
Manan d0b472ad38 Implemented html_parser: cleaning & chunk creation 1 year ago
Alex 4d1ff8238d switching between llms 1 year ago
Alex f9fe3f2f48 Merge branch 'main' into custom-llm 1 year ago
EricGao888 aeac186484 Add retry strategy to increase stability 1 year ago
冯不游 b83589a308 feat: add support for directory list
example: `python ingest.py --dir inputs1 --dir another --dir ../inputs`,
the outputs will be in `outputs/input_folder_name/`
1 year ago
冯不游 636783ca8a fix: avoid second error issue 1 year ago
冯不游 458f2a3ff3 fix: restore index back when continue process 1 year ago
Alex 046fbebf56 Enable other llm's 1 year ago
Alex e88ff885fe
Merge pull request #75 from arc53/rst-interpreters 1 year ago
Pavel b1a6ebffba Directives + Interpreted
Some additional filters for rst parsing
1 year ago
Alex 205be538a3 fix dbqa, with new chain type, also fix for doc export 1 year ago
Alex 9228005a7e chunked embedding 1 year ago