John Bampton
|
32ea0213f7
|
Remove unneeded duplicate words
|
2023-10-07 00:11:03 +10:00 |
|
John Bampton
|
2c6ab18e41
|
Fix spelling
|
2023-10-02 01:25:23 +10:00 |
|
Anton Larin
|
ecfbc7b9fd
|
count coverage
|
2023-08-16 16:35:48 +02:00 |
|
Alex
|
33dce10bc3
|
Merge pull request #296 from larinam/revert_breaking_renaming_azure_change
Revert "Changed environment variable names OPENAI_API_BASE and OPENAI…
|
2023-08-08 18:15:45 +01:00 |
|
Anton Larin
|
ac5ac3e9f1
|
Revert "Changed environment variable names OPENAI_API_BASE and OPENAI_API_VERSION to AZURE_OPENAI_API_BASE and AZURE_OPENAI_API_VERSION"
This reverts commit ce8b29e9d0 .
|
2023-08-05 14:08:51 +02:00 |
|
Anton Larin
|
bed25b317c
|
Fix min_tokens logic for grouping documents: documents with (lengh >= min_tokens) should not be grouped into one document for indexing
|
2023-08-05 13:18:52 +02:00 |
|
Alex
|
a64a30c088
|
fix
|
2023-07-24 16:23:49 +01:00 |
|
Alex
|
dac76a867f
|
fix tokens for header
|
2023-07-24 16:14:08 +01:00 |
|
Rik Schoonbeek
|
ce8b29e9d0
|
Changed environment variable names OPENAI_API_BASE and OPENAI_API_VERSION to AZURE_OPENAI_API_BASE and AZURE_OPENAI_API_VERSION
|
2023-07-12 17:37:56 +02:00 |
|
Idan
|
897b4ef2cd
|
Fixed a bug with reading md files
|
2023-06-23 14:57:29 +03:00 |
|
Anton Larin
|
84168e22d0
|
add missing variable after testin and minor fixes.
|
2023-06-17 16:09:22 +02:00 |
|
Anton Larin
|
6d5b698c39
|
fix arc53/DocsGPT#199
|
2023-06-03 11:04:04 +02:00 |
|
Anton Larin
|
dd9f1abcea
|
fix arc53/DocsGPT#199
|
2023-06-03 11:03:44 +02:00 |
|
Anton Larin
|
b4bd34fb96
|
fix arc53/DocsGPT#199
|
2023-06-03 10:58:31 +02:00 |
|
Nazih Kalo
|
da5d62cc1c
|
updating the bulk ingest file metadata to account for parsers that output lists
|
2023-05-19 10:29:18 -07:00 |
|
Anton Larin
|
962becb9a5
|
Linting
* validate python formatting on every build with Ruff
* fix lint warnings
|
2023-05-13 10:36:17 +02:00 |
|
Anton Larin
|
168648e789
|
Proper PEP8 formatting
|
2023-05-12 12:02:25 +02:00 |
|
Pavel
|
ce8f0ef9e1
|
Merge pull request #168 from arc53/feature/backend-uploads
Feature/backend uploads
|
2023-03-14 19:09:37 +04:00 |
|
Pavel
|
4532b6cd8c
|
print minus
|
2023-03-14 17:49:57 +04:00 |
|
Pavel
|
53424a5c19
|
Added cli commands
|
2023-03-14 17:33:19 +04:00 |
|
Pavel
|
b6c02c850a
|
token ingeest
|
2023-03-14 13:32:29 +04:00 |
|
Pavel
|
bac25112b7
|
v1
|
2023-03-13 19:14:33 +04:00 |
|
Alex
|
1d2162705d
|
uploads backend first
|
2023-03-13 14:20:03 +00:00 |
|
Alex
|
ac0224b687
|
mdx format
|
2023-03-08 23:16:20 +00:00 |
|
Alex
|
1f02f3b376
|
Update rst_parser.py
|
2023-03-08 11:32:44 +00:00 |
|
Alex
|
f7d7244588
|
chunks rst
|
2023-03-08 00:07:53 +00:00 |
|
Alex
|
1af4ca2340
|
Merge pull request #129 from arc53/code-ingestion
Code_to_dict
|
2023-02-25 13:52:14 +00:00 |
|
Pavel
|
2c364d3c00
|
Code_to_dict
3 languages added, works well with python. Java and Js require additional revieving
|
2023-02-25 17:37:33 +04:00 |
|
Alex
|
5c2a537393
|
Merge pull request #116 from arc53/code-ingestion
Code ingestion
|
2023-02-22 18:46:50 +00:00 |
|
Pavel
|
0fb28e5213
|
Calc + structure
|
2023-02-22 21:19:13 +04:00 |
|
Manan
|
524e0f6f01
|
fix | Chunk creation error when
title not the first element in HTML
|
2023-02-22 20:20:54 +05:30 |
|
Manan
|
16eb503e36
|
Added HTML Support. read, clean-up, filter return
|
2023-02-21 23:06:00 +05:30 |
|
Manan
|
e8baa46eb6
|
Merge branch 'main' of https://github.com/arc53/DocsGPT into main
|
2023-02-21 22:11:57 +05:30 |
|
Manan
|
d0b472ad38
|
Implemented html_parser: cleaning & chunk creation
|
2023-02-19 01:53:16 +05:30 |
|
Alex
|
4d1ff8238d
|
switching between llms
|
2023-02-15 18:40:23 +00:00 |
|
Alex
|
f9fe3f2f48
|
Merge branch 'main' into custom-llm
|
2023-02-15 14:42:57 +00:00 |
|
EricGao888
|
aeac186484
|
Add retry strategy to increase stability
|
2023-02-15 17:29:39 +08:00 |
|
冯不游
|
b83589a308
|
feat: add support for directory list
example: `python ingest.py --dir inputs1 --dir another --dir ../inputs`,
the outputs will be in `outputs/input_folder_name/`
|
2023-02-15 02:30:39 +08:00 |
|
冯不游
|
636783ca8a
|
fix: avoid second error issue
|
2023-02-14 22:29:17 +08:00 |
|
冯不游
|
458f2a3ff3
|
fix: restore index back when continue process
|
2023-02-14 22:05:16 +08:00 |
|
Alex
|
046fbebf56
|
Enable other llm's
|
2023-02-14 13:06:28 +00:00 |
|
Alex
|
e88ff885fe
|
Merge pull request #75 from arc53/rst-interpreters
|
2023-02-12 18:32:20 +00:00 |
|
Pavel
|
b1a6ebffba
|
Directives + Interpreted
Some additional filters for rst parsing
|
2023-02-12 22:29:40 +04:00 |
|
Alex
|
205be538a3
|
fix dbqa, with new chain type, also fix for doc export
|
2023-02-12 17:58:54 +00:00 |
|
Alex
|
9228005a7e
|
chunked embedding
|
2023-02-12 16:25:01 +00:00 |
|
vintro
|
2a203aa547
|
Create __init__.py
otherwise running `python ingest.py` complains about `parser` not being a package
|
2023-02-10 19:49:00 -05:00 |
|
Alex
|
d642782a5a
|
move folder
|
2023-02-10 16:10:53 +00:00 |
|