Commit Graph

37 Commits (main)

Author SHA1 Message Date
ManishMadan2882 9000838aab (feat:vectors): calc, add token in db 3 weeks ago
Siddhant Rai 53e86205ad fix: added more headers from default 1 month ago
Siddhant Rai aa670efe3a fix: connection aborted in WebBaseLoader 1 month ago
Alex 8d7a134cb4 lint: ruff 2 months ago
Siddhant Rai e01071426f feat: field to pass number of posts as a parameter 3 months ago
Siddhant Rai eed1bfbe50 feat: fields to handle reddit loader + minor changes 3 months ago
Siddhant Rai 60cfea1126 feat: added reddit loader 3 months ago
Alex 4a701cb993
Merge branch 'main' into feature/remote-loads 4 months ago
Pavel 54d187a0ad Fixing ingestion metadata grouping 4 months ago
Pavel c8d8a8d0b5 Fixing ingestion metadata grouping 4 months ago
Alex 0cb3d12d94 Refactor loader classes to accept inputs directly 4 months ago
Alex 2e14dec12d
Merge pull request #849 from arc53/main
Sync
4 months ago
Anton Larin 9e04b7796a application folder related changes:
* optimize content of requirements.txt
* upgrade libs
* fix imports
5 months ago
Anton Larin e8099c4db5 script folder related changes:
* optmize content of requirements.txt
* upgrade libs
* fix imports
5 months ago
Exterminator11 f3540aac0f Changed import 8 months ago
Exterminator11 889ce984a9 Made changes 8 months ago
Pavel 381a2740ee change input 8 months ago
Pavel 024674eef3 List check 8 months ago
Pavel b7d88b4c0f fix wrong link 8 months ago
Pavel 719ca63ec1 fixes 8 months ago
Pavel 2cfb416fd0 Desc loader 8 months ago
Pavel 50f07f9ef5 limit crawler 8 months ago
Pavel c517bdd2e1 Crawler + sitemap 8 months ago
Pavel 658867cb46 No crawler, no sitemap 8 months ago
Alex 8f2ad38503 tests 8 months ago
John Bampton 32ea0213f7 Remove unneeded duplicate words 8 months ago
John Bampton 2c6ab18e41 Fix spelling 9 months ago
Alex 347cfe253f elastic2 9 months ago
Alex 783e7f6939 working es 9 months ago
Anton Larin 98a97f34f5 fix packaging and imports and introduce tests with pytest.
still issues with celery worker.
10 months ago
Anton Larin bed25b317c Fix min_tokens logic for grouping documents: documents with (lengh >= min_tokens) should not be grouped into one document for indexing 11 months ago
Alex a64a30c088 fix 11 months ago
Alex dac76a867f fix tokens for header 11 months ago
Anton Larin 962becb9a5
Linting
* validate python formatting on every build with Ruff
* fix lint warnings
1 year ago
Anton Larin 168648e789 Proper PEP8 formatting 1 year ago
Alex 8e477c9d16 update worker 1 year ago
Alex 1d2162705d uploads backend first 1 year ago