Commit Graph

13 Commits (cc53a48d1e3188f3973d85ea2645a2835be01c16)

Author SHA1 Message Date
Zach Nussbaum d7395ee37a Merge: main into gptj 1 year ago
Zach Nussbaum b1e361882d fix: multi-turn data breaks 1 year ago
Zach Nussbaum c0a9065032 fix: tokenization error 1 year ago
Zach 573272ad69 fix: drop uneven batch size 1 year ago
Zach 57eb786756 fix: data for inference 1 year ago
Zach e4e88dff33 fix: data processing 1 year ago
Zach 8dd99cc00a fix: prompt len for larger 1 year ago
Zach Nussbaum c68311810a fix: clean up data, pad at end 1 year ago
Zach Nussbaum 668c71dc90 Update data.py 2 years ago
Zach Nussbaum 1a95f68494 fix: just read from watermark file 2 years ago
Zach Nussbaum bb28929305 fix: eos conditional, watermark 2 years ago
Zach Nussbaum eac7734cbf fix: add eos 2 years ago
Zach Nussbaum 723a50bdf1 feat: train and clean data 2 years ago