diff --git a/TRAINING_LOG.md b/TRAINING_LOG.md
index 77911812..744038cc 100644
--- a/TRAINING_LOG.md
+++ b/TRAINING_LOG.md
@@ -230,4 +230,8 @@ We additionally train a full model
 | Weight decay   | 0     |
 | Warmup Steps   | 100   |
 
-Taking inspiration from [the Alpaca Repo](https://github.com/tatsu-lab/stanford_alpaca), we roughly scale the learning rate by `sqrt(k)`, where `k` is the increase in batch size, where Alpaca used a batch size of 128 and learning rate of 2e-5.
\ No newline at end of file
+Taking inspiration from [the Alpaca Repo](https://github.com/tatsu-lab/stanford_alpaca), we roughly scale the learning rate by `sqrt(k)`, where `k` is the increase in batch size, where Alpaca used a batch size of 128 and learning rate of 2e-5.
+
+Comparing our model LoRa to the [Alpaca LoRa](https://huggingface.co/tloen/alpaca-lora-7b), our model has lower perplexity. Qualitatively, training on 3 epochs performed the best on perplexity as well as qualitative examples. 
+
+We tried training a full model using the parameters above, but found that during the second epoch the model overfit.