list indent

1 year ago · d0c329f71c
parent ce0db84a9a
commit d0c329f71c
1 changed files with 4 additions and 4 deletions
--- a/pages/applications/workplace_casestudy.en.mdx
+++ b/pages/applications/workplace_casestudy.en.mdx
@ -10,9 +10,9 @@ The key findings of their prompt engineering approach are:
 - The impact of the prompt on eliciting the correct reasoning is massive. Simply asking the model to classify a given job results in an F1 score of 65.6, whereas the post-prompt engineering model achieves an F1 score of 91.7.
 - Attempting to force the model to stick to a template lowers performance in all cases (this behaviour disappears in early testing with GPT-4, which are posterior to the paper).
 - Many small modifications have an outsized impact on performance.
- - The tables below show the full modifications tested.
- - Properly giving instructions and repeating the key points appears to be the biggest performance driver.
- - Something as simple as giving the model a (human) name and referring to it as such increased F1 score by 0.6pts.
+  - The tables below show the full modifications tested.
+  - Properly giving instructions and repeating the key points appears to be the biggest performance driver.
+  - Something as simple as giving the model a (human) name and referring to it as such increased F1 score by 0.6pts.

 ### Prompt Modifications Tested

@ -53,4 +53,4 @@ The key findings of their prompt engineering approach are:
 | +bothinst+mock+reit+right+info+name    | 85.7          | 96.8          | 90.9          | 79%                    |
 | +bothinst+mock+reit+right+info+name+pos| **86.9**      | **97**        | **91.7**      | 81%                    |

-**Impact of the various prompt modifications.**
+Template stickiness refers to how frequently the model answers in the desired format.