Produce report
Izveidot validation kopu, manuāli atlastot 100 atvērtos un 100 aizvērtos jautājumus
Veikt prompt engineering research balstoties uz guidelines: http://share.yellowrobot.xyz/quick/2023-9-11-F1B1EF12-EC43-40FC-B3FD-8AC486BC8196.html
Dokumentēt rezultātus
^ šo arī Amir varētu?
Beakdown:
Mannaly find 100 open questions, 100 closed questions (yes/no answers or answer with single word) => CSV dataset
Using openai API GPT4 Create hyper-param search of different prompt structures based on guidelines: http://share.yellowrobot.xyz/quick/2023-9-11-F1B1EF12-EC43-40FC-B3FD-8AC486BC8196.html
Examples python
xxxxxxxxxx
11system_prompt = "Given information in qoutes execute task below it"
xxxxxxxxxx
21human_prompt = """ “<question>” \n\n
2Classify Text above as “Open Question” or “Closed Question” (Question that can be answerd with “Yes” or “No” or using Single Word): """
xxxxxxxxxx
81response = openai.ChatCompletion.create(
2 messages=[
3 {"role": "system", "content": system_prompt},
4 {"role": "user", "content": human_prompt}],
5 model="gpt-4",
6 n=response_count,
7 temperature=temperature
8 )
Example:
“
Run this on validation dataset -> Calc Acc, F1
Test also hyper params
temperature = 0, n=1
temperature=0.3, n=3 => Get by voting
temperature=0.5, n=3 => Get by voting
Test effect of system_prompt to results - at least 10 different system_prompt
Test at least 50 prompt combinations
Produce report, identify best prompt configuration
Mine whole dataset using this configuration (run it only on Questions)