2024-07-11 Meeting 51

 

 

LLM Reward, Evaluation, Ranking model

 

Virziens:

Saraksts pētījumiem kur izmanto prompt based evaluation answer (Chain of thought, self-reflection) VS Sampling + Reward model

 

TODO:

  1. Apkopot prompt based refinement of responses, lai varam konkrēti salīdzināt ar mūsu metodi

  2. Apkopot datu kopas un rezultātus

 

Mēģināt atrast vēl kādu modeli no datiem:

https://huggingface.co/datasets?other=human-feedback

https://huggingface.co/datasets/Anthropic/hh-rlhf

https://huggingface.co/datasets/Dahoas/full-hh-rlhf

https://huggingface.co/datasets/nvidia/HelpSteer2

 

 

LLM to SQL or Prolog

 

TODO:

Atrast konkrētus pētījumus un datasets kur salīdzina performance šādiem uzdevumiem, lai varam notestēt savus modeļus (vajag arī datu kopas)

 

LLM to Prolog

https://arxiv.org/pdf/2405.17893

https://huggingface.co/datasets/Thomas-X-Yang/gsm8k-prolog?row=6

LLM to SQL https://github.com/defog-ai/sqlcoder/

Ko varētu, lai uzlabotu rezultātu? named-entity-recognition

https://huggingface.co/dslim/bert-base-NER

 

Notes

Llama Guard is an LLM-based safeguard model that can classify the safety risks in LLM prompts and responses. It demonstrates strong performance on existing benchmarks like the OpenAI Moderation Evaluation dataset and ToxicChat, matching or exceeding current content moderation tools

 

 

https://www.semanticscholar.org/paper/EvaluLLM%3A-LLM-assisted-evaluation-of-generative-Desmond-Ashktorab/1869b63cbb1670938fa21670021e405d5dd40a48

https://dl.acm.org/doi/abs/10.1145/3640544.3645216

 

 

https://github.com/gabrielmittag/NISQA

 

https://www.youtube.com/@YannicKilcher

 

 

Anthropic's HH-RLHF

https://github.com/RLHFlow/RLHF-Reward-Modeling

 

 

 

RRHF (Rank Responses to Align Language Models with Human Feedback)

 

https://arxiv.org/html/2312.07592v1

 

 

RAIN (Rewindable Auto-regressive INference) is another inference method that allows pre-trained LLMs to self-evaluate their own generation and use the evaluation to guide generation rewinding for improved AI safety

 

ToRA (Gou et al., 2023) or RFT (Yuan et al., 2023)

 

 

LLM to Prolog

https://arxiv.org/pdf/2405.17893

https://huggingface.co/datasets/Thomas-X-Yang/gsm8k-prolog?row=6

LLM to SQL https://github.com/defog-ai/sqlcoder/

Ko varētu, lai uzlabotu rezultātu? named-entity-recognition

https://huggingface.co/dslim/bert-base-NER