OpenAI has developed a new CriticGPT model to identify errors in the code generated by ChatGPT, making the output of large-scale language models (LLMs) more accurate.
To improve the output, reinforcement learning with human feedback (RLHF) is usually used, where humans evaluate the model’s output and further refine it. This is time-consuming and error-prone, especially for large models, and can result in a high number of incorrect or unnecessary responses.
OpenAI hopes to change that by making GPT-4 the basis for CriticGPT. “Reviewing ChatGPT’s code with the help of CriticGPT performs 60% better than someone without help,” say the creators of the new tool. CriticGPT is said to be able to detect hallucinations that people would not recognize on their own.
The new model is trained from a dataset of code samples containing intentional bugs and example feedback, allowing CriticGPT to detect not only common bugs, but also rare bugs.
performance
To demonstrate CriticGPT’s performance, OpenAI compared the model with humans and found that it outperformed the average human code reviewer. Criticisms that involved observing and explaining errors were preferred over human-written criticism in 63 percent of cases. According to OpenAI, this is because the model was less nit-picky about the code and produced fewer false positives than the humans themselves.
OpenAI plans to integrate models like CriticGPT into the RLHF labeling pipeline to aid in model trainers. Many of the results OpenAI is currently showing are still mostly in the research stage.
Tip: ChatGPT app now available for macOS