80% Accuracy on a True/False Test Is Not a Ceiling
ChatGPT jumped from 76.5% to 80% accuracy on scientific hypotheses in a single model generation. Engineers know what a trajectory like that means, even when the current score looks rough.
By Crash Davis · 3 min read