Computer Programming Code Examples

The cheat code everyone’s using

Aeshaan Kumar opens his laptop at 11 p.m., stares at a CS135 problem set, and does what most of his classmates do: he asks ChatGPT. Not for the answer, he tells himself, but for a nudge in the right ...

Communications of the ACM

Measuring What Matters in Large Language Model Performance

As large language models (LLMs) gain momentum worldwide, there’s a growing need for reliable ways to measure their performance. Benchmarks that evaluate LLM outputs allow developers to track ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results

The cheat code everyone’s using

Measuring What Matters in Large Language Model Performance

Trending now