Nvidia's Nemotron-Cascade 2 is a 30B MoE model that activates only 3B parameters at inference time, yet achieved gold medal-level performance at the 2025 IMO, IOI, and ICPC World Finals. Nvidia has ...
Abstract: The rapid development of Large Language Models (LLMs), such as ChatGPT and DeepSeek, has revolutionized software development, particularly in the domain of automated code generation. These ...
Abstract: Out-of-distribution (OOD) detection is crucial for machine learning models deployed in open-world environments. However, existing methods often struggle with model over-confidence or rely ...