MultiBanana-Bench comprises 32 tasks designed to evaluate how well image generation models can faithfully incorporate information from multiple reference images. We report evaluation scores using ...