MaCBench is a pioneering benchmark designed to evaluate the multimodal reasoning capabilities of vision-language models (VLMs) in chemistry and materials science. While VLMs show promise in perception tasks, their ability to integrate scientific knowledge across text and images remains underexplored. MaCBench addresses this gap with a diverse suite of tasks spanning data extraction, experimental understanding, and results interpretation—mirroring real-world scientific workflows, as shown in Figure 1.