NVLM 1.0, the cutting-edge multimodal large language model from NVIDIA, surpasses prominent proprietary and open-access models in vision-language tasks with its open-source NVLM-1.0-D-72B decoder model, along with shared code for community use, outperforming its text-only capabilities post multimodal training.