DeepPHY integrates six diverse and challenging environments to evaluate interactive physical reasoning in agentic VLMs. Our results indicate that even state-of-the-art models have significant ...