Visual Language Model VLM

DeepMind Trains 80 Billion Parameter AI Vision-Language Model Flamingo

Unlock the full InfoQ experience by logging in! Stay updated with your favorite authors and topics, engage with content, and download exclusive resources. Birgitta Böckeler, Distinguished Engineer at ...

Communications of the ACM

The Race to Reliable Visual Understanding

The biggest innovation over the last year is that inference-time scaling techniques that have been pioneered in natural language models have now come to visual language models,” said Eric Heim, chief ...

IT-Online

Visual prompt injection vulnerability bypasses AI guardrails

DeepKeep has discovered a new class of visual prompt injection vulnerability. Dubbed “InkJect” – a nod to the hidden “ink” within images used to inject malicious instructions – it affects leading ...

InfoQ

Physical Intelligence Unveils Robotics Foundation Model Pi-Zero

Physical Intelligence recently announced π0 (pi-zero), a general-purpose AI foundation model for robots. Pi-zero is based on a pre-trained vision-language model (VLM) and outperforms other baseline ...

Yahoo Finance

HOPPR Expands VLM Portfolio with 2D Mammography Narrative Model for Breast Imaging Workflows

The HOPPR® EB 2D Mammo Narrative Model is a Vision-Language model (VLM) that generates narrative language from 2D mammography images and is trained on more than 200,000 mammogram studies. Designed as ...

dbta

IBM Releases New Granite-Docling Model to Deliver End-to-End Document Understanding

IBM is releasing Granite-Docling-258M, an ultra-compact and cutting-edge open-source vision-language model (VLM) for converting documents to machine-readable formats while fully preserving their ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results