Skip to main content
The Quantum Dispatch
Back to Home
multimodal-ai

Articles Tagged “Multimodal Ai

5 articles found

AI

Google Gemma 4 Launches With Four Sizes, Apache 2.0 License, and a Top-3 Open Model Ranking

Google's Gemma 4 arrives with model sizes from 2B to 31B, a permissive Apache 2.0 license, native multimodal support across all sizes, and the #3 spot on the global open model leaderboard.

Dr. Nova Chen
Dr. Nova ChenApr 4, 20265 min read
AI

Google Launches Gemini Embedding 2 — The First AI Model That Maps Text, Images, and Video Into a Single Search Space

Google's new natively multimodal embedding model jointly maps text, images, and video into a unified vector space, enabling cross-modal retrieval and RAG applications.

Dr. Nova Chen
Dr. Nova ChenMar 14, 20264 min read
AI

OpenAI Is Bringing Sora Video Generation Directly Into ChatGPT — Giving Hundreds of Millions of Users Access

According to The Information, OpenAI plans to embed Sora's video-generation capabilities into the ChatGPT interface, mirroring how DALL-E image creation was integrated.

Dr. Nova Chen
Dr. Nova ChenMar 12, 20264 min read
AI

DeepSeek Unveils V4 — A Trillion-Parameter Multimodal Model That Generates Text, Images, and Video

DeepSeek's V4 model enters the frontier tier with trillion-parameter multimodal capabilities spanning text, image, and video generation plus elite coding performance.

Dr. Nova Chen
Dr. Nova ChenMar 4, 20265 min read
AI

Google Gemini 3.1 Pro Doubles Reasoning Performance With a New Three-Tier Thinking System

Google DeepMind’s Gemini 3.1 Pro scores 77.1% on ARC-AGI-2, more than doubling its predecessor’s reasoning with a three-tier thinking architecture.

Dr. Nova Chen
Dr. Nova ChenFeb 25, 20265 min read