Articles Tagged “Multimodal Ai”
5 articles found
Google Gemma 4 Launches With Four Sizes, Apache 2.0 License, and a Top-3 Open Model Ranking
Google's Gemma 4 arrives with model sizes from 2B to 31B, a permissive Apache 2.0 license, native multimodal support across all sizes, and the #3 spot on the global open model leaderboard.
Google Launches Gemini Embedding 2 — The First AI Model That Maps Text, Images, and Video Into a Single Search Space
Google's new natively multimodal embedding model jointly maps text, images, and video into a unified vector space, enabling cross-modal retrieval and RAG applications.
OpenAI Is Bringing Sora Video Generation Directly Into ChatGPT — Giving Hundreds of Millions of Users Access
According to The Information, OpenAI plans to embed Sora's video-generation capabilities into the ChatGPT interface, mirroring how DALL-E image creation was integrated.
DeepSeek Unveils V4 — A Trillion-Parameter Multimodal Model That Generates Text, Images, and Video
DeepSeek's V4 model enters the frontier tier with trillion-parameter multimodal capabilities spanning text, image, and video generation plus elite coding performance.
Google Gemini 3.1 Pro Doubles Reasoning Performance With a New Three-Tier Thinking System
Google DeepMind’s Gemini 3.1 Pro scores 77.1% on ARC-AGI-2, more than doubling its predecessor’s reasoning with a three-tier thinking architecture.





