Ovis-1.6: An Open-Source Multimodal Large Language Model (MLLM) Architecture Designed to Structurally Align Visual and Textual Embeddings
Marktechpost
SEPTEMBER 29, 2024
Similarly, in the RealWorldQA benchmark, Ovis outperformed leading proprietary models such as GPT4V and Qwen-VL-Plus, scoring 2230, compared to GPT4V’s 2038. Don’t Forget to join our 52k+ ML SubReddit. For instance, in the MathVista-Mini benchmark, Ovis scored 1808, significantly higher than its competitors.
Let's personalize your content