AI Research, Large Language Models and ML - Artificial Intelligence Zone

AI Research

Large Language Models

NVIDIA AI Researchers Explore Upcycling Large Language Models into Sparse Mixture-of-Experts

Marktechpost

OCTOBER 13, 2024

Don’t Forget to join our 50k+ ML SubReddit [Upcoming Event- Oct 17 202] RetrieveX – The GenAI Data Retrieval Conference (Promoted) The post NVIDIA AI Researchers Explore Upcycling Large Language Models into Sparse Mixture-of-Experts appeared first on MarkTechPost.

NVIDIA AI Researchers Explore Upcycling Large Language Models into Sparse Mixture-of-Experts

Large Language Models Surprise Meta AI Researchers at Compiler Optimization!

Webinars

Trending Sources

This AI Research Introduces Owl: A New Large Language Model for IT Operations

Webinars

Salesforce AI Research Introduces LaTRO: A Self-Rewarding Framework for Enhancing Reasoning Capabilities in Large Language Models

This AI Research Shares a Comprehensive Overview of Large Language Models (LLMs) on Graphs

Apple AI Research Introduces MM1.5: A New Family of Highly Performant Generalist Multimodal Large Language Models (MLLMs)

This AI Research Discusses Achieving Efficient Large Language Models (LLMs) by Eliminating Matrix Multiplication for Scalable Performance

This AI Research from China Proposes YAYI2-30B: A Multilingual Open-Source Large Language Model with 30 Billion Parameters

This AI Research from Cohere Discusses Model Evaluation Using a Panel of Large Language Models Evaluators (PoLL)

This AI Paper Unveils the Future of MultiModal Large Language Models (MM-LLMs) – Understanding Their Evolution, Capabilities, and Impact on AI Research

Salesforce AI Introduces ReGenesis: A Novel AI Approach to Improving Large Language Model Reasoning Capabilities

This AI Research from Tenyx Explore the Reasoning Abilities of Large Language Models (LLMs) Through Their Geometrical Understanding

Google AI Research Introduces Patchscopes: A Revolutionary AI Framework for Decoding and Enhancing the Interpretability of Large Language Models

JPMorgan AI Research Introduces DocLLM: A Lightweight Extension to Traditional Large Language Models Tailored for Generative Reasoning Over Documents with Rich Layouts

CMU AI Researchers Unveil TOFU: A Groundbreaking Machine Learning Benchmark for Data Unlearning in Large Language Models

This AI Research from Apple Unveils a Breakthrough in Running Large Language Models on Devices with Limited Memory

Microsoft AI Research Introduces Generalized Instruction Tuning (called GLAN): A General and Scalable Artificial Intelligence Method for Instruction Tuning of Large Language Models (LLMs)

This AI Research Explains the Synthetic Personality Traits in Large Language Models (LLMs)

Do Large Language Models Really Need All Those Layers? This AI Research Unmasks Model Efficiency: The Quest for Essential Components in Large Language Models

This AI Research Introduces TinyGPT-V: A Parameter-Efficient MLLMs (Multimodal Large Language Models) Tailored for a Range of Real-World Vision-Language Applications

Salesforce AI Research Introduces CodeTF: A One-Stop Transformer Library For Code Large Language Models (CodeLLM)

Google AI Research Propose a General Approach for Personalized Text Generation Using Large Language Models (LLMs)

Google AI Researchers Introduce DiarizationLM: A Machine Learning Framework to Leverage Large Language Models (LLM) to Post-Process the Outputs from a Speaker Diarization System

Can Large Language Models Really Do Math? This Artificial Intelligence AI Research Introduce MathGLM: A Robust Model To Solve Mathematical Problems Without a Calculator

Can Continual Learning Strategies Outperform Traditional Re-Training in Large Language Models? This AI Research Unveils Efficient Machine Learning Approaches

This AI Research Proposes DISC-MedLLM: A Comprehensive Solution that Leverages Large Language Models (LLMs) to Provide Accurate Medical Response

Block Transformer: Enhancing Inference Efficiency in Large Language Models Through Hierarchical Global-to-Local Modeling

Upstage AI Introduces Dataverse for Addressing Challenges in Data Processing for Large Language Models

This AI Research from DeepMind Aims at Reducing Sycophancy in Large Language Models (LLMs) Using Simple Synthetic Data

Meet LLM360: The First Fully Open-Source and Transparent Large Language Models (LLMs)

This AI Paper Explores How Code Integration Elevates Large Language Models to Intelligent Agents

Microsoft Researchers Introduce PromptBench: A Pytorch-based Python Package for Evaluation of Large Language Models (LLMs)

Researchers from Microsoft and Georgia Tech Introduce VCoder: Versatile Vision Encoders for Multimodal Large Language Models

Why AI Language Models Are Still Vulnerable: Key Insights from Kili Technology’s Report on Large Language Model Vulnerabilities

Researchers from Snowflake and CMU Introduce SuffixDecoding: A Novel Model-Free Approach to Accelerating Large Language Model (LLM) Inference through Speculative Decoding

Balancing Act: The Impact of Format Restrictions on Reasoning in Large Language Models

Researchers from Inception, MBZUAI, and Cerebras Open-Sourced ‘Jais’: The World’s Most Advanced Arabic Large Language Model

Meet MiniChain: A Tiny Python Library for Coding with Large Language Models

Meet Mistral-7B-v0.1: A New Large Language Model on the Block

Decoding AI Cognition: Unveiling the Color Perception of Large Language Models through Cognitive Psychology Methods

This AI Research Introduces CoDi-2: A Groundbreaking Multimodal Large Language Model Transforming the Landscape of Interleaved Instruction Processing and Multimodal Output Generation

Meet mPLUG-Owl2: A Multi-Modal Foundation Model that Transforms Multi-modal Large Language Models (MLLMs) with Modality Collaboration

DeepMind and UCL’s Comprehensive Analysis of Latent Multi-Hop Reasoning in Large Language Models

Meet WavJourney: An AI Framework For Compositional Audio Creation With Large Language Models

Stay Connected