AI Research and Inference Engine - Artificial Intelligence Zone

SGLang: An Open-Source Inference Engine Transforming LLM Deployment through CPU Scheduling, Cache-Aware Load Balancing, and Rapid Structured Output Generation

Marktechpost

FEBRUARY 21, 2025

SGLang is an open-source inference engine designed by the SGLang team to address these challenges. It optimizes CPU and GPU resources during inference, achieving significantly higher throughput than many competitive solutions. Also,feel free to follow us on Twitter and dont forget to join our 75k+ ML SubReddit.

Inference Engine

Inference Engine LLM Large Language Models Metadata

Salesforce AI Research Propose Programmatic VLM Evaluation (PROVE): A New Benchmarking Paradigm for Evaluating VLM Responses to Open-Ended Queries

Marktechpost

OCTOBER 24, 2024

Researchers from Salesforce AI Research have proposed Programmatic VLM Evaluation (PROVE), a new benchmarking paradigm that evaluates VLM responses to open-ended visual queries. If you like our work, you will love our newsletter. Don’t Forget to join our 55k+ ML SubReddit.

AI Research

AI Research AI Researcher Inference Engine Large Language Models

Google AI Research Examines Random Circuit Sampling (RCS) for Evaluating Quantum Computer Performance in the Presence of Noise

Marktechpost

OCTOBER 16, 2024

Upcoming Live Webinar- Oct 29, 2024] The Best Platform for Serving Fine-Tuned Models: Predibase Inference Engine (Promoted) The post Google AI Research Examines Random Circuit Sampling (RCS) for Evaluating Quantum Computer Performance in the Presence of Noise appeared first on MarkTechPost.

AI Research

AI Research AI Researcher Inference Engine Algorithm

Webinars

The Intersection of AI and Sales: Personalization Without Compromise

How to Achieve High-Accuracy Results When Using LLMs

Relevance, Reach, Revenue: How to Turn Marketing Trends From Hype to High-Impact

MORE WEBINARS

Salesforce AI Research Introduces BLIP-3-Video: A Multimodal Language Model for Videos Designed to Efficiently Capture Temporal Information Over Multiple Frames

Marktechpost

OCTOBER 24, 2024

In response, researchers from Salesforce AI Research introduced BLIP-3-Video, an advanced VLM specifically designed to address the inefficiencies in video processing. These models often need help to optimize token efficiency and video processing performance, necessitating more effective solutions to streamline token management.

AI Research

AI Research AI Researcher Inference Engine Artificial Intelligence

Google AI Researchers Propose ‘MODEL SWARMS’: A Collaborative Search Algorithm to Flexibly Adapt Diverse LLM Experts to Wide-Ranging Purposes

Marktechpost

OCTOBER 17, 2024

Upcoming Live Webinar- Oct 29, 2024] The Best Platform for Serving Fine-Tuned Models: Predibase Inference Engine (Promoted) The post Google AI Researchers Propose ‘MODEL SWARMS’: A Collaborative Search Algorithm to Flexibly Adapt Diverse LLM Experts to Wide-Ranging Purposes appeared first on MarkTechPost.

LLM

LLM Algorithm AI Research AI Researcher

Google AI Research Introduces Process Advantage Verifiers: A Novel Machine Learning Approach to Improving LLM Reasoning Capabilities

Marktechpost

OCTOBER 15, 2024

Upcoming Live Webinar- Oct 29, 2024] The Best Platform for Serving Fine-Tuned Models: Predibase Inference Engine (Promoted) The post Google AI Research Introduces Process Advantage Verifiers: A Novel Machine Learning Approach to Improving LLM Reasoning Capabilities appeared first on MarkTechPost.

Machine Learning

Machine Learning LLM AI Research AI Researcher

Salesforce AI Research Introduces a Novel Evaluation Framework for Retrieval-Augmented Generation (RAG) Systems based on Sub-Question Coverage

Marktechpost

OCTOBER 25, 2024

The Georgia Institute of Technology and Salesforce AI Research researchers introduce a new framework for evaluating RAG systems based on a metric called “sub-question coverage.” If you like our work, you will love our newsletter. Don’t Forget to join our 55k+ ML SubReddit.

AI Research

AI Research AI Researcher Categorization Inference Engine

This AI Research from Cohere for AI Compares Merging vs Data Mixing as a Recipe for Building High-Performant Aligned LLMs

Marktechpost

OCTOBER 20, 2024

Upcoming Live Webinar- Oct 29, 2024] The Best Platform for Serving Fine-Tuned Models: Predibase Inference Engine (Promoted) The post This AI Research from Cohere for AI Compares Merging vs Data Mixing as a Recipe for Building High-Performant Aligned LLMs appeared first on MarkTechPost.

AI Research

AI Research AI Researcher Inference Engine AI

Allen Institute for AI Released olmOCR: A High-Performance Open Source Toolkit Designed to Convert PDFs and Document Images into Clean and Structured Plain Text

Marktechpost

FEBRUARY 26, 2025

percentage points on AI benchmark datasets like ARC Challenge and DROP. Compatible with inference engines like vLLM and SGLang, allowing flexible deployment on various hardware setups. All credit for this research goes to the researchers of this project. Improves language model training by increasing accuracy by 1.3

Metadata

Metadata Inference Engine Deep Learning AI

Google AI Researchers Introduced a Set of New Methods for Enhancing Long-Context LLM Performance in Retrieval-Augmented Generation

Marktechpost

OCTOBER 16, 2024

Upcoming Live Webinar- Oct 29, 2024] The Best Platform for Serving Fine-Tuned Models: Predibase Inference Engine (Promoted) The post Google AI Researchers Introduced a Set of New Methods for Enhancing Long-Context LLM Performance in Retrieval-Augmented Generation appeared first on MarkTechPost.

LLM

LLM AI Research AI Researcher Inference Engine

Meta AI Researchers Introduce Token-Level Detective Reward Model (TLDR) to Provide Fine-Grained Annotations for Large Vision Language Models

Marktechpost

OCTOBER 26, 2024

Upcoming Live Webinar- Oct 29, 2024] The Best Platform for Serving Fine-Tuned Models: Predibase Inference Engine (Promoted) The post Meta AI Researchers Introduce Token-Level Detective Reward Model (TLDR) to Provide Fine-Grained Annotations for Large Vision Language Models appeared first on MarkTechPost.

AI Research

AI Research AI Researcher Data Scarcity Inference Engine

Meet PowerInfer: A Fast Large Language Model (LLM) on a Single Consumer-Grade GPU that Speeds up Machine Learning Model Inference By 11 Times

Marktechpost

DECEMBER 23, 2023

The team has shared that PowerInfer is a GPU-CPU hybrid inference engine that makes use of this understanding. All credit for this research goes to the researchers of this project. It preloads cold-activated neurons onto the CPU for computation and hot-activated neurons onto the GPU for instant access.

Large Language Models

Large Language Models Machine Learning LLM Natural Language Processing

Meta AI Releases Meta’s Open Materials 2024 (OMat24) Inorganic Materials Dataset and Models

Marktechpost

OCTOBER 20, 2024

Researchers from Meta Fundamental AI Research (FAIR) have introduced the Open Materials 2024 (OMat24) dataset, which contains over 110 million DFT calculations, making it one of the largest publicly available datasets in this domain. If you like our work, you will love our newsletter.

Neural Network

Neural Network Inference Engine AI AI

Salesforce AI Introduces ReGenesis: A Novel AI Approach to Improving Large Language Model Reasoning Capabilities

Marktechpost

OCTOBER 18, 2024

Researchers in AI are working to enable these models to perform not just language understanding but also complex reasoning tasks like problem-solving in mathematics, logic, and general knowledge. In response to these limitations, researchers from Salesforce AI Research introduced a novel method called ReGenesis.

Large Language Models

Large Language Models Inference Engine AI AI

This AI Paper from Meta AI Unveils Dualformer: Controllable Fast and Slow Thinking with Randomized Reasoning Traces, Revolutionizing AI Decision-Making

Marktechpost

OCTOBER 25, 2024

A major challenge in AI research is how to develop models that can balance fast, intuitive reasoning with slower, more detailed reasoning in an efficient way. In AI models, this dichotomy between the two systems mostly presents itself as a trade-off between computational efficiency and accuracy.

Auto-complete

Auto-complete Inference Engine AI AI

Flux by Black Forest Labs: The Next Leap in Text-to-Image Models. Is it better than Midjourney?

Unite.AI

AUGUST 12, 2024

1 With a successful Series Seed funding round of $31 million led by Andreessen Horowitz and support from notable angel investors, Black Forest Labs has positioned itself at the forefront of generative AI research. Black Forest Labs Open-Source FLUX.1 manual_seed(int(time.time())), guidance_scale=3.5, ).images[0]

Natural Language Processing

Natural Language Processing Generative AI Inference Engine AI Tools

DIFFUSEARCH: Revolutionizing Chess AI with Implicit Search and Discrete Diffusion Modeling

Marktechpost

OCTOBER 21, 2024

Large Language Models (LLMs) have gained significant attention in AI research due to their impressive capabilities. However, their limitation lies with long-term planning and complex problem-solving. If you like our work, you will love our newsletter. Don’t Forget to join our 50k+ ML SubReddit.

Neural Network

Neural Network Inference Engine Large Language Models AI

MIND (Math Informed syNthetic Dialogue): How Structured Synthetic Data Improves the Mathematical and Logical Capabilities of AI-Powered Language Models

Marktechpost

OCTOBER 21, 2024

This research field is evolving rapidly as AI researchers explore new methods to enhance LLMs’ capabilities in handling advanced reasoning tasks, particularly in mathematics. If you like our work, you will love our newsletter. Don’t Forget to join our 50k+ ML SubReddit.

Large Language Models

Large Language Models Inference Engine AI AI

TOPS of the Class: Decoding AI Performance on RTX AI PCs and Workstations

NVIDIA

JUNE 12, 2024

The team of AI researchers and engineers behind the open-source Jan.ai The researchers tested its implementation of TensorRT-LLM against the open-source llama.cpp inference engine across a variety of GPUs and CPUs used by the community. Source: Jan.ai

LLM

LLM Generative AI AI AI

Agent-as-a-Judge: An Advanced AI Framework for Scalable and Accurate Evaluation of AI Systems Through Continuous Feedback and Human-level Judgments

Marktechpost

OCTOBER 18, 2024

The lack of effective evaluation methods poses a serious problem for AI research and development. Current evaluation frameworks, such as LLM-as-a-Judge, which uses large language models to judge outputs from other AI systems, must account for the entire task-solving process. If you like our work, you will love our newsletter.

Large Language Models

Large Language Models LLM AI Development AI Developer

CMU Researchers Introduce ReLM: An AI System For Validating And Querying LLMs Using Standard Regular Expressions

Marktechpost

JUNE 8, 2023

A regular expression inference engine that effectively converts regular expressions to finite automata has been designed and implemented. Researchers have achieved competitive GPU utilization and runtimes (seconds) using both shortest path and randomized graph traversals. Check Out The Paper , Github , and CMU Article.

Large Language Models

Large Language Models LLM Inference Engine AI

Controllable Safety Alignment (CoSA): An AI Framework Designed to Adapt Models to Diverse Safety Requirements without Re-Training

Marktechpost

OCTOBER 21, 2024

A team of researchers from Microsoft Responsible AI Research and Johns Hopkins University proposed Controllable Safety Alignment (CoSA) , a framework for efficient inference-time adaptation to diverse safety requirements. The adapted strategy first produces an LLM that is easily controllable for safety.

Large Language Models

Large Language Models Inference Engine LLM AI

This AI Paper from Google Presents a Set of Optimizations that Collectively Attain Groundbreaking Latency Figures for Executing Large Diffusion Models on Various Devices

Marktechpost

JUNE 19, 2023

Moreover, the team found that the fusion windows for commonly used layers and units in LDMs need to be substantially larger on a mobile GPU than what is currently available from commercially available GPU-accelerated ML inference engines. Check Out The Paper and Google AI Article.

Inference Engine

Inference Engine ML AI Tools Deep Learning

Setting Up a Training, Fine-Tuning, and Inferencing of LLMs with NVIDIA GPUs and CUDA

Unite.AI

JUNE 21, 2024

According to NVIDIA's benchmarks , TensorRT can provide up to 8x faster inference performance and 5x lower total cost of ownership compared to CPU-based inference for large language models like GPT-3. This engine can then be used to perform efficient inference on the GPU, leveraging CUDA for accelerated computation.

Deep Learning

Deep Learning Neural Network Convolutional Neural Networks Large Language Models

Refined Local Learning Coefficients (rLLCs): A Novel Machine Learning Approach to Understanding the Development of Attention Heads in Transformers

Marktechpost

OCTOBER 21, 2024

Artificial intelligence (AI) and machine learning (ML) revolve around building models capable of learning from data to perform tasks like language processing, image recognition, and making predictions. A significant aspect of AI research focuses on neural networks, particularly transformers.

Machine Learning

Machine Learning Neural Network Natural Language Processing Inference Engine

Deci Introduces DeciCoder: An Open-Source 1B-Parameter Large Language Model For Code Generation

Marktechpost

SEPTEMBER 1, 2023

By leveraging DeciCoder alongside Infery LLM, a dedicated inference engine, users unlock the power of significantly higher throughput – a staggering 3.5 link] Overall, DeciCoder is more than just a model; it’s a realization of AI efficiency’s potential. The implications of this development are profound.

Large Language Models

Large Language Models Inference Engine LLM Automation

Deci Introduces DeciCoder: An Open-Source 1B-Parameter Large Language Model For Code Generation

Marktechpost

AUGUST 25, 2023

By leveraging DeciCoder alongside Infery LLM, a dedicated inference engine, users unlock the power of significantly higher throughput – a staggering 3.5 link] Overall, DeciCoder is more than just a model; it’s a realization of AI efficiency’s potential. The implications of this development are profound.

Large Language Models

Large Language Models Inference Engine LLM Automation

Meta AI Releases Meta Lingua: A Minimal and Fast LLM Training and Inference Library for Research

Marktechpost

OCTOBER 18, 2024

Meta Lingua’s importance lies in its ability to simplify the experimentation process for NLP researchers. In an era where large language models are at the forefront of AI research, having access to a robust yet simple-to-use tool can make all the difference. If you like our work, you will love our newsletter.

LLM

LLM NLP Inference Engine Large Language Models

Lin Qiao, CEO & Co-Founder of Fireworks AI – Interview Series

Unite.AI

APRIL 24, 2024

Lin Qiao, was formerly head of Meta's PyTorch and is the Co-Founder and CEO of Fireworks AI. Fireworks AI is a production AI platform that is built for developers, Fireworks partners with the world's leading generative AI researchers to serve the best models, at the fastest speeds.

AI

AI AI OpenAI Inference Engine

Meta AI Releases New Quantized Versions of Llama 3.2 (1B & 3B): Delivering Up To 2-4x Increases in Inference Speed and 56% Reduction in Model Size

Marktechpost

OCTOBER 24, 2024

Meta AI’s effort to push the boundaries of efficient AI modeling highlights the growing emphasis on sustainable, inclusive AI development—a trend that is sure to shape the future of AI research and application. All credit for this research goes to the researchers of this project.

Large Language Models

Large Language Models NLP Natural Language Processing Inference Engine

Winners of the Essay competition on the Automation of Wisdom and Philosophy

AI Impacts

OCTOBER 28, 2024

The result of using these methods and technologies would be an AI-powered inference engine we can query to see the rational support, empirical or otherwise, of key premises to arguments that bear on important practical decisions.

Automation

Automation Explainability AI AI

Artificial Intelligence Zone

SGLang: An Open-Source Inference Engine Transforming LLM Deployment through CPU Scheduling, Cache-Aware Load Balancing, and Rapid Structured Output Generation

Salesforce AI Research Propose Programmatic VLM Evaluation (PROVE): A New Benchmarking Paradigm for Evaluating VLM Responses to Open-Ended Queries

Webinars

Trending Sources

Google AI Research Examines Random Circuit Sampling (RCS) for Evaluating Quantum Computer Performance in the Presence of Noise

Webinars

Salesforce AI Research Introduces BLIP-3-Video: A Multimodal Language Model for Videos Designed to Efficiently Capture Temporal Information Over Multiple Frames

Google AI Researchers Propose ‘MODEL SWARMS’: A Collaborative Search Algorithm to Flexibly Adapt Diverse LLM Experts to Wide-Ranging Purposes

Google AI Research Introduces Process Advantage Verifiers: A Novel Machine Learning Approach to Improving LLM Reasoning Capabilities

Salesforce AI Research Introduces a Novel Evaluation Framework for Retrieval-Augmented Generation (RAG) Systems based on Sub-Question Coverage

This AI Research from Cohere for AI Compares Merging vs Data Mixing as a Recipe for Building High-Performant Aligned LLMs

Allen Institute for AI Released olmOCR: A High-Performance Open Source Toolkit Designed to Convert PDFs and Document Images into Clean and Structured Plain Text

Google AI Researchers Introduced a Set of New Methods for Enhancing Long-Context LLM Performance in Retrieval-Augmented Generation

Meta AI Researchers Introduce Token-Level Detective Reward Model (TLDR) to Provide Fine-Grained Annotations for Large Vision Language Models

Meet PowerInfer: A Fast Large Language Model (LLM) on a Single Consumer-Grade GPU that Speeds up Machine Learning Model Inference By 11 Times

Meta AI Releases Meta’s Open Materials 2024 (OMat24) Inorganic Materials Dataset and Models

Salesforce AI Introduces ReGenesis: A Novel AI Approach to Improving Large Language Model Reasoning Capabilities

This AI Paper from Meta AI Unveils Dualformer: Controllable Fast and Slow Thinking with Randomized Reasoning Traces, Revolutionizing AI Decision-Making

Flux by Black Forest Labs: The Next Leap in Text-to-Image Models. Is it better than Midjourney?

DIFFUSEARCH: Revolutionizing Chess AI with Implicit Search and Discrete Diffusion Modeling

MIND (Math Informed syNthetic Dialogue): How Structured Synthetic Data Improves the Mathematical and Logical Capabilities of AI-Powered Language Models

TOPS of the Class: Decoding AI Performance on RTX AI PCs and Workstations

Agent-as-a-Judge: An Advanced AI Framework for Scalable and Accurate Evaluation of AI Systems Through Continuous Feedback and Human-level Judgments

CMU Researchers Introduce ReLM: An AI System For Validating And Querying LLMs Using Standard Regular Expressions

Controllable Safety Alignment (CoSA): An AI Framework Designed to Adapt Models to Diverse Safety Requirements without Re-Training

This AI Paper from Google Presents a Set of Optimizations that Collectively Attain Groundbreaking Latency Figures for Executing Large Diffusion Models on Various Devices

Setting Up a Training, Fine-Tuning, and Inferencing of LLMs with NVIDIA GPUs and CUDA

Refined Local Learning Coefficients (rLLCs): A Novel Machine Learning Approach to Understanding the Development of Attention Heads in Transformers

Deci Introduces DeciCoder: An Open-Source 1B-Parameter Large Language Model For Code Generation

Deci Introduces DeciCoder: An Open-Source 1B-Parameter Large Language Model For Code Generation

Meta AI Releases Meta Lingua: A Minimal and Fast LLM Training and Inference Library for Research

Lin Qiao, CEO & Co-Founder of Fireworks AI – Interview Series

Meta AI Releases New Quantized Versions of Llama 3.2 (1B & 3B): Delivering Up To 2-4x Increases in Inference Speed and 56% Reduction in Model Size

Winners of the Essay competition on the Automation of Wisdom and Philosophy

Stay Connected