AI Research, Large Language Models and ML - Artificial Intelligence Zone

Tencent AI Researchers Introduce Hunyuan-T1: A Mamba-Powered Ultra-Large Language Model Redefining Deep Reasoning, Contextual Efficiency, and Human-Centric Reinforcement Learning

Marktechpost

MARCH 29, 2025

Large language models struggle to process and reason over lengthy, complex texts without losing essential context. Traditional models often suffer from context loss, inefficient handling of long-range dependencies, and difficulties aligning with human preferences, affecting the accuracy and efficiency of their responses.

Large Language Models

Large Language Models AI Researcher AI Research ML

NVIDIA AI Researchers Introduce FFN Fusion: A Novel Optimization Technique that Demonstrates How Sequential Computation in Large Language Models LLMs can be Effectively Parallelized

Marktechpost

MARCH 29, 2025

Large language models (LLMs) have become vital across domains, enabling high-performance applications such as natural language generation, scientific research, and conversational agents. All credit for this research goes to the researchers of this project.

Large Language Models

Large Language Models AI Researcher AI Research AI

Inception Unveils Mercury: The First Commercial-Scale Diffusion Large Language Model

Marktechpost

MARCH 8, 2025

Introducing the first-ever commercial-scale diffusion large language models (dLLMs), Inception labs promises a paradigm shift in speed, cost-efficiency, and intelligence for text and code generation tasks. Also,feel free to follow us on Twitter and dont forget to join our 80k+ ML SubReddit.

Large Language Models

Large Language Models Generative AI LLM AI Researcher

Webinars

Automation, Evolved: Your New Playbook For Smarter Knowledge Work

MORE WEBINARS

Meet FinGPT: An Open-Source Financial Large Language Model (LLMs)

Flipboard

JUNE 16, 2023

Large language models have increased due to the ongoing development and advancement of artificial intelligence, which has profoundly impacted the state of natural language processing in various fields. They soon plan to release the trained model. Check Out The Paper and GitHub link.

Large Language Models

Large Language Models Natural Language Processing Artificial Intelligence Artificial Intelligence

Microsoft AI Released LongRoPE2: A Near-Lossless Method to Extend Large Language Model Context Windows to 128K Tokens While Retaining Over 97% Short-Context Accuracy

Marktechpost

MARCH 1, 2025

Large Language Models (LLMs) have advanced significantly, but a key limitation remains their inability to process long-context sequences effectively. While models like GPT-4o and LLaMA3.1 Also,feel free to follow us on Twitter and dont forget to join our 80k+ ML SubReddit.

Large Language Models

Large Language Models Algorithm AI AI

Alibaba Released Babel: An Open Multilingual Large Language Model LLM Serving Over 90% of Global Speakers

Marktechpost

MARCH 6, 2025

Also,feel free to follow us on Twitter and dont forget to join our 80k+ ML SubReddit.

Large Language Models

Large Language Models LLM NLP Data Quality

Researchers at Stanford Introduces LLM-Lasso: A Novel Machine Learning Framework that Leverages Large Language Models (LLMs) to Guide Feature Selection in Lasso ℓ1 Regression

Marktechpost

MARCH 5, 2025

Also,feel free to follow us on Twitter and dont forget to join our 80k+ ML SubReddit.

Large Language Models

Large Language Models LLM Machine Learning Prompt Engineer

Meta AI Introduces MLGym: A New AI Framework and Benchmark for Advancing AI Research Agents

Marktechpost

FEBRUARY 23, 2025

Researchers from the University College London, University of WisconsinMadison, University of Oxford, Meta, and other institutes have introduced a new framework and benchmark for evaluating and developing LLM agents in AI research. Agents execute bash commands, manage history, and integrate external models.

AI Researcher

AI Researcher AI Research Software Engineer AI

NVIDIA AI Researchers Explore Upcycling Large Language Models into Sparse Mixture-of-Experts

Marktechpost

OCTOBER 13, 2024

Don’t Forget to join our 50k+ ML SubReddit [Upcoming Event- Oct 17 202] RetrieveX – The GenAI Data Retrieval Conference (Promoted) The post NVIDIA AI Researchers Explore Upcycling Large Language Models into Sparse Mixture-of-Experts appeared first on MarkTechPost.

Large Language Models

Large Language Models AI Researcher AI Research Natural Language Processing

This AI Paper Unveils the Future of MultiModal Large Language Models (MM-LLMs) – Understanding Their Evolution, Capabilities, and Impact on AI Research

Marktechpost

JANUARY 30, 2024

Recent developments in Multi-Modal (MM) pre-training have helped enhance the capacity of Machine Learning (ML) models to handle and comprehend a variety of data types, including text, pictures, audio, and video. Join our 36k+ ML SubReddit , 41k+ Facebook Community, Discord Channel , and LinkedIn Gr oup.

Large Language Models

Large Language Models AI Researcher AI Research LLM

Google AI Research Introduces Patchscopes: A Revolutionary AI Framework for Decoding and Enhancing the Interpretability of Large Language Models

Marktechpost

JANUARY 14, 2024

Their aptitude to process and generate language has far-reaching consequences in multiple fields, from automated chatbots to advanced data analysis. Grasping the internal workings of these models is critical to improving their efficacy and aligning them with human values and ethics. If you like our work, you will love our newsletter.

Large Language Models

Large Language Models AI Researcher AI Research Neural Network

JPMorgan AI Research Introduces DocLLM: A Lightweight Extension to Traditional Large Language Models Tailored for Generative Reasoning Over Documents with Rich Layouts

Marktechpost

JANUARY 5, 2024

While Document AI (DocAI) has made significant strides in areas such as question answering, categorization, and extraction, real-world applications continue to face persistent hurdles related to accuracy, reliability, contextual understanding, and generalization to new domains. If you like our work, you will love our newsletter.

Large Language Models

Large Language Models AI Researcher AI Research Categorization

CMU AI Researchers Unveil TOFU: A Groundbreaking Machine Learning Benchmark for Data Unlearning in Large Language Models

Marktechpost

JANUARY 15, 2024

Join our 36k+ ML SubReddit , 41k+ Facebook Community, Discord Channel , and LinkedIn Gr oup. Don’t Forget to join our Telegram Channel The post CMU AI Researchers Unveil TOFU: A Groundbreaking Machine Learning Benchmark for Data Unlearning in Large Language Models appeared first on MarkTechPost.

Large Language Models

Large Language Models Machine Learning AI Researcher AI Research

Meta AI Researchers Introduced SWEET-RL and CollaborativeAgentBench: A Step-Wise Reinforcement Learning Framework to Train Multi-Turn Language Agents for Realistic Human-AI Collaboration Tasks

Marktechpost

MARCH 22, 2025

Large language models (LLMs) are rapidly transforming into autonomous agents capable of performing complex tasks that require reasoning, decision-making, and adaptability. All credit for this research goes to the researchers of this project. Check out the Paper , GitHub Page and Dataset.

AI Researcher

AI Researcher AI Research Large Language Models AI

Microsoft AI Research Introduces Generalized Instruction Tuning (called GLAN): A General and Scalable Artificial Intelligence Method for Instruction Tuning of Large Language Models (LLMs)

Marktechpost

MARCH 2, 2024

Large Language Models (LLMs) have significantly evolved in recent times, especially in the areas of text understanding and generation. Join our 38k+ ML SubReddit , 41k+ Facebook Community, Discord Channel , and LinkedIn Gr oup. Don’t Forget to join our Telegram Channel You may also like our FREE AI Courses….

Large Language Models

Large Language Models Artificial Intelligence Artificial Intelligence AI Researcher

This AI Research from Tenyx Explore the Reasoning Abilities of Large Language Models (LLMs) Through Their Geometrical Understanding

Marktechpost

JULY 8, 2024

Large language models (LLMs) have demonstrated remarkable performance across various tasks, with reasoning capabilities being a crucial aspect of their development. Don’t Forget to join our 46k+ ML SubReddit If You are interested in a promotional partnership (content/ad/newsletter), please fill out this form.

Large Language Models

Large Language Models AI Researcher AI Research LLM

Microsoft Researchers Introduce PromptBench: A Pytorch-based Python Package for Evaluation of Large Language Models (LLMs)

Marktechpost

DECEMBER 23, 2023

In the ever-evolving large language models (LLMs), a persistent challenge has been the need for more standardization, hindering effective model comparisons and impeding the need for reevaluation. The absence of a cohesive and comprehensive framework has left researchers navigating a disjointed evaluation terrain.

Large Language Models

Large Language Models Python LLM AI Researcher

Optimizing Training Data Allocation Between Supervised and Preference Finetuning in Large Language Models

Marktechpost

FEBRUARY 23, 2025

Large Language Models (LLMs) face significant challenges in optimizing their post-training methods, particularly in balancing Supervised Fine-Tuning (SFT) and Reinforcement Learning (RL) approaches. Also,feel free to follow us on Twitter and dont forget to join our 80k+ ML SubReddit.

Large Language Models

Large Language Models Chatbots LLM AI Research

Researchers from Microsoft and Georgia Tech Introduce VCoder: Versatile Vision Encoders for Multimodal Large Language Models

Marktechpost

DECEMBER 27, 2023

In the evolving landscape of artificial intelligence and machine learning, the integration of visual perception with language processing has become a frontier of innovation. This integration is epitomized in the development of Multimodal Large Language Models (MLLMs), which have shown remarkable prowess in a range of vision-language tasks.

Large Language Models

Large Language Models Artificial Intelligence Artificial Intelligence Machine Learning

Meet LLM360: The First Fully Open-Source and Transparent Large Language Models (LLMs)

Marktechpost

DECEMBER 13, 2023

Open-source Large Language Models (LLMs) such as LLaMA, Falcon, and Mistral offer a range of choices for AI professionals and scholars. LLM360 is an initiative to fully open-source LLMs that advocates for all training code and data, model checkpoints, and intermediate results to be made available to the community.

Large Language Models

Large Language Models LLM AI Researcher AI Research

This AI Paper Explores How Code Integration Elevates Large Language Models to Intelligent Agents

Marktechpost

JANUARY 5, 2024

In a recent research paper, a team of researchers from the University of Illinois Urbana-Champaign has offered a thorough and detailed study of the mutually beneficial relationship that exists between code and Large Language Models (LLMs). If you like our work, you will love our newsletter.

Large Language Models

Large Language Models LLM Artificial Intelligence Artificial Intelligence

Tufa Labs Introduced LADDER: A Recursive Learning Framework Enabling Large Language Models to Self-Improve without Human Intervention

Marktechpost

MARCH 8, 2025

Large Language Models (LLMs) benefit significantly from reinforcement learning techniques, which enable iterative improvements by learning from rewards. However, training these models efficiently remains challenging, as they often require extensive datasets and human supervision to enhance their capabilities.

Large Language Models

Large Language Models OpenAI AI Researcher AI Research

Meet SemiKong: The World’s First Open-Source Semiconductor-Focused LLM

Marktechpost

DECEMBER 27, 2024

Researchers from Meta, AITOMATIC, and other collaborators under the Foundation Models workgroup of the AI Alliance have introduced SemiKong. SemiKong represents the worlds first semiconductor-focused large language model (LLM), designed using the Llama 3.1 Dont Forget to join our 60k+ ML SubReddit.

LLM

LLM Large Language Models AI Tools Automation

Apple AI Research Introduces MM1.5: A New Family of Highly Performant Generalist Multimodal Large Language Models (MLLMs)

Marktechpost

OCTOBER 4, 2024

Multimodal large language models (MLLMs) represent a cutting-edge area in artificial intelligence, combining diverse data modalities like text, images, and even video to build a unified understanding across domains. is poised to address key challenges in multimodal AI. The post Apple AI Research Introduces MM1.5:

Large Language Models

Large Language Models AI Research AI Researcher AI

This AI Paper Introduces Learning from Mistakes (LeMa): Enhancing Mathematical Reasoning in Large Language Models through Error-Driven Learning

Marktechpost

NOVEMBER 16, 2023

Large language models, like GPT-3, learn from vast data, including examples of correct and incorrect language usage. These models are trained on diverse datasets containing a wide range of text from the internet, books, articles, and more. All credit for this research goes to the researchers of this project.

Large Language Models

Large Language Models LLM AI AI

Decoding AI Cognition: Unveiling the Color Perception of Large Language Models through Cognitive Psychology Methods

Marktechpost

FEBRUARY 13, 2024

Researchers are pushing what machines can comprehend and replicate regarding human cognitive processes. A groundbreaking study unveils an approach to peering into the minds of Large Language Models (LLMs), particularly focusing on GPT-4’s understanding of color. If you like our work, you will love our newsletter.

Large Language Models

Large Language Models Neural Network Artificial Intelligence Artificial Intelligence

Deci AI Introduces DeciLM-7B: A Super Fast and Super Accurate 7 Billion-Parameter Large Language Model (LLM)

Marktechpost

DECEMBER 14, 2023

In conclusion, DeciLM-7B is a significant model in Large Language Models. It serves as a guiding force where language models excel not only in precision and efficiency but also in accessibility and versatility. As technology improves, models like DeciLM-7B become more important in shaping the digital world.

Large Language Models

Large Language Models LLM Artificial Intelligence Artificial Intelligence

Meet LLM Surgeon: A New Machine Learning Framework for Unstructured, Semi-Structured, and Structured Pruning of Large Language Models (LLMs)

Marktechpost

DECEMBER 30, 2023

The recent advancements in Artificial Intelligence have enabled the development of Large Language Models (LLMs) with a significantly large number of parameters, with some of them reaching into billions (for example, LLaMA-2 that comes in sizes of 7B, 13B, and even 70B parameters).

Large Language Models

Large Language Models Machine Learning LLM Artificial Intelligence

University of Illinois Researchers Introduce Magicoder: a Series of Fully Open-Source Large Language Models (LLMs) for Code

Marktechpost

DECEMBER 8, 2023

Also, don’t forget to join our 33k+ ML SubReddit , 41k+ Facebook Community, Discord Channel , and Email Newsletter , where we share the latest AI research news, cool AI projects, and more. If you like our work, you will love our newsletter.

Large Language Models

Large Language Models Data Science Categorization Python

No Experience? Here’s How You Can Transform Into an Ethical Artificial Intelligence Developer

Unite.AI

FEBRUARY 6, 2025

AI and machine learning (ML) are reshaping industries and unlocking new opportunities at an incredible pace. There are countless routes to becoming an artificial intelligence (AI) expert, and each persons journey will be shaped by unique experiences, setbacks, and growth.

Artificial Intelligence

Artificial Intelligence Artificial Intelligence ML Responsible AI

Salesforce AI Introduces ReGenesis: A Novel AI Approach to Improving Large Language Model Reasoning Capabilities

Marktechpost

OCTOBER 18, 2024

Large language models (LLMs) have revolutionized how machines process and generate human language, but their ability to reason effectively across diverse tasks remains a significant challenge. In response to these limitations, researchers from Salesforce AI Research introduced a novel method called ReGenesis.

Large Language Models

Large Language Models Inference Engine AI AI

KAIST and DeepAuto AI Researchers Propose InfiniteHiP: A Game-Changing Long-Context LLM Framework for 3M-Token Inference on a Single GPU

Marktechpost

FEBRUARY 16, 2025

In large language models (LLMs), processing extended input sequences demands significant computational and memory resources, leading to slower inference and higher hardware costs. All credit for this research goes to the researchers of this project.

LLM

LLM AI Researcher AI Research Large Language Models

Scale AI and Meta Introduces Defense Llama: The LLM Purpose-Built for American National Security

Marktechpost

NOVEMBER 6, 2024

Until recently, existing large language models (LLMs) have lacked the precision, reliability, and domain-specific knowledge required to effectively support defense and security operations. Meet Defense Llama , an ambitious collaborative project introduced by Scale AI and Meta. Don’t Forget to join our 55k+ ML SubReddit.

LLM

LLM Large Language Models Artificial Intelligence Artificial Intelligence

Meet PowerInfer: A Fast Large Language Model (LLM) on a Single Consumer-Grade GPU that Speeds up Machine Learning Model Inference By 11 Times

Marktechpost

DECEMBER 23, 2023

Generative Large Language Models (LLMs) are well known for their remarkable performance in a variety of tasks, including complex Natural Language Processing (NLP), creative writing, question answering, and code generation. All credit for this research goes to the researchers of this project.

Large Language Models

Large Language Models Machine Learning LLM Natural Language Processing

Balancing Act: The Impact of Format Restrictions on Reasoning in Large Language Models

Marktechpost

AUGUST 9, 2024

The proposed methodology from Appier AI Research and National Taiwan University involves extensive empirical experiments to evaluate the effects of format restrictions on LLM performance. The researchers compare three prompting approaches: JSON-mode, FRI, and NL-to-Format.

Large Language Models

Large Language Models Data Analysis LLM Deep Learning

Stanford Researchers Introduce OctoTools: A Training-Free Open-Source Agentic AI Framework Designed to Tackle Complex Reasoning Across Diverse Domains

Marktechpost

FEBRUARY 22, 2025

Large language models (LLMs) are limited by complex reasoning tasks that require multiple steps, domain-specific knowledge, or external tool integration. To address these challenges, researchers have explored ways to enhance LLM capabilities through external tool usage. Check out the Paper and GitHub Page.

Metadata

Metadata Large Language Models Algorithm AI

Meet Open Deep Search (ODS): A Plug-and-Play Framework Democratizing Search with Open-source Reasoning Agents

Marktechpost

MARCH 27, 2025

The rapid advancements in search engine technologies integrated with large language models (LLMs) have predominantly favored proprietary solutions such as Google’s GPT-4o Search Preview and Perplexity’s Sonar Reasoning Pro. All credit for this research goes to the researchers of this project.

Large Language Models

Large Language Models LLM OpenAI AI Researcher

LLMOps: The Next Frontier for Machine Learning Operations

Unite.AI

FEBRUARY 7, 2024

Machine learning (ML) is a powerful technology that can solve complex problems and deliver customer value. However, ML models are challenging to develop and deploy. This is why Machine Learning Operations (MLOps) has emerged as a paradigm to offer scalable and measurable values to Artificial Intelligence (AI) driven businesses.

Machine Learning

Machine Learning Large Language Models LLM BERT

Can Large Language Models be Trusted for Evaluation? Meet SCALEEVAL: An Agent-Debate-Assisted Meta-Evaluation Framework that Leverages the Capabilities of Multiple Communicative LLM Agents

Marktechpost

FEBRUARY 11, 2024

Despite the utility of large language models (LLMs) across various tasks and scenarios, researchers need help to evaluate LLMs properly in different situations. Join our 36k+ ML SubReddit , 41k+ Facebook Community, Discord Channel , and LinkedIn Gr oup. If you like our work, you will love our newsletter.

Large Language Models

Large Language Models LLM Artificial Intelligence Artificial Intelligence

Block Transformer: Enhancing Inference Efficiency in Large Language Models Through Hierarchical Global-to-Local Modeling

Marktechpost

OCTOBER 3, 2024

Large language models (LLMs) have gained widespread popularity, but their token generation process is computationally expensive due to the self-attention mechanism. This approach adopts hierarchical global-to-local modeling to mitigate the significant KV cache IO bottleneck in batch inference.

Large Language Models

Large Language Models Algorithm AI Research AI Researcher

Meet HuatuoGPT-o1: A Medical LLM Designed for Advanced Medical Reasoning

Marktechpost

DECEMBER 30, 2024

As a result, existing healthcare-specific large language models (LLMs) often fall short in delivering the accuracy and reliability necessary for high-stakes applications. Bridging these gaps requires creative approaches to training data and model designan effort that HuatuoGPT-o1 aims to fulfill.

LLM

LLM Large Language Models Big Data Artificial Intelligence

NVIDIA Researchers Introduce Order-Preserving Retrieval-Augmented Generation (OP-RAG) for Enhanced Long-Context Question Answering with Large Language Models (LLMs)

Marktechpost

SEPTEMBER 10, 2024

Retrieval-augmented generation (RAG), a technique that enhances the efficiency of large language models (LLMs) in handling extensive amounts of text, is critical in natural language processing, particularly in applications such as question-answering, where maintaining the context of information is crucial for generating accurate responses.

Large Language Models

Large Language Models Natural Language Processing AI Researcher AI Research

2024 BAIR Graduate Directory

BAIR

MARCH 11, 2024

. --> Every year, the Berkeley Artificial Intelligence Research (BAIR) Lab graduates some of the most talented and innovative minds in artificial intelligence and machine learning. graduates have each expanded the frontiers of AI research and are now ready to embark on new adventures in academia, industry, and beyond.

Robotics

Robotics Natural Language Processing Machine Learning Deep Learning

Hugging Face Releases Picotron: A Tiny Framework that Solves LLM Training 4D Parallelization

Marktechpost

DECEMBER 19, 2024

The rise of large language models (LLMs) has transformed natural language processing, but training these models comes with significant challenges. Training state-of-the-art models like GPT and Llama requires enormous computational resources and intricate engineering. For instance, Llama-3.1-405B

LLM

LLM Natural Language Processing Large Language Models AI Researcher

Tencent AI Researchers Introduce Hunyuan-T1: A Mamba-Powered Ultra-Large Language Model Redefining Deep Reasoning, Contextual Efficiency, and Human-Centric Reinforcement Learning

NVIDIA AI Researchers Introduce FFN Fusion: A Novel Optimization Technique that Demonstrates How Sequential Computation in Large Language Models LLMs can be Effectively Parallelized

Webinars

Trending Sources

Inception Unveils Mercury: The First Commercial-Scale Diffusion Large Language Model

Webinars

Meet FinGPT: An Open-Source Financial Large Language Model (LLMs)

Microsoft AI Released LongRoPE2: A Near-Lossless Method to Extend Large Language Model Context Windows to 128K Tokens While Retaining Over 97% Short-Context Accuracy

Alibaba Released Babel: An Open Multilingual Large Language Model LLM Serving Over 90% of Global Speakers

Researchers at Stanford Introduces LLM-Lasso: A Novel Machine Learning Framework that Leverages Large Language Models (LLMs) to Guide Feature Selection in Lasso ℓ1 Regression

Meta AI Introduces MLGym: A New AI Framework and Benchmark for Advancing AI Research Agents

NVIDIA AI Researchers Explore Upcycling Large Language Models into Sparse Mixture-of-Experts

This AI Paper Unveils the Future of MultiModal Large Language Models (MM-LLMs) – Understanding Their Evolution, Capabilities, and Impact on AI Research

Google AI Research Introduces Patchscopes: A Revolutionary AI Framework for Decoding and Enhancing the Interpretability of Large Language Models

JPMorgan AI Research Introduces DocLLM: A Lightweight Extension to Traditional Large Language Models Tailored for Generative Reasoning Over Documents with Rich Layouts

CMU AI Researchers Unveil TOFU: A Groundbreaking Machine Learning Benchmark for Data Unlearning in Large Language Models

Meta AI Researchers Introduced SWEET-RL and CollaborativeAgentBench: A Step-Wise Reinforcement Learning Framework to Train Multi-Turn Language Agents for Realistic Human-AI Collaboration Tasks

Microsoft AI Research Introduces Generalized Instruction Tuning (called GLAN): A General and Scalable Artificial Intelligence Method for Instruction Tuning of Large Language Models (LLMs)

This AI Research from Tenyx Explore the Reasoning Abilities of Large Language Models (LLMs) Through Their Geometrical Understanding

Microsoft Researchers Introduce PromptBench: A Pytorch-based Python Package for Evaluation of Large Language Models (LLMs)

Optimizing Training Data Allocation Between Supervised and Preference Finetuning in Large Language Models

Researchers from Microsoft and Georgia Tech Introduce VCoder: Versatile Vision Encoders for Multimodal Large Language Models

Meet LLM360: The First Fully Open-Source and Transparent Large Language Models (LLMs)

This AI Paper Explores How Code Integration Elevates Large Language Models to Intelligent Agents

Tufa Labs Introduced LADDER: A Recursive Learning Framework Enabling Large Language Models to Self-Improve without Human Intervention

Meet SemiKong: The World’s First Open-Source Semiconductor-Focused LLM

Apple AI Research Introduces MM1.5: A New Family of Highly Performant Generalist Multimodal Large Language Models (MLLMs)

This AI Paper Introduces Learning from Mistakes (LeMa): Enhancing Mathematical Reasoning in Large Language Models through Error-Driven Learning

Decoding AI Cognition: Unveiling the Color Perception of Large Language Models through Cognitive Psychology Methods

Deci AI Introduces DeciLM-7B: A Super Fast and Super Accurate 7 Billion-Parameter Large Language Model (LLM)

Meet LLM Surgeon: A New Machine Learning Framework for Unstructured, Semi-Structured, and Structured Pruning of Large Language Models (LLMs)

University of Illinois Researchers Introduce Magicoder: a Series of Fully Open-Source Large Language Models (LLMs) for Code

No Experience? Here’s How You Can Transform Into an Ethical Artificial Intelligence Developer

Salesforce AI Introduces ReGenesis: A Novel AI Approach to Improving Large Language Model Reasoning Capabilities

KAIST and DeepAuto AI Researchers Propose InfiniteHiP: A Game-Changing Long-Context LLM Framework for 3M-Token Inference on a Single GPU

Scale AI and Meta Introduces Defense Llama: The LLM Purpose-Built for American National Security

Meet PowerInfer: A Fast Large Language Model (LLM) on a Single Consumer-Grade GPU that Speeds up Machine Learning Model Inference By 11 Times

Balancing Act: The Impact of Format Restrictions on Reasoning in Large Language Models

Stanford Researchers Introduce OctoTools: A Training-Free Open-Source Agentic AI Framework Designed to Tackle Complex Reasoning Across Diverse Domains

Meet Open Deep Search (ODS): A Plug-and-Play Framework Democratizing Search with Open-source Reasoning Agents

LLMOps: The Next Frontier for Machine Learning Operations

Can Large Language Models be Trusted for Evaluation? Meet SCALEEVAL: An Agent-Debate-Assisted Meta-Evaluation Framework that Leverages the Capabilities of Multiple Communicative LLM Agents

Block Transformer: Enhancing Inference Efficiency in Large Language Models Through Hierarchical Global-to-Local Modeling

Meet HuatuoGPT-o1: A Medical LLM Designed for Advanced Medical Reasoning

NVIDIA Researchers Introduce Order-Preserving Retrieval-Augmented Generation (OP-RAG) for Enhanced Long-Context Question Answering with Large Language Models (LLMs)

2024 BAIR Graduate Directory

Hugging Face Releases Picotron: A Tiny Framework that Solves LLM Training 4D Parallelization

Stay Connected