article thumbnail

Track LLM model evaluation using Amazon SageMaker managed MLflow and FMEval

AWS Machine Learning Blog

Evaluating large language models (LLMs) is crucial as LLM-based systems become increasingly powerful and relevant in our society. Rigorous testing allows us to understand an LLMs capabilities, limitations, and potential biases, and provide actionable feedback to identify and mitigate risk.

LLM 95
article thumbnail

5 Tools to Help Build Your LLM Apps

Flipboard

Whether you're a seasoned ML engineer or a new LLM developer, these tools will help you get more productive and accelerate the development and deployment of your AI projects.

LLM 163
professionals

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

Go from Engineer to ML Engineer with Declarative ML

Flipboard

Learn how to easily build any AI model and customize your own LLM in just a few lines of code with a declarative approach to machine learning.

article thumbnail

Achieve ~2x speed-up in LLM inference with Medusa-1 on Amazon SageMaker AI

AWS Machine Learning Blog

Researchers developed Medusa , a framework to speed up LLM inference by adding extra heads to predict multiple tokens simultaneously. This post demonstrates how to use Medusa-1, the first version of the framework, to speed up an LLM by fine-tuning it on Amazon SageMaker AI and confirms the speed up with deployment and a simple load test.

LLM 76
article thumbnail

LLM continuous self-instruct fine-tuning framework powered by a compound AI system on Amazon SageMaker

AWS Machine Learning Blog

Fine-tuning a pre-trained large language model (LLM) allows users to customize the model to perform better on domain-specific tasks or align more closely with human preferences. You can use supervised fine-tuning (SFT) and instruction tuning to train the LLM to perform better on specific tasks using human-annotated datasets and instructions.

LLM 78
article thumbnail

Future AGI Secures $1.6M to Launch the World’s Most Accurate AI Evaluation Platform

Unite.AI

” Transforming AI Performance Across Industries Future AGI is already delivering impactful results across industries: A Series E sales-tech company used Future AGIs LLM Experimentation Hub to achieve 99% accuracy in its agentic pipeline, compressing weeks of work into just hours.

article thumbnail

Using Large Language Models on Amazon Bedrock for multi-step task execution

AWS Machine Learning Blog

The goal of this blog post is to show you how a large language model (LLM) can be used to perform tasks that require multi-step dynamic reasoning and execution. Fig 1: Simple execution flow solution overview In a more complex scheme, you can add multiple layers of validation and provide relevant APIs to increase the success rate of the LLM.