Information, Metadata and ML Engineer - Artificial Intelligence Zone

Vitech uses Amazon Bedrock to revolutionize information access with AI-powered chatbot

AWS Machine Learning Blog

MAY 30, 2024

To serve their customers, Vitech maintains a repository of information that includes product documentation (user guides, standard operating procedures, runbooks), which is currently scattered across multiple internal platforms (for example, Confluence sites and SharePoint folders). langsmith==0.0.43 pgvector==0.2.3 streamlit==1.28.0

Chatbots

Chatbots Prompt Engineering Prompt Engineer Large Language Models

Track LLM model evaluation using Amazon SageMaker managed MLflow and FMEval

AWS Machine Learning Blog

JANUARY 28, 2025

Regular interval evaluation also allows organizations to stay informed about the latest advancements, making informed decisions about upgrading or switching models. This allows you to keep track of your ML experiments. In this post, we show how to use FMEval and Amazon SageMaker to programmatically evaluate LLMs.

LLM

LLM Large Language Models ML Algorithm

Customized model monitoring for near real-time batch inference with Amazon SageMaker

AWS Machine Learning Blog

OCTOBER 28, 2024

The SageMaker endpoint (which includes the custom inference code to preprocesses the multi-payload request) passes the inference data to the ML model, postprocesses the predictions, and sends a response to the user or application. The information pertaining to the request and response is stored in Amazon S3.

ML

ML Metadata Data Scientist DevOps

Webinars

The Intersection of AI and Sales: Personalization Without Compromise

How to Achieve High-Accuracy Results When Using LLMs

Relevance, Reach, Revenue: How to Turn Marketing Trends From Hype to High-Impact

MORE WEBINARS

From concept to reality: Navigating the Journey of RAG from proof of concept to production

AWS Machine Learning Blog

FEBRUARY 12, 2025

Machine learning (ML) engineers must make trade-offs and prioritize the most important factors for their specific use case and business requirements. You can use advanced parsing options supported by Amazon Bedrock Knowledge Bases for parsing non-textual information from documents using FMs.

Auto-classification

Auto-classification Metadata Generative AI Machine Learning

Revolutionizing clinical trials with the power of voice and AI

AWS Machine Learning Blog

MARCH 18, 2025

In the rapidly evolving healthcare landscape, patients often find themselves navigating a maze of complex medical information, seeking answers to their questions and concerns. However, accessing accurate and comprehensible information can be a daunting task, leading to confusion and frustration.

LLM

LLM NLP Data Integration AI

Deploy Amazon SageMaker pipelines using AWS Controllers for Kubernetes

AWS Machine Learning Blog

SEPTEMBER 4, 2024

In this post, we introduce an example to help DevOps engineers manage the entire ML lifecycle—including training and inference—using the same toolkit. Solution overview We consider a use case in which an ML engineer configures a SageMaker model building pipeline using a Jupyter notebook.

DevOps

DevOps ML Engineer ML Metadata

Build a robust text-to-SQL solution generating complex queries, self-correcting, and querying diverse data sources

AWS Machine Learning Blog

FEBRUARY 28, 2024

Structured Query Language (SQL) is a complex language that requires an understanding of databases and metadata. Third, despite the larger adoption of centralized analytics solutions like data lakes and warehouses, complexity rises with different table names and other metadata that is required to create the SQL for the desired sources.

Metadata

Metadata Generative AI LLM NLP

Fine tune a generative AI application for Amazon Bedrock using Amazon SageMaker Pipeline decorators

AWS Machine Learning Blog

AUGUST 22, 2024

It automatically keeps track of model artifacts, hyperparameters, and metadata, helping you to reproduce and audit model versions. The SageMaker Pipelines decorator feature helps convert local ML code written as a Python program into one or more pipeline steps. See Provisioned Throughput for Amazon Bedrock for more information.

Generative AI

Generative AI Metadata Python ML

Top Artificial Intelligence AI Courses from Google

Marktechpost

MAY 30, 2024

Introduction to AI and Machine Learning on Google Cloud This course introduces Google Cloud’s AI and ML offerings for predictive and generative projects, covering technologies, products, and tools across the data-to-AI lifecycle.

Artificial Intelligence

Artificial Intelligence Artificial Intelligence BERT Computer Vision

Best practices for Amazon SageMaker HyperPod task governance

AWS Machine Learning Blog

FEBRUARY 19, 2025

These graphs inform administrators where teams can further maximize their GPU utilization. In this example, the ML engineering team is borrowing 5 GPUs for their training task With SageMaker HyperPod, you can additionally set up observability tools of your choice. queue-name: hyperpod-ns-researchers-localqueue kueue.x-k8s.io/priority-class:

Data Scientist

Data Scientist Data Science ML Engineer Generative AI

MLOps Landscape in 2023: Top Tools and Platforms

The MLOps Blog

JUNE 27, 2023

When thinking about a tool for metadata storage and management, you should consider: General business-related items : Pricing model, security, and support. When thinking about a tool for metadata storage and management, you should consider: General business-related items : Pricing model, security, and support. Can you compare images?

Machine Learning

Machine Learning Metadata Data Scientist Data Quality

Centralize model governance with SageMaker Model Registry Resource Access Manager sharing

AWS Machine Learning Blog

NOVEMBER 14, 2024

However, model governance functions in an organization are centralized and to perform those functions, teams need access to metadata about model lifecycle activities across those accounts for validation, approval, auditing, and monitoring to manage risk and compliance. An experiment collects multiple runs with the same objective.

ML

ML Machine Learning Auto-complete Auto-classification

Governing the ML lifecycle at scale, Part 1: A framework for architecting ML workloads using Amazon SageMaker

AWS Machine Learning Blog

OCTOBER 20, 2023

An ML engineer deploys the model pipeline into the ML team test environment using a shared services CI/CD process. After stakeholder validation, the ML model is deployed to the team’s production environment. ML operations This module helps LOBs and ML engineers work on their dev instances of the model deployment template.

ML

ML Data Scientist ML Engineer Data Science

Top Large Language Models LLMs Courses

Marktechpost

JULY 25, 2024

It is ideal for ML engineers, data scientists, and technical leaders, providing real-world training for production-ready generative AI using Amazon Bedrock and cloud-native services.

Large Language Models

Large Language Models Prompt Engineering Prompt Engineer Chatbots

Set up Amazon SageMaker Studio with Jupyter Lab 3 using the AWS CDK

AWS Machine Learning Blog

JANUARY 17, 2023

This post guides you through the steps to get started with setting up and deploying Studio to standardize ML model development and collaboration with fellow ML engineers and ML scientists. cdk.json – Contains metadata, and feature flags. For more information about Amazon Studio, see Amazon SageMaker Studio.

Software Engineer

Software Engineer ML Engineer ML Machine Learning

Amazon SageMaker Feature Store now supports cross-account sharing, discovery, and access

AWS Machine Learning Blog

FEBRUARY 13, 2024

Let’s demystify this using the following personas and a real-world analogy: Data and ML engineers (owners and producers) – They lay the groundwork by feeding data into the feature store Data scientists (consumers) – They extract and utilize this data to craft their models Data engineers serve as architects sketching the initial blueprint.

ML

ML Machine Learning ML Engineer Data Scientist

Learn how Amazon Ads created a generative AI-powered image generation capability using Amazon SageMaker

AWS Machine Learning Blog

MAY 15, 2024

If advertisers do not supply this information, the model will infer it based on information from their product listing on amazon.com. Here, Amazon SageMaker Ground Truth allowed ML engineers to easily build the human-in-the-loop workflow (step v).

Generative AI

Generative AI AI AI Machine Learning

How to Save Trained Model in Python

The MLOps Blog

MAY 10, 2023

Unpickling an object can execute malicious code, so it’s crucial to only unpickle information from reliable sources. Cons of Saving ML Models with JSON 1 Because JSON only supports a small number of data types, it could not be compatible with sophisticated machine learning models that employ unique data types.

Python

Python Metadata ML Machine Learning

Achieve operational excellence with well-architected generative AI solutions using Amazon Bedrock

AWS Machine Learning Blog

OCTOBER 2, 2024

This is particularly useful for tracking access to sensitive resources such as personally identifiable information (PII), model updates, and other critical activities, enabling enterprises to maintain a robust audit trail and compliance. For more information, see Monitor Amazon Bedrock with Amazon CloudWatch.

Generative AI

Generative AI Data Ingestion AI AI

Use Amazon SageMaker Model Card sharing to improve model governance

AWS Machine Learning Blog

AUGUST 31, 2023

During AWS re:Invent 2022, AWS introduced new ML governance tools for Amazon SageMaker which simplifies access control and enhances transparency over your ML projects. For more information about improving governance of your ML models, refer to Improve governance of your machine learning models with Amazon SageMaker.

ML

ML Data Scientist Machine Learning Data Science

MLOps Is an Extension of DevOps. Not a Fork — My Thoughts on THE MLOPS Paper as an MLOps Startup CEO

The MLOps Blog

JANUARY 23, 2023

Came to ML from software. Founded neptune.ai , a modular MLOps component for ML metadata store , aka “experiment tracker + model registry”. Most of our customers are doing ML/MLOps at a reasonable scale, NOT at the hyperscale of big-tech FAANG companies. . – How about the ML engineer? Let me explain.

DevOps

DevOps Metadata Software Engineer Data Scientist

Build a crop segmentation machine learning model with Planet data and Amazon SageMaker geospatial capabilities

AWS Machine Learning Blog

SEPTEMBER 29, 2023

Planet and AWS’s partnership on geospatial ML SageMaker geospatial capabilities empower data scientists and ML engineers to build, train, and deploy models using geospatial data. It also contains each scene’s metadata, its image ID, and a preview image reference. Shital Dhakal is a Sr.

Machine Learning

Machine Learning Data Scientist ML Python

Build an Amazon SageMaker Model Registry approval and promotion workflow with human intervention

AWS Machine Learning Blog

JANUARY 10, 2024

Specialist Data Engineering at Merck, and Prabakaran Mathaiyan, Sr. ML Engineer at Tiger Analytics. The large machine learning (ML) model development lifecycle requires a scalable model release process similar to that of software development. This post is co-written with Jayadeep Pabbisetty, Sr.

ML

ML Machine Learning Data Scientist ETL

LLM experimentation at scale using Amazon SageMaker Pipelines and MLflow

AWS Machine Learning Blog

JULY 24, 2024

Fine-tuning an LLM can be a complex workflow for data scientists and machine learning (ML) engineers to operationalize. This enables tracking and reproducibility of experiments across different runs, allowing for more informed decision-making about which models perform best on specific tasks or domains.

LLM

LLM ML Generative AI Machine Learning

How Earth.com and Provectus implemented their MLOps Infrastructure with Amazon SageMaker

AWS Machine Learning Blog

JUNE 27, 2023

Earth.com didn’t have an in-house ML engineering team, which made it hard to add new datasets featuring new species, release and improve new models, and scale their disjointed ML system. It also persists a manifest file to Amazon S3, including all necessary information to recreate that dataset version.

DevOps

DevOps ML Machine Learning ML Engineer

Create high-quality datasets with Amazon SageMaker Ground Truth and FiftyOne

AWS Machine Learning Blog

MAY 5, 2023

Solution overview Ground Truth is a fully self-served and managed data labeling service that empowers data scientists, machine learning (ML) engineers, and researchers to build high-quality datasets. For our example use case, we work with the Fashion200K dataset , released at ICCV 2017.

Metadata

Metadata Computer Vision Machine Learning Data Scientist

Use IP-restricted presigned URLs to enhance security in Amazon SageMaker Ground Truth

AWS Machine Learning Blog

AUGUST 20, 2024

By introducing IP-Restricted presigned URLs, SageMaker Ground Truth empowers you with greater control over data access, so sensitive information remains accessible only to authorized workers within approved locations. For more information or assistance, contact your AWS account team or visit the SageMaker community forums.

Software Engineer

Software Engineer ML Machine Learning Metadata

How to Build Machine Learning Systems With a Feature Store

The MLOps Blog

JANUARY 26, 2024

We’ll see how this architecture applies to different classes of ML systems, discuss MLOps and testing aspects, and look at some example implementations. Understanding machine learning pipelines Machine learning (ML) pipelines are a key component of ML systems. But what is an ML pipeline?

Machine Learning

Machine Learning Metadata ML Python

Scale and simplify ML workload monitoring on Amazon EKS with AWS Neuron Monitor container

AWS Machine Learning Blog

JUNE 25, 2024

For more information on this and on setting up a pull through cache, see the Neuron Monitor User Guide. He is a rising senior at the University of Pennsylvania pursuing Dual Bachelor’s Degrees in Computer Information Science and Business Analytics in the Jerome Fisher Management and Technology Program. 32xlarge - inf1.xlarge

ML

ML Metadata Software Development Generative AI

How Forethought saves over 66% in costs for generative AI models using Amazon SageMaker

AWS Machine Learning Blog

JUNE 13, 2023

This post is co-written with Jad Chamoun, Director of Engineering at Forethought Technologies, Inc. and Salina Wu, Senior ML Engineer at Forethought Technologies, Inc. SupportGPT leverages state-of-the-art Information Retrieval (IR) systems and large language models (LLMs) to power over 30 million customer interactions annually.

Generative AI

Generative AI Auto-complete AI Modeling Machine Learning

Build an end-to-end MLOps pipeline using Amazon SageMaker Pipelines, GitHub, and GitHub Actions

AWS Machine Learning Blog

DECEMBER 13, 2023

ML operations, known as MLOps, focus on streamlining, automating, and monitoring ML models throughout their lifecycle. Data scientists, ML engineers, IT staff, and DevOps teams must work together to operationalize models from research to deployment and maintenance. Add the desired GitHub user names as reviewers.

ML

ML Automation Metadata Software Development

Philips accelerates development of AI-enabled healthcare solutions with an MLOps platform built on Amazon SageMaker

AWS Machine Learning Blog

NOVEMBER 16, 2023

Amazon SageMaker provides purpose-built tools for machine learning operations (MLOps) to help automate and standardize processes across the ML lifecycle. In this post, we describe how Philips partnered with AWS to develop AI ToolSuite—a scalable, secure, and compliant ML platform on SageMaker.

Data Scientist

Data Scientist ML Data Science Automation

Evaluate large language models for quality and responsibility

AWS Machine Learning Blog

NOVEMBER 30, 2023

Reports holistically summarize each evaluation in a human-readable way, through natural-language explanations, visualizations, and examples, focusing annotators and data scientists on where to optimize their LLMs and help make informed decisions. What is FMEval? We use datasets such as BoolQ , NaturalQuestions , and TriviaQA.

Large Language Models

Large Language Models Algorithm LLM Responsible AI

How Kakao Games automates lifetime value prediction from game data using Amazon SageMaker and AWS Glue

AWS Machine Learning Blog

MARCH 1, 2023

These types of data are historical raw data from an ML perspective. For example, each log is written in the format of timestamp, user ID, and event information. These are all implemented as a single ML pipeline using Amazon SageMaker Pipelines , and all the ML trainings are managed via Amazon SageMaker Experiments.

Automation

Automation ETL Data Drift ML

Logging PyMC and Arviz Artifacts on Neptune

The MLOps Blog

JANUARY 24, 2024

help data scientists systematically record, catalog, and analyze modeling artifacts and experiment metadata. But as you can imagine, storing all this information in a reliable, accessible, and intuitive way can be difficult and tedious. Experiment trackers like neptune.ai Even though neptune.ai Aside neptune.ai

Metadata

Metadata Python Data Scientist ML

Exploring Generative AI in conversational experiences: An Introduction with Amazon Lex, Langchain, and SageMaker Jumpstart

AWS Machine Learning Blog

JUNE 8, 2023

You are provided with information about entities the Human mentions, if relevant. A session stores metadata and application-specific data known as session attributes. Ryan Gomes is a Data & ML Engineer with the AWS Professional Services Intelligence Practice. He leads the NYC machine learning and AI meetup.

Generative AI

Generative AI LLM Machine Learning Large Language Models

Use Amazon SageMaker Model Cards sharing to improve model governance

AWS Machine Learning Blog

AUGUST 31, 2023

During AWS re:Invent 2022, AWS introduced new ML governance tools for Amazon SageMaker which simplifies access control and enhances transparency over your ML projects. For more information about improving governance of your ML models, refer to Improve governance of your machine learning models with Amazon SageMaker.

ML

ML Data Scientist Machine Learning Data Science

Google experts on practical paths to data-centricity in applied AI

Snorkel AI

JULY 5, 2023

RC : I have had ML engineers tell me, “You didn’t need to do feature selection anymore, and that you could just throw everything at the model and it will figure out what to keep and what to throw away.” So does that mean feature selection is no longer necessary? If not, when should we consider using feature selection?”

Large Language Models

Large Language Models Metadata Machine Learning AI

Google experts on practical paths to data-centricity in applied AI

Snorkel AI

JULY 5, 2023

RC : I have had ML engineers tell me, “You didn’t need to do feature selection anymore, and that you could just throw everything at the model and it will figure out what to keep and what to throw away.” So does that mean feature selection is no longer necessary? If not, when should we consider using feature selection?”

Large Language Models

Large Language Models Metadata Machine Learning AI

Google experts on practical paths to data-centricity in applied AI

Snorkel AI

JULY 5, 2023

RC : I have had ML engineers tell me, “You didn’t need to do feature selection anymore, and that you could just throw everything at the model and it will figure out what to keep and what to throw away.” So does that mean feature selection is no longer necessary? If not, when should we consider using feature selection?”

Large Language Models

Large Language Models Metadata Machine Learning AI

MLflow: Simplifying Machine Learning Experimentation

Viso.ai

MARCH 29, 2024

MLflow is an open-source platform designed to manage the entire machine learning lifecycle, making it easier for ML Engineers, Data Scientists, Software Developers, and everyone involved in the process. Machine learning operations (MLOps) are a set of practices that automate and simplify machine learning (ML) workflows and deployments.

Machine Learning

Machine Learning ML Automation Data Scientist

Accelerate pre-training of Mistral’s Mathstral model with highly resilient clusters on Amazon SageMaker HyperPod

AWS Machine Learning Blog

SEPTEMBER 18, 2024

By directly integrating with Amazon Managed Service for Prometheus and Amazon Managed Grafana and abstracting the management of hardware failures and job resumption, SageMaker HyperPod allows data scientists and ML engineers to focus on model development rather than infrastructure management. You can find more information on the p4de.24xlarge

Auto-complete

Auto-complete ML Generative AI Deep Learning

Custom AI Solutions vs. Off-the-Shelf Products: Key Factors to Consider

Dlabs.ai

JUNE 11, 2019

We had historical data on past Facebook ads along with the sales information from Shopify. Based on this information, the ad content could be better adjusted to a given target group, resulting in a two times greater conversion rate of Facebook ads. Fast forward a little bit.

Machine Learning

Machine Learning Artificial Intelligence Artificial Intelligence Algorithm

How Did We Get to ML Model Reproducibility

The MLOps Blog

MARCH 14, 2023

So at some point in your project, you need to store all this information so that you can retrieve them whenever needed. So for our project, we created a document containing information about each specific module that we have worked on for example, data collection, data preprocessing and exploration, modeling, deployment, monitoring, etc.

ML

ML Machine Learning Metadata ML Engineer

Vitech uses Amazon Bedrock to revolutionize information access with AI-powered chatbot

Track LLM model evaluation using Amazon SageMaker managed MLflow and FMEval

Webinars

Trending Sources

Customized model monitoring for near real-time batch inference with Amazon SageMaker

Webinars

From concept to reality: Navigating the Journey of RAG from proof of concept to production

Revolutionizing clinical trials with the power of voice and AI

Deploy Amazon SageMaker pipelines using AWS Controllers for Kubernetes

Build a robust text-to-SQL solution generating complex queries, self-correcting, and querying diverse data sources

Fine tune a generative AI application for Amazon Bedrock using Amazon SageMaker Pipeline decorators

Top Artificial Intelligence AI Courses from Google

Best practices for Amazon SageMaker HyperPod task governance

MLOps Landscape in 2023: Top Tools and Platforms

Centralize model governance with SageMaker Model Registry Resource Access Manager sharing

Governing the ML lifecycle at scale, Part 1: A framework for architecting ML workloads using Amazon SageMaker

Top Large Language Models LLMs Courses

Set up Amazon SageMaker Studio with Jupyter Lab 3 using the AWS CDK

Amazon SageMaker Feature Store now supports cross-account sharing, discovery, and access

Learn how Amazon Ads created a generative AI-powered image generation capability using Amazon SageMaker

How to Save Trained Model in Python

Achieve operational excellence with well-architected generative AI solutions using Amazon Bedrock

Use Amazon SageMaker Model Card sharing to improve model governance

MLOps Is an Extension of DevOps. Not a Fork — My Thoughts on THE MLOPS Paper as an MLOps Startup CEO

Build a crop segmentation machine learning model with Planet data and Amazon SageMaker geospatial capabilities

Build an Amazon SageMaker Model Registry approval and promotion workflow with human intervention

LLM experimentation at scale using Amazon SageMaker Pipelines and MLflow

How Earth.com and Provectus implemented their MLOps Infrastructure with Amazon SageMaker

Create high-quality datasets with Amazon SageMaker Ground Truth and FiftyOne

Use IP-restricted presigned URLs to enhance security in Amazon SageMaker Ground Truth

How to Build Machine Learning Systems With a Feature Store

Scale and simplify ML workload monitoring on Amazon EKS with AWS Neuron Monitor container

How Forethought saves over 66% in costs for generative AI models using Amazon SageMaker

Build an end-to-end MLOps pipeline using Amazon SageMaker Pipelines, GitHub, and GitHub Actions

Philips accelerates development of AI-enabled healthcare solutions with an MLOps platform built on Amazon SageMaker

Evaluate large language models for quality and responsibility

How Kakao Games automates lifetime value prediction from game data using Amazon SageMaker and AWS Glue

Logging PyMC and Arviz Artifacts on Neptune

Exploring Generative AI in conversational experiences: An Introduction with Amazon Lex, Langchain, and SageMaker Jumpstart

Use Amazon SageMaker Model Cards sharing to improve model governance

Google experts on practical paths to data-centricity in applied AI

Google experts on practical paths to data-centricity in applied AI

Google experts on practical paths to data-centricity in applied AI

MLflow: Simplifying Machine Learning Experimentation

Accelerate pre-training of Mistral’s Mathstral model with highly resilient clusters on Amazon SageMaker HyperPod

Custom AI Solutions vs. Off-the-Shelf Products: Key Factors to Consider

How Did We Get to ML Model Reproducibility

Stay Connected