Data Science, Metadata and ML Engineer - Artificial Intelligence Zone

From Solo Notebooks to Collaborative Powerhouse: VS Code Extensions for Data Science and ML Teams

Towards AI

AUGUST 7, 2024

From Solo Notebooks to Collaborative Powerhouse: VS Code Extensions for Data Science and ML Teams Photo by Parabol | The Agile Meeting Toolbox on Unsplash In this article, we will explore the essential VS Code extensions that enhance productivity and collaboration for data scientists and machine learning (ML) engineers.

Data Science

Data Science ML ML Engineer Data Scientist

Build a robust text-to-SQL solution generating complex queries, self-correcting, and querying diverse data sources

AWS Machine Learning Blog

FEBRUARY 28, 2024

Structured Query Language (SQL) is a complex language that requires an understanding of databases and metadata. The solution in this post aims to bring enterprise analytics operations to the next level by shortening the path to your data using natural language. Today, generative AI can enable people without SQL knowledge.

Metadata

Metadata Generative AI LLM NLP

Governing the ML lifecycle at scale, Part 1: A framework for architecting ML workloads using Amazon SageMaker

AWS Machine Learning Blog

OCTOBER 20, 2023

Data scientists search and pull features from the central feature store catalog, build models through experiments, and select the best model for promotion. Data scientists create and share new features into the central feature store catalog for reuse.

ML

ML Data Scientist ML Engineer Data Science

Webinars

AI for Paralegals: Everything You Need to Know (and How to Use It Safely)

The Intersection of AI and Sales: Personalization Without Compromise

How to Achieve High-Accuracy Results When Using LLMs

Beyond the Buzz: How to Turn Marketing Trends into Revenue-Driving Strategies

MORE WEBINARS

Data4ML Preparation Guidelines (Beyond The Basics)

Towards AI

NOVEMBER 8, 2024

Data preparation isn’t just a part of the ML engineering process — it’s the heart of it. Photo by Myriam Jessier on Unsplash To set the stage, let’s examine the nuances between research-phase data and production-phase data. Writing Output: Centralizing data into a structure, like a delta table.

Data Ingestion

Data Ingestion Metadata ML Engineer ML

MLOps Landscape in 2023: Top Tools and Platforms

The MLOps Blog

JUNE 27, 2023

With built-in components and integration with Google Cloud services, Vertex AI simplifies the end-to-end machine learning process, making it easier for data science teams to build and deploy models at scale. Metaflow Metaflow helps data scientists and machine learning engineers build, manage, and deploy data science projects.

Machine Learning

Machine Learning Metadata Data Scientist Data Quality

Machine Learning Engineering in the Real World

ODSC - Open Data Science

SEPTEMBER 21, 2023

Secondly, to be a successful ML engineer in the real world, you cannot just understand the technology; you must understand the business. We should start by considering the broad elements that should constitute any ML solution, as indicated in the following diagram: Figure 1.2:

Machine Learning

Machine Learning ML Engineer ML Data Science

Fine tune a generative AI application for Amazon Bedrock using Amazon SageMaker Pipeline decorators

AWS Machine Learning Blog

AUGUST 22, 2024

It automatically keeps track of model artifacts, hyperparameters, and metadata, helping you to reproduce and audit model versions. The SageMaker Pipelines decorator feature helps convert local ML code written as a Python program into one or more pipeline steps. SageMaker Pipelines can handle model versioning and lineage tracking.

Generative AI

Generative AI Metadata Python ML

First ODSC Europe 2023 Sessions Announced

ODSC - Open Data Science

MARCH 27, 2023

ML Governance: A Lean Approach Ryan Dawson | Principal Data Engineer | Thoughtworks Meissane Chami | Senior ML Engineer | Thoughtworks During this session, you’ll discuss the day-to-day realities of ML Governance. Some of the questions you’ll explore include How much documentation is appropriate?

Machine Learning

Machine Learning Data Science Data Ingestion Deep Learning

Philips accelerates development of AI-enabled healthcare solutions with an MLOps platform built on Amazon SageMaker

AWS Machine Learning Blog

NOVEMBER 16, 2023

Amazon SageMaker provides purpose-built tools for machine learning operations (MLOps) to help automate and standardize processes across the ML lifecycle. In this post, we describe how Philips partnered with AWS to develop AI ToolSuite—a scalable, secure, and compliant ML platform on SageMaker.

Data Scientist

Data Scientist ML Data Science Automation

Amazon SageMaker Feature Store now supports cross-account sharing, discovery, and access

AWS Machine Learning Blog

FEBRUARY 13, 2024

Let’s demystify this using the following personas and a real-world analogy: Data and ML engineers (owners and producers) – They lay the groundwork by feeding data into the feature store Data scientists (consumers) – They extract and utilize this data to craft their models Data engineers serve as architects sketching the initial blueprint.

ML

ML Machine Learning ML Engineer Data Scientist

Best practices for Amazon SageMaker HyperPod task governance

AWS Machine Learning Blog

FEBRUARY 19, 2025

In this post, we provide best practices to maximize the value of SageMaker HyperPod task governance and make the administration and data science experiences seamless. Access control When working with SageMaker HyperPod task governance, data scientists will assume their specific role.

Data Scientist

Data Scientist Data Science ML Engineer Generative AI

Build an Amazon SageMaker Model Registry approval and promotion workflow with human intervention

AWS Machine Learning Blog

JANUARY 10, 2024

Specialist Data Engineering at Merck, and Prabakaran Mathaiyan, Sr. ML Engineer at Tiger Analytics. The large machine learning (ML) model development lifecycle requires a scalable model release process similar to that of software development. This post is co-written with Jayadeep Pabbisetty, Sr.

ML

ML Machine Learning Data Scientist ETL

Customized model monitoring for near real-time batch inference with Amazon SageMaker

AWS Machine Learning Blog

OCTOBER 28, 2024

You can use this framework as a starting point to monitor your custom metrics or handle other unique requirements for model quality monitoring in your AI/ML applications. Data Scientist at AWS, bringing a breadth of data science, ML engineering, MLOps, and AI/ML architecting to help businesses create scalable solutions on AWS.

ML

ML Metadata Data Scientist DevOps

Use Amazon SageMaker Model Card sharing to improve model governance

AWS Machine Learning Blog

AUGUST 31, 2023

Model cards are intended to be a single source of truth for business and technical metadata about the model that can reliably be used for auditing and documentation purposes. Depending on your governance requirements, Data Science & Dev accounts can be merged into a single AWS account.

ML

ML Data Scientist Machine Learning Data Science

MLOps Is an Extension of DevOps. Not a Fork — My Thoughts on THE MLOPS Paper as an MLOps Startup CEO

The MLOps Blog

JANUARY 23, 2023

Came to ML from software. Founded neptune.ai , a modular MLOps component for ML metadata store , aka “experiment tracker + model registry”. Most of our customers are doing ML/MLOps at a reasonable scale, NOT at the hyperscale of big-tech FAANG companies. . – How about the ML engineer? Let me explain.

DevOps

DevOps Metadata Software Engineer Data Scientist

Introducing the Topic Tracks for ODSC East 2025: Spotlight on Gen AI, AI Agents, LLMs, & More

ODSC - Open Data Science

FEBRUARY 25, 2025

Topics Include: Agentic AI DesignPatterns LLMs & RAG forAgents Agent Architectures &Chaining Evaluating AI Agent Performance Building with LangChain and LlamaIndex Real-World Applications of Autonomous Agents Who Should Attend: Data Scientists, Developers, AI Architects, and ML Engineers seeking to build cutting-edge autonomous systems.

Data Scientist

Data Scientist Large Language Models Machine Learning ML Engineer

Bundesliga Match Fact Ball Recovery Time: Quantifying teams’ success in pressing opponents on AWS

AWS Machine Learning Blog

MARCH 30, 2023

This allows for seamless communication of positional data and various outputs of Bundesliga Match Facts between containers in real time. The match-related data is collected and ingested using DFL’s DataHub. Both the Lambda function and the Fargate container publish the data for further consumption in the relevant MSK topics.

Machine Learning

Machine Learning Data Scientist Data Science Metadata

Achieve operational excellence with well-architected generative AI solutions using Amazon Bedrock

AWS Machine Learning Blog

OCTOBER 2, 2024

Additionally, you can enable model invocation logging to collect invocation logs, full request response data, and metadata for all Amazon Bedrock model API invocations in your AWS account. Leveraging her expertise in Computer Vision and Deep Learning, she empowers customers to harness the power of the ML in AWS cloud efficiently.

Generative AI

Generative AI Data Ingestion AI AI

Build an end-to-end MLOps pipeline using Amazon SageMaker Pipelines, GitHub, and GitHub Actions

AWS Machine Learning Blog

DECEMBER 13, 2023

ML operations, known as MLOps, focus on streamlining, automating, and monitoring ML models throughout their lifecycle. Data scientists, ML engineers, IT staff, and DevOps teams must work together to operationalize models from research to deployment and maintenance.

ML

ML Automation Metadata Software Development

How Kakao Games automates lifetime value prediction from game data using Amazon SageMaker and AWS Glue

AWS Machine Learning Blog

MARCH 1, 2023

Solution overview The ML solution for LTV forecasting is composed of four components: the training dataset ETL pipeline, MLOps pipeline, inference dataset ETL pipeline, and ML batch inference. ML engineers no longer need to manage this training metadata separately.

Automation

Automation ETL Data Drift ML

MLflow: Simplifying Machine Learning Experimentation

Viso.ai

MARCH 29, 2024

MLflow is an open-source platform designed to manage the entire machine learning lifecycle, making it easier for ML Engineers, Data Scientists, Software Developers, and everyone involved in the process. MLOps aims to automate and operationalize ML models, enabling smoother transitions to production and deployment.

Machine Learning

Machine Learning ML Automation Data Scientist

Use Amazon SageMaker Model Cards sharing to improve model governance

AWS Machine Learning Blog

AUGUST 31, 2023

Model cards are intended to be a single source of truth for business and technical metadata about the model that can reliably be used for auditing and documentation purposes. Depending on your governance requirements, Data Science & Dev accounts can be merged into a single AWS account.

ML

ML Data Scientist Machine Learning Data Science

Learnings From Building the ML Platform at Mailchimp

The MLOps Blog

OCTOBER 3, 2023

So I was able to get from growth hacking to data analytics, then data analytics to data science, and then data science to MLOps. I switched from analytics to data science, then to machine learning, then to data engineering, then to MLOps. How do I get this model in production?

ML

ML Data Scientist Machine Learning Data Science

Learnings From Building the ML Platform at Stitch Fix

The MLOps Blog

AUGUST 3, 2023

This is Piotr Niedźwiedź and Aurimas Griciūnas from neptune.ai , and you’re listening to ML Platform Podcast. Stefan is a software engineer, data scientist, and has been doing work as an ML engineer. As you’ve been running the ML data platform team, how do you do that? Stefan: Yeah.

ML

ML Data Scientist Software Engineer Machine Learning

FMOps/LLMOps: Operationalize generative AI and differences with MLOps

AWS Machine Learning Blog

SEPTEMBER 1, 2023

These data owners are focused on providing access to their data to multiple business units or teams. Data science team – Data scientists need to focus on creating the best model based on predefined key performance indicators (KPIs) working in notebooks.

Generative AI

Generative AI Prompt Engineering Prompt Engineer ML

How to Build an End-To-End ML Pipeline

The MLOps Blog

MAY 9, 2023

One of the most prevalent complaints we hear from ML engineers in the community is how costly and error-prone it is to manually go through the ML workflow of building and deploying models. Building end-to-end machine learning pipelines lets ML engineers build once, rerun, and reuse many times. Kale v0.7.0.

ML

ML Machine Learning Metadata Data Science

Centralize model governance with SageMaker Model Registry Resource Access Manager sharing

AWS Machine Learning Blog

NOVEMBER 14, 2024

However, model governance functions in an organization are centralized and to perform those functions, teams need access to metadata about model lifecycle activities across those accounts for validation, approval, auditing, and monitoring to manage risk and compliance. An experiment collects multiple runs with the same objective.

ML

ML Machine Learning Auto-complete Auto-classification

Definite Guide to Building a Machine Learning Platform

The MLOps Blog

MARCH 21, 2023

From gathering and processing data to building models through experiments, deploying the best ones, and managing them at scale for continuous value in production—it’s a lot. As the number of ML-powered apps and services grows, it gets overwhelming for data scientists and ML engineers to build and deploy models at scale.

Machine Learning

Machine Learning Data Scientist ML Metadata

How Thomson Reuters built an AI platform using Amazon SageMaker to accelerate delivery of ML projects

AWS Machine Learning Blog

JANUARY 13, 2023

Enabling such a secure, compliant environment in the cloud within minutes relieves data scientists from the burden of handling cloud infrastructure, networking requirements, and security standards measures, to focus instead on the data science problem. The following diagram illustrates this architecture. Model deployment.

ML

ML Data Scientist Machine Learning Metadata

Bring SageMaker Autopilot into your MLOps processes using a custom SageMaker Project

AWS Machine Learning Blog

JUNE 14, 2023

SageMaker Projects helps organizations set up and standardize environments for automating different steps involved in an ML lifecycle. Although notebooks are helpful for model building and experimentation, a team of data scientists and ML engineers sharing code need a more scalable way to maintain code consistency and strict version control.

ML

ML Data Scientist Automation DevOps

Use Kubernetes Operators for new inference capabilities in Amazon SageMaker that reduce LLM deployment costs by 50% on average

AWS Machine Learning Blog

APRIL 19, 2024

environment: HF_MODEL_ID: databricks/dolly-v2-7b HF_TASK: text-generation apiVersion: sagemaker.services.k8s.aws/v1alpha1 kind: Model metadata: name: flan-t5-xxl spec: modelName: flan-t5-xxl executionRoleARN: containers: - image: 763104351884.dkr.ecr.us-east-1.amazonaws.com/huggingface-pytorch-tgi-inference:2.0.1-tgi0.9.3-gpu-py39-cu118-ubuntu20.04

Metadata

Metadata LLM Software Development Machine Learning

Improve governance of models with Amazon SageMaker unified Model Cards and Model Registry

AWS Machine Learning Blog

NOVEMBER 13, 2024

You can now register machine learning (ML) models in Amazon SageMaker Model Registry with Amazon SageMaker Model Cards , making it straightforward to manage governance information for specific model versions directly in SageMaker Model Registry in just a few clicks.

Metadata

Metadata ML Software Engineer Machine Learning

Artificial Intelligence Zone

From Solo Notebooks to Collaborative Powerhouse: VS Code Extensions for Data Science and ML Teams

Build a robust text-to-SQL solution generating complex queries, self-correcting, and querying diverse data sources

Webinars

Trending Sources

Governing the ML lifecycle at scale, Part 1: A framework for architecting ML workloads using Amazon SageMaker

Webinars

Data4ML Preparation Guidelines (Beyond The Basics)

MLOps Landscape in 2023: Top Tools and Platforms

Machine Learning Engineering in the Real World

Fine tune a generative AI application for Amazon Bedrock using Amazon SageMaker Pipeline decorators

First ODSC Europe 2023 Sessions Announced

Philips accelerates development of AI-enabled healthcare solutions with an MLOps platform built on Amazon SageMaker

Amazon SageMaker Feature Store now supports cross-account sharing, discovery, and access

Best practices for Amazon SageMaker HyperPod task governance

Build an Amazon SageMaker Model Registry approval and promotion workflow with human intervention

Customized model monitoring for near real-time batch inference with Amazon SageMaker

Use Amazon SageMaker Model Card sharing to improve model governance

MLOps Is an Extension of DevOps. Not a Fork — My Thoughts on THE MLOPS Paper as an MLOps Startup CEO

Introducing the Topic Tracks for ODSC East 2025: Spotlight on Gen AI, AI Agents, LLMs, & More

Bundesliga Match Fact Ball Recovery Time: Quantifying teams’ success in pressing opponents on AWS

Achieve operational excellence with well-architected generative AI solutions using Amazon Bedrock

Build an end-to-end MLOps pipeline using Amazon SageMaker Pipelines, GitHub, and GitHub Actions

How Kakao Games automates lifetime value prediction from game data using Amazon SageMaker and AWS Glue

MLflow: Simplifying Machine Learning Experimentation

Use Amazon SageMaker Model Cards sharing to improve model governance

Learnings From Building the ML Platform at Mailchimp

Learnings From Building the ML Platform at Stitch Fix

FMOps/LLMOps: Operationalize generative AI and differences with MLOps

How to Build an End-To-End ML Pipeline

Centralize model governance with SageMaker Model Registry Resource Access Manager sharing

Definite Guide to Building a Machine Learning Platform

How Thomson Reuters built an AI platform using Amazon SageMaker to accelerate delivery of ML projects

Bring SageMaker Autopilot into your MLOps processes using a custom SageMaker Project

Use Kubernetes Operators for new inference capabilities in Amazon SageMaker that reduce LLM deployment costs by 50% on average

Improve governance of models with Amazon SageMaker unified Model Cards and Model Registry

Stay Connected