Remove Data Analysis Remove Data Drift Remove Metadata
article thumbnail

How Kakao Games automates lifetime value prediction from game data using Amazon SageMaker and AWS Glue

AWS Machine Learning Blog

Challenges In this section, we discuss challenges around various data sources, data drift caused by internal or external events, and solution reusability. For example, Amazon Forecast supports related time series data like weather, prices, economic indicators, or promotions to reflect internal and external related events.

article thumbnail

Managing Dataset Versions in Long-Term ML Projects

The MLOps Blog

However, dataset version management can be a pain for maturing ML teams, mainly due to the following: 1 Managing large data volumes without utilizing data management platforms. 2 Ensuring and maintaining high-quality data. 3 Incorporating additional data sources. 4 The time-consuming process of labeling new data points.

ML 59
professionals

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

Monitoring Your Time Series Model in Comet

Heartbeat

There are several techniques used for model monitoring with time series data, including: Data Drift Detection: This involves monitoring the distribution of the input data over time to detect any changes that may impact the model’s performance. You can learn more about Comet here.

article thumbnail

Building ML Platform in Retail and eCommerce

The MLOps Blog

As an example for catalogue data, it’s important to check if the set of mandatory fields like product title, primary image, nutritional values, etc. are present in the data. So, we need to build a verification layer that runs based on a set of rules to verify and validate data before preparing it for model training.

ML 59
article thumbnail

How United Airlines built a cost-efficient Optical Character Recognition active learning pipeline

AWS Machine Learning Blog

This workflow will be foundational to our unstructured data-based machine learning applications as it will enable us to minimize human labeling effort, deliver strong model performance quickly, and adapt to data drift.” – Jon Nelson, Senior Manager of Data Science and Machine Learning at United Airlines.

article thumbnail

How to Build an End-To-End ML Pipeline

The MLOps Blog

Data validation This step collects the transformed data as input and, through a series of tests and validators, ensures that it meets the criteria for the next component. It checks the data for quality issues and detects outliers and anomalies. For example: Is it too large to fit the infrastructure requirements?

ML 98