Data Quality, Data Science and Explainability - Artificial Intelligence Zone

How Axfood enables accelerated machine learning throughout the organization using Amazon SageMaker

AWS Machine Learning Blog

FEBRUARY 27, 2024

Axfood has a structure with multiple decentralized data science teams with different areas of responsibility. Together with a central data platform team, the data science teams bring innovation and digital transformation through AI and ML solutions to the organization.

Machine Learning

Machine Learning DevOps Data Scientist Data Quality

The risks and limitations of AI in insurance

IBM Journey to AI blog

MAY 8, 2023

Crucially, the insurance sector is a financially regulated industry where the transparency, explainability and auditability of algorithms is of key importance to the regulator. Usage risk—inaccuracy The performance of an AI system heavily depends on the data from which it learns.

Algorithm

Algorithm AI AI Generative AI

MLOps Landscape in 2023: Top Tools and Platforms

The MLOps Blog

JUNE 27, 2023

This includes features for model explainability, fairness assessment, privacy preservation, and compliance tracking. With built-in components and integration with Google Cloud services, Vertex AI simplifies the end-to-end machine learning process, making it easier for data science teams to build and deploy models at scale.

Machine Learning

Machine Learning Metadata Data Scientist Data Quality

Webinars

The Intersection of AI and Sales: Personalization Without Compromise

How to Achieve High-Accuracy Results When Using LLMs

Relevance, Reach, Revenue: How to Turn Marketing Trends From Hype to High-Impact

MORE WEBINARS

11 Open Source Data Exploration Tools You Need to Know in 2023

ODSC - Open Data Science

FEBRUARY 24, 2023

Data Quality Now that you’ve learned more about your data and cleaned it up, it’s time to ensure the quality of your data is up to par. With these data exploration tools, you can determine if your data is accurate, consistent, and reliable.

Data Analysis

Data Analysis Data Science Business Intelligence Python

ODSC West 2023 Recap in Pictures

ODSC - Open Data Science

DECEMBER 5, 2023

Networking Always a highlight and crowd-pleasure of ODSC conferences, the networking events Monday-Wednesday were well-deserved after long days of data science training sessions. You can also get data science training on-demand wherever you are with our Ai+ Training platform. Register now before ticket prices go up !

Data Science

Data Science Large Language Models Artificial Intelligence Artificial Intelligence

How Tastry “Taught a Computer How to Taste.”

Unite.AI

OCTOBER 2, 2023

To explain this limitation, it is important to understand that the chemistry of sensory-based products is largely focused on quality control, i.e., how much of this analyte is in that mixture? When it comes to data quality, we realized a valid training set could not be generated from existing commercial or crowd-sourced data.

Machine Learning

Machine Learning Data Quality Data Science Explainability

The Weather Company enhances MLOps with Amazon SageMaker, AWS CloudFormation, and Amazon CloudWatch

AWS Machine Learning Blog

JULY 8, 2024

MLOps practitioners have many options to establish an MLOps platform; one among them is cloud-based integrated platforms that scale with data science teams. TWCo was looking to scale its ML operations with more transparency and less complexity to allow for more manageable ML workflows as their data science team grew.

Data Scientist

Data Scientist ML Engineer Machine Learning Data Science

Building a Capability Roadmap: The Maturity Stages of Data & AI

ODSC - Open Data Science

MAY 15, 2023

A high amount of effort is spent organizing data and creating reliable metrics the business can use to make better decisions. This creates a daunting backlog of data quality improvements and, sometimes, a graveyard of unused dashboards that have not been updated in years. Let’s start with an example.

Data Quality

Data Quality Data Science Data Ingestion AI

Monitoring Machine Learning Models in Production

Heartbeat

JUNE 12, 2023

If the test or validation data distribution has too much deviance from the training data distribution, then we must go for retraining since it is a sign of population drift. Model Interpretability and Explainability Model interpretability and explainability describe how a machine learning model arrives at its predictions or decisions.

Machine Learning

Machine Learning Data Drift Explainability Data Quality

Centralize model governance with SageMaker Model Registry Resource Access Manager sharing

AWS Machine Learning Blog

NOVEMBER 14, 2024

Transparency and explainability : Making sure that AI systems are transparent, explainable, and accountable. It includes processes for monitoring model performance, managing risks, ensuring data quality, and maintaining transparency and accountability throughout the model’s lifecycle.

ML

ML Machine Learning Auto-complete Auto-classification

Steven Hillion, SVP of Data and AI at Astronomer – Interview Series

Unite.AI

JUNE 24, 2024

At Astronomer, he spearheads the creation of Apache Airflow features specifically designed for ML and AI teams and oversees the internal data science team. Can you share some information about your journey in data science and AI, and how it has shaped your approach to leading engineering and analytics teams?

Data Scientist

Data Scientist Large Language Models Machine Learning Software Engineer

How are AI Projects Different

Towards AI

AUGUST 16, 2023

Michael Dziedzic on Unsplash I am often asked by prospective clients to explain the artificial intelligence (AI) software process, and I have recently been asked by managers with extensive software development and data science experience who wanted to implement MLOps. Join thousands of data leaders on the AI newsletter.

Machine Learning

Machine Learning Software Development Data Drift Data Science

ETL Process Explained: Essential Steps for Effective Data Management

Pickl AI

OCTOBER 17, 2024

As organisations increasingly rely on data-driven insights, effective ETL processes ensure data integrity and quality, enabling informed decision-making. ETL facilitates Data Analytics by transforming raw data into meaningful insights, empowering businesses to uncover trends, track performance, and make strategic decisions.

ETL

ETL Explainability Data Integration Data Extraction

The Role of RTOS in the Future of Big Data Processing

ODSC - Open Data Science

JUNE 19, 2023

With the advent of big data in the modern world, RTOS is becoming increasingly important. As software expert Tim Mangan explains, a purpose-built real-time OS is more suitable for apps that involve tons of data processing. The Big Data and RTOS connection IoT and embedded devices are among the biggest sources of big data.

Big Data

Big Data ETL Data Science Artificial Intelligence

Data Hygiene Explained: Best Practices and Key Features

Pickl AI

JULY 19, 2023

In this article, we will delve into the concept of data hygiene, its best practices, and key features, while also exploring the benefits it offers to businesses. It involves validating, cleaning, and enriching data to ensure its accuracy, completeness, and relevance. Large datasets may require significant processing time.

Explainability

Explainability Data Quality Data Integration Automation

Machine Learning Project Checklist

DataRobot Blog

JULY 21, 2022

Evaluate the computing resources and development environment that the data science team will need. Large projects or those involving text, images, or streaming data may need specialized infrastructure. Data aggregation such as from hourly to daily or from daily to weekly time steps may also be required.

Machine Learning

Machine Learning Data Drift Categorization Data Scientist

Access Snowflake data using OAuth-based authentication in Amazon SageMaker Data Wrangler

Flipboard

MARCH 22, 2023

In this post, we show how to configure a new OAuth-based authentication feature for using Snowflake in Amazon SageMaker Data Wrangler. Snowflake is a cloud data platform that provides data solutions for data warehousing to data science. Data Wrangler creates the report from the sampled data.

IDP

IDP Data Scientist Categorization Data Quality

Understanding Generative AI Through Critical Thinking and Implementation

ODSC - Open Data Science

NOVEMBER 20, 2024

Addressing the Challenges of Generative AI: Data Quality, Governance, and Compliance One of the major hurdles businesses face when adopting generative AI is data quality. Yves Mulkers stressed the need for clean, reliable data as a foundation for AI success.

Generative AI

Generative AI Prompt Engineering Prompt Engineer Data Quality

Deliver your first ML use case in 8–12 weeks

AWS Machine Learning Blog

APRIL 26, 2023

Ensuring data quality, governance, and security may slow down or stall ML projects. Data engineering – Identifies the data sources, sets up data ingestion and pipelines, and prepares data using Data Wrangler. Conduct exploratory analysis and data preparation.

ML

ML Machine Learning Data Science Data Drift

“Fall in love with your data”—Snorkel AI’s Enterprise LLM Summit

Snorkel AI

JANUARY 26, 2024

To achieve the trust, quality, and reliability necessary for production applications, enterprise data science teams must develop proprietary data for use with specialized models. Data scientists can best improve LLM performance on specific tasks by feeding them the right data prepared in the right way.

LLM

LLM Data Science Large Language Models Explainability

Top 5 Challenges faced by Data Scientists

Pickl AI

MARCH 10, 2023

Data Science is the process in which collecting, analysing and interpreting large volumes of data helps solve complex business problems. A Data Scientist is responsible for analysing and interpreting the data, ensuring it provides valuable insights that help in decision-making.

Data Scientist

Data Scientist Data Science Data Integration Auto-classification

MLOps for batch inference with model monitoring and retraining using Amazon SageMaker, HashiCorp Terraform, and GitLab CI/CD

AWS Machine Learning Blog

AUGUST 29, 2023

This architecture design represents a multi-account strategy where ML models are built, trained, and registered in a central model registry within a data science development account (which has more controls than a typical application development account). The following figure depicts a successful run of the training pipeline.

Data Scientist

Data Scientist Data Quality Python ML

How can Data Scientists use ChatGPT for developing Machine Learning Models

Pickl AI

OCTOBER 17, 2023

This blog discusses best practices, real-world use cases, security and privacy considerations, and how Data Scientists can use ChatGPT to their full potential. Machine Learning Models: How Data Scientists Use ChatGPT Data Scientists use ChatGPT as a powerful ally in the ever-evolving field of Data Science.

Data Scientist

Data Scientist Machine Learning ChatGPT Data Analysis

LLM distillation techniques to explode in importance in 2024

Snorkel AI

NOVEMBER 9, 2023

LLM distillation will become a much more common and important practice for data science teams in 2024, according to a poll of attendees at Snorkel AI’s 2023 Enterprise LLM Virtual Summit. As data science teams reorient around the enduring value of small, deployable models, they’re also learning how LLMs can accelerate data labeling.

LLM

LLM Data Scientist Data Science Large Language Models

Find Your AI Solutions at the ODSC West AI Expo

ODSC - Open Data Science

OCTOBER 20, 2023

The Tangent Information Modeler, Time Series Modeling Reinvented Philip Wauters | Customer Success Manager and Value Engineer | Tangent Works Existing techniques for modeling time series data face limitations in scalability, agility, explainability, and accuracy. LLMs in Data Analytics: Can They Match Human Precision?

Data Science

Data Science NLP Machine Learning Data Analysis

Beginner’s Guide to ML-001: Introducing the Wonderful World of Machine Learning: An Introduction

Towards AI

FEBRUARY 20, 2024

If you want an overview of the Machine Learning Process, it can be categorized into 3 wide buckets: Collection of Data: Collection of Relevant data is key for building a Machine learning model. It isn't easy to collect a good amount of quality data. You need to know two basic terminologies here, Features and Labels.

Machine Learning

Machine Learning ML Neural Network Algorithm

How AI facilitates more fair and accurate credit scoring

Snorkel AI

OCTOBER 4, 2023

Third-party FMs are expensive to use at scale, and funneling consumer financial data through an open-access foundation model raises serious privacy concerns. Hosting a foundation model or building one from scratch is also no small feat; their massive sizes necessitate enormous computing and data science resources.

Data Scientist

Data Scientist AI AI Neural Network

Deep Learning Challenges in Software Development

Heartbeat

AUGUST 29, 2023

The following are some of the primary difficulties for deep learning in software development: Data Quality and Quantity Deep learning models need a lot of labeled and quality training data. To prevent biases and overfitting, it is also essential to ensure the data's diversity and representativeness.

Software Development

Software Development Deep Learning Neural Network Convolutional Neural Networks

Better Forecasting with AI-Powered Time Series Modeling

DataRobot Blog

DECEMBER 15, 2022

By simplifying Time Series Forecasting models and accelerating the AI lifecycle, DataRobot can centralize collaboration across the business—especially data science and IT teams—and maximize ROI. Prepare your data for Time Series Forecasting. The model training process is not a black box—it includes trust and explainability.

Machine Learning

Machine Learning AI AI Data Drift

Building AI Applications with Foundation Models: Key Insights from Chip Huyen

ODSC - Open Data Science

FEBRUARY 11, 2025

Chip Huyen began by explaining how AI engineering has emerged as a distinct discipline, evolving out of traditional machine learning engineering. This shift has made AI engineering more multidisciplinary, incorporating elements of data science, software engineering, and systemdesign. Focus on data quality over quantity.

AI Engineer

AI Engineer Machine Learning Software Engineer Prompt Engineering

LLM distillation techniques to explode in importance in 2024

Snorkel AI

NOVEMBER 9, 2023

LLM distillation will become a much more common and important practice for data science teams in 2024, according to a poll of attendees at Snorkel AI’s 2023 Enterprise LLM Virtual Summit. As data science teams reorient around the enduring value of small, deployable models, they’re also learning how LLMs can accelerate data labeling.

LLM

LLM Data Scientist Data Science Large Language Models

The Ever-growing Importance of MLOps: The Transformative Effect of DataRobot

DataRobot Blog

FEBRUARY 11, 2022

All models built within DataRobot MLOps support ethical AI through configurable bias monitoring and are fully explainable and transparent. The in-built, data quality assessments and visualization tools result in equitable, fair models that minimize the potential for harm, along with world-class data drift, service help, and accuracy tracking.

Data Drift

Data Drift Machine Learning DevOps Data Scientist

The Age of BioInformatics: Part 2

Heartbeat

OCTOBER 25, 2023

Bioinformatics: A Haven for Data Scientists and Machine Learning Engineers: Bioinformatics offers an unparalleled opportunity for data scientists and machine learning engineers to apply their expertise in solving complex biological problems.

Machine Learning

Machine Learning Data Scientist Convolutional Neural Networks Algorithm

How AI facilitates more fair and accurate credit scoring

Snorkel AI

OCTOBER 4, 2023

Third-party FMs are expensive to use at scale, and funneling consumer financial data through an open-access foundation model raises serious privacy concerns. Hosting a foundation model or building one from scratch is also no small feat; their massive sizes necessitate enormous computing and data science resources.

Data Scientist

Data Scientist AI AI Neural Network

7-Steps to Perform Data Visualization Guide for Success

Pickl AI

NOVEMBER 6, 2023

If you’re an aspiring Data Science professional , Data Visualisation will be part of your job role in presenting the insights in a visually understandable format. However, if you’re a beginner in the field, you need to undertake a Data Visualisation course for a beginner.

Data Science

Data Science Data Scientist Data Analysis Python

“Fall in love with your data”—Snorkel AI’s Enterprise LLM Summit

Snorkel AI

JANUARY 26, 2024

To achieve the trust, quality, and reliability necessary for production applications, enterprise data science teams must develop proprietary data for use with specialized models. Data scientists can best improve LLM performance on specific tasks by feeding them the right data prepared in the right way.

LLM

LLM Explainability Large Language Models Data Scientist

Is your model good? A deep dive into Amazon SageMaker Canvas advanced metrics

AWS Machine Learning Blog

JULY 31, 2023

It also enables you to evaluate the models using advanced metrics as if you were a data scientist. We explain the metrics and show techniques to deal with data to obtain better model performance. Quick model is useful when iterating to more quickly understand the impact of data changes to your model accuracy.

Auto-classification

Auto-classification Machine Learning ML Auto-complete

Data Demystified: What Exactly is Data?- 4 Types of Analytics

Pickl AI

JULY 23, 2023

Innovation and New Opportunities By analyzing data, organizations can uncover new opportunities for innovation and growth. Types of Analytics Descriptive Analytics with Example Explained Descriptive Analytics summarises and interprets historical data. It helps in gaining insights into past performance.

Data Analysis

Data Analysis Explainability Algorithm Machine Learning

What Industries are Hiring for Different Jobs in AI

ODSC - Open Data Science

APRIL 26, 2023

As you can imagine, data science is a pretty loose term or big tent idea overall. Though just about every industry imaginable utilizes the skills of a data-focused professional, each has its own challenges, needs, and desired outcomes. What makes this job title unique is the “Swiss army knife” approach to data.

Machine Learning

Machine Learning Data Science Data Scientist Python

The Age of Health Informatics: Part 1

Heartbeat

OCTOBER 23, 2023

Revolutionizing Healthcare through Data Science and Machine Learning Image by Cai Fang on Unsplash Introduction In the digital transformation era, healthcare is experiencing a paradigm shift driven by integrating data science, machine learning, and information technology.

Data Scientist

Data Scientist Machine Learning Big Data Algorithm

Seldon and Snorkel AI partner to advance data-centric AI

Snorkel AI

JANUARY 31, 2023

Snorkel AI provides a data-centric AI development platform for AI teams to unlock production-grade model quality and accelerate time-to-value for their investments. Seldon is a deployment solution that helps teams serve, monitor, explain, and manage their ML models in production.

Data Drift

Data Drift Explainability Data Scientist AI

Seldon and Snorkel AI partner to advance data-centric AI

Snorkel AI

JANUARY 31, 2023

Snorkel AI provides a data-centric AI development platform for AI teams to unlock production-grade model quality and accelerate time-to-value for their investments. Seldon is a deployment solution that helps teams serve, monitor, explain, and manage their ML models in production.

Data Drift

Data Drift Explainability Data Scientist AI

Leverage Phi-3: Exploring RAG based QnA with Microsoft’s Phi-3

Pragnakalp

APRIL 29, 2024

Q1: Which are the 2 high focuses of data science? A1: The two high focuses of data science are Velocity and Variety, which are characteristics of Big Data. Velocity refers to the increasing rate at which data is collected and obtained, while Variety refers to the different types and sources of data.

Deep Learning

Deep Learning Big Data Data Science LLM

GenAI most impactful tech of the decade | Gartner AI Hype Cycle

Snorkel AI

JULY 24, 2023

Suddenly, non-technical users witnessed the LLM-backed chatbot’s ability to regurgitate knowledge, explain jokes and write poems. When models are pretrained, data is the main means for customization and fine-tuning of the models,” Gartner® said. The data-centric philosophy goes well beyond the point of training a model.

Artificial Intelligence

Artificial Intelligence Artificial Intelligence Large Language Models Generative AI

How Axfood enables accelerated machine learning throughout the organization using Amazon SageMaker

The risks and limitations of AI in insurance

Webinars

Trending Sources

MLOps Landscape in 2023: Top Tools and Platforms

Webinars

11 Open Source Data Exploration Tools You Need to Know in 2023

ODSC West 2023 Recap in Pictures

How Tastry “Taught a Computer How to Taste.”

The Weather Company enhances MLOps with Amazon SageMaker, AWS CloudFormation, and Amazon CloudWatch

Building a Capability Roadmap: The Maturity Stages of Data & AI

Monitoring Machine Learning Models in Production

Centralize model governance with SageMaker Model Registry Resource Access Manager sharing

Steven Hillion, SVP of Data and AI at Astronomer – Interview Series

How are AI Projects Different

ETL Process Explained: Essential Steps for Effective Data Management

The Role of RTOS in the Future of Big Data Processing

Data Hygiene Explained: Best Practices and Key Features

Machine Learning Project Checklist

Access Snowflake data using OAuth-based authentication in Amazon SageMaker Data Wrangler

Understanding Generative AI Through Critical Thinking and Implementation

Deliver your first ML use case in 8–12 weeks

“Fall in love with your data”—Snorkel AI’s Enterprise LLM Summit

Top 5 Challenges faced by Data Scientists

MLOps for batch inference with model monitoring and retraining using Amazon SageMaker, HashiCorp Terraform, and GitLab CI/CD

How can Data Scientists use ChatGPT for developing Machine Learning Models

LLM distillation techniques to explode in importance in 2024

Find Your AI Solutions at the ODSC West AI Expo

Beginner’s Guide to ML-001: Introducing the Wonderful World of Machine Learning: An Introduction

How AI facilitates more fair and accurate credit scoring

Deep Learning Challenges in Software Development

Better Forecasting with AI-Powered Time Series Modeling

Building AI Applications with Foundation Models: Key Insights from Chip Huyen

LLM distillation techniques to explode in importance in 2024

The Ever-growing Importance of MLOps: The Transformative Effect of DataRobot

The Age of BioInformatics: Part 2

How AI facilitates more fair and accurate credit scoring

7-Steps to Perform Data Visualization Guide for Success

“Fall in love with your data”—Snorkel AI’s Enterprise LLM Summit

Is your model good? A deep dive into Amazon SageMaker Canvas advanced metrics

Data Demystified: What Exactly is Data?- 4 Types of Analytics

What Industries are Hiring for Different Jobs in AI

The Age of Health Informatics: Part 1

Seldon and Snorkel AI partner to advance data-centric AI

Seldon and Snorkel AI partner to advance data-centric AI

Leverage Phi-3: Exploring RAG based QnA with Microsoft’s Phi-3

GenAI most impactful tech of the decade | Gartner AI Hype Cycle

Stay Connected