Data Platform, Information and Metadata - Artificial Intelligence Zone

Data platform trinity: Competitive or complementary?

IBM Journey to AI blog

JANUARY 18, 2023

Data platform architecture has an interesting history. A read-optimized platform that can integrate data from multiple applications emerged. In another decade, the internet and mobile started the generate data of unforeseen volume, variety and velocity. It required a different data platform solution.

Data Platform

Data Platform ETL Metadata Data Discovery

Manage access controls in generative AI-powered search applications using Amazon OpenSearch Service and Amazon Cognito

Flipboard

NOVEMBER 19, 2024

Solution overview By combining the powerful vector search capabilities of OpenSearch Service with the access control features provided by Amazon Cognito , this solution enables organizations to manage access controls based on custom user attributes and document metadata. For more information, see Getting started with the AWS CDK.

Generative AI

Generative AI Metadata Robotics LLM

Google AI Introduces Croissant: A Metadata Format for Machine Learning-Ready Datasets

Marktechpost

MARCH 12, 2024

Database metadata can be expressed in various formats, including schema.org and DCAT. Unfortunately, these formats weren’t made with machine learning data in mind. Google has recently introduced Croissant, a new format for metadata in ML-ready datasets. Users can then publish their datasets.

Metadata

Metadata Machine Learning ML Data Discovery

Webinars

How to Achieve High-Accuracy Results When Using LLMs

Maximizing Profit and Productivity: The New Era of AI-Powered Accounting

Relevance, Reach, Revenue: How to Turn Marketing Trends From Hype to High-Impact

Automation, Evolved: Your New Playbook For Smarter Knowledge Work

MORE WEBINARS

Ken Claffey, CEO of VDURA – Interview Series

Unite.AI

FEBRUARY 6, 2025

Throughout my career, Ive been building and refining this unique combination of technical and business insights, which continues to inform my approach to innovation in the industry. Can you share the story behind the creation of the VDURA Data Platform and the key challenges you aimed to address in the AI and HPC landscape?

Data Platform

Data Platform Data Integration Metadata AI

How the right data and AI foundation can empower a successful ESG strategy

IBM Journey to AI blog

APRIL 10, 2023

Everyone would be using the same data set to make informed decisions which may range from goal setting to prioritizing investments in sustainability. Data fabric can help model, integrate and query data sources, build data pipelines, integrate data in near real-time, and run AI-driven applications.

ESG

ESG Metadata AI AI

US Open heralds new era of fan engagement with watsonx and generative AI

IBM Journey to AI blog

AUGUST 17, 2023

Year after year, IBM Consulting works with the United States Tennis Association (USTA) to transform massive amounts of data into meaningful insight for tennis fans. This year, the USTA is using watsonx , IBM’s new AI and data platform for business.

Generative AI

Generative AI Metadata AI AI

Advancing AI trust with new responsible AI tools, capabilities, and resources

AWS Machine Learning Blog

DECEMBER 5, 2024

Automated Reasoning checks help prevent factual errors from hallucinations using sound mathematical, logic-based algorithmic verification and reasoning processes to verify the information generated by a model, so outputs align with provided facts and arent based on hallucinated or inconsistent data.

Responsible AI

Responsible AI AI Tools AI AI

AI and the future of unstructured data

IBM Journey to AI blog

OCTOBER 14, 2024

Donahue: At the enterprise or company level, “good” data is clean, structured and enriched. This preprocessing pipeline should minimize information loss between the original content and the LLM-ready version. You may ask, “What does that have to do with unstructured data?”

Business Intelligence

Business Intelligence AI AI Data Science

Exploring the AI and data capabilities of watsonx

IBM Journey to AI blog

JULY 17, 2023

These encoder-only architecture models are fast and effective for many enterprise NLP tasks, such as classifying customer feedback and extracting information from large documents. While they require task-specific labeled data for fine tuning, they also offer clients the best cost performance trade-off for non-generative use cases.

Machine Learning

Machine Learning Metadata Automation AI

How Can The Adoption of a Data Platform Simplify Data Governance For An Organization?

Pickl AI

APRIL 14, 2023

Falling into the wrong hands can lead to the illicit use of this data. Hence, adopting a Data Platform that assures complete data security and governance for an organization becomes paramount. In this blog, we are going to discuss more on What are Data platforms & Data Governance.

Data Platform

Data Platform Data Integration Data Ingestion Automation

18 Data Profiling Tools Every Developer Must Know

Marktechpost

JUNE 5, 2024

Data profiling is a crucial tool. For evaluating data quality. It entails analyzing, cleansing, transforming, and modeling data to find valuable information, improve data quality, and assist in better decision-making, What is Data Profiling?

Data Quality

Data Quality Metadata Data Integration ETL

AI that’s ready for business starts with data that’s ready for AI

IBM Journey to AI blog

JULY 3, 2024

Open is creating a foundation for storing, managing, integrating and accessing data built on open and interoperable capabilities that span hybrid cloud deployments, data storage, data formats, query engines, governance and metadata.

Data Quality

Data Quality Metadata Business Intelligence AI

Discover insights from Box with the Amazon Q Box connector

AWS Machine Learning Blog

AUGUST 8, 2024

However, extracting meaningful information from the vast amount of Box data can be challenging without the right tools and capabilities. This enables you to quickly understand the main points and find relevant information in your documents without having to scan through individual document descriptions manually.

Metadata

Metadata Generative AI ML IDP

Ground truth generation and review best practices for evaluating generative AI question-answering with FMEval

AWS Machine Learning Blog

MARCH 5, 2025

To ensure the highest quality measurement of your question answering application against ground truth, the evaluation metrics implementation must inform ground truth curation. By following these guidelines, data teams can implement high fidelity ground truth generation for question-answering use case evaluation with FMEval.

Generative AI

Generative AI LLM AI AI

Principal Financial Group uses AWS Post Call Analytics solution to extract omnichannel customer insights

AWS Machine Learning Blog

NOVEMBER 15, 2023

Content redaction: Each customer audio interaction is recorded as a stereo WAV file, but could potentially include sensitive information such as HIPAA-protected and personally identifiable information (PII). Scalability: This architecture needed to immediately scale to thousands of calls per day and millions of calls per year.

Data Ingestion

Data Ingestion Metadata NLP Data Scientist

Demand forecasting at Getir built with Amazon Forecast

AWS Machine Learning Blog

MAY 15, 2023

Among those algorithms, deep/neural networks are more suitable for e-commerce forecasting problems as they accept item metadata features, forward-looking features for campaign and marketing activities, and – most importantly – related time series features. We are able to forecast over 10,000 SKUs daily in all the countries we serve.

Neural Network

Neural Network Convolutional Neural Networks Metadata Data Scientist

How to Build Machine Learning Systems With a Feature Store

The MLOps Blog

JANUARY 26, 2024

A feature store is a data platform that supports the creation and use of feature data throughout the lifecycle of an ML model, from creating features that can be reused across many models to model training to model inference (making predictions). It can also transform incoming data on the fly. What is a feature store?

Machine Learning

Machine Learning Metadata ML Python

The Sequence Pulse: The Architecture Powering Data Drift Detection at Uber

TheSequence

JULY 5, 2023

The Architecture The D3 architecture comprises several core systems managed by Uber's Data Platform, which play a crucial role in maintaining data quality. Databook provides valuable insights into datasets, including column information, lineage, data quality tests, metrics, SLAs, data consistency, duplicates, and more.

Data Drift

Data Drift Data Quality Metadata Data Platform

Governing the ML lifecycle at scale, Part 1: A framework for architecting ML workloads using Amazon SageMaker

AWS Machine Learning Blog

OCTOBER 20, 2023

Data lake foundations This module helps data lake admins set up a data lake to ingest data, curate datasets, and use the AWS Lake Formation governance model for managing fine-grained data access across accounts and users using a centralized data catalog, data access policies, and tag-based access controls.

ML

ML Data Scientist ML Engineer Data Science

Data Fabric & Data Mesh: Two Approaches, One Data-Driven Destiny

Heartbeat

DECEMBER 7, 2023

Data should have an independent team responsible for its creation, delivery, and sustainability. This team should consist of experts who know the business domain where the data comes from and should be something other than general-purpose Information and Communication Technologies (ICT) teams. The domain of the data.

Metadata

Metadata Data Platform Deep Learning Data Quality

Comcast’s data-centric approach to speech interfaces

Snorkel AI

FEBRUARY 13, 2023

Media Analytics, where we analyze all the broadcast content, as well as live content, that we’re distributing to extract additional metadata from this data and make it available to other systems to create new interactive experiences, or for further insights into how customers are using our streaming services.

Metadata

Metadata Machine Learning Deep Learning BERT

Comcast’s data-centric approach to speech interfaces

Snorkel AI

FEBRUARY 13, 2023

Media Analytics, where we analyze all the broadcast content, as well as live content, that we’re distributing to extract additional metadata from this data and make it available to other systems to create new interactive experiences, or for further insights into how customers are using our streaming services.

Metadata

Metadata Machine Learning Deep Learning BERT

Discover the Snowflake Architecture With All its Pros and Cons- NIX United

Mlearning.ai

FEBRUARY 16, 2023

Today, companies are facing a continual need to store tremendous volumes of data. The demand for information repositories enabling business intelligence and analytics is growing exponentially, giving birth to cloud solutions. The tool’s high storage capacity is perfect for keeping large information volumes.

Business Intelligence

Business Intelligence Data Ingestion Metadata Machine Learning

Use the Amazon SageMaker and Salesforce Data Cloud integration to power your Salesforce apps with AI/ML

AWS Machine Learning Blog

AUGUST 4, 2023

The real-time inference call data is first passed to the SageMaker Data Wrangler container in the inference pipeline, where it is preprocessed and passed to the trained model for product recommendation. For more information, refer to Creating roles and attaching policies (console). Creating the dataset may take some time.

ML

ML Categorization AI AI

LLMOps: What It Is, Why It Matters, and How to Implement It

The MLOps Blog

MARCH 12, 2024

Retrieval Augmented Generation (RAG) enables LLMs to extract and synthesize information like an advanced search engine. Tools range from data platforms to vector databases, embedding providers, fine-tuning platforms, prompt engineering, evaluation tools, orchestration frameworks, observability platforms, and LLM API gateways.

Prompt Engineer

Prompt Engineer Prompt Engineering Large Language Models LLM

A brief history of Data Engineering: From IDS to Real-Time streaming

Artificial Corner

JUNE 6, 2023

However, it was inflexible and could not handle many-to-many relationships or complex relationships between data, which limited its use in more complex applications. Hierarchical databases, such as IBM’s Information Management System (IMS), were widely used in early mainframe database management systems.

Data Mining

Data Mining Big Data ETL Machine Learning

Learnings From Building the ML Platform at Stitch Fix

The MLOps Blog

AUGUST 3, 2023

Stefan is a software engineer, data scientist, and has been doing work as an ML engineer. He also ran the data platform in his previous company and is also co-creator of open-source framework, Hamilton. As you’ve been running the ML data platform team, how do you do that? Stefan: Yeah. Thanks for having me.

ML

ML Data Scientist Software Engineer Machine Learning

Learnings From Building the ML Platform at Mailchimp

The MLOps Blog

OCTOBER 3, 2023

There’s no component that stores metadata about this feature store? Mikiko Bazeley: In the case of the literal feature store, all it does is store features and metadata. We’re assuming that data scientists, for the most part, don’t want to write transformations elsewhere. Mikiko Bazeley: 100%. Do you need Airflow?

ML

ML Data Scientist Machine Learning Data Science

Definite Guide to Building a Machine Learning Platform

The MLOps Blog

MARCH 21, 2023

To make that possible, your data scientists would need to store enough details about the environment the model was created in and the related metadata so that the model could be recreated with the same or similar outcomes. You need to build your ML platform with experimentation and general workflow reproducibility in mind.

Machine Learning

Machine Learning Data Scientist ML Metadata

Inference AudioCraft MusicGen models using Amazon SageMaker

AWS Machine Learning Blog

AUGUST 6, 2024

Dependent workflows can poll these topics to make informed decisions based on the inference outcomes. Asynchronous music generation As soon as the response metadata is sent to the client, the asynchronous inference begins the music generation. He specializes in building data platforms and architecting seamless data ecosystems.

Auto-complete

Auto-complete Metadata Generative AI Deep Learning

Exploring the Power of Data Warehouse Functionality

Pickl AI

JUNE 11, 2024

Summary: A data warehouse is a central information hub that stores and organizes vast amounts of data from different sources within an organization. Unlike operational databases focused on daily tasks, data warehouses are designed for analysis, enabling historical trend exploration and informed decision-making.

ETL

ETL Data Mining Data Integration Actionable Intelligence

Connect, share, and query where your data sits using Amazon SageMaker Unified Studio

Flipboard

MARCH 21, 2025

Traditionally, answering this question would involve multiple data exports, complex extract, transform, and load (ETL) processes, and careful data synchronization across systems. The table metadata is managed by Data Catalog. This is a SageMaker Lakehouse managed catalog backed by RMS storage.

Metadata

Metadata ETL Data Analysis Big Data

Data democratization: How data architecture can drive business decisions and AI initiatives

IBM Journey to AI blog

AUGUST 4, 2023

It’s often described as a way to simply increase data access, but the transition is about far more than that. When effectively implemented, a data democracy simplifies the data stack, eliminates data gatekeepers, and makes the company’s comprehensive data platform easily accessible by different teams via a user-friendly dashboard.

Machine Learning

Machine Learning Metadata Automation AI

Search enterprise data assets using LLMs backed by knowledge graphs

Flipboard

NOVEMBER 27, 2024

Customers want to search through all of the data and applications across their organization, and they want to see the provenance information for all of the documents retrieved. The application needs to search through the catalog and show the metadata information related to all of the data assets that are relevant to the search context.

Metadata

Metadata Auto-complete Data Discovery ML Engineer

Democratize ML on Salesforce Data Cloud with no-code Amazon SageMaker Canvas

AWS Machine Learning Blog

NOVEMBER 27, 2023

Salesforce Data Cloud and Einstein Studio Salesforce Data Cloud is a data platform that provides businesses with real-time updates of their customer data from any touch point. Einstein Studio is a gateway to AI tools on Salesforce Data Cloud. Salesforce adds a “__c “ to all the Data Cloud object fields.

ML

ML Data Scientist Explainability Natural Language Processing

Artificial Intelligence Zone

Data platform trinity: Competitive or complementary?

Manage access controls in generative AI-powered search applications using Amazon OpenSearch Service and Amazon Cognito

Webinars

Trending Sources

Google AI Introduces Croissant: A Metadata Format for Machine Learning-Ready Datasets

Webinars

Ken Claffey, CEO of VDURA – Interview Series

How the right data and AI foundation can empower a successful ESG strategy

US Open heralds new era of fan engagement with watsonx and generative AI

Advancing AI trust with new responsible AI tools, capabilities, and resources

AI and the future of unstructured data

Exploring the AI and data capabilities of watsonx

How Can The Adoption of a Data Platform Simplify Data Governance For An Organization?

18 Data Profiling Tools Every Developer Must Know

AI that’s ready for business starts with data that’s ready for AI

Discover insights from Box with the Amazon Q Box connector

Ground truth generation and review best practices for evaluating generative AI question-answering with FMEval

Principal Financial Group uses AWS Post Call Analytics solution to extract omnichannel customer insights

Demand forecasting at Getir built with Amazon Forecast

How to Build Machine Learning Systems With a Feature Store

The Sequence Pulse: The Architecture Powering Data Drift Detection at Uber

Governing the ML lifecycle at scale, Part 1: A framework for architecting ML workloads using Amazon SageMaker

Data Fabric & Data Mesh: Two Approaches, One Data-Driven Destiny

Comcast’s data-centric approach to speech interfaces

Comcast’s data-centric approach to speech interfaces

Discover the Snowflake Architecture With All its Pros and Cons- NIX United

Use the Amazon SageMaker and Salesforce Data Cloud integration to power your Salesforce apps with AI/ML

LLMOps: What It Is, Why It Matters, and How to Implement It

A brief history of Data Engineering: From IDS to Real-Time streaming

Learnings From Building the ML Platform at Stitch Fix

Learnings From Building the ML Platform at Mailchimp

Definite Guide to Building a Machine Learning Platform

Inference AudioCraft MusicGen models using Amazon SageMaker

Exploring the Power of Data Warehouse Functionality

Connect, share, and query where your data sits using Amazon SageMaker Unified Studio

Data democratization: How data architecture can drive business decisions and AI initiatives

Search enterprise data assets using LLMs backed by knowledge graphs

Democratize ML on Salesforce Data Cloud with no-code Amazon SageMaker Canvas

Stay Connected