Information, Metadata and ML - Artificial Intelligence Zone

How DPG Media uses Amazon Bedrock and Amazon Transcribe to enhance video metadata with AI-powered pipelines

AWS Machine Learning Blog

OCTOBER 16, 2024

With a growing library of long-form video content, DPG Media recognizes the importance of efficiently managing and enhancing video metadata such as actor information, genre, summary of episodes, the mood of the video, and more. Video data analysis with AI wasn’t required for generating detailed, accurate, and high-quality metadata.

Metadata

Metadata Automation Generative AI AI

LAION AI Unveils LAION-DISCO-12M: Enabling Machine Learning Research in Foundation Models with 12 Million YouTube Audio Links and Metadata

Marktechpost

NOVEMBER 19, 2024

Introduction to LAION-DISCO-12M To address this gap, LAION AI has released LAION-DISCO-12M—a collection of 12 million links to publicly available YouTube samples, paired with metadata designed to support foundational machine learning research in audio and music. Don’t Forget to join our 55k+ ML SubReddit.

Metadata

Metadata Machine Learning Natural Language Processing Computer Vision

Governing the ML lifecycle at scale, Part 3: Setting up data governance at scale

Flipboard

NOVEMBER 22, 2024

This post is part of an ongoing series about governing the machine learning (ML) lifecycle at scale. The data mesh architecture aims to increase the return on investments in data teams, processes, and technology, ultimately driving business value through innovative analytics and ML projects across the enterprise.

ML

ML Data Science Metadata DevOps

Webinars

Relevance, Reach, Return: How to Turn Marketing Trends From Hype to High-Impact

Automation, Evolved: Your New Playbook For Smarter Knowledge Work

MORE WEBINARS

Process formulas and charts with Anthropic’s Claude on Amazon Bedrock

AWS Machine Learning Blog

MARCH 21, 2025

Research papers and engineering documents often contain a wealth of information in the form of mathematical formulas, charts, and graphs. Navigating these unstructured documents to find relevant information can be a tedious and time-consuming task, especially when dealing with large volumes of data. Generate metadata for the page.

Metadata

Metadata Convolutional Neural Networks Generative AI Data Scientist

Amazon Q Business simplifies integration of enterprise knowledge bases at scale

Flipboard

FEBRUARY 11, 2025

Amazon Q Business , a new generative AI-powered assistant, can answer questions, provide summaries, generate content, and securely complete tasks based on data and information in an enterprises systems. Furthermore, it might contain sensitive data or personally identifiable information (PII) requiring redaction.

Data Ingestion

Data Ingestion Metadata Machine Learning Generative AI

Generate user-personalized communication with Amazon Personalize and Amazon Bedrock

Flipboard

APRIL 10, 2025

For instance, as a marketing manager for a video-on-demand company, you might want to send personalized email messages tailored to each individual usertaking into account their demographic information, such as gender and age, and their viewing preferences. This can be information like the title, description, or movie genre.

Metadata

Metadata Generative AI ML Machine Learning

Syngenta develops a generative AI assistant to support sales representatives using Amazon Bedrock Agents

Flipboard

DECEMBER 3, 2024

This conversational agent offers a new intuitive way to access the extensive quantity of seed product information to enable seed recommendations, providing farmers and sales representatives with an additional tool to quickly retrieve relevant seed information, complementing their expertise and supporting collaborative, informed decision-making.

Generative AI

Generative AI Metadata Machine Learning Natural Language Processing

Future-Proof Your Company’s AI Strategy: How a Strong Data Foundation Can Set You Up for Sustainable Innovation

Unite.AI

MARCH 17, 2025

Building a Data Foundation for the Future According to a recent KPMG survey , 67% of business leaders expect AI to fundamentally transform their businesses within the next two years, and 85% feel like data quality will be the biggest bottleneck to progress.

AI Strategy

AI Strategy Data Quality AI AI

Unstructured data management and governance using AWS AI/ML and analytics services

Flipboard

OCTOBER 25, 2023

Unstructured data is information that doesn’t conform to a predefined schema or isn’t organized according to a preset data model. Unstructured information may have a little or a lot of structure but in ways that are unexpected or inconsistent. Additionally, we show how to use AWS AI/ML services for analyzing unstructured data.

ML

ML Metadata Data Extraction AI

Google AI Introduces Croissant: A Metadata Format for Machine Learning-Ready Datasets

Marktechpost

MARCH 12, 2024

When building machine learning (ML) models using preexisting datasets, experts in the field must first familiarize themselves with the data, decipher its structure, and determine which subset to use as features. So much so that a basic barrier, the great range of data formats, is slowing advancement in ML.

Metadata

Metadata Machine Learning ML Data Discovery

Build a generative AI enabled virtual IT troubleshooting assistant using Amazon Q Business

AWS Machine Learning Blog

MARCH 21, 2025

Todays organizations face a critical challenge with the fragmentation of vital information across multiple environments. This solution helps streamline information retrieval, enhance collaboration, and significantly boost overall operational efficiency, offering a glimpse into the future of intelligent enterprise information management.

Generative AI

Generative AI Metadata IDP AI

Access control for vector stores using metadata filtering with Knowledge Bases for Amazon Bedrock

AWS Machine Learning Blog

JULY 2, 2024

Knowledge bases effectively bridge the gap between the broad knowledge encapsulated within foundation models and the specialized, domain-specific information that businesses possess, enabling a truly customized and valuable generative artificial intelligence (AI) experience.

Metadata

Metadata Generative AI Python Computer Vision

Manage access controls in generative AI-powered search applications using Amazon OpenSearch Service and Amazon Cognito

Flipboard

NOVEMBER 19, 2024

Solution overview By combining the powerful vector search capabilities of OpenSearch Service with the access control features provided by Amazon Cognito , this solution enables organizations to manage access controls based on custom user attributes and document metadata. For more information, see Getting started with the AWS CDK.

Generative AI

Generative AI Metadata Robotics LLM

Rightsify’s GCX: Your Go-To Source for High-Quality, Ethically Sourced, Copyright-Cleared AI Music Training Datasets with Rich Metadata

Marktechpost

MAY 9, 2024

These datasets encompass millions of hours of music, over 10 million recordings and compositions accompanied by comprehensive metadata, including key, tempo, instrumentation, keywords, moods, energies, chords, and more, facilitating training and commercial usage. GCX provides datasets with over 4.4

Metadata

Metadata Categorization AI AI

Accelerate AWS Well-Architected reviews with Generative AI

Flipboard

MARCH 4, 2025

Integration with the AWS Well-Architected Tool pre-populates workload information and initial assessment responses. Metadata filtering is used to improve retrieval accuracy. The WAFR Accelerator application retrieves the review status from the DynamoDB table to keep the user informed.

Generative AI

Generative AI Prompt Engineer Prompt Engineering AI

Unleashing the multimodal power of Amazon Bedrock Data Automation to transform unstructured data into actionable insights

AWS Machine Learning Blog

MARCH 20, 2025

It often requires managing multiple machine learning (ML) models, designing complex workflows, and integrating diverse data sources into production-ready formats. In a world whereaccording to Gartner over 80% of enterprise data is unstructured, enterprises need a better way to extract meaningful information to fuel innovation.

Automation

Automation IDP Generative AI Prompt Engineer

Yariv Fishman, Chief Product Officer at Deep Instinct – Interview Series

Unite.AI

AUGUST 28, 2024

in Information Systems Engineering from Ben Gurion University and an MBA from the Technion, Israel Institute of Technology. Along the way, I’ve learned different best practices – from how to manage a team to how to inform the proper strategy – that have shaped how I lead at Deep Instinct. ML is unfit for the task.

Deep Learning

Deep Learning Explainability Neural Network Metadata

AI Workforce: using AI and Drones to simplify infrastructure inspections

AWS Machine Learning Blog

APRIL 3, 2025

AI/ML and generative AI: Computer vision and intelligent insights As drones capture video footage, raw data is processed through AI-powered models running on Amazon Elastic Compute Cloud (Amazon EC2) instances. It even aids in synthetic training data generation, refining our ML models for improved accuracy.

Computer Vision

Computer Vision Automation AI AI

Empower your generative AI application with a comprehensive custom observability solution

AWS Machine Learning Blog

OCTOBER 29, 2024

This solution uses decorators in your application code to capture and log metadata such as input prompts, output results, run time, and custom metadata, offering enhanced security, ease of use, flexibility, and integration with native AWS services.

Generative AI

Generative AI Metadata Data Scientist AI

Use custom metadata created by Amazon Comprehend to intelligently process insurance claims using Amazon Kendra

AWS Machine Learning Blog

DECEMBER 5, 2023

Structured data, defined as data following a fixed pattern such as information stored in columns within databases, and unstructured data, which lacks a specific form or pattern like text, images, or social media posts, both continue to grow as they are produced and consumed by various organizations.

Metadata

Metadata Auto-classification Auto-complete Content Enrichment

Evaluate models or RAG systems using Amazon Bedrock Evaluations – Now generally available

AWS Machine Learning Blog

APRIL 4, 2025

Weve also added new citation metrics for the already-powerful RAG evaluation suite, including citation precision and citation coverage, to help you better assess how accurately your RAG system uses retrieved information. You must provide a knowledgeBaseIdentifier for every output. Fields marked with ? are optional.

Generative AI

Generative AI Metadata Python Data Scientist

Read graphs, diagrams, tables, and scanned pages using multimodal prompts in Amazon Bedrock

AWS Machine Learning Blog

NOVEMBER 26, 2024

In this post, we discuss how to use LLMs from Amazon Bedrock to not only extract text, but also understand information available in images. Solution overview In this post, we demonstrate how to use models on Amazon Bedrock to retrieve information from images, tables, and scanned documents. 90B Vision model.

LLM

LLM Convolutional Neural Networks Metadata Explainability

Building a Retrieval-Augmented Generation (RAG) System with FAISS and Open-Source LLMs

Marktechpost

MARCH 18, 2025

Often support for metadata filtering alongside vector search Popular vector databases include FAISS (Facebook AI Similarity Search), Pinecone, Weaviate, Milvus, and Chroma. The language model generates a response informed by both its parameters and the retrieved information Benefits of RAG include: 1.

Metadata

Metadata LLM Auto-complete Neural Network

Secure a generative AI assistant with OWASP Top 10 mitigation

Flipboard

JANUARY 24, 2025

This comprehensive security setup addresses LLM10:2025 Unbound Consumption and LLM02:2025 Sensitive Information Disclosure, making sure that applications remain both resilient and secure. In the physical architecture diagram, the application controller is the LLM orchestrator AWS Lambda function.

Generative AI

Generative AI LLM AI AI

Discover insights from Gmail using the Gmail connector for Amazon Q Business

AWS Machine Learning Blog

OCTOBER 31, 2024

Google Drive supports storing documents such as Emails contain a wealth of information found in different places, such as within the subject of an email, the message content, or even attachments. It can be tailored to specific business needs by connecting to company data, information, and systems through over 40 built-in connectors.

IDP

IDP Metadata Generative AI AI

Asure’s approach to enhancing their call center experience using generative AI and Amazon Q in Quicksight

AWS Machine Learning Blog

MARCH 20, 2025

Amazon Bedrock offers fine-tuning capabilities that allow you to customize these pre-trained models using proprietary call transcript data, facilitating high accuracy and relevance without the need for extensive machine learning (ML) expertise. Architecture The following diagram illustrates the solution architecture.

Generative AI

Generative AI Metadata AI AI

How Veritone uses Amazon Bedrock, Amazon Rekognition, Amazon Transcribe, and information retrieval to update their video search pipeline

AWS Machine Learning Blog

MAY 7, 2024

Veritone’s current media search and retrieval system relies on keyword matching of metadata generated from ML services, including information related to faces, sentiment, and objects. We use the Amazon Titan Text and Multimodal Embeddings models to embed the metadata and the video frames and index them in OpenSearch Service.

Metadata

Metadata Generative AI Machine Learning Large Language Models

How to build a decision tree model in IBM Db2

IBM Journey to AI blog

APRIL 13, 2023

Building ML infrastructure and integrating ML models with the larger business are major bottlenecks to AI adoption [1,2,3]. IBM Db2 can help solve these problems with its built-in ML infrastructure. Db2 Warehouse on cloud also supports these ML features.

Software Engineer

Software Engineer ML Machine Learning Metadata

Build a dynamic, role-based AI agent using Amazon Bedrock inline agents

AWS Machine Learning Blog

FEBRUARY 13, 2025

Employees and managers see different levels of company policy information, with managers getting additional access to confidential data like performance review and compensation details. The role information is also used to configure metadata filtering in the knowledge bases to generate relevant responses.

Metadata

Metadata Generative AI ML AI

AWS Enhancing Information Retrieval in Large Language Models: A Data-Centric Approach Using Metadata, Synthetic QAs, and Meta Knowledge Summaries for Improved Accuracy and Relevancy

Marktechpost

AUGUST 24, 2024

Retrieval Augmented Generation (RAG) represents a cutting-edge advancement in Artificial Intelligence, particularly in NLP and Information Retrieval (IR). This integration allows LLMs to perform more accurately and effectively in knowledge-intensive tasks, especially where proprietary or up-to-date information is crucial.

Large Language Models

Large Language Models Metadata Artificial Intelligence Artificial Intelligence

Researchers from UCLA and Google Propose AVIS: A Groundbreaking AI Framework for Autonomous Information Seeking in Visual Question Answering

Marktechpost

SEPTEMBER 6, 2023

GPT3, LaMDA, PALM, BLOOM, and LLaMA are just a few examples of large language models (LLMs) that have demonstrated their ability to store and apply vast amounts of information. Finally, unlike image search engines, they do not examine the query image against a large corpus of images tagged with different metadata.

Large Language Models

Large Language Models LLM Metadata AI

Researchers from Princeton University Introduce Metadata Conditioning then Cooldown (MeCo) to Simplify and Optimize Language Model Pre-training

Marktechpost

JANUARY 7, 2025

This approach has two primary shortcomings: Missed Contextual Signals : Without considering metadata such as source URLs, LMs overlook important contextual information that could guide their understanding of a texts intent or quality. MeCo leverages readily available metadata, such as source URLs, during the pre-training phase.

Metadata

Metadata Natural Language Processing LLM ML

Customized model monitoring for near real-time batch inference with Amazon SageMaker

AWS Machine Learning Blog

OCTOBER 28, 2024

Real-world applications vary in inference requirements for their artificial intelligence and machine learning (AI/ML) solutions to optimize performance and reduce costs. SageMaker Model Monitor monitors the quality of SageMaker ML models in production. Your client applications invoke this endpoint to get inferences from the model.

ML

ML Metadata Data Scientist DevOps

Build an enterprise synthetic data strategy using Amazon Bedrock

AWS Machine Learning Blog

APRIL 8, 2025

By using synthetic data, enterprises can train AI models, conduct analyses, and develop applications without the risk of exposing sensitive information. Validation challenges Verifying the quality and representation of synthetic data often requires comparison with real data, which can be problematic when working with sensitive information.

Python

Python Metadata ML Data Analysis

Automate Amazon Bedrock batch inference: Building a scalable and efficient pipeline

AWS Machine Learning Blog

OCTOBER 29, 2024

It stores information such as job ID, status, creation time, and other metadata. The following is a screenshot of the DynamoDB table where you can track the job status and other types of metadata related to the job. The invoked Lambda function creates new job entries in a DynamoDB table with the status as Pending.

Automation

Automation Generative AI Metadata Data Scientist

Allen Institute for AI Released olmOCR: A High-Performance Open Source Toolkit Designed to Convert PDFs and Document Images into Clean and Structured Plain Text

Marktechpost

FEBRUARY 26, 2025

A main issue with PDF processing is that these documents store information optimally for visual presentation rather than logical reading order. This toolkit integrates text-based and visual information, allowing for superior extraction accuracy compared to conventional OCR methods.

Metadata

Metadata Inference Engine Deep Learning AI

Build a video insights and summarization engine using generative AI with Amazon Bedrock

AWS Machine Learning Blog

OCTOBER 29, 2024

These meetings often involve exchanging information and discussing actions that one or more parties must take after the session. This engine uses artificial intelligence (AI) and machine learning (ML) services and generative AI on AWS to extract transcripts, produce a summary, and provide a sentiment for the call.

Generative AI

Generative AI ML AI AI

The Impact of Questionable Research Practices on the Evaluation of Machine Learning (ML) Models

Marktechpost

JULY 27, 2024

These methods have the potential to greatly exaggerate published results, deceiving the scientific community and the general public about the actual effectiveness of ML models. Due to the intricacy of ML research, which includes pre-training, post-training, and evaluation stages, there is much potential for QRPs. Check out the Paper.

Machine Learning

Machine Learning ML Metadata Large Language Models

Vitech uses Amazon Bedrock to revolutionize information access with AI-powered chatbot

AWS Machine Learning Blog

MAY 30, 2024

To serve their customers, Vitech maintains a repository of information that includes product documentation (user guides, standard operating procedures, runbooks), which is currently scattered across multiple internal platforms (for example, Confluence sites and SharePoint folders). langsmith==0.0.43 pgvector==0.2.3 streamlit==1.28.0

Chatbots

Chatbots Prompt Engineer Prompt Engineering Large Language Models

Information extraction with LLMs using Amazon SageMaker JumpStart

AWS Machine Learning Blog

MAY 7, 2024

Large language models (LLMs) have unlocked new possibilities for extracting information from unstructured text data. SageMaker JumpStart is a machine learning (ML) hub with foundation models (FMs), built-in algorithms, and prebuilt ML solutions that you can deploy with just a few clicks.

Prompt Engineer

Prompt Engineer Prompt Engineering Large Language Models LLM

Governing the ML lifecycle at scale, Part 1: A framework for architecting ML workloads using Amazon SageMaker

AWS Machine Learning Blog

OCTOBER 20, 2023

Customers of every size and industry are innovating on AWS by infusing machine learning (ML) into their products and services. Recent developments in generative AI models have further sped up the need of ML adoption across industries.

ML

ML Data Scientist ML Engineer Data Science

A Coding Implementation to Build a Document Search Agent (DocSearchAgent) with Hugging Face, ChromaDB, and Langchain

Marktechpost

MARCH 19, 2025

In today’s information-rich world, finding relevant documents quickly is crucial. print("-" * 50) interactive_search() Let’s add the ability to filter our search results by metadata: Copy Code Copied Use a different Browser def filtered_search(query, filter_source=None, n_results=5): """ Search with optional filtering by source.

Metadata

Metadata Artificial Intelligence Artificial Intelligence ML

Track LLM model evaluation using Amazon SageMaker managed MLflow and FMEval

AWS Machine Learning Blog

JANUARY 28, 2025

Regular interval evaluation also allows organizations to stay informed about the latest advancements, making informed decisions about upgrading or switching models. SageMaker is a data, analytics, and AI/ML platform, which we will use in conjunction with FMEval to streamline the evaluation process.

LLM

LLM Large Language Models ML Algorithm

Evaluate and improve performance of Amazon Bedrock Knowledge Bases

AWS Machine Learning Blog

MARCH 25, 2025

There are two metrics used to evaluate retrieval: Context relevance Evaluates whether the retrieved information directly addresses the querys intent. It requires ground truth texts for comparison to assess recall and completeness of retrieved information. Implement metadata filtering , adding contextual layers to chunk retrieval.

Prompt Engineer

Prompt Engineer Prompt Engineering Metadata Responsible AI

How DPG Media uses Amazon Bedrock and Amazon Transcribe to enhance video metadata with AI-powered pipelines

LAION AI Unveils LAION-DISCO-12M: Enabling Machine Learning Research in Foundation Models with 12 Million YouTube Audio Links and Metadata

Webinars

Trending Sources

Governing the ML lifecycle at scale, Part 3: Setting up data governance at scale

Webinars

Process formulas and charts with Anthropic’s Claude on Amazon Bedrock

Amazon Q Business simplifies integration of enterprise knowledge bases at scale

Generate user-personalized communication with Amazon Personalize and Amazon Bedrock

Syngenta develops a generative AI assistant to support sales representatives using Amazon Bedrock Agents

Future-Proof Your Company’s AI Strategy: How a Strong Data Foundation Can Set You Up for Sustainable Innovation

Unstructured data management and governance using AWS AI/ML and analytics services

Google AI Introduces Croissant: A Metadata Format for Machine Learning-Ready Datasets

Build a generative AI enabled virtual IT troubleshooting assistant using Amazon Q Business

Access control for vector stores using metadata filtering with Knowledge Bases for Amazon Bedrock

Manage access controls in generative AI-powered search applications using Amazon OpenSearch Service and Amazon Cognito

Rightsify’s GCX: Your Go-To Source for High-Quality, Ethically Sourced, Copyright-Cleared AI Music Training Datasets with Rich Metadata

Accelerate AWS Well-Architected reviews with Generative AI

Unleashing the multimodal power of Amazon Bedrock Data Automation to transform unstructured data into actionable insights

Yariv Fishman, Chief Product Officer at Deep Instinct – Interview Series

AI Workforce: using AI and Drones to simplify infrastructure inspections

Empower your generative AI application with a comprehensive custom observability solution

Use custom metadata created by Amazon Comprehend to intelligently process insurance claims using Amazon Kendra

Evaluate models or RAG systems using Amazon Bedrock Evaluations – Now generally available

Read graphs, diagrams, tables, and scanned pages using multimodal prompts in Amazon Bedrock

Building a Retrieval-Augmented Generation (RAG) System with FAISS and Open-Source LLMs

Secure a generative AI assistant with OWASP Top 10 mitigation

Discover insights from Gmail using the Gmail connector for Amazon Q Business

Asure’s approach to enhancing their call center experience using generative AI and Amazon Q in Quicksight

How Veritone uses Amazon Bedrock, Amazon Rekognition, Amazon Transcribe, and information retrieval to update their video search pipeline

How to build a decision tree model in IBM Db2

Build a dynamic, role-based AI agent using Amazon Bedrock inline agents

AWS Enhancing Information Retrieval in Large Language Models: A Data-Centric Approach Using Metadata, Synthetic QAs, and Meta Knowledge Summaries for Improved Accuracy and Relevancy

Researchers from UCLA and Google Propose AVIS: A Groundbreaking AI Framework for Autonomous Information Seeking in Visual Question Answering

Researchers from Princeton University Introduce Metadata Conditioning then Cooldown (MeCo) to Simplify and Optimize Language Model Pre-training

Customized model monitoring for near real-time batch inference with Amazon SageMaker

Build an enterprise synthetic data strategy using Amazon Bedrock

Automate Amazon Bedrock batch inference: Building a scalable and efficient pipeline

Allen Institute for AI Released olmOCR: A High-Performance Open Source Toolkit Designed to Convert PDFs and Document Images into Clean and Structured Plain Text

Build a video insights and summarization engine using generative AI with Amazon Bedrock

The Impact of Questionable Research Practices on the Evaluation of Machine Learning (ML) Models

Vitech uses Amazon Bedrock to revolutionize information access with AI-powered chatbot

Information extraction with LLMs using Amazon SageMaker JumpStart

Governing the ML lifecycle at scale, Part 1: A framework for architecting ML workloads using Amazon SageMaker

A Coding Implementation to Build a Document Search Agent (DocSearchAgent) with Hugging Face, ChromaDB, and Langchain

Track LLM model evaluation using Amazon SageMaker managed MLflow and FMEval

Evaluate and improve performance of Amazon Bedrock Knowledge Bases

Stay Connected