Data Ingestion, Data Platform and ETL - Artificial Intelligence Zone

Data Ingestion

Data Platform

ETL

How Rocket Companies modernized their data science solution on AWS

AWS Machine Learning Blog

FEBRUARY 21, 2025

Rockets legacy data science architecture is shown in the following diagram. The diagram depicts the flow; the key components are detailed below: Data Ingestion: Data is ingested into the system using Attunity data ingestion in Spark SQL.

Data Science

Data Science Data Scientist Data Ingestion DevOps

Data architecture strategy for data quality

IBM Journey to AI blog

JANUARY 5, 2023

The first generation of data architectures represented by enterprise data warehouse and business intelligence platforms were characterized by thousands of ETL jobs, tables, and reports that only a small group of specialized data engineers understood, resulting in an under-realized positive impact on the business.

Data Quality

Data Quality Metadata Big Data ETL

Join 15,000+

professionals

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Webinars

The Intersection of AI and Sales: Personalization Without Compromise

How to Achieve High-Accuracy Results When Using LLMs

Relevance, Reach, Revenue: How to Turn Marketing Trends From Hype to High-Impact

MORE WEBINARS

Trending Sources

Improving air quality with generative AI

AWS Machine Learning Blog

JUNE 18, 2024

This manual synchronization process, hindered by disparate data formats, is resource-intensive, limiting the potential for widespread data orchestration. The platform, although functional, deals with CSV and JSON files containing hundreds of thousands of rows from various manufacturers, demanding substantial effort for data ingestion.

Generative AI

Generative AI Data Ingestion Python LLM

Webinars

The Intersection of AI and Sales: Personalization Without Compromise

How to Achieve High-Accuracy Results When Using LLMs

Relevance, Reach, Revenue: How to Turn Marketing Trends From Hype to High-Impact

MORE WEBINARS

Differentiation: Microsoft Fabric vs Power BI

Pickl AI

DECEMBER 16, 2024

Whether you aim for comprehensive data integration or impactful visual insights, this comparison will clarify the best fit for your goals. Key Takeaways Microsoft Fabric is a full-scale data platform, while Power BI focuses on visualising insights. Fabric suits large enterprises; Power BI fits team-level reporting needs.

ETL

ETL Data Ingestion Data Integration Machine Learning

Comparing Tools For Data Processing Pipelines

The MLOps Blog

MARCH 15, 2023

A typical data pipeline involves the following steps or processes through which the data passes before being consumed by a downstream process, such as an ML model training process. Data Ingestion : Involves raw data collection from origin and storage using architectures such as batch, streaming or event-driven.

Categorization

Categorization ETL Data Integration Automation

Popular Data Transformation Tools: Importance and Best Practices

Pickl AI

OCTOBER 10, 2024

Its drag-and-drop interface makes it user-friendly, allowing data engineers to build complex workflows without extensive coding knowledge. Nifi excels in data ingestion, routing, transformation, and system-to-system data flow management. AWS Glue AWS Glue is a fully managed ETL service provided by Amazon Web Services.

ETL

ETL Data Quality Machine Learning Business Intelligence

How to Build Machine Learning Systems With a Feature Store

The MLOps Blog

JANUARY 26, 2024

Keeping track of how exactly the incoming data (the feature pipeline’s input) has to be transformed and ensuring that each model receives the features precisely how it saw them during training is one of the hardest parts of architecting ML systems. This is where feature stores come in. What is a feature store?

Machine Learning

Machine Learning Metadata ML Python

Drowning in Data? A Data Lake May Be Your Lifesaver

ODSC - Open Data Science

SEPTEMBER 29, 2023

Arjuna Chala, associate vice president, HPCC Systems For those not familiar with the HPCC Systems data lake platform, can you describe your organization and the development history behind HPCC Systems? They were interested in creating a data platform capable of managing a sizable number of datasets.

Big Data

Big Data ETL Data Science Data Ingestion

TransOrg’s Cloud Data Engineering Services on AWS, GCP & Snowflake

TransOrg Analytics

SEPTEMBER 24, 2024

Data Foundation on AWS Amazon S3: Scalable storage foundation for data lakes. AWS Lake Formation: Simplify the process of creating and managing a secure data lake. Amazon Redshift: Fast, scalable data warehouse for analytics. AWS Glue: Fully managed ETL service for easy data preparation and integration.

ETL

ETL LLM Data Ingestion Automation