🤖

The Autonomous Data Engineer

When AI becomes your data engineering colleague

A YouTube series and open source ecosystem exploring how AI agents can work autonomously with data platforms.

The YouTube Series

The Autonomous Data Engineer is a video series documenting the journey of building an AI-based autonomous data engineer. Each episode explores a different aspect: from metadata extraction to autonomous pipeline navigation, from lineage tracking to code generation.

It's not a tutorial. It's a logbook — showing what works, what doesn't, and what happens when you give an AI agent real access to a data ecosystem.

Watch on YouTube →

ade-core

⚙️

ade-core

Agentic Data Engineering Framework

A framework that enables AI agents to work autonomously with data platforms by providing structured context through consolidated metadata.

📊 Multi-platform metadata

Extracts and consolidates metadata from Databricks, Power BI, and other platforms into a single queryable catalog.

🔗 Lineage tracking

Parses source code to identify dependencies between notebooks, tables, and DAX measures.

🔍 Full-text search

Search across all platforms with a single query, integrating with Claude via MCP Server.

🖥️ Web UI

Includes a Streamlit interface for visually exploring the catalog, no CLI needed.

Python Databricks Power BI MCP Server Claude Streamlit
View on GitHub →

The Vision

Modern data engineering is fragmented: notebooks on one platform, semantic models on another, pipelines everywhere. Each system has its own metadata, its own interface, its own language.

ADE was born to solve this problem — not with another dashboard, but with an AI agent that understands context, navigates dependencies, and works side by side with the data engineer.

The goal isn't to replace the data engineer. It's to give them a colleague that never sleeps, never forgets, and can traverse the entire stack in seconds.

← Back to projects