Back to Plugins

Oracle Ai Data Platform Workbench Engineer Agent

Oracle AI Data Platform (AIDP) Workbench engineer agent for Claude Code — a 37-skill agent that operates the full Spark/Delta lakehouse in natural language. Discovers your catalog into a grounding cache, turns plain English into accurate Spark SQL, and runs the lifecycle (CREATE…

developmentaiagent
By Oracle
3722Updated 3 days agoPythonUPL-1.0

Installation

/plugin install oracle-ai-data-platform-workbench-engineer-agent@claude-plugins-official

How to install

  1. Open Claude Code in your terminal
  2. Run the installation command above
  3. The plugin will be enabled automatically
  4. Use the plugin's features in your Claude Code sessions

Oracle AI Data Platform Workbench Samples

This repository contains a curated collection of sample notebooks demonstrating how to build data pipelines, run machine learning workloads, and integrate AI capabilities using Oracle AI Data Platform (AIDP) Workbench — a unified, governed workspace for data engineering, ML, and AI development powered by Apache Spark.

What is Oracle AI Data Platform Workbench?

Oracle AI Data Platform Workbench is a unified, governed workspace for building, managing, and deploying AI and data-driven solutions. It brings together notebooks, agent development, orchestration, and catalog management in a single collaborative platform — empowering teams to explore data, fine-tune models, and operationalize AI with trust and speed.

Learn more about AIDP Workbench →


Repository Structure

oracle-aidp-samples/
├── getting-started/          # Foundational notebooks for new users
│   ├── Delta_Lake/           # Delta Lake feature walkthroughs
│   └── migration/            # Migrating workloads to AIDP
├── data-engineering/
│   ├── ingestion/            # Connectors and data loading patterns
│   └── transformation/       # Pipeline architectures and table formats
│       ├── liquid-clustering/
│       ├── medallion-lake/
│       ├── scd/
│       └── streaming/
├── ai/
│   ├── agent-flows/          # Agent orchestration and scheduling
│   └── ml-datascience/       # ML, LLM, and AI service integrations
└── shared-utils/             # Reusable utilities and data generators

Sample Catalog

Getting Started

Foundational examples to help you get up and running on AIDP Workbench.

NotebookDescription
Access ALH DataWrite and query data in Oracle Autonomous AI Lakehouse (ALH) using PySpark insertInto and SQL INSERT statements with external catalogs.
Access Object Storage DataRead and write data from OCI Object Storage using direct access, external volumes, and external tables.
Analyse Data Using PySparkPySpark fundamentals: catalog and schema setup, table creation, data insertion, schema exploration, and matplotlib visualizations.
Analyse Data Using SQLCore SQL operations on AIDP including DataFrame creation, transformations, aggregations, and simple visualizations.
ALH External Catalog MERGEEnd-to-end MERGE workflow into an ALH table via an AIDP external catalog: insert/update/delete with merge keys and OOS-staging skip optimization.

Delta Lake

NotebookDescription
Use Delta Lake TableComprehensive guide covering Delta table operations: updates, merges, time travel, liquid clustering, and vacuuming.
Delta Change Data FeedCapture row-level changes (inserts, updates, deletes) from Delta tables for CDC, incremental processing, and streaming pipelines.
Handle Schema EvolutionAdd and evolve columns in Delta tables without rewriting existing data, leveraging automatic schema evolution.
Delta UniForm TablesCreate Delta UniForm tables that automatically synchronize Iceberg metadata for cross-format interoperability.

Migration

NotebookDescription
Migrate Files from Databricks to AIDPRecursively export notebooks and files from a Databricks workspace to AIDP using the databricks-sdk library.
Download from Git to AIDPDownload notebooks and files from a Git repository as a ZIP archive and extract them directly into an AIDP workspace volume.

Data Engineering — Ingestion

Patterns for connecting to and loading data from a wide range of sources.

NotebookDescription
Read/Write Oracle Ecosystem ConnectorsConnect to Oracle Database, Oracle Exadata, ALH, and ATP with external catalog support and SQL pushdown.
Read/Write External Ecosystem ConnectorsRead/write operations with Hive Metastore, Microsoft SQL Server, PostgreSQL, and MySQL.
Read-Only Ingestion ConnectorsUse read-only connectors for MySQL HeatWave, REST APIs, Oracle Fusion BICC, Kafka, and other sources.
Connect Using Custom JDBC DriverIntegrate custom JDBC drivers (e.g., SQLite, Snowflake) with Spark for connecting to databases not bundled by default.
Execute Oracle ALH SQLExecute SQL statements directly against Oracle ALH using the oracledb Python package.
Ingest Data Using YAMLConfig-driven ingestion from cloud storage (CSV, JSON) and JDBC sources with schema validation and data quality checks.
Ingest from Multi-CloudIngest data from Azure Data Lake Storage (ADLS) and AWS S3 with proper JAR configuration and credential management.
Ingest into Apache Iceberg (OCI Native)End-to-end Apache Iceberg workflow: table creation, querying, schema evolution, time travel, and metadata inspection using OCI native protocol and Hadoop catalog.
Pipe-Delimited File IngestionRead pipe-delimited (|) files from OCI Object Storage and register them as external tables.
Read Excel FilesRead Excel (.xlsx) files using the Spark Excel connector and convert them to Spark DataFrames or CSV.
Streaming from OCI Streaming ServiceConsume messages from OCI Streaming (Kafka-compatible) using Spark Structured Streaming with SASL/OAUTHBearer authentication.
Streaming from Volume PathProcess CSV files from a workspace volume using one-time micro-batch streaming with Trigger.Once().

Data Engineering — Transformation

Architectural patterns and pipeline templates for data transformation at scale.

Medallion Architecture

Implements the Bronze → Silver → Gold lakehouse pattern with data quality checks and aggregations. Industry variants available:

NotebookIndustry
EducationEducation analytics pipeline
EnergyEnergy consumption and reporting
Financial ServicesFinancial transactions and risk
HealthcarePatient records and clinical data
HospitalityHotel bookings and guest analytics
InsurancePolicy and claims processing
ManufacturingProduction line and quality data
MediaContent engagement and subscriptions
Real EstateProperty listings and transactions
RetailSales, inventory, and customer data
TelecommunicationsNetwork usage and customer churn
TransportationLogistics and fleet tracking

Delta Liquid Clustering

Demonstrates Delta Lake liquid clustering for automatic query optimization and data layout management. Industry variants available:

NotebookIndustry
EducationStudent performance analytics with ML prediction
EnergySmart grid monitoring and anomaly detection
Financial ServicesTransaction analytics and reporting
HealthcarePatient data access patterns
HospitalityBooking and occupancy analytics
InsuranceClaims and policy data optimization
ManufacturingProduction and quality metrics
MediaContent and engagement data
Real EstateProperty and transaction data
RetailSales and inventory analytics
TelecommunicationsNetwork and customer usage data
TransportationFleet and logistics optimization

Apache Iceberg Uniform Liquid Clustering

Combines Delta UniForm with Apache Iceberg Liquid Clustering for open-format, cross-engine table optimization. Industry variants available:

NotebookIndustry
EducationStudent performance data
EnergyGrid and sensor data
Financial ServicesTransaction and risk data
HealthcareClinical and patient records
[Hospitality](data-engineering/transformation/liquid-

View source on GitHub