Back to Portfolio
L’OréalData EngineeringWeeks to 30 Minutes

Hybrid Cloud ETL Architecture for Anaplan Process

Architecting a semi-automated Python pipeline on Google Colab to bypass Zone automation restrictions and accelerate financial consolidation.

PythonGoogle ColabAnaplanGoogle Drive

The Challenge

The Channel Development team was trapped in "Excel Hell." They spent weeks every month manually converting raw data (Promotion Reports, Invoices, Scan Sales) into Anaplan-ready upload files.

  • Volume: Millions of rows across multiple disparate sources.
  • Constraint: Strict Zone policies blocked standard cloud automation tools (like Airflow or Cloud Functions) for this specific workflow, leaving the team relying on crashing local laptops.

The Stack

  • Compute: Google Colab (Python)
  • Storage: Google Drive
  • Transformation: Pandas (Data Munging)
  • Target: Anaplan (Financial Planning)

The Architecture

Image

The "Hard Part"

Compliant Semi-Automation The engineering blocker wasn't the code; it was the compliance. We could not deploy a "always-on" server. I architected a "Cloud-Based Semi-Automation" pattern:

  • The Loophole: Google Colab was whitelisted for "Ad-Hoc Analysis."
  • The Hack: We built a production-grade ETL pipeline inside a Colab notebook. The user only needs to click "Run All" (manual trigger) to execute complex cloud-based transformations (automated logic). This satisfied the "no automated triggers" policy while leveraging cloud compute power.

The Result

  • Speed: Reduced data processing time from weeks to 30 minutes.
  • Reliability: Eliminated version control issues and Excel crashes.
  • Scalability: The script now handles full historical datasets that previously broke local Excel instances.