FLab - Governance-first Data & ML Platform

Safe ML on sensitive customer data - without humans touching production.

FLab introduces a strict separation between human experimentation and automated production execution. Researchers iterate quickly in controlled Labs; production artifacts are produced only by audited, versioned Projects.

Join early design partner program See the Projects/Labs model

Principle

Intent is human, execution is machine

Goal

Enable experimentation with bounded blast radius

Outcome

Auditability + governance by design

Example policy boundary

# Labs: interactive, sandboxed
allow:
  - read: curated_views/*
  - run: ephemeral_compute
deny:
  - read: raw_customer_data/*
  - write: production_artifacts/*

# Projects: automated, audited
allow:
  - read: raw_customer_data/*
  - write: versioned_artifacts/*
require:
  - git_commit
  - signed_pipeline
  - audit_log

The point isn’t a specific policy language - it’s a platform model where production data access and artifact creation are only possible via automated pipelines.

The problem

Many organizations want to build Data/ML products using customer data, but enabling ML work often means granting broad human access to sensitive datasets and production artifacts.

What typically happens today

Engineers and researchers get direct access to production datasets “to move fast”.
Training and evaluation happen in notebooks, with partial reproducibility.
Artifacts are copied between systems with weak lineage and incomplete audit trails.
Governance is bolted on later via ad-hoc IAM policies and process.

What the business actually needs

Fast iteration for ML engineers and researchers.
Strict access control, least privilege, and consistent audit logs.
Deterministic production pipelines with versioned data + code + configs.
Clear separation between experimentation and production outputs.

Labs (human plane)

Controlled environments for ad-hoc experimentation - designed to be productive, but intentionally not powerful enough to exfiltrate raw production data or publish final artifacts.

Ephemeral compute with strict egress controls
Access to curated / masked / aggregated datasets
Experiment tracking and reproducible runs
Outputs are proposals (code/config), not production artifacts

Projects (machine plane)

Automated pipelines that are the only path to production data access and production-grade artifacts. Everything is versioned, signed, and audited.

Non-interactive execution (CI-style)
Data access gated by policy + approvals
Immutable versioned artifacts with lineage
Promotion workflows (staging → prod) with auditability

How it works

A minimal end-to-end loop, designed to eliminate "manual production ML".

Researchers iterate in Labs using curated or policy-approved views.
They publish changes as code/config (e.g., Git commits), plus experiment metadata.
Projects execute training/evaluation in automated pipelines with controlled access to raw data.
Artifacts are produced once, versioned, and promoted through controlled stages.
Audit logs capture who changed what, which data was accessed, and how artifacts were created.

In other words: people can explore, but only automation can produce truth.

Built for Cloud

FLab is designed as a cloud-native control plane with secure, isolated execution environments. AWS PaaS provides the primitives needed for identity, isolation, storage, and auditability.

Core building blocks

Compute isolation for Labs and Projects (container or VM-based)
Object storage for versioned datasets and artifacts
Central policy + identity integration
Immutable audit logs and observability

Note: FLab integrates with existing data lakes and orchestration tools; the key innovation is the platform model enforcing the boundary between interactive work and production execution.

Contact

Interested in hearing more, reviewing the model, or sharing your ideas (security, governance, or platform requirements)?

Email: dmytro@flab.dev