Skip to content

Architecture

Sergio Soto edited this page Apr 8, 2026 · 2 revisions

Architecture

Overview

oiio-proxy-generator is a stateless serverless function on VAST DataEngine. It processes EXR/DPX files as they are ingested into a VAST S3 bucket, generating color-correct thumbnails and review proxies for the SpaceHarbor MAM application.

Event Flow

                    VAST S3 Bucket
                         |
                  [.exr/.dpx uploaded]
                         v
              DataEngine Element Trigger
          (ElementCreated, suffix: .exr)
                         |
                    [VastEvent]
                         v
          oiio-proxy-generator container
         +-------------------------------+
         |  init(ctx)                    |
         |    - S3 client (boto3)        |
         |    - VastDB session           |
         |    - DDL table check          |
         |    - Tool availability check  |
         +-------------------------------+
         |  handler(ctx, event)          |
         |    1. Parse VastEvent         |
         |    2. Extract bucket/key      |
         |    3. S3 GET source to /tmp   |
         |    4. Detect colorspace       |
         |    5. linear->sRGB->thumb     |
         |    6. linear->Rec709->proxy   |
         |    7. S3 PUT outputs          |
         |       (ContentType + tags)    |
         |    8. VastDB INSERT           |
         |    9. Kafka publish           |
         |   10. Cleanup /tmp            |
         +-------------------------------+
                    |          |
           +-------+          +--------+
           v                           v
     VAST DataBase              Kafka Broker
   (proxy_outputs)         (spaceharbor.proxy)
           |
           v
    SpaceHarbor App
  (presigned URL delivery)

Data Flow

S3 bucket
  |
  +-- GET source.exr ---------> /tmp/source.exr
  |                                |
  |                                +-- oiiotool colorconvert + resize --> /tmp/thumb.jpg
  |                                |
  |                                +-- oiiotool colorconvert + resize --> /tmp/intermediate.png
  |                                                                         |
  |                                                                   ffmpeg H.264
  |                                                                         |
  |                                                                    /tmp/proxy.mp4
  |
  +-- PUT .proxies/thumb.jpg <--- /tmp/thumb.jpg  (ContentType: image/jpeg)
  |
  +-- PUT .proxies/proxy.mp4 <--- /tmp/proxy.mp4  (ContentType: video/mp4)
  |
  cleanup /tmp/*

Module Responsibilities

Module Purpose
main.py DataEngine handler: init/handler, S3 I/O, orchestration, cleanup
ocio_transform.py Color space detection and transform via oiiotool CLI
oiio_processor.py Thumbnail (oiiotool resize + JPEG) and proxy (oiiotool + ffmpeg H.264)
publisher.py Kafka ProxyGeneratedEvent publishing
vast_db_persistence.py VastDB proxy_outputs table: schema, DDL, persistence

Color Space Pipeline

Source EXR (detected: linear / sRGB / Rec709)
    |
    +--[oiiotool --colorconvert linear sRGB]----> resize 256x256 --> JPEG thumbnail
    |
    +--[oiiotool --colorconvert linear Rec709]--> resize 1920x1080 --> PNG intermediate
                                                        |
                                                  ffmpeg H.264
                                                  -pix_fmt yuv420p
                                                  -movflags +faststart
                                                        |
                                                  MP4 proxy

Color management mode: oiiotool's built-in transforms (no OCIO config required). Supports linear, sRGB, Rec709. If OCIO_CONFIG_PATH is set and valid, OCIO transforms are used instead (supports ACES colorspace names like ACEScg, ARRI LogC).

Detection priority:

  1. Explicit colorspace EXR attribute
  2. oiio:ColorSpace attribute
  3. Chromaticities heuristic
  4. Fallback: linear

S3 Output Tagging

Outputs are uploaded with ContentType and S3 tags:

Tag Value Purpose
ContentType image/jpeg or video/mp4 Browser knows how to handle presigned URL
media_type thumbnail or proxy VAST Catalog discoverability
generator oiio-proxy-generator Provenance tracking
version 2.0.0 Version tracking

Trigger Safety (Infinite Loop Prevention)

The element trigger uses a .exr suffix filter. Since proxy outputs end in .jpg and .mp4, they do not match the trigger and cannot cause an infinite processing loop. The handler also validates extensions as a defense-in-depth measure.

Design Decisions

Decision Rationale
S3 download + upload (not NFS) DataEngine pods don't have NFS mounts; S3 is the only available I/O path
oiiotool built-in color (not OCIO) ACES config not available in Ubuntu Jammy repos; built-in linear/sRGB/Rec709 is sufficient for review proxies
Two-step proxy (oiiotool + ffmpeg) oiiotool handles EXR/DPX reading and color; ffmpeg handles H.264 encoding with faststart
.proxies/ sibling prefix Same bucket as source; hidden on NFS (dotfile); no extra view needed
ContentType on S3 PUT Browsers need MIME type to render presigned URL responses correctly
-movflags +faststart Browser can stream proxy without downloading entire file
Shared file_id with exr-inspector Enables JOINs across metadata and proxy tables without coordination
CNB buildpack (not custom Dockerfile) Consistent with exr-inspector; VAST runtime SDK included automatically

Clone this wiki locally