ARGON — Reviewer Package v0.2.0

Overview

What It Is

Argon is a browser-based motion analysis and character animation pipeline. Upload reference footage → Argon extracts skeleton pose and facial expression data → apply that data to drive a character image in your visual style.

The entire pipeline runs serverless GPU in the cloud (Modal.com). The interface lives on Vercel. Collaborators open arg0n.dev — nothing to install.

Designed for animator/designer workflows: non-technical users interact through a React UI, technical users can hit the API directly. CivitAI LoRAs are downloadable and cached server-side for injection into any generation or transfer request.

Three Core Operations

01

Analyze

Extract body skeleton (DWPose), 468-point face landmarks (MediaPipe), expression coefficients (16 AUs + PAD vector), and segmentation masks (BiRefNet) from any reference image or video.

02

Transfer

Apply extracted motion and expression data to a character still via LivePortrait. Supports single-frame expression transfer and full beat-synced animated sequences.

03

Generate

Text-to-image and pose-conditioned generation via ComfyUI (SDXL/Flux). CivitAI LoRAs injected at inference time via ComfyUI LoraLoader nodes.

Architecture

arg0n.dev (Vercel · Create React App)
  |
  | Vercel rewrite: /api/:path* → Modal URL
  |
  ↓
Modal.com (serverless GPU · L4/A10G)
  +-- ArgonRuntime cls (warm container, 300s idle timeout)
  |   +-- MediaPipe face mesh (init on container start)
  |   +-- ComfyUI (DWPose · LivePortrait · BiRefNet)
  |   +-- CivitAI LoRA downloader → Modal Volume
  +-- FastAPI HTTP layer (matches argon-server.js contract)
  +-- modal.Dict → cross-request job state
  +-- modal.Volume → /models/loras/ persistent LoRA cache

Key Design Decisions

No Vercel Serverless Functions

CRA project — no built-in API routes. Vercel rewrites proxy all /api/* calls directly to Modal. Zero additional infrastructure.

Dual-mode: Mock + GPU

Every endpoint returns valid mock data when ComfyUI models aren't loaded. Frontend dev continues without GPU wait. Data contract identical between mock and real.

LoRA Persistence

Modal Volume at /models/loras/ persists across deploys and restarts. Download a CivitAI LoRA once — cached forever for all sessions.

Stateless API, Stateful Jobs

modal.Dict shared KV across containers. Long jobs return jobId immediately; clients poll /api/jobs/:id.

Tech Stack

Layer	Technology	Notes
Frontend	React 18, ReactFlow, Framer Motion, Three.js	CRA — Vercel
API client	argon-client.js	Relative /api/* in prod, localhost:7860 in dev
GPU backend	Modal.com (Python)	FastAPI + modal.asgi_app() — L4 default
Inference	ComfyUI + custom nodes	DWPose, LivePortraitKJ, BiRefNet-ZHO
Face analysis	MediaPipe	468-point face mesh, always available
LoRA source	CivitAI API	Download by versionId, cache to Modal Volume
Job state	modal.Dict	Cross-container shared KV
Model storage	modal.Volume	Persistent /models/ across deploys
Local dev	argon-server.js (Node.js)	Zero-dep mock on port 7860

Capabilities

Expression System

Core data primitive: ExpressionCoefficients — 16 continuous 0–1 values extracted from face analysis, used to drive character animation.

// ExpressionCoefficients
{
  jaw, mouthOpen, mouthCornerUp, mouthCornerDown,
  lipPucker, lipStretch,
  browInner, browOuter, browFurrow,
  eyeWide, eyeClose, eyeSquint,
  cheekRaise, noseFlair, noseWrinkle,
  intensity,  // overall expression energy

  emotionVector: {
    valence:   -1.0 to 1.0,  // +/- sentiment
    arousal:    0.0 to 1.0,  // calm to excited
    dominance:  0.0 to 1.0,  // passive to dominant
  },

  emotionClass: {  // softmax
    neutral, happy, sad, angry,
    surprised, fearful, disgusted, contempt
  }
}

Beat Sync

Pass BeatEmotionCurve into sequence transfer — expression intensity peaks on downbeats.

// BeatEmotionCurve
{
  trackDurationMs: 180000,
  beats: [
    { timeMs: 0,    strength: 1.0 },
    { timeMs: 500,  strength: 0.8 },
    { timeMs: 1000, strength: 1.0 },
    ...
  ]
}

LoRA Injection

// any generate/transfer request
{
  loraPaths: [
    "/models/loras/style_12345.safetensors",
    "/models/loras/face_67890.safetensors"
  ]
}

Theoretical Workflows

Character Animation from Reference Video

Drive a character with a real performer's motion and expression, in your visual style.

1

Analyze → Motion

Upload reference video. Argon extracts body skeleton + expression per frame. Returns MotionTrack ID.

2

Transfer → Sequence

Upload character still. Paste MotionTrack ID. Add optional style prompt + LoRA paths.

3

Poll → Collect Frames

Argon renders each frame: DWPose conditioning + LivePortrait expression drive.

4

Composite / Export

Use frames in your pipeline. Or trigger BiRefNet segmentation for alpha isolation.

Output: Frame-by-frame animation in your character's style, driven by real human performance.

Beat-Synced Music Video

Character expressiveness breathes with music — emotion peaks on downbeats.

1

Export Beat Timestamps

From DAW or BPM tool: array of { timeMs, strength } per beat.

2

Analyze → Motion

Upload reference footage or use mock MotionTrack.

3

Transfer → Sequence with beatCurve

Argon multiplies each frame's expression coefficients by proximity-to-beat scalar.

Output: Expression energy synced to music rhythm — performance energy, not lipsynch.

CivitAI Style Injection

Generate frames in any visual style. Download once, cached forever.

1

Find LoRA on CivitAI

Grab versionId from model URL: civitai.com/models/XXXX?modelVersionId=YYYYY

2

POST /api/loras/download

Pass { versionId }. Downloads, caches to Modal Volume. ~30 seconds. Returns file path.

3

Use in Generate or Transfer

Pass path as loraPaths: [path]. ComfyUI LoraLoader injects at inference time.

LoRA library builds up over time on Modal Volume — shared across all team sessions.

Character Sheet Generation

Consistent identity across all pose angles.

1

Analyze → Face

Upload reference portrait. Returns identityHash string.

2

Generate → Pose (FRONT)

Character prompt + LoRAs + identityLock: hash.

3

Repeat Each Angle

THREE_QUARTER, SIDE, BACK — same identityLock hash.

Output: Front / 3/4 / side / back with consistent character identity.

Segmentation for Compositing

Clean alpha masks for compositing pipelines.

1

Analyze → Segment

Upload image. Select region: face, hair, body, or full.

2

Receive Mask

BiRefNet returns base64 PNG alpha mask. Works on photo-real and illustrated styles.

3

Composite

Use in After Effects, DaVinci, or any compositing tool.

BiRefNet is strong on illustration/anime art — better than roto on non-photorealistic content.

API Reference

All endpoints at /api/* — Vercel rewrites proxy to Modal. Auth-free during dev phase.

Method	Endpoint	Sync/Async	Description
GET	/api/health	sync	Backend status
POST	/api/analyze/motion	async	Extract MotionTrack from video/image — returns trackId + jobId
POST	/api/analyze/expression	sync	16 expression coefficients + emotion vector
POST	/api/analyze/face	sync	468 MediaPipe landmarks + ARKit 52 blendshapes
POST	/api/analyze/segment	sync	BiRefNet mask for face/hair/body/full
POST	/api/transfer/expression	sync	LivePortrait single-frame expression drive
POST	/api/transfer/sequence	async	Beat-synced animated sequence with LoRA injection
POST	/api/generate/image	async	Text-to-image via ComfyUI (SDXL/Flux) + LoRA
POST	/api/generate/video	async	Video generation (Dream Machine backend)
POST	/api/generate/pose	async	Pose-conditioned character generation
POST	/api/loras/download	sync	Download CivitAI LoRA by versionId to Volume
GET	/api/loras	sync	List cached LoRAs
GET	/api/jobs/:jobId	sync	Poll job: queued / running / done / error
GET	/api/events	SSE	Real-time job updates

Build Status

Infrastructure

DONE Modal GPU routing

DONE Vercel rewrite proxy

DONE Modal Volume (LoRA cache)

DONE modal.Dict job state

DONE SSE event stream

DONE Async job polling

DONE argon-client.js

DONE CivitAI LoRA download

Analysis Models

DONE MediaPipe face mesh

DONE Expression extraction (16 AUs)

DONE PAD emotion vector

DONE Mock MotionTrack (dev)

DONE DWPose body skeleton

DONE BiRefNet segmentation

DONE ARKit blendshapes (full)

DONE Beat proximity modulation

Generation / Transfer

DONE LivePortrait drive

DONE ComfyUI image gen

DONE Pose-conditioned gen

WIP Sequence rendering

DONE React UI (pipeline wired)

DONE Video gen (Dream Machine)

All core engines live.
Sequence rendering
multi-frame in progress.

v0.2.0 status: All analysis models live (DWPose, BiRefNet, ARKit, MediaPipe). ComfyUI workflows deployed (image gen, pose-conditioned, LivePortrait drive). React pipeline UI fully wired — CharacterSheet, CharacterPipeline, ScriptStoryboard, and VideoEditor all call argon-server directly via argon-client. Next: live end-to-end test against Modal GPU, sequence rendering multi-frame optimization.

Repository

GitHub: ChopperD00/deadweight-argon · branch: main

File	Purpose	Lines
modal_server.py	Modal GPU backend. FastAPI + ArgonRuntime class + all endpoints.	~400
comfy_workflows.py	ComfyUI workflow builders: DWPose extraction, LivePortrait drive, SDXL/Flux gen.	~300
comfy_helpers.py	Shared ComfyUI utilities: node builders, polling, image encode/decode.	~150
argon-server.js	Zero-dep Node.js mock server for local dev. Mirrors modal_server.py API exactly.	~700
src/lib/argon-client.js	React API client. Env-aware, full interface for all endpoints + SSE + polling.	~250
vercel.json	CRA build config + /api/* rewrite to Modal URL.	~20
.env.example	All required env vars documented.	~20
DEPLOYMENT.md	Step-by-step: modal setup → deploy → update vercel.json.	~100
USER-GUIDE.md	Non-technical guide with 6 theoretical workflows + glossary.	~250
ARGON-ANALYSIS-SCHEMA.md	Full TypeScript interface spec for all data contracts.	~500

→ GitHub Repo modal_server.py DEPLOYMENT.md USER-GUIDE.md SCHEMA.md arg0n.dev ↗