Adding -insecure flag to hermes dashboard
This commit is contained in:
@@ -0,0 +1,3 @@
|
||||
---
|
||||
description: Creative content generation — ASCII art, hand-drawn style diagrams, and visual design tools.
|
||||
---
|
||||
@@ -0,0 +1,147 @@
|
||||
---
|
||||
name: architecture-diagram
|
||||
description: Generate dark-themed SVG diagrams of software systems and cloud infrastructure as standalone HTML files with inline SVG graphics. Semantic component colors (cyan=frontend, emerald=backend, violet=database, amber=cloud/AWS, rose=security, orange=message bus), JetBrains Mono font, grid background. Best suited for software architecture, cloud/VPC topology, microservice maps, service-mesh diagrams, database + API layer diagrams, security groups, message buses — anything that fits a tech-infra deck with a dark aesthetic. If a more specialized diagramming skill exists for the subject (scientific, educational, hand-drawn, animated, etc.), prefer that — otherwise this skill can also serve as a general-purpose SVG diagram fallback. Based on Cocoon AI's architecture-diagram-generator (MIT).
|
||||
version: 1.0.0
|
||||
author: Cocoon AI (hello@cocoon-ai.com), ported by Hermes Agent
|
||||
license: MIT
|
||||
dependencies: []
|
||||
metadata:
|
||||
hermes:
|
||||
tags: [architecture, diagrams, SVG, HTML, visualization, infrastructure, cloud]
|
||||
related_skills: [concept-diagrams, excalidraw]
|
||||
---
|
||||
|
||||
# Architecture Diagram Skill
|
||||
|
||||
Generate professional, dark-themed technical architecture diagrams as standalone HTML files with inline SVG graphics. No external tools, no API keys, no rendering libraries — just write the HTML file and open it in a browser.
|
||||
|
||||
## Scope
|
||||
|
||||
**Best suited for:**
|
||||
- Software system architecture (frontend / backend / database layers)
|
||||
- Cloud infrastructure (VPC, regions, subnets, managed services)
|
||||
- Microservice / service-mesh topology
|
||||
- Database + API map, deployment diagrams
|
||||
- Anything with a tech-infra subject that fits a dark, grid-backed aesthetic
|
||||
|
||||
**Look elsewhere first for:**
|
||||
- Physics, chemistry, math, biology, or other scientific subjects
|
||||
- Physical objects (vehicles, hardware, anatomy, cross-sections)
|
||||
- Floor plans, narrative journeys, educational / textbook-style visuals
|
||||
- Hand-drawn whiteboard sketches (consider `excalidraw`)
|
||||
- Animated explainers (consider an animation skill)
|
||||
|
||||
If a more specialized skill is available for the subject, prefer that. If none fits, this skill can also serve as a general SVG diagram fallback — the output will just carry the dark tech aesthetic described below.
|
||||
|
||||
Based on [Cocoon AI's architecture-diagram-generator](https://github.com/Cocoon-AI/architecture-diagram-generator) (MIT).
|
||||
|
||||
## Workflow
|
||||
|
||||
1. User describes their system architecture (components, connections, technologies)
|
||||
2. Generate the HTML file following the design system below
|
||||
3. Save with `write_file` to a `.html` file (e.g. `~/architecture-diagram.html`)
|
||||
4. User opens in any browser — works offline, no dependencies
|
||||
|
||||
### Output Location
|
||||
|
||||
Save diagrams to a user-specified path, or default to the current working directory:
|
||||
```
|
||||
./[project-name]-architecture.html
|
||||
```
|
||||
|
||||
### Preview
|
||||
|
||||
After saving, suggest the user open it:
|
||||
```bash
|
||||
# macOS
|
||||
open ./my-architecture.html
|
||||
# Linux
|
||||
xdg-open ./my-architecture.html
|
||||
```
|
||||
|
||||
## Design System & Visual Language
|
||||
|
||||
### Color Palette (Semantic Mapping)
|
||||
|
||||
Use specific `rgba` fills and hex strokes to categorize components:
|
||||
|
||||
| Component Type | Fill (rgba) | Stroke (Hex) |
|
||||
| :--- | :--- | :--- |
|
||||
| **Frontend** | `rgba(8, 51, 68, 0.4)` | `#22d3ee` (cyan-400) |
|
||||
| **Backend** | `rgba(6, 78, 59, 0.4)` | `#34d399` (emerald-400) |
|
||||
| **Database** | `rgba(76, 29, 149, 0.4)` | `#a78bfa` (violet-400) |
|
||||
| **AWS/Cloud** | `rgba(120, 53, 15, 0.3)` | `#fbbf24` (amber-400) |
|
||||
| **Security** | `rgba(136, 19, 55, 0.4)` | `#fb7185` (rose-400) |
|
||||
| **Message Bus** | `rgba(251, 146, 60, 0.3)` | `#fb923c` (orange-400) |
|
||||
| **External** | `rgba(30, 41, 59, 0.5)` | `#94a3b8` (slate-400) |
|
||||
|
||||
### Typography & Background
|
||||
- **Font:** JetBrains Mono (Monospace), loaded from Google Fonts
|
||||
- **Sizes:** 12px (Names), 9px (Sublabels), 8px (Annotations), 7px (Tiny labels)
|
||||
- **Background:** Slate-950 (`#020617`) with a subtle 40px grid pattern
|
||||
|
||||
```svg
|
||||
<!-- Background Grid Pattern -->
|
||||
<pattern id="grid" width="40" height="40" patternUnits="userSpaceOnUse">
|
||||
<path d="M 40 0 L 0 0 0 40" fill="none" stroke="#1e293b" stroke-width="0.5"/>
|
||||
</pattern>
|
||||
```
|
||||
|
||||
## Technical Implementation Details
|
||||
|
||||
### Component Rendering
|
||||
Components are rounded rectangles (`rx="6"`) with 1.5px strokes. To prevent arrows from showing through semi-transparent fills, use a **double-rect masking technique**:
|
||||
1. Draw an opaque background rect (`#0f172a`)
|
||||
2. Draw the semi-transparent styled rect on top
|
||||
|
||||
### Connection Rules
|
||||
- **Z-Order:** Draw arrows *early* in the SVG (after the grid) so they render behind component boxes
|
||||
- **Arrowheads:** Defined via SVG markers
|
||||
- **Security Flows:** Use dashed lines in rose color (`#fb7185`)
|
||||
- **Boundaries:**
|
||||
- *Security Groups:* Dashed (`4,4`), rose color
|
||||
- *Regions:* Large dashed (`8,4`), amber color, `rx="12"`
|
||||
|
||||
### Spacing & Layout Logic
|
||||
- **Standard Height:** 60px (Services); 80-120px (Large components)
|
||||
- **Vertical Gap:** Minimum 40px between components
|
||||
- **Message Buses:** Must be placed *in the gap* between services, not overlapping them
|
||||
- **Legend Placement:** **CRITICAL.** Must be placed outside all boundary boxes. Calculate the lowest Y-coordinate of all boundaries and place the legend at least 20px below it.
|
||||
|
||||
## Document Structure
|
||||
|
||||
The generated HTML file follows a four-part layout:
|
||||
1. **Header:** Title with a pulsing dot indicator and subtitle
|
||||
2. **Main SVG:** The diagram contained within a rounded border card
|
||||
3. **Summary Cards:** A grid of three cards below the diagram for high-level details
|
||||
4. **Footer:** Minimal metadata
|
||||
|
||||
### Info Card Pattern
|
||||
```html
|
||||
<div class="card">
|
||||
<div class="card-header">
|
||||
<div class="card-dot cyan"></div>
|
||||
<h3>Title</h3>
|
||||
</div>
|
||||
<ul>
|
||||
<li>• Item one</li>
|
||||
<li>• Item two</li>
|
||||
</ul>
|
||||
</div>
|
||||
```
|
||||
|
||||
## Output Requirements
|
||||
- **Single File:** One self-contained `.html` file
|
||||
- **No External Dependencies:** All CSS and SVG must be inline (except Google Fonts)
|
||||
- **No JavaScript:** Use pure CSS for any animations (like pulsing dots)
|
||||
- **Compatibility:** Must render correctly in any modern web browser
|
||||
|
||||
## Template Reference
|
||||
|
||||
Load the full HTML template for the exact structure, CSS, and SVG component examples:
|
||||
|
||||
```
|
||||
skill_view(name="architecture-diagram", file_path="templates/template.html")
|
||||
```
|
||||
|
||||
The template contains working examples of every component type (frontend, backend, database, cloud, security), arrow styles (standard, dashed, curved), security groups, region boundaries, and the legend — use it as your structural reference when generating diagrams.
|
||||
@@ -0,0 +1,319 @@
|
||||
<!DOCTYPE html>
|
||||
<html lang="en">
|
||||
<head>
|
||||
<meta charset="UTF-8">
|
||||
<meta name="viewport" content="width=device-width, initial-scale=1.0">
|
||||
<title>[PROJECT NAME] Architecture Diagram</title>
|
||||
<link href="https://fonts.googleapis.com/css2?family=JetBrains+Mono:wght@400;500;600;700&display=swap" rel="stylesheet">
|
||||
<style>
|
||||
* {
|
||||
margin: 0;
|
||||
padding: 0;
|
||||
box-sizing: border-box;
|
||||
}
|
||||
|
||||
body {
|
||||
font-family: 'JetBrains Mono', monospace;
|
||||
background: #020617;
|
||||
min-height: 100vh;
|
||||
padding: 2rem;
|
||||
color: white;
|
||||
}
|
||||
|
||||
.container {
|
||||
max-width: 1200px;
|
||||
margin: 0 auto;
|
||||
}
|
||||
|
||||
.header {
|
||||
margin-bottom: 2rem;
|
||||
}
|
||||
|
||||
.header-row {
|
||||
display: flex;
|
||||
align-items: center;
|
||||
gap: 1rem;
|
||||
margin-bottom: 0.5rem;
|
||||
}
|
||||
|
||||
.pulse-dot {
|
||||
width: 12px;
|
||||
height: 12px;
|
||||
background: #22d3ee;
|
||||
border-radius: 50%;
|
||||
animation: pulse 2s infinite;
|
||||
}
|
||||
|
||||
@keyframes pulse {
|
||||
0%, 100% { opacity: 1; }
|
||||
50% { opacity: 0.5; }
|
||||
}
|
||||
|
||||
h1 {
|
||||
font-size: 1.5rem;
|
||||
font-weight: 700;
|
||||
letter-spacing: -0.025em;
|
||||
}
|
||||
|
||||
.subtitle {
|
||||
color: #94a3b8;
|
||||
font-size: 0.875rem;
|
||||
margin-left: 1.75rem;
|
||||
}
|
||||
|
||||
.diagram-container {
|
||||
background: rgba(15, 23, 42, 0.5);
|
||||
border-radius: 1rem;
|
||||
border: 1px solid #1e293b;
|
||||
padding: 1.5rem;
|
||||
overflow-x: auto;
|
||||
}
|
||||
|
||||
svg {
|
||||
width: 100%;
|
||||
min-width: 900px;
|
||||
display: block;
|
||||
}
|
||||
|
||||
.cards {
|
||||
display: grid;
|
||||
grid-template-columns: repeat(auto-fit, minmax(280px, 1fr));
|
||||
gap: 1rem;
|
||||
margin-top: 2rem;
|
||||
}
|
||||
|
||||
.card {
|
||||
background: rgba(15, 23, 42, 0.5);
|
||||
border-radius: 0.75rem;
|
||||
border: 1px solid #1e293b;
|
||||
padding: 1.25rem;
|
||||
}
|
||||
|
||||
.card-header {
|
||||
display: flex;
|
||||
align-items: center;
|
||||
gap: 0.5rem;
|
||||
margin-bottom: 0.75rem;
|
||||
}
|
||||
|
||||
.card-dot {
|
||||
width: 8px;
|
||||
height: 8px;
|
||||
border-radius: 50%;
|
||||
}
|
||||
|
||||
.card-dot.cyan { background: #22d3ee; }
|
||||
.card-dot.emerald { background: #34d399; }
|
||||
.card-dot.violet { background: #a78bfa; }
|
||||
.card-dot.amber { background: #fbbf24; }
|
||||
.card-dot.rose { background: #fb7185; }
|
||||
|
||||
.card h3 {
|
||||
font-size: 0.875rem;
|
||||
font-weight: 600;
|
||||
}
|
||||
|
||||
.card ul {
|
||||
list-style: none;
|
||||
color: #94a3b8;
|
||||
font-size: 0.75rem;
|
||||
}
|
||||
|
||||
.card li {
|
||||
margin-bottom: 0.375rem;
|
||||
}
|
||||
|
||||
.footer {
|
||||
text-align: center;
|
||||
margin-top: 1.5rem;
|
||||
color: #475569;
|
||||
font-size: 0.75rem;
|
||||
}
|
||||
</style>
|
||||
</head>
|
||||
<body>
|
||||
<div class="container">
|
||||
<!-- Header -->
|
||||
<div class="header">
|
||||
<div class="header-row">
|
||||
<div class="pulse-dot"></div>
|
||||
<h1>[PROJECT NAME] Architecture</h1>
|
||||
</div>
|
||||
<p class="subtitle">[Subtitle description]</p>
|
||||
</div>
|
||||
|
||||
<!-- Main Diagram -->
|
||||
<div class="diagram-container">
|
||||
<svg viewBox="0 0 1000 680">
|
||||
<!-- Definitions -->
|
||||
<defs>
|
||||
<marker id="arrowhead" markerWidth="10" markerHeight="7" refX="9" refY="3.5" orient="auto">
|
||||
<polygon points="0 0, 10 3.5, 0 7" fill="#64748b" />
|
||||
</marker>
|
||||
<pattern id="grid" width="40" height="40" patternUnits="userSpaceOnUse">
|
||||
<path d="M 40 0 L 0 0 0 40" fill="none" stroke="#1e293b" stroke-width="0.5"/>
|
||||
</pattern>
|
||||
</defs>
|
||||
|
||||
<!-- Background Grid -->
|
||||
<rect width="100%" height="100%" fill="url(#grid)" />
|
||||
|
||||
<!-- =================================================================
|
||||
COMPONENT EXAMPLES - Copy and customize these patterns
|
||||
================================================================= -->
|
||||
|
||||
<!-- External/Generic Component -->
|
||||
<rect x="30" y="280" width="100" height="50" rx="6" fill="rgba(30, 41, 59, 0.5)" stroke="#94a3b8" stroke-width="1.5"/>
|
||||
<text x="80" y="300" fill="white" font-size="11" font-weight="600" text-anchor="middle">Users</text>
|
||||
<text x="80" y="316" fill="#94a3b8" font-size="9" text-anchor="middle">Browser/Mobile</text>
|
||||
|
||||
<!-- Security Component -->
|
||||
<rect x="30" y="80" width="100" height="60" rx="6" fill="rgba(136, 19, 55, 0.4)" stroke="#fb7185" stroke-width="1.5"/>
|
||||
<text x="80" y="105" fill="white" font-size="11" font-weight="600" text-anchor="middle">Auth Provider</text>
|
||||
<text x="80" y="121" fill="#94a3b8" font-size="9" text-anchor="middle">OAuth 2.0</text>
|
||||
|
||||
<!-- Region/Cloud Boundary -->
|
||||
<rect x="160" y="40" width="820" height="620" rx="12" fill="rgba(251, 191, 36, 0.05)" stroke="#fbbf24" stroke-width="1" stroke-dasharray="8,4"/>
|
||||
<text x="172" y="58" fill="#fbbf24" font-size="10" font-weight="600">AWS Region: us-west-2</text>
|
||||
|
||||
<!-- AWS/Cloud Service -->
|
||||
<rect x="200" y="280" width="110" height="50" rx="6" fill="rgba(120, 53, 15, 0.3)" stroke="#fbbf24" stroke-width="1.5"/>
|
||||
<text x="255" y="300" fill="white" font-size="11" font-weight="600" text-anchor="middle">CloudFront</text>
|
||||
<text x="255" y="316" fill="#94a3b8" font-size="9" text-anchor="middle">CDN</text>
|
||||
|
||||
<!-- Multi-line AWS Component (S3 Buckets example) -->
|
||||
<rect x="200" y="380" width="110" height="100" rx="6" fill="rgba(120, 53, 15, 0.3)" stroke="#fbbf24" stroke-width="1.5"/>
|
||||
<text x="255" y="400" fill="white" font-size="11" font-weight="600" text-anchor="middle">S3 Buckets</text>
|
||||
<text x="255" y="420" fill="#94a3b8" font-size="8" text-anchor="middle">• bucket-one</text>
|
||||
<text x="255" y="434" fill="#94a3b8" font-size="8" text-anchor="middle">• bucket-two</text>
|
||||
<text x="255" y="448" fill="#94a3b8" font-size="8" text-anchor="middle">• bucket-three</text>
|
||||
<text x="255" y="466" fill="#fbbf24" font-size="7" text-anchor="middle">OAI Protected</text>
|
||||
|
||||
<!-- Security Group (dashed boundary) -->
|
||||
<rect x="350" y="265" width="120" height="80" rx="8" fill="transparent" stroke="#fb7185" stroke-width="1" stroke-dasharray="4,4"/>
|
||||
<text x="358" y="279" fill="#fb7185" font-size="8">sg-name :port</text>
|
||||
|
||||
<!-- Component inside security group -->
|
||||
<rect x="360" y="280" width="100" height="50" rx="6" fill="rgba(120, 53, 15, 0.3)" stroke="#fbbf24" stroke-width="1.5"/>
|
||||
<text x="410" y="300" fill="white" font-size="11" font-weight="600" text-anchor="middle">Load Balancer</text>
|
||||
<text x="410" y="316" fill="#94a3b8" font-size="9" text-anchor="middle">HTTPS :443</text>
|
||||
|
||||
<!-- Backend Component -->
|
||||
<rect x="510" y="280" width="110" height="50" rx="6" fill="rgba(6, 78, 59, 0.4)" stroke="#34d399" stroke-width="1.5"/>
|
||||
<text x="565" y="300" fill="white" font-size="11" font-weight="600" text-anchor="middle">API Server</text>
|
||||
<text x="565" y="316" fill="#94a3b8" font-size="9" text-anchor="middle">FastAPI :8000</text>
|
||||
|
||||
<!-- Database Component -->
|
||||
<rect x="700" y="280" width="120" height="50" rx="6" fill="rgba(76, 29, 149, 0.4)" stroke="#a78bfa" stroke-width="1.5"/>
|
||||
<text x="760" y="300" fill="white" font-size="11" font-weight="600" text-anchor="middle">Database</text>
|
||||
<text x="760" y="316" fill="#94a3b8" font-size="9" text-anchor="middle">PostgreSQL</text>
|
||||
|
||||
<!-- Frontend Component -->
|
||||
<rect x="200" y="520" width="200" height="110" rx="8" fill="rgba(8, 51, 68, 0.4)" stroke="#22d3ee" stroke-width="1.5"/>
|
||||
<text x="300" y="545" fill="white" font-size="12" font-weight="600" text-anchor="middle">Frontend</text>
|
||||
<text x="300" y="565" fill="#94a3b8" font-size="9" text-anchor="middle">React + TypeScript</text>
|
||||
<text x="300" y="580" fill="#94a3b8" font-size="9" text-anchor="middle">Additional detail</text>
|
||||
<text x="300" y="595" fill="#94a3b8" font-size="9" text-anchor="middle">More info</text>
|
||||
<text x="300" y="615" fill="#22d3ee" font-size="8" text-anchor="middle">domain.example.com</text>
|
||||
|
||||
<!-- =================================================================
|
||||
ARROW EXAMPLES
|
||||
================================================================= -->
|
||||
|
||||
<!-- Standard arrow with label -->
|
||||
<line x1="130" y1="305" x2="198" y2="305" stroke="#22d3ee" stroke-width="1.5" marker-end="url(#arrowhead)"/>
|
||||
<text x="164" y="299" fill="#94a3b8" font-size="9" text-anchor="middle">HTTPS</text>
|
||||
|
||||
<!-- Simple arrow (no label) -->
|
||||
<line x1="310" y1="305" x2="358" y2="305" stroke="#22d3ee" stroke-width="1.5" marker-end="url(#arrowhead)"/>
|
||||
|
||||
<!-- Vertical arrow -->
|
||||
<line x1="255" y1="330" x2="255" y2="378" stroke="#fbbf24" stroke-width="1.5" marker-end="url(#arrowhead)"/>
|
||||
<text x="270" y="358" fill="#94a3b8" font-size="9">OAI</text>
|
||||
|
||||
<!-- Dashed arrow (for auth/security flows) -->
|
||||
<line x1="460" y1="305" x2="508" y2="305" stroke="#34d399" stroke-width="1.5" marker-end="url(#arrowhead)"/>
|
||||
<line x1="620" y1="305" x2="698" y2="305" stroke="#a78bfa" stroke-width="1.5" marker-end="url(#arrowhead)"/>
|
||||
<text x="655" y="299" fill="#94a3b8" font-size="9">TLS</text>
|
||||
|
||||
<!-- Curved path for auth flow -->
|
||||
<path d="M 80 140 L 80 200 Q 80 220 100 220 L 200 220 Q 220 220 220 240 L 220 278" fill="none" stroke="#fb7185" stroke-width="1.5" stroke-dasharray="5,5"/>
|
||||
<text x="150" y="210" fill="#fb7185" font-size="8">JWT + PKCE</text>
|
||||
|
||||
<!-- =================================================================
|
||||
LEGEND
|
||||
================================================================= -->
|
||||
<text x="720" y="70" fill="white" font-size="10" font-weight="600">Legend</text>
|
||||
|
||||
<rect x="720" y="82" width="16" height="10" rx="2" fill="rgba(8, 51, 68, 0.4)" stroke="#22d3ee" stroke-width="1"/>
|
||||
<text x="742" y="90" fill="#94a3b8" font-size="8">Frontend</text>
|
||||
|
||||
<rect x="720" y="98" width="16" height="10" rx="2" fill="rgba(6, 78, 59, 0.4)" stroke="#34d399" stroke-width="1"/>
|
||||
<text x="742" y="106" fill="#94a3b8" font-size="8">Backend</text>
|
||||
|
||||
<rect x="720" y="114" width="16" height="10" rx="2" fill="rgba(120, 53, 15, 0.3)" stroke="#fbbf24" stroke-width="1"/>
|
||||
<text x="742" y="122" fill="#94a3b8" font-size="8">Cloud Service</text>
|
||||
|
||||
<rect x="720" y="130" width="16" height="10" rx="2" fill="rgba(76, 29, 149, 0.4)" stroke="#a78bfa" stroke-width="1"/>
|
||||
<text x="742" y="138" fill="#94a3b8" font-size="8">Database</text>
|
||||
|
||||
<rect x="720" y="146" width="16" height="10" rx="2" fill="rgba(136, 19, 55, 0.4)" stroke="#fb7185" stroke-width="1"/>
|
||||
<text x="742" y="154" fill="#94a3b8" font-size="8">Security</text>
|
||||
|
||||
<line x1="720" y1="168" x2="736" y2="168" stroke="#fb7185" stroke-width="1" stroke-dasharray="3,3"/>
|
||||
<text x="742" y="171" fill="#94a3b8" font-size="8">Auth Flow</text>
|
||||
|
||||
<rect x="720" y="178" width="16" height="10" rx="2" fill="transparent" stroke="#fb7185" stroke-width="1" stroke-dasharray="3,3"/>
|
||||
<text x="742" y="186" fill="#94a3b8" font-size="8">Security Group</text>
|
||||
</svg>
|
||||
</div>
|
||||
|
||||
<!-- Info Cards -->
|
||||
<div class="cards">
|
||||
<div class="card">
|
||||
<div class="card-header">
|
||||
<div class="card-dot rose"></div>
|
||||
<h3>Card Title 1</h3>
|
||||
</div>
|
||||
<ul>
|
||||
<li>• Item one</li>
|
||||
<li>• Item two</li>
|
||||
<li>• Item three</li>
|
||||
<li>• Item four</li>
|
||||
</ul>
|
||||
</div>
|
||||
|
||||
<div class="card">
|
||||
<div class="card-header">
|
||||
<div class="card-dot amber"></div>
|
||||
<h3>Card Title 2</h3>
|
||||
</div>
|
||||
<ul>
|
||||
<li>• Item one</li>
|
||||
<li>• Item two</li>
|
||||
<li>• Item three</li>
|
||||
<li>• Item four</li>
|
||||
</ul>
|
||||
</div>
|
||||
|
||||
<div class="card">
|
||||
<div class="card-header">
|
||||
<div class="card-dot violet"></div>
|
||||
<h3>Card Title 3</h3>
|
||||
</div>
|
||||
<ul>
|
||||
<li>• Item one</li>
|
||||
<li>• Item two</li>
|
||||
<li>• Item three</li>
|
||||
<li>• Item four</li>
|
||||
</ul>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<!-- Footer -->
|
||||
<p class="footer">
|
||||
[Project Name] • [Additional metadata]
|
||||
</p>
|
||||
</div>
|
||||
</body>
|
||||
</html>
|
||||
@@ -0,0 +1,321 @@
|
||||
---
|
||||
name: ascii-art
|
||||
description: Generate ASCII art using pyfiglet (571 fonts), cowsay, boxes, toilet, image-to-ascii, remote APIs (asciified, ascii.co.uk), and LLM fallback. No API keys required.
|
||||
version: 4.0.0
|
||||
author: 0xbyt4, Hermes Agent
|
||||
license: MIT
|
||||
dependencies: []
|
||||
metadata:
|
||||
hermes:
|
||||
tags: [ASCII, Art, Banners, Creative, Unicode, Text-Art, pyfiglet, figlet, cowsay, boxes]
|
||||
related_skills: [excalidraw]
|
||||
|
||||
---
|
||||
|
||||
# ASCII Art Skill
|
||||
|
||||
Multiple tools for different ASCII art needs. All tools are local CLI programs or free REST APIs — no API keys required.
|
||||
|
||||
## Tool 1: Text Banners (pyfiglet — local)
|
||||
|
||||
Render text as large ASCII art banners. 571 built-in fonts.
|
||||
|
||||
### Setup
|
||||
|
||||
```bash
|
||||
pip install pyfiglet --break-system-packages -q
|
||||
```
|
||||
|
||||
### Usage
|
||||
|
||||
```bash
|
||||
python3 -m pyfiglet "YOUR TEXT" -f slant
|
||||
python3 -m pyfiglet "TEXT" -f doom -w 80 # Set width
|
||||
python3 -m pyfiglet --list_fonts # List all 571 fonts
|
||||
```
|
||||
|
||||
### Recommended fonts
|
||||
|
||||
| Style | Font | Best for |
|
||||
|-------|------|----------|
|
||||
| Clean & modern | `slant` | Project names, headers |
|
||||
| Bold & blocky | `doom` | Titles, logos |
|
||||
| Big & readable | `big` | Banners |
|
||||
| Classic banner | `banner3` | Wide displays |
|
||||
| Compact | `small` | Subtitles |
|
||||
| Cyberpunk | `cyberlarge` | Tech themes |
|
||||
| 3D effect | `3-d` | Splash screens |
|
||||
| Gothic | `gothic` | Dramatic text |
|
||||
|
||||
### Tips
|
||||
|
||||
- Preview 2-3 fonts and let the user pick their favorite
|
||||
- Short text (1-8 chars) works best with detailed fonts like `doom` or `block`
|
||||
- Long text works better with compact fonts like `small` or `mini`
|
||||
|
||||
## Tool 2: Text Banners (asciified API — remote, no install)
|
||||
|
||||
Free REST API that converts text to ASCII art. 250+ FIGlet fonts. Returns plain text directly — no parsing needed. Use this when pyfiglet is not installed or as a quick alternative.
|
||||
|
||||
### Usage (via terminal curl)
|
||||
|
||||
```bash
|
||||
# Basic text banner (default font)
|
||||
curl -s "https://asciified.thelicato.io/api/v2/ascii?text=Hello+World"
|
||||
|
||||
# With a specific font
|
||||
curl -s "https://asciified.thelicato.io/api/v2/ascii?text=Hello&font=Slant"
|
||||
curl -s "https://asciified.thelicato.io/api/v2/ascii?text=Hello&font=Doom"
|
||||
curl -s "https://asciified.thelicato.io/api/v2/ascii?text=Hello&font=Star+Wars"
|
||||
curl -s "https://asciified.thelicato.io/api/v2/ascii?text=Hello&font=3-D"
|
||||
curl -s "https://asciified.thelicato.io/api/v2/ascii?text=Hello&font=Banner3"
|
||||
|
||||
# List all available fonts (returns JSON array)
|
||||
curl -s "https://asciified.thelicato.io/api/v2/fonts"
|
||||
```
|
||||
|
||||
### Tips
|
||||
|
||||
- URL-encode spaces as `+` in the text parameter
|
||||
- The response is plain text ASCII art — no JSON wrapping, ready to display
|
||||
- Font names are case-sensitive; use the fonts endpoint to get exact names
|
||||
- Works from any terminal with curl — no Python or pip needed
|
||||
|
||||
## Tool 3: Cowsay (Message Art)
|
||||
|
||||
Classic tool that wraps text in a speech bubble with an ASCII character.
|
||||
|
||||
### Setup
|
||||
|
||||
```bash
|
||||
sudo apt install cowsay -y # Debian/Ubuntu
|
||||
# brew install cowsay # macOS
|
||||
```
|
||||
|
||||
### Usage
|
||||
|
||||
```bash
|
||||
cowsay "Hello World"
|
||||
cowsay -f tux "Linux rules" # Tux the penguin
|
||||
cowsay -f dragon "Rawr!" # Dragon
|
||||
cowsay -f stegosaurus "Roar!" # Stegosaurus
|
||||
cowthink "Hmm..." # Thought bubble
|
||||
cowsay -l # List all characters
|
||||
```
|
||||
|
||||
### Available characters (50+)
|
||||
|
||||
`beavis.zen`, `bong`, `bunny`, `cheese`, `daemon`, `default`, `dragon`,
|
||||
`dragon-and-cow`, `elephant`, `eyes`, `flaming-skull`, `ghostbusters`,
|
||||
`hellokitty`, `kiss`, `kitty`, `koala`, `luke-koala`, `mech-and-cow`,
|
||||
`meow`, `moofasa`, `moose`, `ren`, `sheep`, `skeleton`, `small`,
|
||||
`stegosaurus`, `stimpy`, `supermilker`, `surgery`, `three-eyes`,
|
||||
`turkey`, `turtle`, `tux`, `udder`, `vader`, `vader-koala`, `www`
|
||||
|
||||
### Eye/tongue modifiers
|
||||
|
||||
```bash
|
||||
cowsay -b "Borg" # =_= eyes
|
||||
cowsay -d "Dead" # x_x eyes
|
||||
cowsay -g "Greedy" # $_$ eyes
|
||||
cowsay -p "Paranoid" # @_@ eyes
|
||||
cowsay -s "Stoned" # *_* eyes
|
||||
cowsay -w "Wired" # O_O eyes
|
||||
cowsay -e "OO" "Msg" # Custom eyes
|
||||
cowsay -T "U " "Msg" # Custom tongue
|
||||
```
|
||||
|
||||
## Tool 4: Boxes (Decorative Borders)
|
||||
|
||||
Draw decorative ASCII art borders/frames around any text. 70+ built-in designs.
|
||||
|
||||
### Setup
|
||||
|
||||
```bash
|
||||
sudo apt install boxes -y # Debian/Ubuntu
|
||||
# brew install boxes # macOS
|
||||
```
|
||||
|
||||
### Usage
|
||||
|
||||
```bash
|
||||
echo "Hello World" | boxes # Default box
|
||||
echo "Hello World" | boxes -d stone # Stone border
|
||||
echo "Hello World" | boxes -d parchment # Parchment scroll
|
||||
echo "Hello World" | boxes -d cat # Cat border
|
||||
echo "Hello World" | boxes -d dog # Dog border
|
||||
echo "Hello World" | boxes -d unicornsay # Unicorn
|
||||
echo "Hello World" | boxes -d diamonds # Diamond pattern
|
||||
echo "Hello World" | boxes -d c-cmt # C-style comment
|
||||
echo "Hello World" | boxes -d html-cmt # HTML comment
|
||||
echo "Hello World" | boxes -a c # Center text
|
||||
boxes -l # List all 70+ designs
|
||||
```
|
||||
|
||||
### Combine with pyfiglet or asciified
|
||||
|
||||
```bash
|
||||
python3 -m pyfiglet "HERMES" -f slant | boxes -d stone
|
||||
# Or without pyfiglet installed:
|
||||
curl -s "https://asciified.thelicato.io/api/v2/ascii?text=HERMES&font=Slant" | boxes -d stone
|
||||
```
|
||||
|
||||
## Tool 5: TOIlet (Colored Text Art)
|
||||
|
||||
Like pyfiglet but with ANSI color effects and visual filters. Great for terminal eye candy.
|
||||
|
||||
### Setup
|
||||
|
||||
```bash
|
||||
sudo apt install toilet toilet-fonts -y # Debian/Ubuntu
|
||||
# brew install toilet # macOS
|
||||
```
|
||||
|
||||
### Usage
|
||||
|
||||
```bash
|
||||
toilet "Hello World" # Basic text art
|
||||
toilet -f bigmono12 "Hello" # Specific font
|
||||
toilet --gay "Rainbow!" # Rainbow coloring
|
||||
toilet --metal "Metal!" # Metallic effect
|
||||
toilet -F border "Bordered" # Add border
|
||||
toilet -F border --gay "Fancy!" # Combined effects
|
||||
toilet -f pagga "Block" # Block-style font (unique to toilet)
|
||||
toilet -F list # List available filters
|
||||
```
|
||||
|
||||
### Filters
|
||||
|
||||
`crop`, `gay` (rainbow), `metal`, `flip`, `flop`, `180`, `left`, `right`, `border`
|
||||
|
||||
**Note**: toilet outputs ANSI escape codes for colors — works in terminals but may not render in all contexts (e.g., plain text files, some chat platforms).
|
||||
|
||||
## Tool 6: Image to ASCII Art
|
||||
|
||||
Convert images (PNG, JPEG, GIF, WEBP) to ASCII art.
|
||||
|
||||
### Option A: ascii-image-converter (recommended, modern)
|
||||
|
||||
```bash
|
||||
# Install
|
||||
sudo snap install ascii-image-converter
|
||||
# OR: go install github.com/TheZoraiz/ascii-image-converter@latest
|
||||
```
|
||||
|
||||
```bash
|
||||
ascii-image-converter image.png # Basic
|
||||
ascii-image-converter image.png -C # Color output
|
||||
ascii-image-converter image.png -d 60,30 # Set dimensions
|
||||
ascii-image-converter image.png -b # Braille characters
|
||||
ascii-image-converter image.png -n # Negative/inverted
|
||||
ascii-image-converter https://url/image.jpg # Direct URL
|
||||
ascii-image-converter image.png --save-txt out # Save as text
|
||||
```
|
||||
|
||||
### Option B: jp2a (lightweight, JPEG only)
|
||||
|
||||
```bash
|
||||
sudo apt install jp2a -y
|
||||
jp2a --width=80 image.jpg
|
||||
jp2a --colors image.jpg # Colorized
|
||||
```
|
||||
|
||||
## Tool 7: Search Pre-Made ASCII Art
|
||||
|
||||
Search curated ASCII art from the web. Use `terminal` with `curl`.
|
||||
|
||||
### Source A: ascii.co.uk (recommended for pre-made art)
|
||||
|
||||
Large collection of classic ASCII art organized by subject. Art is inside HTML `<pre>` tags. Fetch the page with curl, then extract art with a small Python snippet.
|
||||
|
||||
**URL pattern:** `https://ascii.co.uk/art/{subject}`
|
||||
|
||||
**Step 1 — Fetch the page:**
|
||||
|
||||
```bash
|
||||
curl -s 'https://ascii.co.uk/art/cat' -o /tmp/ascii_art.html
|
||||
```
|
||||
|
||||
**Step 2 — Extract art from pre tags:**
|
||||
|
||||
```python
|
||||
import re, html
|
||||
with open('/tmp/ascii_art.html') as f:
|
||||
text = f.read()
|
||||
arts = re.findall(r'<pre[^>]*>(.*?)</pre>', text, re.DOTALL)
|
||||
for art in arts:
|
||||
clean = re.sub(r'<[^>]+>', '', art)
|
||||
clean = html.unescape(clean).strip()
|
||||
if len(clean) > 30:
|
||||
print(clean)
|
||||
print('\n---\n')
|
||||
```
|
||||
|
||||
**Available subjects** (use as URL path):
|
||||
- Animals: `cat`, `dog`, `horse`, `bird`, `fish`, `dragon`, `snake`, `rabbit`, `elephant`, `dolphin`, `butterfly`, `owl`, `wolf`, `bear`, `penguin`, `turtle`
|
||||
- Objects: `car`, `ship`, `airplane`, `rocket`, `guitar`, `computer`, `coffee`, `beer`, `cake`, `house`, `castle`, `sword`, `crown`, `key`
|
||||
- Nature: `tree`, `flower`, `sun`, `moon`, `star`, `mountain`, `ocean`, `rainbow`
|
||||
- Characters: `skull`, `robot`, `angel`, `wizard`, `pirate`, `ninja`, `alien`
|
||||
- Holidays: `christmas`, `halloween`, `valentine`
|
||||
|
||||
**Tips:**
|
||||
- Preserve artist signatures/initials — important etiquette
|
||||
- Multiple art pieces per page — pick the best one for the user
|
||||
- Works reliably via curl, no JavaScript needed
|
||||
|
||||
### Source B: GitHub Octocat API (fun easter egg)
|
||||
|
||||
Returns a random GitHub Octocat with a wise quote. No auth needed.
|
||||
|
||||
```bash
|
||||
curl -s https://api.github.com/octocat
|
||||
```
|
||||
|
||||
## Tool 8: Fun ASCII Utilities (via curl)
|
||||
|
||||
These free services return ASCII art directly — great for fun extras.
|
||||
|
||||
### QR Codes as ASCII Art
|
||||
|
||||
```bash
|
||||
curl -s "qrenco.de/Hello+World"
|
||||
curl -s "qrenco.de/https://example.com"
|
||||
```
|
||||
|
||||
### Weather as ASCII Art
|
||||
|
||||
```bash
|
||||
curl -s "wttr.in/London" # Full weather report with ASCII graphics
|
||||
curl -s "wttr.in/Moon" # Moon phase in ASCII art
|
||||
curl -s "v2.wttr.in/London" # Detailed version
|
||||
```
|
||||
|
||||
## Tool 9: LLM-Generated Custom Art (Fallback)
|
||||
|
||||
When tools above don't have what's needed, generate ASCII art directly using these Unicode characters:
|
||||
|
||||
### Character Palette
|
||||
|
||||
**Box Drawing:** `╔ ╗ ╚ ╝ ║ ═ ╠ ╣ ╦ ╩ ╬ ┌ ┐ └ ┘ │ ─ ├ ┤ ┬ ┴ ┼ ╭ ╮ ╰ ╯`
|
||||
|
||||
**Block Elements:** `░ ▒ ▓ █ ▄ ▀ ▌ ▐ ▖ ▗ ▘ ▝ ▚ ▞`
|
||||
|
||||
**Geometric & Symbols:** `◆ ◇ ◈ ● ○ ◉ ■ □ ▲ △ ▼ ▽ ★ ☆ ✦ ✧ ◀ ▶ ◁ ▷ ⬡ ⬢ ⌂`
|
||||
|
||||
### Rules
|
||||
|
||||
- Max width: 60 characters per line (terminal-safe)
|
||||
- Max height: 15 lines for banners, 25 for scenes
|
||||
- Monospace only: output must render correctly in fixed-width fonts
|
||||
|
||||
## Decision Flow
|
||||
|
||||
1. **Text as a banner** → pyfiglet if installed, otherwise asciified API via curl
|
||||
2. **Wrap a message in fun character art** → cowsay
|
||||
3. **Add decorative border/frame** → boxes (can combine with pyfiglet/asciified)
|
||||
4. **Art of a specific thing** (cat, rocket, dragon) → ascii.co.uk via curl + parsing
|
||||
5. **Convert an image to ASCII** → ascii-image-converter or jp2a
|
||||
6. **QR code** → qrenco.de via curl
|
||||
7. **Weather/moon art** → wttr.in via curl
|
||||
8. **Something custom/creative** → LLM generation with Unicode palette
|
||||
9. **Any tool not installed** → install it, or fall back to next option
|
||||
@@ -0,0 +1,290 @@
|
||||
# ☤ ASCII Video
|
||||
|
||||
Renders any content as colored ASCII character video. Audio, video, images, text, or pure math in, MP4/GIF/PNG sequence out. Full RGB color per character cell, 1080p 24fps default. No GPU.
|
||||
|
||||
Built for [Hermes Agent](https://github.com/NousResearch/hermes-agent). Usable in any coding agent. Canonical source lives here; synced to [`NousResearch/hermes-agent/skills/creative/ascii-video`](https://github.com/NousResearch/hermes-agent/tree/main/skills/creative/ascii-video) via PR.
|
||||
|
||||
## What this is
|
||||
|
||||
A skill that teaches an agent how to build single-file Python renderers for ASCII video from scratch. The agent gets the full pipeline: grid system, font rasterization, effect library, shader chain, audio analysis, parallel encoding. It writes the renderer, runs it, gets video.
|
||||
|
||||
The output is actual video. Not terminal escape codes. Frames are computed as grids of colored characters, composited onto pixel canvases with pre-rasterized font bitmaps, post-processed through shaders, piped to ffmpeg.
|
||||
|
||||
## Modes
|
||||
|
||||
| Mode | Input | Output |
|
||||
|------|-------|--------|
|
||||
| Video-to-ASCII | A video file | ASCII recreation of the footage |
|
||||
| Audio-reactive | An audio file | Visuals driven by frequency bands, beats, energy |
|
||||
| Generative | Nothing | Procedural animation from math |
|
||||
| Hybrid | Video + audio | ASCII video with audio-reactive overlays |
|
||||
| Lyrics/text | Audio + timed text (SRT) | Karaoke-style text with effects |
|
||||
| TTS narration | Text quotes + API key | Narrated video with typewriter text and generated speech |
|
||||
|
||||
## Pipeline
|
||||
|
||||
Every mode follows the same 6-stage path:
|
||||
|
||||
```
|
||||
INPUT --> ANALYZE --> SCENE_FN --> TONEMAP --> SHADE --> ENCODE
|
||||
```
|
||||
|
||||
1. **Input** loads source material (or nothing for generative).
|
||||
2. **Analyze** extracts per-frame features. Audio gets 6-band FFT, RMS, spectral centroid, flatness, flux, beat detection with exponential decay. Video gets luminance, edges, motion.
|
||||
3. **Scene function** returns a pixel canvas directly. Composes multiple character grids at different densities, value/hue fields, pixel blend modes. This is where the visuals happen.
|
||||
4. **Tonemap** does adaptive percentile-based brightness normalization with per-scene gamma. ASCII on black is inherently dark. Linear multipliers don't work. This does.
|
||||
5. **Shade** runs a `ShaderChain` (38 composable shaders) plus a `FeedbackBuffer` for temporal recursion with spatial transforms.
|
||||
6. **Encode** pipes raw RGB frames to ffmpeg for H.264 encoding. Segments concatenated, audio muxed.
|
||||
|
||||
## Grid system
|
||||
|
||||
Characters render on fixed-size grids. Layer multiple densities for depth.
|
||||
|
||||
| Size | Font | Grid at 1080p | Use |
|
||||
|------|------|---------------|-----|
|
||||
| xs | 8px | 400x108 | Ultra-dense data fields |
|
||||
| sm | 10px | 320x83 | Rain, starfields |
|
||||
| md | 16px | 192x56 | Default balanced |
|
||||
| lg | 20px | 160x45 | Readable text |
|
||||
| xl | 24px | 137x37 | Large titles |
|
||||
| xxl | 40px | 80x22 | Giant minimal |
|
||||
|
||||
Rendering the same scene on `sm` and `lg` then screen-blending them creates natural texture interference. Fine detail shows through gaps in coarse characters. Most scenes use two or three grids.
|
||||
|
||||
## Character palettes (24)
|
||||
|
||||
Each sorted dark-to-bright, each a different visual texture. Validated against the font at init so broken glyphs get dropped silently.
|
||||
|
||||
| Family | Examples | Feel |
|
||||
|--------|----------|------|
|
||||
| Density ramps | ` .:-=+#@█` | Classic ASCII art gradient |
|
||||
| Block elements | ` ░▒▓█▄▀▐▌` | Chunky, digital |
|
||||
| Braille | ` ⠁⠂⠃...⠿` | Fine-grained pointillism |
|
||||
| Dots | ` ⋅∘∙●◉◎` | Smooth, organic |
|
||||
| Stars | ` ·✧✦✩✨★✶` | Sparkle, celestial |
|
||||
| Half-fills | ` ◔◑◕◐◒◓◖◗◙` | Directional fill progression |
|
||||
| Crosshatch | ` ▣▤▥▦▧▨▩` | Hatched density ramp |
|
||||
| Math | ` ·∘∙•°±×÷≈≠≡∞∫∑Ω` | Scientific, abstract |
|
||||
| Box drawing | ` ─│┌┐└┘├┤┬┴┼` | Structural, circuit-like |
|
||||
| Katakana | ` ·ヲァィゥェォャュ...` | Matrix rain |
|
||||
| Greek | ` αβγδεζηθ...ω` | Classical, academic |
|
||||
| Runes | ` ᚠᚢᚦᚱᚷᛁᛇᛒᛖᛚᛞᛟ` | Mystical, ancient |
|
||||
| Alchemical | ` ☉☽♀♂♃♄♅♆♇` | Esoteric |
|
||||
| Arrows | ` ←↑→↓↔↕↖↗↘↙` | Directional, kinetic |
|
||||
| Music | ` ♪♫♬♩♭♮♯○●` | Musical |
|
||||
| Project-specific | ` .·~=≈∞⚡☿✦★⊕◊◆▲▼●■` | Themed per project |
|
||||
|
||||
Custom palettes are built per project to match the content.
|
||||
|
||||
## Color strategies
|
||||
|
||||
| Strategy | How it maps hue | Good for |
|
||||
|----------|----------------|----------|
|
||||
| Angle-mapped | Position angle from center | Rainbow radial effects |
|
||||
| Distance-mapped | Distance from center | Depth, tunnels |
|
||||
| Frequency-mapped | Audio spectral centroid | Timbral shifting |
|
||||
| Value-mapped | Brightness level | Heat maps, fire |
|
||||
| Time-cycled | Slow rotation over time | Ambient, chill |
|
||||
| Source-sampled | Original video pixel colors | Video-to-ASCII |
|
||||
| Palette-indexed | Discrete lookup table | Retro, flat graphic |
|
||||
| Temperature | Warm-to-cool blend | Emotional tone |
|
||||
| Complementary | Hue + opposite | Bold, dramatic |
|
||||
| Triadic | Three equidistant hues | Psychedelic, vibrant |
|
||||
| Analogous | Neighboring hues | Harmonious, subtle |
|
||||
| Monochrome | Fixed hue, vary S/V | Noir, focused |
|
||||
|
||||
Plus 10 discrete RGB palettes (neon, pastel, cyberpunk, vaporwave, earth, ice, blood, forest, mono-green, mono-amber).
|
||||
|
||||
Full OKLAB/OKLCH color system: sRGB↔linear↔OKLAB conversion pipeline, perceptually uniform gradient interpolation, and color harmony generation (complementary, triadic, analogous, split-complementary, tetradic).
|
||||
|
||||
## Value field generators (21)
|
||||
|
||||
Value fields are the core visual building blocks. Each produces a 2D float array in [0, 1] mapping every grid cell to a brightness value.
|
||||
|
||||
### Trigonometric (12)
|
||||
|
||||
| Field | Description |
|
||||
|-------|-------------|
|
||||
| Sine field | Layered multi-sine interference, general-purpose background |
|
||||
| Smooth noise | Multi-octave sine approximation of Perlin noise |
|
||||
| Rings | Concentric rings, bass-driven count and wobble |
|
||||
| Spiral | Logarithmic spiral arms, configurable arm count/tightness |
|
||||
| Tunnel | Infinite depth perspective (inverse distance) |
|
||||
| Vortex | Twisting radial pattern, distance modulates angle |
|
||||
| Interference | N overlapping sine waves creating moire |
|
||||
| Aurora | Horizontal flowing bands |
|
||||
| Ripple | Concentric waves from configurable source points |
|
||||
| Plasma | Sum of sines at multiple orientations/speeds |
|
||||
| Diamond | Diamond/checkerboard pattern |
|
||||
| Noise/static | Random per-cell per-frame flicker |
|
||||
|
||||
### Noise-based (4)
|
||||
|
||||
| Field | Description |
|
||||
|-------|-------------|
|
||||
| Value noise | Smooth organic noise, no axis-alignment artifacts |
|
||||
| fBM | Fractal Brownian Motion — octaved noise for clouds, terrain, smoke |
|
||||
| Domain warp | Inigo Quilez technique — fBM-driven coordinate distortion for flowing organic forms |
|
||||
| Voronoi | Moving seed points with distance, edge, and cell-ID output modes |
|
||||
|
||||
### Simulation-based (4)
|
||||
|
||||
| Field | Description |
|
||||
|-------|-------------|
|
||||
| Reaction-diffusion | Gray-Scott with 7 presets: coral, spots, worms, labyrinths, mitosis, pulsating, chaos |
|
||||
| Cellular automata | Game of Life + 4 rule variants with analog fade trails |
|
||||
| Strange attractors | Clifford, De Jong, Bedhead — iterated point systems binned to density fields |
|
||||
| Temporal noise | 3D noise that morphs in-place without directional drift |
|
||||
|
||||
### SDF-based
|
||||
|
||||
7 signed distance field primitives (circle, box, ring, line, triangle, star, heart) with smooth boolean combinators (union, intersection, subtraction, smooth union/subtraction) and infinite tiling. Render as solid fills or glowing outlines.
|
||||
|
||||
## Hue field generators (9)
|
||||
|
||||
Determine per-cell color independent of brightness: fixed hue, angle-mapped rainbow, distance gradient, time-cycled rotation, audio spectral centroid, horizontal/vertical gradients, plasma variation, perceptually uniform OKLCH rainbow.
|
||||
|
||||
## Coordinate transforms (11)
|
||||
|
||||
UV-space transforms applied before effect evaluation: rotate, scale, skew, tile (with mirror seaming), polar, inverse-polar, twist (rotation increasing with distance), fisheye, wave displacement, Möbius conformal transformation. `make_tgrid()` wraps transformed coordinates into a grid object.
|
||||
|
||||
## Particle systems (9)
|
||||
|
||||
| Type | Behavior |
|
||||
|------|----------|
|
||||
| Explosion | Beat-triggered radial burst with gravity and life decay |
|
||||
| Embers | Rising from bottom with horizontal drift |
|
||||
| Dissolving cloud | Spreading outward with accelerating fade |
|
||||
| Starfield | 3D projected, Z-depth stars approaching with streak trails |
|
||||
| Orbit | Circular/elliptical paths around center |
|
||||
| Gravity well | Attracted toward configurable point sources |
|
||||
| Boid flocking | Separation/alignment/cohesion with spatial hash for O(n) neighbors |
|
||||
| Flow-field | Steered by gradient of any value field |
|
||||
| Trail particles | Fading lines between current and previous positions |
|
||||
|
||||
14 themed particle character sets (energy, spark, leaf, snow, rain, bubble, data, hex, binary, rune, zodiac, dot, dash).
|
||||
|
||||
## Temporal coherence
|
||||
|
||||
10 easing functions (linear, quad, cubic, expo, elastic, bounce — in/out/in-out). Keyframe interpolation with eased transitions. Value field morphing (smooth crossfade between fields). Value field sequencing (cycle through fields with crossfade). Temporal noise (3D noise evolving smoothly in-place).
|
||||
|
||||
## Shader pipeline
|
||||
|
||||
38 composable shaders, applied to the pixel canvas after character rendering. Configurable per section.
|
||||
|
||||
| Category | Shaders |
|
||||
|----------|---------|
|
||||
| Geometry | CRT barrel, pixelate, wave distort, displacement map, kaleidoscope, mirror (h/v/quad/diag) |
|
||||
| Channel | Chromatic aberration (beat-reactive), channel shift, channel swap, RGB split radial |
|
||||
| Color | Invert, posterize, threshold, solarize, hue rotate, saturation, color grade, color wobble, color ramp |
|
||||
| Glow/Blur | Bloom, edge glow, soft focus, radial blur |
|
||||
| Noise | Film grain (beat-reactive), static noise |
|
||||
| Lines/Patterns | Scanlines, halftone |
|
||||
| Tone | Vignette, contrast, gamma, levels, brightness |
|
||||
| Glitch/Data | Glitch bands (beat-reactive), block glitch, pixel sort, data bend |
|
||||
|
||||
12 color tint presets: warm, cool, matrix green, amber, sepia, neon pink, ice, blood, forest, void, sunset, neutral.
|
||||
|
||||
7 mood presets for common shader combos:
|
||||
|
||||
| Mood | Shaders |
|
||||
|------|---------|
|
||||
| Retro terminal | CRT + scanlines + grain + amber/green tint |
|
||||
| Clean modern | Light bloom + subtle vignette |
|
||||
| Glitch art | Heavy chromatic + glitch bands + color wobble |
|
||||
| Cinematic | Bloom + vignette + grain + color grade |
|
||||
| Dreamy | Heavy bloom + soft focus + color wobble |
|
||||
| Harsh/industrial | High contrast + grain + scanlines, no bloom |
|
||||
| Psychedelic | Color wobble + chromatic + kaleidoscope mirror |
|
||||
|
||||
## Blend modes and composition
|
||||
|
||||
20 pixel blend modes for layering canvases: normal, add, subtract, multiply, screen, overlay, softlight, hardlight, difference, exclusion, colordodge, colorburn, linearlight, vividlight, pin_light, hard_mix, lighten, darken, grain_extract, grain_merge. Both sRGB and linear-light blending supported.
|
||||
|
||||
**Feedback buffer.** Temporal recursion — each frame blends with a transformed version of the previous frame. 7 spatial transforms: zoom, shrink, rotate CW/CCW, shift up/down, mirror. Optional per-frame hue shift for rainbow trails. Configurable decay, blend mode, and opacity per scene.
|
||||
|
||||
**Masking.** 16 mask types for spatial compositing: shape masks (circle, rect, ring, gradients), procedural masks (any value field as a mask, text stencils), animated masks (iris open/close, wipe, dissolve), boolean operations (union, intersection, subtraction, invert).
|
||||
|
||||
**Transitions.** Crossfade, directional wipe, radial wipe, dissolve, glitch cut.
|
||||
|
||||
## Scene design patterns
|
||||
|
||||
Compositional patterns for making scenes that look intentional rather than random.
|
||||
|
||||
**Layer hierarchy.** Background (dim atmosphere, dense grid), content (main visual, standard grid), accent (sparse highlights, coarse grid). Three distinct roles, not three competing layers.
|
||||
|
||||
**Directional parameter arcs.** The defining parameter of each scene ramps, accelerates, or builds over its duration. Progress-based formulas (linear, ease-out, step reveal) replace aimless `sin(t)` oscillation.
|
||||
|
||||
**Scene concepts.** Scenes built around visual metaphors (emergence, descent, collision, entropy) with motivated layer/palette/feedback choices. Not named after their effects.
|
||||
|
||||
**Compositional techniques.** Counter-rotating dual systems, wave collision, progressive fragmentation (voronoi cells multiplying over time), entropy (geometry consumed by reaction-diffusion), staggered layer entry (crescendo buildup).
|
||||
|
||||
## Hardware adaptation
|
||||
|
||||
Auto-detects CPU count, RAM, platform, ffmpeg. Adapts worker count, resolution, FPS.
|
||||
|
||||
| Profile | Resolution | FPS | When |
|
||||
|---------|-----------|-----|------|
|
||||
| `draft` | 960x540 | 12 | Check timing/layout |
|
||||
| `preview` | 1280x720 | 15 | Review effects |
|
||||
| `production` | 1920x1080 | 24 | Final output |
|
||||
| `max` | 3840x2160 | 30 | Ultra-high |
|
||||
| `auto` | Detected | 24 | Adapts to hardware + duration |
|
||||
|
||||
`auto` estimates render time and downgrades if it would take over an hour. Low-memory systems drop to 720p automatically.
|
||||
|
||||
### Render times (1080p 24fps, ~180ms/frame/worker)
|
||||
|
||||
| Duration | 4 workers | 8 workers | 16 workers |
|
||||
|----------|-----------|-----------|------------|
|
||||
| 30s | ~3 min | ~2 min | ~1 min |
|
||||
| 2 min | ~13 min | ~7 min | ~4 min |
|
||||
| 5 min | ~33 min | ~17 min | ~9 min |
|
||||
| 10 min | ~65 min | ~33 min | ~17 min |
|
||||
|
||||
720p roughly halves these. 4K roughly quadruples them.
|
||||
|
||||
## Known pitfalls
|
||||
|
||||
**Brightness.** ASCII characters are small bright dots on black. Most frame pixels are background. Linear `* N` multipliers clip highlights and wash out. Use `tonemap()` with per-scene gamma instead. Default gamma 0.75, solarize scenes 0.55, posterize 0.50.
|
||||
|
||||
**Render bottleneck.** The per-cell Python loop compositing font bitmaps runs at ~100-150ms/frame. Unavoidable without Cython/C. Everything else must be vectorized numpy. Python for-loops over rows/cols in effect functions will tank performance.
|
||||
|
||||
**ffmpeg deadlock.** Never `stderr=subprocess.PIPE` on long-running encodes. Buffer fills at ~64KB, process hangs. Redirect stderr to a file.
|
||||
|
||||
**Font cell height.** Pillow's `textbbox()` returns wrong height on macOS. Use `font.getmetrics()` for `ascent + descent`.
|
||||
|
||||
**Font compatibility.** Not all Unicode renders in all fonts. Palettes validated at init, blank glyphs silently removed.
|
||||
|
||||
## Requirements
|
||||
|
||||
◆ Python 3.10+
|
||||
◆ NumPy, Pillow, SciPy (audio modes)
|
||||
◆ ffmpeg on PATH
|
||||
◆ A monospace font (Menlo, Courier, Monaco, auto-detected)
|
||||
◆ Optional: OpenCV, ElevenLabs API key (TTS mode)
|
||||
|
||||
## File structure
|
||||
|
||||
```
|
||||
├── SKILL.md # Modes, workflow, creative direction
|
||||
├── README.md # This file
|
||||
└── references/
|
||||
├── architecture.md # Grid system, fonts, palettes, color, _render_vf()
|
||||
├── effects.md # Value fields, hue fields, backgrounds, particles
|
||||
├── shaders.md # 38 shaders, ShaderChain, tint presets, transitions
|
||||
├── composition.md # Blend modes, multi-grid, tonemap, FeedbackBuffer
|
||||
├── scenes.md # Scene protocol, SCENES table, render_clip(), examples
|
||||
├── design-patterns.md # Layer hierarchy, directional arcs, scene concepts
|
||||
├── inputs.md # Audio analysis, video sampling, text, TTS
|
||||
├── optimization.md # Hardware detection, vectorized patterns, parallelism
|
||||
└── troubleshooting.md # Broadcasting traps, blend pitfalls, diagnostics
|
||||
```
|
||||
|
||||
## Projects built with this
|
||||
|
||||
✦ 85-second highlight reel. 15 scenes (14×5s + 15s crescendo finale), randomized order, directional parameter arcs, layer hierarchy composition. Showcases the full effect vocabulary: fBM, voronoi fragmentation, reaction-diffusion, cellular automata, dual counter-rotating spirals, wave collision, domain warping, tunnel descent, kaleidoscope symmetry, boid flocking, fire simulation, glitch corruption, and a 7-layer crescendo buildup.
|
||||
|
||||
✦ Audio-reactive music visualizer. 3.5 min, 8 sections with distinct effects, beat-triggered particles and glitch, cycling palettes.
|
||||
|
||||
✦ TTS narrated testimonial video. 23 quotes, per-quote ElevenLabs voices, background music at 15% wide stereo, per-clip re-rendering for iterative editing.
|
||||
@@ -0,0 +1,232 @@
|
||||
---
|
||||
name: ascii-video
|
||||
description: "Production pipeline for ASCII art video — any format. Converts video/audio/images/generative input into colored ASCII character video output (MP4, GIF, image sequence). Covers: video-to-ASCII conversion, audio-reactive music visualizers, generative ASCII art animations, hybrid video+audio reactive, text/lyrics overlays, real-time terminal rendering. Use when users request: ASCII video, text art video, terminal-style video, character art animation, retro text visualization, audio visualizer in ASCII, converting video to ASCII art, matrix-style effects, or any animated ASCII output."
|
||||
---
|
||||
|
||||
# ASCII Video Production Pipeline
|
||||
|
||||
## Creative Standard
|
||||
|
||||
This is visual art. ASCII characters are the medium; cinema is the standard.
|
||||
|
||||
**Before writing a single line of code**, articulate the creative concept. What is the mood? What visual story does this tell? What makes THIS project different from every other ASCII video? The user's prompt is a starting point — interpret it with creative ambition, not literal transcription.
|
||||
|
||||
**First-render excellence is non-negotiable.** The output must be visually striking without requiring revision rounds. If something looks generic, flat, or like "AI-generated ASCII art," it is wrong — rethink the creative concept before shipping.
|
||||
|
||||
**Go beyond the reference vocabulary.** The effect catalogs, shader presets, and palette libraries in the references are a starting vocabulary. For every project, combine, modify, and invent new patterns. The catalog is a palette of paints — you write the painting.
|
||||
|
||||
**Be proactively creative.** Extend the skill's vocabulary when the project calls for it. If the references don't have what the vision demands, build it. Include at least one visual moment the user didn't ask for but will appreciate — a transition, an effect, a color choice that elevates the whole piece.
|
||||
|
||||
**Cohesive aesthetic over technical correctness.** All scenes in a video must feel connected by a unifying visual language — shared color temperature, related character palettes, consistent motion vocabulary. A technically correct video where every scene uses a random different effect is an aesthetic failure.
|
||||
|
||||
**Dense, layered, considered.** Every frame should reward viewing. Never flat black backgrounds. Always multi-grid composition. Always per-scene variation. Always intentional color.
|
||||
|
||||
## Modes
|
||||
|
||||
| Mode | Input | Output | Reference |
|
||||
|------|-------|--------|-----------|
|
||||
| **Video-to-ASCII** | Video file | ASCII recreation of source footage | `references/inputs.md` § Video Sampling |
|
||||
| **Audio-reactive** | Audio file | Generative visuals driven by audio features | `references/inputs.md` § Audio Analysis |
|
||||
| **Generative** | None (or seed params) | Procedural ASCII animation | `references/effects.md` |
|
||||
| **Hybrid** | Video + audio | ASCII video with audio-reactive overlays | Both input refs |
|
||||
| **Lyrics/text** | Audio + text/SRT | Timed text with visual effects | `references/inputs.md` § Text/Lyrics |
|
||||
| **TTS narration** | Text quotes + TTS API | Narrated testimonial/quote video with typed text | `references/inputs.md` § TTS Integration |
|
||||
|
||||
## Stack
|
||||
|
||||
Single self-contained Python script per project. No GPU required.
|
||||
|
||||
| Layer | Tool | Purpose |
|
||||
|-------|------|---------|
|
||||
| Core | Python 3.10+, NumPy | Math, array ops, vectorized effects |
|
||||
| Signal | SciPy | FFT, peak detection (audio modes) |
|
||||
| Imaging | Pillow (PIL) | Font rasterization, frame decoding, image I/O |
|
||||
| Video I/O | ffmpeg (CLI) | Decode input, encode output, mux audio |
|
||||
| Parallel | concurrent.futures | N workers for batch/clip rendering |
|
||||
| TTS | ElevenLabs API (optional) | Generate narration clips |
|
||||
| Optional | OpenCV | Video frame sampling, edge detection |
|
||||
|
||||
## Pipeline Architecture
|
||||
|
||||
Every mode follows the same 6-stage pipeline:
|
||||
|
||||
```
|
||||
INPUT → ANALYZE → SCENE_FN → TONEMAP → SHADE → ENCODE
|
||||
```
|
||||
|
||||
1. **INPUT** — Load/decode source material (video frames, audio samples, images, or nothing)
|
||||
2. **ANALYZE** — Extract per-frame features (audio bands, video luminance/edges, motion vectors)
|
||||
3. **SCENE_FN** — Scene function renders to pixel canvas (`uint8 H,W,3`). Composes multiple character grids via `_render_vf()` + pixel blend modes. See `references/composition.md`
|
||||
4. **TONEMAP** — Percentile-based adaptive brightness normalization. See `references/composition.md` § Adaptive Tonemap
|
||||
5. **SHADE** — Post-processing via `ShaderChain` + `FeedbackBuffer`. See `references/shaders.md`
|
||||
6. **ENCODE** — Pipe raw RGB frames to ffmpeg for H.264/GIF encoding
|
||||
|
||||
## Creative Direction
|
||||
|
||||
### Aesthetic Dimensions
|
||||
|
||||
| Dimension | Options | Reference |
|
||||
|-----------|---------|-----------|
|
||||
| **Character palette** | Density ramps, block elements, symbols, scripts (katakana, Greek, runes, braille), project-specific | `architecture.md` § Palettes |
|
||||
| **Color strategy** | HSV, OKLAB/OKLCH, discrete RGB palettes, auto-generated harmony, monochrome, temperature | `architecture.md` § Color System |
|
||||
| **Background texture** | Sine fields, fBM noise, domain warp, voronoi, reaction-diffusion, cellular automata, video | `effects.md` |
|
||||
| **Primary effects** | Rings, spirals, tunnel, vortex, waves, interference, aurora, fire, SDFs, strange attractors | `effects.md` |
|
||||
| **Particles** | Sparks, snow, rain, bubbles, runes, orbits, flocking boids, flow-field followers, trails | `effects.md` § Particles |
|
||||
| **Shader mood** | Retro CRT, clean modern, glitch art, cinematic, dreamy, industrial, psychedelic | `shaders.md` |
|
||||
| **Grid density** | xs(8px) through xxl(40px), mixed per layer | `architecture.md` § Grid System |
|
||||
| **Coordinate space** | Cartesian, polar, tiled, rotated, fisheye, Möbius, domain-warped | `effects.md` § Transforms |
|
||||
| **Feedback** | Zoom tunnel, rainbow trails, ghostly echo, rotating mandala, color evolution | `composition.md` § Feedback |
|
||||
| **Masking** | Circle, ring, gradient, text stencil, animated iris/wipe/dissolve | `composition.md` § Masking |
|
||||
| **Transitions** | Crossfade, wipe, dissolve, glitch cut, iris, mask-based reveal | `shaders.md` § Transitions |
|
||||
|
||||
### Per-Section Variation
|
||||
|
||||
Never use the same config for the entire video. For each section/scene:
|
||||
- **Different background effect** (or compose 2-3)
|
||||
- **Different character palette** (match the mood)
|
||||
- **Different color strategy** (or at minimum a different hue)
|
||||
- **Vary shader intensity** (more bloom during peaks, more grain during quiet)
|
||||
- **Different particle types** if particles are active
|
||||
|
||||
### Project-Specific Invention
|
||||
|
||||
For every project, invent at least one of:
|
||||
- A custom character palette matching the theme
|
||||
- A custom background effect (combine/modify existing building blocks)
|
||||
- A custom color palette (discrete RGB set matching the brand/mood)
|
||||
- A custom particle character set
|
||||
- A novel scene transition or visual moment
|
||||
|
||||
Don't just pick from the catalog. The catalog is vocabulary — you write the poem.
|
||||
|
||||
## Workflow
|
||||
|
||||
### Step 1: Creative Vision
|
||||
|
||||
Before any code, articulate the creative concept:
|
||||
|
||||
- **Mood/atmosphere**: What should the viewer feel? Energetic, meditative, chaotic, elegant, ominous?
|
||||
- **Visual story**: What happens over the duration? Build tension? Transform? Dissolve?
|
||||
- **Color world**: Warm/cool? Monochrome? Neon? Earth tones? What's the dominant hue?
|
||||
- **Character texture**: Dense data? Sparse stars? Organic dots? Geometric blocks?
|
||||
- **What makes THIS different**: What's the one thing that makes this project unique?
|
||||
- **Emotional arc**: How do scenes progress? Open with energy, build to climax, resolve?
|
||||
|
||||
Map the user's prompt to aesthetic choices. A "chill lo-fi visualizer" demands different everything from a "glitch cyberpunk data stream."
|
||||
|
||||
### Step 2: Technical Design
|
||||
|
||||
- **Mode** — which of the 6 modes above
|
||||
- **Resolution** — landscape 1920x1080 (default), portrait 1080x1920, square 1080x1080 @ 24fps
|
||||
- **Hardware detection** — auto-detect cores/RAM, set quality profile. See `references/optimization.md`
|
||||
- **Sections** — map timestamps to scene functions, each with its own effect/palette/color/shader config
|
||||
- **Output format** — MP4 (default), GIF (640x360 @ 15fps), PNG sequence
|
||||
|
||||
### Step 3: Build the Script
|
||||
|
||||
Single Python file. Components (with references):
|
||||
|
||||
1. **Hardware detection + quality profile** — `references/optimization.md`
|
||||
2. **Input loader** — mode-dependent; `references/inputs.md`
|
||||
3. **Feature analyzer** — audio FFT, video luminance, or synthetic
|
||||
4. **Grid + renderer** — multi-density grids with bitmap cache; `references/architecture.md`
|
||||
5. **Character palettes** — multiple per project; `references/architecture.md` § Palettes
|
||||
6. **Color system** — HSV + discrete RGB + harmony generation; `references/architecture.md` § Color
|
||||
7. **Scene functions** — each returns `canvas (uint8 H,W,3)`; `references/scenes.md`
|
||||
8. **Tonemap** — adaptive brightness normalization; `references/composition.md`
|
||||
9. **Shader pipeline** — `ShaderChain` + `FeedbackBuffer`; `references/shaders.md`
|
||||
10. **Scene table + dispatcher** — time → scene function + config; `references/scenes.md`
|
||||
11. **Parallel encoder** — N-worker clip rendering with ffmpeg pipes
|
||||
12. **Main** — orchestrate full pipeline
|
||||
|
||||
### Step 4: Quality Verification
|
||||
|
||||
- **Test frames first**: render single frames at key timestamps before full render
|
||||
- **Brightness check**: `canvas.mean() > 8` for all ASCII content. If dark, lower gamma
|
||||
- **Visual coherence**: do all scenes feel like they belong to the same video?
|
||||
- **Creative vision check**: does the output match the concept from Step 1? If it looks generic, go back
|
||||
|
||||
## Critical Implementation Notes
|
||||
|
||||
### Brightness — Use `tonemap()`, Not Linear Multipliers
|
||||
|
||||
This is the #1 visual issue. ASCII on black is inherently dark. **Never use `canvas * N` multipliers** — they clip highlights. Use adaptive tonemap:
|
||||
|
||||
```python
|
||||
def tonemap(canvas, gamma=0.75):
|
||||
f = canvas.astype(np.float32)
|
||||
lo, hi = np.percentile(f[::4, ::4], [1, 99.5])
|
||||
if hi - lo < 10: hi = lo + 10
|
||||
f = np.clip((f - lo) / (hi - lo), 0, 1) ** gamma
|
||||
return (f * 255).astype(np.uint8)
|
||||
```
|
||||
|
||||
Pipeline: `scene_fn() → tonemap() → FeedbackBuffer → ShaderChain → ffmpeg`
|
||||
|
||||
Per-scene gamma: default 0.75, solarize 0.55, posterize 0.50, bright scenes 0.85. Use `screen` blend (not `overlay`) for dark layers.
|
||||
|
||||
### Font Cell Height
|
||||
|
||||
macOS Pillow: `textbbox()` returns wrong height. Use `font.getmetrics()`: `cell_height = ascent + descent`. See `references/troubleshooting.md`.
|
||||
|
||||
### ffmpeg Pipe Deadlock
|
||||
|
||||
Never `stderr=subprocess.PIPE` with long-running ffmpeg — buffer fills at 64KB and deadlocks. Redirect to file. See `references/troubleshooting.md`.
|
||||
|
||||
### Font Compatibility
|
||||
|
||||
Not all Unicode chars render in all fonts. Validate palettes at init — render each char, check for blank output. See `references/troubleshooting.md`.
|
||||
|
||||
### Per-Clip Architecture
|
||||
|
||||
For segmented videos (quotes, scenes, chapters), render each as a separate clip file for parallel rendering and selective re-rendering. See `references/scenes.md`.
|
||||
|
||||
## Performance Targets
|
||||
|
||||
| Component | Budget |
|
||||
|-----------|--------|
|
||||
| Feature extraction | 1-5ms |
|
||||
| Effect function | 2-15ms |
|
||||
| Character render | 80-150ms (bottleneck) |
|
||||
| Shader pipeline | 5-25ms |
|
||||
| **Total** | ~100-200ms/frame |
|
||||
|
||||
## References
|
||||
|
||||
| File | Contents |
|
||||
|------|----------|
|
||||
| `references/architecture.md` | Grid system, resolution presets, font selection, character palettes (20+), color system (HSV + OKLAB + discrete RGB + harmony generation), `_render_vf()` helper, GridLayer class |
|
||||
| `references/composition.md` | Pixel blend modes (20 modes), `blend_canvas()`, multi-grid composition, adaptive `tonemap()`, `FeedbackBuffer`, `PixelBlendStack`, masking/stencil system |
|
||||
| `references/effects.md` | Effect building blocks: value field generators, hue fields, noise/fBM/domain warp, voronoi, reaction-diffusion, cellular automata, SDFs, strange attractors, particle systems, coordinate transforms, temporal coherence |
|
||||
| `references/shaders.md` | `ShaderChain`, `_apply_shader_step()` dispatch, 38 shader catalog, audio-reactive scaling, transitions, tint presets, output format encoding, terminal rendering |
|
||||
| `references/scenes.md` | Scene protocol, `Renderer` class, `SCENES` table, `render_clip()`, beat-synced cutting, parallel rendering, design patterns (layer hierarchy, directional arcs, visual metaphors, compositional techniques), complete scene examples at every complexity level, scene design checklist |
|
||||
| `references/inputs.md` | Audio analysis (FFT, bands, beats), video sampling, image conversion, text/lyrics, TTS integration (ElevenLabs, voice assignment, audio mixing) |
|
||||
| `references/optimization.md` | Hardware detection, quality profiles, vectorized patterns, parallel rendering, memory management, performance budgets |
|
||||
| `references/troubleshooting.md` | NumPy broadcasting traps, blend mode pitfalls, multiprocessing/pickling, brightness diagnostics, ffmpeg issues, font problems, common mistakes |
|
||||
|
||||
---
|
||||
|
||||
## Creative Divergence (use only when user requests experimental/creative/unique output)
|
||||
|
||||
If the user asks for creative, experimental, surprising, or unconventional output, select the strategy that best fits and reason through its steps BEFORE generating code.
|
||||
|
||||
- **Forced Connections** — when the user wants cross-domain inspiration ("make it look organic," "industrial aesthetic")
|
||||
- **Conceptual Blending** — when the user names two things to combine ("ocean meets music," "space + calligraphy")
|
||||
- **Oblique Strategies** — when the user is maximally open ("surprise me," "something I've never seen")
|
||||
|
||||
### Forced Connections
|
||||
1. Pick a domain unrelated to the visual goal (weather systems, microbiology, architecture, fluid dynamics, textile weaving)
|
||||
2. List its core visual/structural elements (erosion → gradual reveal; mitosis → splitting duplication; weaving → interlocking patterns)
|
||||
3. Map those elements onto ASCII characters and animation patterns
|
||||
4. Synthesize — what does "erosion" or "crystallization" look like in a character grid?
|
||||
|
||||
### Conceptual Blending
|
||||
1. Name two distinct visual/conceptual spaces (e.g., ocean waves + sheet music)
|
||||
2. Map correspondences (crests = high notes, troughs = rests, foam = staccato)
|
||||
3. Blend selectively — keep the most interesting mappings, discard forced ones
|
||||
4. Develop emergent properties that exist only in the blend
|
||||
|
||||
### Oblique Strategies
|
||||
1. Draw one: "Honor thy error as a hidden intention" / "Use an old idea" / "What would your closest friend do?" / "Emphasize the flaws" / "Turn it upside down" / "Only a part, not the whole" / "Reverse"
|
||||
2. Interpret the directive against the current ASCII animation challenge
|
||||
3. Apply the lateral insight to the visual design before writing code
|
||||
@@ -0,0 +1,802 @@
|
||||
# Architecture Reference
|
||||
|
||||
> **See also:** composition.md · effects.md · scenes.md · shaders.md · inputs.md · optimization.md · troubleshooting.md
|
||||
|
||||
## Grid System
|
||||
|
||||
### Resolution Presets
|
||||
|
||||
```python
|
||||
RESOLUTION_PRESETS = {
|
||||
"landscape": (1920, 1080), # 16:9 — YouTube, default
|
||||
"portrait": (1080, 1920), # 9:16 — TikTok, Reels, Stories
|
||||
"square": (1080, 1080), # 1:1 — Instagram feed
|
||||
"ultrawide": (2560, 1080), # 21:9 — cinematic
|
||||
"landscape4k":(3840, 2160), # 16:9 — 4K
|
||||
"portrait4k": (2160, 3840), # 9:16 — 4K portrait
|
||||
}
|
||||
|
||||
def get_resolution(preset="landscape", custom=None):
|
||||
"""Returns (VW, VH) tuple."""
|
||||
if custom:
|
||||
return custom
|
||||
return RESOLUTION_PRESETS.get(preset, RESOLUTION_PRESETS["landscape"])
|
||||
```
|
||||
|
||||
### Multi-Density Grids
|
||||
|
||||
Pre-initialize multiple grid sizes. Switch per section for visual variety. Grid dimensions auto-compute from resolution:
|
||||
|
||||
**Landscape (1920x1080):**
|
||||
|
||||
| Key | Font Size | Grid (cols x rows) | Use |
|
||||
|-----|-----------|-------------------|-----|
|
||||
| xs | 8 | 400x108 | Ultra-dense data fields |
|
||||
| sm | 10 | 320x83 | Dense detail, rain, starfields |
|
||||
| md | 16 | 192x56 | Default balanced, transitions |
|
||||
| lg | 20 | 160x45 | Quote/lyric text (readable at 1080p) |
|
||||
| xl | 24 | 137x37 | Short quotes, large titles |
|
||||
| xxl | 40 | 80x22 | Giant text, minimal |
|
||||
|
||||
**Portrait (1080x1920):**
|
||||
|
||||
| Key | Font Size | Grid (cols x rows) | Use |
|
||||
|-----|-----------|-------------------|-----|
|
||||
| xs | 8 | 225x192 | Ultra-dense, tall data columns |
|
||||
| sm | 10 | 180x148 | Dense detail, vertical rain |
|
||||
| md | 16 | 112x100 | Default balanced |
|
||||
| lg | 20 | 90x80 | Readable text (~30 chars/line centered) |
|
||||
| xl | 24 | 75x66 | Short quotes, stacked |
|
||||
| xxl | 40 | 45x39 | Giant text, minimal |
|
||||
|
||||
**Square (1080x1080):**
|
||||
|
||||
| Key | Font Size | Grid (cols x rows) | Use |
|
||||
|-----|-----------|-------------------|-----|
|
||||
| sm | 10 | 180x83 | Dense detail |
|
||||
| md | 16 | 112x56 | Default balanced |
|
||||
| lg | 20 | 90x45 | Readable text |
|
||||
|
||||
**Key differences in portrait mode:**
|
||||
- Fewer columns (90 at `lg` vs 160) — lines must be shorter or wrap
|
||||
- Many more rows (80 at `lg` vs 45) — vertical stacking is natural
|
||||
- Aspect ratio correction flips: `asp = cw / ch` still works but the visual emphasis is vertical
|
||||
- Radial effects appear as tall ellipses unless corrected
|
||||
- Vertical effects (rain, embers, fire columns) are naturally enhanced
|
||||
- Horizontal effects (spectrum bars, waveforms) need rotation or compression
|
||||
|
||||
**Grid sizing for text in portrait**: Use `lg` (20px) for 2-3 word lines. Max comfortable line length is ~25-30 chars. For longer quotes, break aggressively into many short lines stacked vertically — portrait has vertical space to spare. `xl` (24px) works for single words or very short phrases.
|
||||
|
||||
Grid dimensions: `cols = VW // cell_width`, `rows = VH // cell_height`.
|
||||
|
||||
### Font Selection
|
||||
|
||||
Don't hardcode a single font. Choose fonts to match the project's mood. Monospace fonts are required for grid alignment but vary widely in personality:
|
||||
|
||||
| Font | Personality | Platform |
|
||||
|------|-------------|----------|
|
||||
| Menlo | Clean, neutral, Apple-native | macOS |
|
||||
| Monaco | Retro terminal, compact | macOS |
|
||||
| Courier New | Classic typewriter, wide | Cross-platform |
|
||||
| SF Mono | Modern, tight spacing | macOS |
|
||||
| Consolas | Windows native, clean | Windows |
|
||||
| JetBrains Mono | Developer, ligature-ready | Install |
|
||||
| Fira Code | Geometric, modern | Install |
|
||||
| IBM Plex Mono | Corporate, authoritative | Install |
|
||||
| Source Code Pro | Adobe, balanced | Install |
|
||||
|
||||
**Font detection at init**: probe available fonts and fall back gracefully:
|
||||
|
||||
```python
|
||||
import platform
|
||||
|
||||
def find_font(preferences):
|
||||
"""Try fonts in order, return first that exists."""
|
||||
for name, path in preferences:
|
||||
if os.path.exists(path):
|
||||
return path
|
||||
raise FileNotFoundError(f"No monospace font found. Tried: {[p for _,p in preferences]}")
|
||||
|
||||
FONT_PREFS_MACOS = [
|
||||
("Menlo", "/System/Library/Fonts/Menlo.ttc"),
|
||||
("Monaco", "/System/Library/Fonts/Monaco.ttf"),
|
||||
("SF Mono", "/System/Library/Fonts/SFNSMono.ttf"),
|
||||
("Courier", "/System/Library/Fonts/Courier.ttc"),
|
||||
]
|
||||
FONT_PREFS_LINUX = [
|
||||
("DejaVu Sans Mono", "/usr/share/fonts/truetype/dejavu/DejaVuSansMono.ttf"),
|
||||
("Liberation Mono", "/usr/share/fonts/truetype/liberation/LiberationMono-Regular.ttf"),
|
||||
("Noto Sans Mono", "/usr/share/fonts/truetype/noto/NotoSansMono-Regular.ttf"),
|
||||
("Ubuntu Mono", "/usr/share/fonts/truetype/ubuntu/UbuntuMono-R.ttf"),
|
||||
]
|
||||
FONT_PREFS_WINDOWS = [
|
||||
("Consolas", r"C:\Windows\Fonts\consola.ttf"),
|
||||
("Courier New", r"C:\Windows\Fonts\cour.ttf"),
|
||||
("Lucida Console", r"C:\Windows\Fonts\lucon.ttf"),
|
||||
("Cascadia Code", os.path.expandvars(r"%LOCALAPPDATA%\Microsoft\Windows\Fonts\CascadiaCode.ttf")),
|
||||
("Cascadia Mono", os.path.expandvars(r"%LOCALAPPDATA%\Microsoft\Windows\Fonts\CascadiaMono.ttf")),
|
||||
]
|
||||
|
||||
def _get_font_prefs():
|
||||
s = platform.system()
|
||||
if s == "Darwin":
|
||||
return FONT_PREFS_MACOS
|
||||
elif s == "Windows":
|
||||
return FONT_PREFS_WINDOWS
|
||||
return FONT_PREFS_LINUX
|
||||
|
||||
FONT_PREFS = _get_font_prefs()
|
||||
```
|
||||
|
||||
**Multi-font rendering**: use different fonts for different layers (e.g., monospace for background, a bolder variant for overlay text). Each GridLayer owns its own font:
|
||||
|
||||
```python
|
||||
grid_bg = GridLayer(find_font(FONT_PREFS), 16) # background
|
||||
grid_text = GridLayer(find_font(BOLD_PREFS), 20) # readable text
|
||||
```
|
||||
|
||||
### Collecting All Characters
|
||||
|
||||
Before initializing grids, gather all characters that need bitmap pre-rasterization:
|
||||
|
||||
```python
|
||||
all_chars = set()
|
||||
for pal in [PAL_DEFAULT, PAL_DENSE, PAL_BLOCKS, PAL_RUNE, PAL_KATA,
|
||||
PAL_GREEK, PAL_MATH, PAL_DOTS, PAL_BRAILLE, PAL_STARS,
|
||||
PAL_HALFFILL, PAL_HATCH, PAL_BINARY, PAL_MUSIC, PAL_BOX,
|
||||
PAL_CIRCUIT, PAL_ARROWS, PAL_HERMES]: # ... all palettes used in project
|
||||
all_chars.update(pal)
|
||||
# Add any overlay text characters
|
||||
all_chars.update("ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789 .,-:;!?/|")
|
||||
all_chars.discard(" ") # space is never rendered
|
||||
```
|
||||
|
||||
### GridLayer Initialization
|
||||
|
||||
Each grid pre-computes coordinate arrays for vectorized effect math. The grid automatically adapts to any resolution (landscape, portrait, square):
|
||||
|
||||
```python
|
||||
class GridLayer:
|
||||
def __init__(self, font_path, font_size, vw=None, vh=None):
|
||||
"""Initialize grid for any resolution.
|
||||
vw, vh: video width/height in pixels. Defaults to global VW, VH."""
|
||||
vw = vw or VW; vh = vh or VH
|
||||
self.vw = vw; self.vh = vh
|
||||
|
||||
self.font = ImageFont.truetype(font_path, font_size)
|
||||
asc, desc = self.font.getmetrics()
|
||||
bbox = self.font.getbbox("M")
|
||||
self.cw = bbox[2] - bbox[0] # character cell width
|
||||
self.ch = asc + desc # CRITICAL: not textbbox height
|
||||
|
||||
self.cols = vw // self.cw
|
||||
self.rows = vh // self.ch
|
||||
self.ox = (vw - self.cols * self.cw) // 2 # centering
|
||||
self.oy = (vh - self.rows * self.ch) // 2
|
||||
|
||||
# Aspect ratio metadata
|
||||
self.aspect = vw / vh # >1 = landscape, <1 = portrait, 1 = square
|
||||
self.is_portrait = vw < vh
|
||||
self.is_landscape = vw > vh
|
||||
|
||||
# Index arrays
|
||||
self.rr = np.arange(self.rows, dtype=np.float32)[:, None]
|
||||
self.cc = np.arange(self.cols, dtype=np.float32)[None, :]
|
||||
|
||||
# Polar coordinates (aspect-corrected)
|
||||
cx, cy = self.cols / 2.0, self.rows / 2.0
|
||||
asp = self.cw / self.ch
|
||||
self.dx = self.cc - cx
|
||||
self.dy = (self.rr - cy) * asp
|
||||
self.dist = np.sqrt(self.dx**2 + self.dy**2)
|
||||
self.angle = np.arctan2(self.dy, self.dx)
|
||||
|
||||
# Normalized (0-1 range) -- for distance falloff
|
||||
self.dx_n = (self.cc - cx) / max(self.cols, 1)
|
||||
self.dy_n = (self.rr - cy) / max(self.rows, 1) * asp
|
||||
self.dist_n = np.sqrt(self.dx_n**2 + self.dy_n**2)
|
||||
|
||||
# Pre-rasterize all characters to float32 bitmaps
|
||||
self.bm = {}
|
||||
for c in all_chars:
|
||||
img = Image.new("L", (self.cw, self.ch), 0)
|
||||
ImageDraw.Draw(img).text((0, 0), c, fill=255, font=self.font)
|
||||
self.bm[c] = np.array(img, dtype=np.float32) / 255.0
|
||||
```
|
||||
|
||||
### Character Render Loop
|
||||
|
||||
The bottleneck. Composites pre-rasterized bitmaps onto pixel canvas:
|
||||
|
||||
```python
|
||||
def render(self, chars, colors, canvas=None):
|
||||
if canvas is None:
|
||||
canvas = np.zeros((VH, VW, 3), dtype=np.uint8)
|
||||
for row in range(self.rows):
|
||||
y = self.oy + row * self.ch
|
||||
if y + self.ch > VH: break
|
||||
for col in range(self.cols):
|
||||
c = chars[row, col]
|
||||
if c == " ": continue
|
||||
x = self.ox + col * self.cw
|
||||
if x + self.cw > VW: break
|
||||
a = self.bm[c] # float32 bitmap
|
||||
canvas[y:y+self.ch, x:x+self.cw] = np.maximum(
|
||||
canvas[y:y+self.ch, x:x+self.cw],
|
||||
(a[:, :, None] * colors[row, col]).astype(np.uint8))
|
||||
return canvas
|
||||
```
|
||||
|
||||
Use `np.maximum` for additive blending (brighter chars overwrite dimmer ones, never darken).
|
||||
|
||||
### Multi-Layer Rendering
|
||||
|
||||
Render multiple grids onto the same canvas for depth:
|
||||
|
||||
```python
|
||||
canvas = np.zeros((VH, VW, 3), dtype=np.uint8)
|
||||
canvas = grid_lg.render(bg_chars, bg_colors, canvas) # background layer
|
||||
canvas = grid_md.render(main_chars, main_colors, canvas) # main layer
|
||||
canvas = grid_sm.render(detail_chars, detail_colors, canvas) # detail overlay
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Character Palettes
|
||||
|
||||
### Design Principles
|
||||
|
||||
Character palettes are the primary visual texture of ASCII video. They control not just brightness mapping but the entire visual feel. Design palettes intentionally:
|
||||
|
||||
- **Visual weight**: characters sorted by the amount of ink/pixels they fill. Space is always index 0.
|
||||
- **Coherence**: characters within a palette should belong to the same visual family.
|
||||
- **Density curve**: the brightness-to-character mapping is nonlinear. Dense palettes (many chars) give smoother gradients; sparse palettes (5-8 chars) give posterized/graphic looks.
|
||||
- **Rendering compatibility**: every character in the palette must exist in the font. Test at init and remove missing glyphs.
|
||||
|
||||
### Palette Library
|
||||
|
||||
Organized by visual family. Mix and match per project -- don't default to PAL_DEFAULT for everything.
|
||||
|
||||
#### Density / Brightness Palettes
|
||||
```python
|
||||
PAL_DEFAULT = " .`'-:;!><=+*^~?/|(){}[]#&$@%" # classic ASCII art
|
||||
PAL_DENSE = " .:;+=xX$#@\u2588" # simple 11-level ramp
|
||||
PAL_MINIMAL = " .:-=+#@" # 8-level, graphic
|
||||
PAL_BINARY = " \u2588" # 2-level, extreme contrast
|
||||
PAL_GRADIENT = " \u2591\u2592\u2593\u2588" # 4-level block gradient
|
||||
```
|
||||
|
||||
#### Unicode Block Elements
|
||||
```python
|
||||
PAL_BLOCKS = " \u2591\u2592\u2593\u2588\u2584\u2580\u2590\u258c" # standard blocks
|
||||
PAL_BLOCKS_EXT = " \u2596\u2597\u2598\u2599\u259a\u259b\u259c\u259d\u259e\u259f\u2591\u2592\u2593\u2588" # quadrant blocks (more detail)
|
||||
PAL_SHADE = " \u2591\u2592\u2593\u2588\u2587\u2586\u2585\u2584\u2583\u2582\u2581" # vertical fill progression
|
||||
```
|
||||
|
||||
#### Symbolic / Thematic
|
||||
```python
|
||||
PAL_MATH = " \u00b7\u2218\u2219\u2022\u00b0\u00b1\u2213\u00d7\u00f7\u2248\u2260\u2261\u2264\u2265\u221e\u222b\u2211\u220f\u221a\u2207\u2202\u2206\u03a9" # math symbols
|
||||
PAL_BOX = " \u2500\u2502\u250c\u2510\u2514\u2518\u251c\u2524\u252c\u2534\u253c\u2550\u2551\u2554\u2557\u255a\u255d\u2560\u2563\u2566\u2569\u256c" # box drawing
|
||||
PAL_CIRCUIT = " .\u00b7\u2500\u2502\u250c\u2510\u2514\u2518\u253c\u25cb\u25cf\u25a1\u25a0\u2206\u2207\u2261" # circuit board
|
||||
PAL_RUNE = " .\u16a0\u16a2\u16a6\u16b1\u16b7\u16c1\u16c7\u16d2\u16d6\u16da\u16de\u16df" # elder futhark runes
|
||||
PAL_ALCHEMIC = " \u2609\u263d\u2640\u2642\u2643\u2644\u2645\u2646\u2647\u2648\u2649\u264a\u264b" # planetary/alchemical symbols
|
||||
PAL_ZODIAC = " \u2648\u2649\u264a\u264b\u264c\u264d\u264e\u264f\u2650\u2651\u2652\u2653" # zodiac
|
||||
PAL_ARROWS = " \u2190\u2191\u2192\u2193\u2194\u2195\u2196\u2197\u2198\u2199\u21a9\u21aa\u21bb\u27a1" # directional arrows
|
||||
PAL_MUSIC = " \u266a\u266b\u266c\u2669\u266d\u266e\u266f\u25cb\u25cf" # musical notation
|
||||
```
|
||||
|
||||
#### Script / Writing System
|
||||
```python
|
||||
PAL_KATA = " \u00b7\uff66\uff67\uff68\uff69\uff6a\uff6b\uff6c\uff6d\uff6e\uff6f\uff70\uff71\uff72\uff73\uff74\uff75\uff76\uff77" # katakana halfwidth (matrix rain)
|
||||
PAL_GREEK = " \u03b1\u03b2\u03b3\u03b4\u03b5\u03b6\u03b7\u03b8\u03b9\u03ba\u03bb\u03bc\u03bd\u03be\u03c0\u03c1\u03c3\u03c4\u03c6\u03c8\u03c9" # Greek lowercase
|
||||
PAL_CYRILLIC = " \u0430\u0431\u0432\u0433\u0434\u0435\u0436\u0437\u0438\u043a\u043b\u043c\u043d\u043e\u043f\u0440\u0441\u0442\u0443\u0444\u0445\u0446\u0447\u0448" # Cyrillic lowercase
|
||||
PAL_ARABIC = " \u0627\u0628\u062a\u062b\u062c\u062d\u062e\u062f\u0630\u0631\u0632\u0633\u0634\u0635\u0636\u0637" # Arabic letters (isolated forms)
|
||||
```
|
||||
|
||||
#### Dot / Point Progressions
|
||||
```python
|
||||
PAL_DOTS = " ⋅∘∙●◉◎◆✦★" # dot size progression
|
||||
PAL_BRAILLE = " ⠁⠂⠃⠄⠅⠆⠇⠈⠉⠊⠋⠌⠍⠎⠏⠐⠑⠒⠓⠔⠕⠖⠗⠘⠙⠚⠛⠜⠝⠞⠟⠿" # braille patterns
|
||||
PAL_STARS = " ·✧✦✩✨★✶✳✸" # star progression
|
||||
PAL_HALFFILL = " ◔◑◕◐◒◓◖◗◙" # directional half-fill progression
|
||||
PAL_HATCH = " ▣▤▥▦▧▨▩" # crosshatch density ramp
|
||||
```
|
||||
|
||||
#### Project-Specific (examples -- invent new ones per project)
|
||||
```python
|
||||
PAL_HERMES = " .\u00b7~=\u2248\u221e\u26a1\u263f\u2726\u2605\u2295\u25ca\u25c6\u25b2\u25bc\u25cf\u25a0" # mythology/tech blend
|
||||
PAL_OCEAN = " ~\u2248\u2248\u2248\u223c\u2307\u2248\u224b\u224c\u2248" # water/wave characters
|
||||
PAL_ORGANIC = " .\u00b0\u2218\u2022\u25e6\u25c9\u2742\u273f\u2741\u2743" # growing/botanical
|
||||
PAL_MACHINE = " _\u2500\u2502\u250c\u2510\u253c\u2261\u25a0\u2588\u2593\u2592\u2591" # mechanical/industrial
|
||||
```
|
||||
|
||||
### Creating Custom Palettes
|
||||
|
||||
When designing for a project, build palettes from the content's theme:
|
||||
|
||||
1. **Choose a visual family** (dots, blocks, symbols, script)
|
||||
2. **Sort by visual weight** -- render each char at target font size, count lit pixels, sort ascending
|
||||
3. **Test at target grid size** -- some chars collapse to blobs at small sizes
|
||||
4. **Validate in font** -- remove chars the font can't render:
|
||||
|
||||
```python
|
||||
def validate_palette(pal, font):
|
||||
"""Remove characters the font can't render."""
|
||||
valid = []
|
||||
for c in pal:
|
||||
if c == " ":
|
||||
valid.append(c)
|
||||
continue
|
||||
img = Image.new("L", (20, 20), 0)
|
||||
ImageDraw.Draw(img).text((0, 0), c, fill=255, font=font)
|
||||
if np.array(img).max() > 0: # char actually rendered something
|
||||
valid.append(c)
|
||||
return "".join(valid)
|
||||
```
|
||||
|
||||
### Mapping Values to Characters
|
||||
|
||||
```python
|
||||
def val2char(v, mask, pal=PAL_DEFAULT):
|
||||
"""Map float array (0-1) to character array using palette."""
|
||||
n = len(pal)
|
||||
idx = np.clip((v * n).astype(int), 0, n - 1)
|
||||
out = np.full(v.shape, " ", dtype="U1")
|
||||
for i, ch in enumerate(pal):
|
||||
out[mask & (idx == i)] = ch
|
||||
return out
|
||||
```
|
||||
|
||||
**Nonlinear mapping** for different visual curves:
|
||||
|
||||
```python
|
||||
def val2char_gamma(v, mask, pal, gamma=1.0):
|
||||
"""Gamma-corrected palette mapping. gamma<1 = brighter, gamma>1 = darker."""
|
||||
v_adj = np.power(np.clip(v, 0, 1), gamma)
|
||||
return val2char(v_adj, mask, pal)
|
||||
|
||||
def val2char_step(v, mask, pal, thresholds):
|
||||
"""Custom threshold mapping. thresholds = list of float breakpoints."""
|
||||
out = np.full(v.shape, pal[0], dtype="U1")
|
||||
for i, thr in enumerate(thresholds):
|
||||
out[mask & (v > thr)] = pal[min(i + 1, len(pal) - 1)]
|
||||
return out
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Color System
|
||||
|
||||
### HSV->RGB (Vectorized)
|
||||
|
||||
All color computation in HSV for intuitive control, converted at render time:
|
||||
|
||||
```python
|
||||
def hsv2rgb(h, s, v):
|
||||
"""Vectorized HSV->RGB. h,s,v are numpy arrays. Returns (R,G,B) uint8 arrays."""
|
||||
h = h % 1.0
|
||||
c = v * s; x = c * (1 - np.abs((h*6) % 2 - 1)); m = v - c
|
||||
# ... 6 sector assignment ...
|
||||
return (np.clip((r+m)*255, 0, 255).astype(np.uint8),
|
||||
np.clip((g+m)*255, 0, 255).astype(np.uint8),
|
||||
np.clip((b+m)*255, 0, 255).astype(np.uint8))
|
||||
```
|
||||
|
||||
### Color Mapping Strategies
|
||||
|
||||
Don't default to a single strategy. Choose based on the visual intent:
|
||||
|
||||
| Strategy | Hue source | Effect | Good for |
|
||||
|----------|------------|--------|----------|
|
||||
| Angle-mapped | `g.angle / (2*pi)` | Rainbow around center | Radial effects, kaleidoscopes |
|
||||
| Distance-mapped | `g.dist_n * 0.3` | Gradient from center | Tunnels, depth effects |
|
||||
| Frequency-mapped | `f["cent"] * 0.2` | Timbral color shifting | Audio-reactive |
|
||||
| Value-mapped | `val * 0.15` | Brightness-dependent hue | Fire, heat maps |
|
||||
| Time-cycled | `t * rate` | Slow color rotation | Ambient, chill |
|
||||
| Source-sampled | Video frame pixel colors | Preserve original color | Video-to-ASCII |
|
||||
| Palette-indexed | Discrete color lookup | Flat graphic style | Retro, pixel art |
|
||||
| Temperature | Blend between warm/cool | Emotional tone | Mood-driven scenes |
|
||||
| Complementary | `hue` and `hue + 0.5` | High contrast | Bold, dramatic |
|
||||
| Triadic | `hue`, `hue + 0.33`, `hue + 0.66` | Vibrant, balanced | Psychedelic |
|
||||
| Analogous | `hue +/- 0.08` | Harmonious, subtle | Elegant, cohesive |
|
||||
| Monochrome | Fixed hue, vary S and V | Restrained, focused | Noir, minimal |
|
||||
|
||||
### Color Palettes (Discrete RGB)
|
||||
|
||||
For non-HSV workflows -- direct RGB color sets for graphic/retro looks:
|
||||
|
||||
```python
|
||||
# Named color palettes -- use for flat/graphic styles or per-character coloring
|
||||
COLORS_NEON = [(255,0,102), (0,255,153), (102,0,255), (255,255,0), (0,204,255)]
|
||||
COLORS_PASTEL = [(255,179,186), (255,223,186), (255,255,186), (186,255,201), (186,225,255)]
|
||||
COLORS_MONO_GREEN = [(0,40,0), (0,80,0), (0,140,0), (0,200,0), (0,255,0)]
|
||||
COLORS_MONO_AMBER = [(40,20,0), (80,50,0), (140,90,0), (200,140,0), (255,191,0)]
|
||||
COLORS_CYBERPUNK = [(255,0,60), (0,255,200), (180,0,255), (255,200,0)]
|
||||
COLORS_VAPORWAVE = [(255,113,206), (1,205,254), (185,103,255), (5,255,161)]
|
||||
COLORS_EARTH = [(86,58,26), (139,90,43), (189,154,91), (222,193,136), (245,230,193)]
|
||||
COLORS_ICE = [(200,230,255), (150,200,240), (100,170,230), (60,130,210), (30,80,180)]
|
||||
COLORS_BLOOD = [(80,0,0), (140,10,10), (200,20,20), (255,50,30), (255,100,80)]
|
||||
COLORS_FOREST = [(10,30,10), (20,60,15), (30,100,20), (50,150,30), (80,200,50)]
|
||||
|
||||
def rgb_palette_map(val, mask, palette):
|
||||
"""Map float array (0-1) to RGB colors from a discrete palette."""
|
||||
n = len(palette)
|
||||
idx = np.clip((val * n).astype(int), 0, n - 1)
|
||||
R = np.zeros(val.shape, dtype=np.uint8)
|
||||
G = np.zeros(val.shape, dtype=np.uint8)
|
||||
B = np.zeros(val.shape, dtype=np.uint8)
|
||||
for i, (r, g, b) in enumerate(palette):
|
||||
m = mask & (idx == i)
|
||||
R[m] = r; G[m] = g; B[m] = b
|
||||
return R, G, B
|
||||
```
|
||||
|
||||
### OKLAB Color Space (Perceptually Uniform)
|
||||
|
||||
HSV hue is perceptually non-uniform: green occupies far more visual range than blue. OKLAB / OKLCH provide perceptually even color steps — hue increments of 0.1 look equally different regardless of starting hue. Use OKLAB for:
|
||||
- Gradient interpolation (no unwanted intermediate hues)
|
||||
- Color harmony generation (perceptually balanced palettes)
|
||||
- Smooth color transitions over time
|
||||
|
||||
```python
|
||||
# --- sRGB <-> Linear sRGB ---
|
||||
|
||||
def srgb_to_linear(c):
|
||||
"""Convert sRGB [0,1] to linear light. c: float32 array."""
|
||||
return np.where(c <= 0.04045, c / 12.92, ((c + 0.055) / 1.055) ** 2.4)
|
||||
|
||||
def linear_to_srgb(c):
|
||||
"""Convert linear light to sRGB [0,1]."""
|
||||
return np.where(c <= 0.0031308, c * 12.92, 1.055 * np.power(np.maximum(c, 0), 1/2.4) - 0.055)
|
||||
|
||||
# --- Linear sRGB <-> OKLAB ---
|
||||
|
||||
def linear_rgb_to_oklab(r, g, b):
|
||||
"""Linear sRGB to OKLAB. r,g,b: float32 arrays [0,1].
|
||||
Returns (L, a, b) where L=[0,1], a,b=[-0.4, 0.4] approx."""
|
||||
l_ = 0.4122214708 * r + 0.5363325363 * g + 0.0514459929 * b
|
||||
m_ = 0.2119034982 * r + 0.6806995451 * g + 0.1073969566 * b
|
||||
s_ = 0.0883024619 * r + 0.2817188376 * g + 0.6299787005 * b
|
||||
l_c = np.cbrt(l_); m_c = np.cbrt(m_); s_c = np.cbrt(s_)
|
||||
L = 0.2104542553 * l_c + 0.7936177850 * m_c - 0.0040720468 * s_c
|
||||
a = 1.9779984951 * l_c - 2.4285922050 * m_c + 0.4505937099 * s_c
|
||||
b_ = 0.0259040371 * l_c + 0.7827717662 * m_c - 0.8086757660 * s_c
|
||||
return L, a, b_
|
||||
|
||||
def oklab_to_linear_rgb(L, a, b):
|
||||
"""OKLAB to linear sRGB. Returns (r, g, b) float32 arrays [0,1]."""
|
||||
l_ = L + 0.3963377774 * a + 0.2158037573 * b
|
||||
m_ = L - 0.1055613458 * a - 0.0638541728 * b
|
||||
s_ = L - 0.0894841775 * a - 1.2914855480 * b
|
||||
l_c = l_ ** 3; m_c = m_ ** 3; s_c = s_ ** 3
|
||||
r = +4.0767416621 * l_c - 3.3077115913 * m_c + 0.2309699292 * s_c
|
||||
g = -1.2684380046 * l_c + 2.6097574011 * m_c - 0.3413193965 * s_c
|
||||
b_ = -0.0041960863 * l_c - 0.7034186147 * m_c + 1.7076147010 * s_c
|
||||
return np.clip(r, 0, 1), np.clip(g, 0, 1), np.clip(b_, 0, 1)
|
||||
|
||||
# --- Convenience: sRGB uint8 <-> OKLAB ---
|
||||
|
||||
def rgb_to_oklab(R, G, B):
|
||||
"""sRGB uint8 arrays to OKLAB."""
|
||||
r = srgb_to_linear(R.astype(np.float32) / 255.0)
|
||||
g = srgb_to_linear(G.astype(np.float32) / 255.0)
|
||||
b = srgb_to_linear(B.astype(np.float32) / 255.0)
|
||||
return linear_rgb_to_oklab(r, g, b)
|
||||
|
||||
def oklab_to_rgb(L, a, b):
|
||||
"""OKLAB to sRGB uint8 arrays."""
|
||||
r, g, b_ = oklab_to_linear_rgb(L, a, b)
|
||||
R = np.clip(linear_to_srgb(r) * 255, 0, 255).astype(np.uint8)
|
||||
G = np.clip(linear_to_srgb(g) * 255, 0, 255).astype(np.uint8)
|
||||
B = np.clip(linear_to_srgb(b_) * 255, 0, 255).astype(np.uint8)
|
||||
return R, G, B
|
||||
|
||||
# --- OKLCH (cylindrical form of OKLAB) ---
|
||||
|
||||
def oklab_to_oklch(L, a, b):
|
||||
"""OKLAB to OKLCH. Returns (L, C, H) where H is in [0, 1] (normalized)."""
|
||||
C = np.sqrt(a**2 + b**2)
|
||||
H = (np.arctan2(b, a) / (2 * np.pi)) % 1.0
|
||||
return L, C, H
|
||||
|
||||
def oklch_to_oklab(L, C, H):
|
||||
"""OKLCH to OKLAB. H in [0, 1]."""
|
||||
angle = H * 2 * np.pi
|
||||
a = C * np.cos(angle)
|
||||
b = C * np.sin(angle)
|
||||
return L, a, b
|
||||
```
|
||||
|
||||
### Gradient Interpolation (OKLAB vs HSV)
|
||||
|
||||
Interpolating colors through OKLAB avoids the hue detours that HSV produces:
|
||||
|
||||
```python
|
||||
def lerp_oklab(color_a, color_b, t_array):
|
||||
"""Interpolate between two sRGB colors through OKLAB.
|
||||
color_a, color_b: (R, G, B) tuples 0-255
|
||||
t_array: float32 array [0,1] — interpolation parameter per pixel.
|
||||
Returns (R, G, B) uint8 arrays."""
|
||||
La, aa, ba = rgb_to_oklab(
|
||||
np.full_like(t_array, color_a[0], dtype=np.uint8),
|
||||
np.full_like(t_array, color_a[1], dtype=np.uint8),
|
||||
np.full_like(t_array, color_a[2], dtype=np.uint8))
|
||||
Lb, ab, bb = rgb_to_oklab(
|
||||
np.full_like(t_array, color_b[0], dtype=np.uint8),
|
||||
np.full_like(t_array, color_b[1], dtype=np.uint8),
|
||||
np.full_like(t_array, color_b[2], dtype=np.uint8))
|
||||
L = La + (Lb - La) * t_array
|
||||
a = aa + (ab - aa) * t_array
|
||||
b = ba + (bb - ba) * t_array
|
||||
return oklab_to_rgb(L, a, b)
|
||||
|
||||
def lerp_oklch(color_a, color_b, t_array, short_path=True):
|
||||
"""Interpolate through OKLCH (preserves chroma, smooth hue path).
|
||||
short_path: take the shorter arc around the hue wheel."""
|
||||
La, aa, ba = rgb_to_oklab(
|
||||
np.full_like(t_array, color_a[0], dtype=np.uint8),
|
||||
np.full_like(t_array, color_a[1], dtype=np.uint8),
|
||||
np.full_like(t_array, color_a[2], dtype=np.uint8))
|
||||
Lb, ab, bb = rgb_to_oklab(
|
||||
np.full_like(t_array, color_b[0], dtype=np.uint8),
|
||||
np.full_like(t_array, color_b[1], dtype=np.uint8),
|
||||
np.full_like(t_array, color_b[2], dtype=np.uint8))
|
||||
L1, C1, H1 = oklab_to_oklch(La, aa, ba)
|
||||
L2, C2, H2 = oklab_to_oklch(Lb, ab, bb)
|
||||
# Shortest hue path
|
||||
if short_path:
|
||||
dh = H2 - H1
|
||||
dh = np.where(dh > 0.5, dh - 1.0, np.where(dh < -0.5, dh + 1.0, dh))
|
||||
H = (H1 + dh * t_array) % 1.0
|
||||
else:
|
||||
H = H1 + (H2 - H1) * t_array
|
||||
L = L1 + (L2 - L1) * t_array
|
||||
C = C1 + (C2 - C1) * t_array
|
||||
Lout, aout, bout = oklch_to_oklab(L, C, H)
|
||||
return oklab_to_rgb(Lout, aout, bout)
|
||||
```
|
||||
|
||||
### Color Harmony Generation
|
||||
|
||||
Auto-generate harmonious palettes from a seed color:
|
||||
|
||||
```python
|
||||
def harmony_complementary(seed_rgb):
|
||||
"""Two colors: seed + opposite hue."""
|
||||
L, a, b = rgb_to_oklab(np.array([seed_rgb[0]]), np.array([seed_rgb[1]]), np.array([seed_rgb[2]]))
|
||||
_, C, H = oklab_to_oklch(L, a, b)
|
||||
return [seed_rgb, _oklch_to_srgb_tuple(L[0], C[0], (H[0] + 0.5) % 1.0)]
|
||||
|
||||
def harmony_triadic(seed_rgb):
|
||||
"""Three colors: seed + two at 120-degree offsets."""
|
||||
L, a, b = rgb_to_oklab(np.array([seed_rgb[0]]), np.array([seed_rgb[1]]), np.array([seed_rgb[2]]))
|
||||
_, C, H = oklab_to_oklch(L, a, b)
|
||||
return [seed_rgb,
|
||||
_oklch_to_srgb_tuple(L[0], C[0], (H[0] + 0.333) % 1.0),
|
||||
_oklch_to_srgb_tuple(L[0], C[0], (H[0] + 0.667) % 1.0)]
|
||||
|
||||
def harmony_analogous(seed_rgb, spread=0.08, n=5):
|
||||
"""N colors spread evenly around seed hue."""
|
||||
L, a, b = rgb_to_oklab(np.array([seed_rgb[0]]), np.array([seed_rgb[1]]), np.array([seed_rgb[2]]))
|
||||
_, C, H = oklab_to_oklch(L, a, b)
|
||||
offsets = np.linspace(-spread * (n-1)/2, spread * (n-1)/2, n)
|
||||
return [_oklch_to_srgb_tuple(L[0], C[0], (H[0] + off) % 1.0) for off in offsets]
|
||||
|
||||
def harmony_split_complementary(seed_rgb, split=0.08):
|
||||
"""Three colors: seed + two flanking the complement."""
|
||||
L, a, b = rgb_to_oklab(np.array([seed_rgb[0]]), np.array([seed_rgb[1]]), np.array([seed_rgb[2]]))
|
||||
_, C, H = oklab_to_oklch(L, a, b)
|
||||
comp = (H[0] + 0.5) % 1.0
|
||||
return [seed_rgb,
|
||||
_oklch_to_srgb_tuple(L[0], C[0], (comp - split) % 1.0),
|
||||
_oklch_to_srgb_tuple(L[0], C[0], (comp + split) % 1.0)]
|
||||
|
||||
def harmony_tetradic(seed_rgb):
|
||||
"""Four colors: two complementary pairs at 90-degree offset."""
|
||||
L, a, b = rgb_to_oklab(np.array([seed_rgb[0]]), np.array([seed_rgb[1]]), np.array([seed_rgb[2]]))
|
||||
_, C, H = oklab_to_oklch(L, a, b)
|
||||
return [seed_rgb,
|
||||
_oklch_to_srgb_tuple(L[0], C[0], (H[0] + 0.25) % 1.0),
|
||||
_oklch_to_srgb_tuple(L[0], C[0], (H[0] + 0.5) % 1.0),
|
||||
_oklch_to_srgb_tuple(L[0], C[0], (H[0] + 0.75) % 1.0)]
|
||||
|
||||
def _oklch_to_srgb_tuple(L, C, H):
|
||||
"""Helper: single OKLCH -> sRGB (R,G,B) int tuple."""
|
||||
La = np.array([L]); Ca = np.array([C]); Ha = np.array([H])
|
||||
Lo, ao, bo = oklch_to_oklab(La, Ca, Ha)
|
||||
R, G, B = oklab_to_rgb(Lo, ao, bo)
|
||||
return (int(R[0]), int(G[0]), int(B[0]))
|
||||
```
|
||||
|
||||
### OKLAB Hue Fields
|
||||
|
||||
Drop-in replacements for `hf_*` generators that produce perceptually uniform hue variation:
|
||||
|
||||
```python
|
||||
def hf_oklch_angle(offset=0.0, chroma=0.12, lightness=0.7):
|
||||
"""OKLCH hue mapped to angle from center. Perceptually uniform rainbow.
|
||||
Returns (R, G, B) uint8 color array instead of a float hue.
|
||||
NOTE: Use with _render_vf_rgb() variant, not standard _render_vf()."""
|
||||
def fn(g, f, t, S):
|
||||
H = (g.angle / (2 * np.pi) + offset + t * 0.05) % 1.0
|
||||
L = np.full_like(H, lightness)
|
||||
C = np.full_like(H, chroma)
|
||||
Lo, ao, bo = oklch_to_oklab(L, C, H)
|
||||
R, G, B = oklab_to_rgb(Lo, ao, bo)
|
||||
return mkc(R, G, B, g.rows, g.cols)
|
||||
return fn
|
||||
```
|
||||
|
||||
### Compositing Helpers
|
||||
|
||||
```python
|
||||
def mkc(R, G, B, rows, cols):
|
||||
"""Pack 3 uint8 arrays into (rows, cols, 3) color array."""
|
||||
o = np.zeros((rows, cols, 3), dtype=np.uint8)
|
||||
o[:,:,0] = R; o[:,:,1] = G; o[:,:,2] = B
|
||||
return o
|
||||
|
||||
def layer_over(base_ch, base_co, top_ch, top_co):
|
||||
"""Composite top layer onto base. Non-space chars overwrite."""
|
||||
m = top_ch != " "
|
||||
base_ch[m] = top_ch[m]; base_co[m] = top_co[m]
|
||||
return base_ch, base_co
|
||||
|
||||
def layer_blend(base_co, top_co, alpha):
|
||||
"""Alpha-blend top color layer onto base. alpha is float array (0-1) or scalar."""
|
||||
if isinstance(alpha, (int, float)):
|
||||
alpha = np.full(base_co.shape[:2], alpha, dtype=np.float32)
|
||||
a = alpha[:,:,None]
|
||||
return np.clip(base_co * (1 - a) + top_co * a, 0, 255).astype(np.uint8)
|
||||
|
||||
def stamp(ch, co, text, row, col, color=(255,255,255)):
|
||||
"""Write text string at position."""
|
||||
for i, c in enumerate(text):
|
||||
cc = col + i
|
||||
if 0 <= row < ch.shape[0] and 0 <= cc < ch.shape[1]:
|
||||
ch[row, cc] = c; co[row, cc] = color
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Section System
|
||||
|
||||
Map time ranges to effect functions + shader configs + grid sizes:
|
||||
|
||||
```python
|
||||
SECTIONS = [
|
||||
(0.0, "void"), (3.94, "starfield"), (21.0, "matrix"),
|
||||
(46.0, "drop"), (130.0, "glitch"), (187.0, "outro"),
|
||||
]
|
||||
|
||||
FX_DISPATCH = {"void": fx_void, "starfield": fx_starfield, ...}
|
||||
SECTION_FX = {"void": {"vignette": 0.3, "bloom": 170}, ...}
|
||||
SECTION_GRID = {"void": "md", "starfield": "sm", "drop": "lg", ...}
|
||||
SECTION_MIRROR = {"drop": "h", "bass_rings": "quad"}
|
||||
|
||||
def get_section(t):
|
||||
sec = SECTIONS[0][1]
|
||||
for ts, name in SECTIONS:
|
||||
if t >= ts: sec = name
|
||||
return sec
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Parallel Encoding
|
||||
|
||||
Split frames across N workers. Each pipes raw RGB to its own ffmpeg subprocess:
|
||||
|
||||
```python
|
||||
def render_batch(batch_id, frame_start, frame_end, features, seg_path):
|
||||
r = Renderer()
|
||||
cmd = ["ffmpeg", "-y", "-f", "rawvideo", "-pix_fmt", "rgb24",
|
||||
"-s", f"{VW}x{VH}", "-r", str(FPS), "-i", "pipe:0",
|
||||
"-c:v", "libx264", "-preset", "fast", "-crf", "18",
|
||||
"-pix_fmt", "yuv420p", seg_path]
|
||||
|
||||
# CRITICAL: stderr to file, not pipe
|
||||
stderr_fh = open(os.path.join(workdir, f"err_{batch_id:02d}.log"), "w")
|
||||
pipe = subprocess.Popen(cmd, stdin=subprocess.PIPE,
|
||||
stdout=subprocess.DEVNULL, stderr=stderr_fh)
|
||||
|
||||
for fi in range(frame_start, frame_end):
|
||||
t = fi / FPS
|
||||
sec = get_section(t)
|
||||
f = {k: float(features[k][fi]) for k in features}
|
||||
ch, co = FX_DISPATCH[sec](r, f, t)
|
||||
canvas = r.render(ch, co)
|
||||
canvas = apply_mirror(canvas, sec, f)
|
||||
canvas = apply_shaders(canvas, sec, f, t)
|
||||
pipe.stdin.write(canvas.tobytes())
|
||||
|
||||
pipe.stdin.close()
|
||||
pipe.wait()
|
||||
stderr_fh.close()
|
||||
```
|
||||
|
||||
Concatenate segments + mux audio:
|
||||
|
||||
```python
|
||||
# Write concat file
|
||||
with open(concat_path, "w") as cf:
|
||||
for seg in segments:
|
||||
cf.write(f"file '{seg}'\n")
|
||||
|
||||
subprocess.run(["ffmpeg", "-y", "-f", "concat", "-safe", "0", "-i", concat_path,
|
||||
"-i", audio_path, "-c:v", "copy", "-c:a", "aac", "-b:a", "192k",
|
||||
"-shortest", output_path])
|
||||
```
|
||||
|
||||
## Effect Function Contract
|
||||
|
||||
### v2 Protocol (Current)
|
||||
|
||||
Every scene function: `(r, f, t, S) -> canvas_uint8` — where `r` = Renderer, `f` = features dict, `t` = time float, `S` = persistent state dict
|
||||
|
||||
```python
|
||||
def fx_example(r, f, t, S):
|
||||
"""Scene function returns a full pixel canvas (uint8 H,W,3).
|
||||
Scenes have full control over multi-grid rendering and pixel-level composition.
|
||||
"""
|
||||
# Render multiple layers at different grid densities
|
||||
canvas_a = _render_vf(r, "md", vf_plasma, hf_angle(0.0), PAL_DENSE, f, t, S)
|
||||
canvas_b = _render_vf(r, "sm", vf_vortex, hf_time_cycle(0.1), PAL_RUNE, f, t, S)
|
||||
|
||||
# Pixel-level blend
|
||||
result = blend_canvas(canvas_a, canvas_b, "screen", 0.8)
|
||||
return result
|
||||
```
|
||||
|
||||
See `references/scenes.md` for the full scene protocol, the Renderer class, `_render_vf()` helper, and complete scene examples.
|
||||
|
||||
See `references/composition.md` for blend modes, tone mapping, feedback buffers, and multi-grid composition.
|
||||
|
||||
### v1 Protocol (Legacy)
|
||||
|
||||
Simple scenes that use a single grid can still return `(chars, colors)` and let the caller handle rendering, but the v2 canvas protocol is preferred for all new code.
|
||||
|
||||
```python
|
||||
def fx_simple(r, f, t, S):
|
||||
g = r.get_grid("md")
|
||||
val = np.sin(g.dist * 0.1 - t * 3) * f.get("bass", 0.3) * 2
|
||||
val = np.clip(val, 0, 1); mask = val > 0.03
|
||||
ch = val2char(val, mask, PAL_DEFAULT)
|
||||
R, G, B = hsv2rgb(np.full_like(val, 0.6), np.full_like(val, 0.7), val)
|
||||
co = mkc(R, G, B, g.rows, g.cols)
|
||||
return g.render(ch, co) # returns canvas directly
|
||||
```
|
||||
|
||||
### Persistent State
|
||||
|
||||
Effects that need state across frames (particles, rain columns) use the `S` dict parameter (which is `r.S` — same object, but passed explicitly for clarity):
|
||||
|
||||
```python
|
||||
def fx_with_state(r, f, t, S):
|
||||
if "particles" not in S:
|
||||
S["particles"] = initialize_particles()
|
||||
update_particles(S["particles"])
|
||||
# ...
|
||||
```
|
||||
|
||||
State persists across frames within a single scene/clip. Each worker process (and each scene) gets its own independent state.
|
||||
|
||||
### Helper Functions
|
||||
|
||||
```python
|
||||
def hsv2rgb_scalar(h, s, v):
|
||||
"""Single-value HSV to RGB. Returns (R, G, B) tuple of ints 0-255."""
|
||||
h = h % 1.0
|
||||
c = v * s; x = c * (1 - abs((h * 6) % 2 - 1)); m = v - c
|
||||
if h * 6 < 1: r, g, b = c, x, 0
|
||||
elif h * 6 < 2: r, g, b = x, c, 0
|
||||
elif h * 6 < 3: r, g, b = 0, c, x
|
||||
elif h * 6 < 4: r, g, b = 0, x, c
|
||||
elif h * 6 < 5: r, g, b = x, 0, c
|
||||
else: r, g, b = c, 0, x
|
||||
return (int((r+m)*255), int((g+m)*255), int((b+m)*255))
|
||||
|
||||
def log(msg):
|
||||
"""Print timestamped log message."""
|
||||
print(msg, flush=True)
|
||||
```
|
||||
@@ -0,0 +1,892 @@
|
||||
# Composition & Brightness Reference
|
||||
|
||||
The composable system is the core of visual complexity. It operates at three levels: pixel-level blend modes, multi-grid composition, and adaptive brightness management. This document covers all three, plus the masking/stencil system for spatial control.
|
||||
|
||||
> **See also:** architecture.md · effects.md · scenes.md · shaders.md · troubleshooting.md
|
||||
|
||||
## Pixel-Level Blend Modes
|
||||
|
||||
### The `blend_canvas()` Function
|
||||
|
||||
All blending operates on full pixel canvases (`uint8 H,W,3`). Internally converts to float32 [0,1] for precision, blends, lerps by opacity, converts back.
|
||||
|
||||
```python
|
||||
def blend_canvas(base, top, mode="normal", opacity=1.0):
|
||||
af = base.astype(np.float32) / 255.0
|
||||
bf = top.astype(np.float32) / 255.0
|
||||
fn = BLEND_MODES.get(mode, BLEND_MODES["normal"])
|
||||
result = fn(af, bf)
|
||||
if opacity < 1.0:
|
||||
result = af * (1 - opacity) + result * opacity
|
||||
return np.clip(result * 255, 0, 255).astype(np.uint8)
|
||||
```
|
||||
|
||||
### 20 Blend Modes
|
||||
|
||||
```python
|
||||
BLEND_MODES = {
|
||||
# Basic arithmetic
|
||||
"normal": lambda a, b: b,
|
||||
"add": lambda a, b: np.clip(a + b, 0, 1),
|
||||
"subtract": lambda a, b: np.clip(a - b, 0, 1),
|
||||
"multiply": lambda a, b: a * b,
|
||||
"screen": lambda a, b: 1 - (1 - a) * (1 - b),
|
||||
|
||||
# Contrast
|
||||
"overlay": lambda a, b: np.where(a < 0.5, 2*a*b, 1 - 2*(1-a)*(1-b)),
|
||||
"softlight": lambda a, b: (1 - 2*b)*a*a + 2*b*a,
|
||||
"hardlight": lambda a, b: np.where(b < 0.5, 2*a*b, 1 - 2*(1-a)*(1-b)),
|
||||
|
||||
# Difference
|
||||
"difference": lambda a, b: np.abs(a - b),
|
||||
"exclusion": lambda a, b: a + b - 2*a*b,
|
||||
|
||||
# Dodge / burn
|
||||
"colordodge": lambda a, b: np.clip(a / (1 - b + 1e-6), 0, 1),
|
||||
"colorburn": lambda a, b: np.clip(1 - (1 - a) / (b + 1e-6), 0, 1),
|
||||
|
||||
# Light
|
||||
"linearlight": lambda a, b: np.clip(a + 2*b - 1, 0, 1),
|
||||
"vividlight": lambda a, b: np.where(b < 0.5,
|
||||
np.clip(1 - (1-a)/(2*b + 1e-6), 0, 1),
|
||||
np.clip(a / (2*(1-b) + 1e-6), 0, 1)),
|
||||
"pin_light": lambda a, b: np.where(b < 0.5,
|
||||
np.minimum(a, 2*b), np.maximum(a, 2*b - 1)),
|
||||
"hard_mix": lambda a, b: np.where(a + b >= 1.0, 1.0, 0.0),
|
||||
|
||||
# Compare
|
||||
"lighten": lambda a, b: np.maximum(a, b),
|
||||
"darken": lambda a, b: np.minimum(a, b),
|
||||
|
||||
# Grain
|
||||
"grain_extract": lambda a, b: np.clip(a - b + 0.5, 0, 1),
|
||||
"grain_merge": lambda a, b: np.clip(a + b - 0.5, 0, 1),
|
||||
}
|
||||
```
|
||||
|
||||
### Blend Mode Selection Guide
|
||||
|
||||
**Modes that brighten** (safe for dark inputs):
|
||||
- `screen` — always brightens. Two 50% gray layers screen to 75%. The go-to safe blend.
|
||||
- `add` — simple addition, clips at white. Good for sparkles, glows, particle overlays.
|
||||
- `colordodge` — extreme brightening at overlap zones. Can blow out. Use low opacity (0.3-0.5).
|
||||
- `linearlight` — aggressive brightening. Similar to add but with offset.
|
||||
|
||||
**Modes that darken** (avoid with dark inputs):
|
||||
- `multiply` — darkens everything. Only use when both layers are already bright.
|
||||
- `overlay` — darkens when base < 0.5, brightens when base > 0.5. Crushes dark inputs: `2 * 0.12 * 0.12 = 0.03`. Use `screen` instead for dark material.
|
||||
- `colorburn` — extreme darkening at overlap zones.
|
||||
|
||||
**Modes that create contrast**:
|
||||
- `softlight` — gentle contrast. Good for subtle texture overlay.
|
||||
- `hardlight` — strong contrast. Like overlay but keyed on the top layer.
|
||||
- `vividlight` — very aggressive contrast. Use sparingly.
|
||||
|
||||
**Modes that create color effects**:
|
||||
- `difference` — XOR-like patterns. Two identical layers difference to black; offset layers create wild colors. Great for psychedelic looks.
|
||||
- `exclusion` — softer version of difference. Creates complementary color patterns.
|
||||
- `hard_mix` — posterizes to pure black/white/saturated color at intersections.
|
||||
|
||||
**Modes for texture blending**:
|
||||
- `grain_extract` / `grain_merge` — extract a texture from one layer, apply it to another.
|
||||
|
||||
### Multi-Layer Chaining
|
||||
|
||||
```python
|
||||
# Pattern: render layers -> blend sequentially
|
||||
canvas_a = _render_vf(r, "md", vf_plasma, hf_angle(0.0), PAL_DENSE, f, t, S)
|
||||
canvas_b = _render_vf(r, "sm", vf_vortex, hf_time_cycle(0.1), PAL_RUNE, f, t, S)
|
||||
canvas_c = _render_vf(r, "lg", vf_rings, hf_distance(), PAL_BLOCKS, f, t, S)
|
||||
|
||||
result = blend_canvas(canvas_a, canvas_b, "screen", 0.8)
|
||||
result = blend_canvas(result, canvas_c, "difference", 0.6)
|
||||
```
|
||||
|
||||
Order matters: `screen(A, B)` is commutative, but `difference(screen(A,B), C)` differs from `difference(A, screen(B,C))`.
|
||||
|
||||
### Linear-Light Blend Modes
|
||||
|
||||
Standard `blend_canvas()` operates in sRGB space — the raw byte values. This is fine for most uses, but sRGB is perceptually non-linear: blending in sRGB darkens midtones and shifts hues slightly. For physically accurate blending (matching how light actually combines), convert to linear light first.
|
||||
|
||||
Uses `srgb_to_linear()` / `linear_to_srgb()` from `architecture.md` § OKLAB Color System.
|
||||
|
||||
```python
|
||||
def blend_canvas_linear(base, top, mode="normal", opacity=1.0):
|
||||
"""Blend in linear light space for physically accurate results.
|
||||
|
||||
Identical API to blend_canvas(), but converts sRGB → linear before
|
||||
blending and linear → sRGB after. More expensive (~2x) due to the
|
||||
gamma conversions, but produces correct results for additive blending,
|
||||
screen, and any mode where brightness matters.
|
||||
"""
|
||||
af = srgb_to_linear(base.astype(np.float32) / 255.0)
|
||||
bf = srgb_to_linear(top.astype(np.float32) / 255.0)
|
||||
fn = BLEND_MODES.get(mode, BLEND_MODES["normal"])
|
||||
result = fn(af, bf)
|
||||
if opacity < 1.0:
|
||||
result = af * (1 - opacity) + result * opacity
|
||||
result = linear_to_srgb(np.clip(result, 0, 1))
|
||||
return np.clip(result * 255, 0, 255).astype(np.uint8)
|
||||
```
|
||||
|
||||
**When to use `blend_canvas_linear()` vs `blend_canvas()`:**
|
||||
|
||||
| Scenario | Use | Why |
|
||||
|----------|-----|-----|
|
||||
| Screen-blending two bright layers | `linear` | sRGB screen over-brightens highlights |
|
||||
| Add mode for glow/bloom effects | `linear` | Additive light follows linear physics |
|
||||
| Blending text overlay at low opacity | `srgb` | Perceptual blending looks more natural for text |
|
||||
| Multiply for shadow/darkening | `srgb` | Differences are minimal for darken ops |
|
||||
| Color-critical work (matching reference) | `linear` | Avoids sRGB hue shifts in midtones |
|
||||
| Performance-critical inner loop | `srgb` | ~2x faster, good enough for most ASCII art |
|
||||
|
||||
**Batch version** for compositing many layers (converts once, blends multiple, converts back):
|
||||
|
||||
```python
|
||||
def blend_many_linear(layers, modes, opacities):
|
||||
"""Blend a stack of layers in linear light space.
|
||||
|
||||
Args:
|
||||
layers: list of uint8 (H,W,3) canvases
|
||||
modes: list of blend mode strings (len = len(layers) - 1)
|
||||
opacities: list of floats (len = len(layers) - 1)
|
||||
Returns:
|
||||
uint8 (H,W,3) canvas
|
||||
"""
|
||||
# Convert all to linear at once
|
||||
linear = [srgb_to_linear(l.astype(np.float32) / 255.0) for l in layers]
|
||||
result = linear[0]
|
||||
for i in range(1, len(linear)):
|
||||
fn = BLEND_MODES.get(modes[i-1], BLEND_MODES["normal"])
|
||||
blended = fn(result, linear[i])
|
||||
op = opacities[i-1]
|
||||
if op < 1.0:
|
||||
blended = result * (1 - op) + blended * op
|
||||
result = np.clip(blended, 0, 1)
|
||||
result = linear_to_srgb(result)
|
||||
return np.clip(result * 255, 0, 255).astype(np.uint8)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Multi-Grid Composition
|
||||
|
||||
This is the core visual technique. Rendering the same conceptual scene at different grid densities (character sizes) creates natural texture interference, because characters at different scales overlap at different spatial frequencies.
|
||||
|
||||
### Why It Works
|
||||
|
||||
- `sm` grid (10pt font): 320x83 characters. Fine detail, dense texture.
|
||||
- `md` grid (16pt): 192x56 characters. Medium density.
|
||||
- `lg` grid (20pt): 160x45 characters. Coarse, chunky characters.
|
||||
|
||||
When you render a plasma field on `sm` and a vortex on `lg`, then screen-blend them, the fine plasma texture shows through the gaps in the coarse vortex characters. The result has more visual complexity than either layer alone.
|
||||
|
||||
### The `_render_vf()` Helper
|
||||
|
||||
This is the workhorse function. It takes a value field + hue field + palette + grid, renders to a complete pixel canvas:
|
||||
|
||||
```python
|
||||
def _render_vf(r, grid_key, val_fn, hue_fn, pal, f, t, S, sat=0.8, threshold=0.03):
|
||||
"""Render a value field + hue field to a pixel canvas via a named grid.
|
||||
|
||||
Args:
|
||||
r: Renderer instance (has .get_grid())
|
||||
grid_key: "xs", "sm", "md", "lg", "xl", "xxl"
|
||||
val_fn: (g, f, t, S) -> float32 [0,1] array (rows, cols)
|
||||
hue_fn: callable (g, f, t, S) -> float32 hue array, OR float scalar
|
||||
pal: character palette string
|
||||
f: feature dict
|
||||
t: time in seconds
|
||||
S: persistent state dict
|
||||
sat: HSV saturation (0-1)
|
||||
threshold: minimum value to render (below = space)
|
||||
|
||||
Returns:
|
||||
uint8 array (VH, VW, 3) — full pixel canvas
|
||||
"""
|
||||
g = r.get_grid(grid_key)
|
||||
val = np.clip(val_fn(g, f, t, S), 0, 1)
|
||||
mask = val > threshold
|
||||
ch = val2char(val, mask, pal)
|
||||
|
||||
# Hue: either a callable or a fixed float
|
||||
if callable(hue_fn):
|
||||
h = hue_fn(g, f, t, S) % 1.0
|
||||
else:
|
||||
h = np.full((g.rows, g.cols), float(hue_fn), dtype=np.float32)
|
||||
|
||||
# CRITICAL: broadcast to full shape and copy (see Troubleshooting)
|
||||
h = np.broadcast_to(h, (g.rows, g.cols)).copy()
|
||||
|
||||
R, G, B = hsv2rgb(h, np.full_like(val, sat), val)
|
||||
co = mkc(R, G, B, g.rows, g.cols)
|
||||
return g.render(ch, co)
|
||||
```
|
||||
|
||||
### Grid Combination Strategies
|
||||
|
||||
| Combination | Effect | Good For |
|
||||
|-------------|--------|----------|
|
||||
| `sm` + `lg` | Maximum contrast between fine detail and chunky blocks | Bold, graphic looks |
|
||||
| `sm` + `md` | Subtle texture layering, similar scales | Organic, flowing looks |
|
||||
| `md` + `lg` + `xs` | Three-scale interference, maximum complexity | Psychedelic, dense |
|
||||
| `sm` + `sm` (different effects) | Same scale, pattern interference only | Moire, interference |
|
||||
|
||||
### Complete Multi-Grid Scene Example
|
||||
|
||||
```python
|
||||
def fx_psychedelic(r, f, t, S):
|
||||
"""Three-layer multi-grid scene with beat-reactive kaleidoscope."""
|
||||
# Layer A: plasma on medium grid with rainbow hue
|
||||
canvas_a = _render_vf(r, "md",
|
||||
lambda g, f, t, S: vf_plasma(g, f, t, S) * 1.3,
|
||||
hf_angle(0.0), PAL_DENSE, f, t, S, sat=0.8)
|
||||
|
||||
# Layer B: vortex on small grid with cycling hue
|
||||
canvas_b = _render_vf(r, "sm",
|
||||
lambda g, f, t, S: vf_vortex(g, f, t, S, twist=5.0) * 1.2,
|
||||
hf_time_cycle(0.1), PAL_RUNE, f, t, S, sat=0.7)
|
||||
|
||||
# Layer C: rings on large grid with distance hue
|
||||
canvas_c = _render_vf(r, "lg",
|
||||
lambda g, f, t, S: vf_rings(g, f, t, S, n_base=8, spacing_base=3) * 1.4,
|
||||
hf_distance(0.3, 0.02), PAL_BLOCKS, f, t, S, sat=0.9)
|
||||
|
||||
# Blend: A screened with B, then difference with C
|
||||
result = blend_canvas(canvas_a, canvas_b, "screen", 0.8)
|
||||
result = blend_canvas(result, canvas_c, "difference", 0.6)
|
||||
|
||||
# Beat-triggered kaleidoscope
|
||||
if f.get("bdecay", 0) > 0.3:
|
||||
result = sh_kaleidoscope(result.copy(), folds=6)
|
||||
|
||||
return result
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Adaptive Tone Mapping
|
||||
|
||||
### The Brightness Problem
|
||||
|
||||
ASCII characters are small bright dots on a black background. Most pixels in any frame are background (black). This means:
|
||||
- Mean frame brightness is inherently low (often 5-30 out of 255)
|
||||
- Different effect combinations produce wildly different brightness levels
|
||||
- A spiral scene might be 50 mean, while a fire scene is 9 mean
|
||||
- Linear multipliers (e.g., `canvas * 2.0`) either leave dark scenes dark or blow out bright scenes
|
||||
|
||||
### The `tonemap()` Function
|
||||
|
||||
Replaces linear brightness multipliers with adaptive per-frame normalization + gamma correction:
|
||||
|
||||
```python
|
||||
def tonemap(canvas, target_mean=90, gamma=0.75, black_point=2, white_point=253):
|
||||
"""Adaptive tone-mapping: normalizes + gamma-corrects so no frame is
|
||||
fully dark or washed out.
|
||||
|
||||
1. Compute 1st and 99.5th percentile on 4x subsample (16x fewer values,
|
||||
negligible accuracy loss, major speedup at 1080p+)
|
||||
2. Stretch that range to [0, 1]
|
||||
3. Apply gamma curve (< 1 lifts shadows, > 1 darkens)
|
||||
4. Rescale to [black_point, white_point]
|
||||
"""
|
||||
f = canvas.astype(np.float32)
|
||||
sub = f[::4, ::4] # 4x subsample: ~390K values vs ~6.2M at 1080p
|
||||
lo = np.percentile(sub, 1)
|
||||
hi = np.percentile(sub, 99.5)
|
||||
if hi - lo < 10:
|
||||
hi = max(hi, lo + 10) # near-uniform frame fallback
|
||||
f = np.clip((f - lo) / (hi - lo), 0.0, 1.0)
|
||||
np.power(f, gamma, out=f) # in-place: avoids allocation
|
||||
np.multiply(f, (white_point - black_point), out=f)
|
||||
np.add(f, black_point, out=f)
|
||||
return np.clip(f, 0, 255).astype(np.uint8)
|
||||
```
|
||||
|
||||
### Why Gamma, Not Linear
|
||||
|
||||
Linear multiplier `* 2.0`:
|
||||
```
|
||||
input 10 -> output 20 (still dark)
|
||||
input 100 -> output 200 (ok)
|
||||
input 200 -> output 255 (clipped, lost detail)
|
||||
```
|
||||
|
||||
Gamma 0.75 after normalization:
|
||||
```
|
||||
input 0.04 -> output 0.08 (lifted from invisible to visible)
|
||||
input 0.39 -> output 0.50 (moderate lift)
|
||||
input 0.78 -> output 0.84 (gentle lift, no clipping)
|
||||
```
|
||||
|
||||
Gamma < 1 compresses the highlights and expands the shadows. This is exactly what we need: lift dark ASCII content into visibility without blowing out the bright parts.
|
||||
|
||||
### Pipeline Ordering
|
||||
|
||||
The pipeline in `render_clip()` is:
|
||||
|
||||
```
|
||||
scene_fn(r, f, t, S) -> canvas
|
||||
|
|
||||
tonemap(canvas, gamma=scene_gamma)
|
||||
|
|
||||
FeedbackBuffer.apply(canvas, ...)
|
||||
|
|
||||
ShaderChain.apply(canvas, f=f, t=t)
|
||||
|
|
||||
ffmpeg pipe
|
||||
```
|
||||
|
||||
Tonemap runs BEFORE feedback and shaders. This means:
|
||||
- Feedback operates on normalized data (consistent behavior regardless of scene brightness)
|
||||
- Shaders like solarize, posterize, contrast operate on properly-ranged data
|
||||
- The brightness shader in the chain is no longer needed (tonemap handles it)
|
||||
|
||||
### Per-Scene Gamma Tuning
|
||||
|
||||
Default gamma is 0.75. Scenes that apply destructive post-processing need more aggressive lift because the destruction happens after tonemap:
|
||||
|
||||
| Scene Type | Recommended Gamma | Why |
|
||||
|------------|-------------------|-----|
|
||||
| Standard effects | 0.75 | Default, works for most scenes |
|
||||
| Solarize post-process | 0.50-0.60 | Solarize inverts bright pixels, reducing overall brightness |
|
||||
| Posterize post-process | 0.50-0.55 | Posterize quantizes, often crushing mid-values to black |
|
||||
| Heavy difference blending | 0.60-0.70 | Difference mode creates many near-zero pixels |
|
||||
| Already bright scenes | 0.85-1.0 | Don't over-boost scenes that are naturally bright |
|
||||
|
||||
Configure via the scene table:
|
||||
|
||||
```python
|
||||
SCENES = [
|
||||
{"start": 9.17, "end": 11.25, "name": "fire", "gamma": 0.55,
|
||||
"fx": fx_fire, "shaders": [("solarize", {"threshold": 200}), ...]},
|
||||
{"start": 25.96, "end": 27.29, "name": "diamond", "gamma": 0.5,
|
||||
"fx": fx_diamond, "shaders": [("bloom", {"thr": 90}), ...]},
|
||||
]
|
||||
```
|
||||
|
||||
### Brightness Verification
|
||||
|
||||
After rendering, spot-check frame brightness:
|
||||
|
||||
```python
|
||||
# In test-frame mode
|
||||
canvas = scene["fx"](r, feat, t, r.S)
|
||||
canvas = tonemap(canvas, gamma=scene.get("gamma", 0.75))
|
||||
chain = ShaderChain()
|
||||
for sn, kw in scene.get("shaders", []):
|
||||
chain.add(sn, **kw)
|
||||
canvas = chain.apply(canvas, f=feat, t=t)
|
||||
print(f"Mean brightness: {canvas.astype(float).mean():.1f}, max: {canvas.max()}")
|
||||
```
|
||||
|
||||
Target ranges after tonemap + shaders:
|
||||
- Quiet/ambient scenes: mean 30-60
|
||||
- Active scenes: mean 40-100
|
||||
- Climax/peak scenes: mean 60-150
|
||||
- If mean < 20: gamma is too high or a shader is destroying brightness
|
||||
- If mean > 180: gamma is too low or add is stacking too much
|
||||
|
||||
---
|
||||
|
||||
## FeedbackBuffer Spatial Transforms
|
||||
|
||||
The feedback buffer stores the previous frame and blends it into the current frame with decay. Spatial transforms applied to the buffer before blending create the illusion of motion in the feedback trail.
|
||||
|
||||
### Implementation
|
||||
|
||||
```python
|
||||
class FeedbackBuffer:
|
||||
def __init__(self):
|
||||
self.buf = None
|
||||
|
||||
def apply(self, canvas, decay=0.85, blend="screen", opacity=0.5,
|
||||
transform=None, transform_amt=0.02, hue_shift=0.0):
|
||||
if self.buf is None:
|
||||
self.buf = canvas.astype(np.float32) / 255.0
|
||||
return canvas
|
||||
|
||||
# Decay old buffer
|
||||
self.buf *= decay
|
||||
|
||||
# Spatial transform
|
||||
if transform:
|
||||
self.buf = self._transform(self.buf, transform, transform_amt)
|
||||
|
||||
# Hue shift the feedback for rainbow trails
|
||||
if hue_shift > 0:
|
||||
self.buf = self._hue_shift(self.buf, hue_shift)
|
||||
|
||||
# Blend feedback into current frame
|
||||
result = blend_canvas(canvas,
|
||||
np.clip(self.buf * 255, 0, 255).astype(np.uint8),
|
||||
blend, opacity)
|
||||
|
||||
# Update buffer with current frame
|
||||
self.buf = result.astype(np.float32) / 255.0
|
||||
return result
|
||||
|
||||
def _transform(self, buf, transform, amt):
|
||||
h, w = buf.shape[:2]
|
||||
if transform == "zoom":
|
||||
# Zoom in: sample from slightly inside (creates expanding tunnel)
|
||||
m = int(h * amt); n = int(w * amt)
|
||||
if m > 0 and n > 0:
|
||||
cropped = buf[m:-m or None, n:-n or None]
|
||||
# Resize back to full (nearest-neighbor for speed)
|
||||
buf = np.array(Image.fromarray(
|
||||
np.clip(cropped * 255, 0, 255).astype(np.uint8)
|
||||
).resize((w, h), Image.NEAREST)).astype(np.float32) / 255.0
|
||||
elif transform == "shrink":
|
||||
# Zoom out: pad edges, shrink center
|
||||
m = int(h * amt); n = int(w * amt)
|
||||
small = np.array(Image.fromarray(
|
||||
np.clip(buf * 255, 0, 255).astype(np.uint8)
|
||||
).resize((w - 2*n, h - 2*m), Image.NEAREST))
|
||||
new = np.zeros((h, w, 3), dtype=np.uint8)
|
||||
new[m:m+small.shape[0], n:n+small.shape[1]] = small
|
||||
buf = new.astype(np.float32) / 255.0
|
||||
elif transform == "rotate_cw":
|
||||
# Small clockwise rotation via affine
|
||||
angle = amt * 10 # amt=0.005 -> 0.05 degrees per frame
|
||||
cy, cx = h / 2, w / 2
|
||||
Y = np.arange(h, dtype=np.float32)[:, None]
|
||||
X = np.arange(w, dtype=np.float32)[None, :]
|
||||
cos_a, sin_a = np.cos(angle), np.sin(angle)
|
||||
sx = (X - cx) * cos_a + (Y - cy) * sin_a + cx
|
||||
sy = -(X - cx) * sin_a + (Y - cy) * cos_a + cy
|
||||
sx = np.clip(sx.astype(int), 0, w - 1)
|
||||
sy = np.clip(sy.astype(int), 0, h - 1)
|
||||
buf = buf[sy, sx]
|
||||
elif transform == "rotate_ccw":
|
||||
angle = -amt * 10
|
||||
cy, cx = h / 2, w / 2
|
||||
Y = np.arange(h, dtype=np.float32)[:, None]
|
||||
X = np.arange(w, dtype=np.float32)[None, :]
|
||||
cos_a, sin_a = np.cos(angle), np.sin(angle)
|
||||
sx = (X - cx) * cos_a + (Y - cy) * sin_a + cx
|
||||
sy = -(X - cx) * sin_a + (Y - cy) * cos_a + cy
|
||||
sx = np.clip(sx.astype(int), 0, w - 1)
|
||||
sy = np.clip(sy.astype(int), 0, h - 1)
|
||||
buf = buf[sy, sx]
|
||||
elif transform == "shift_up":
|
||||
pixels = max(1, int(h * amt))
|
||||
buf = np.roll(buf, -pixels, axis=0)
|
||||
buf[-pixels:] = 0 # black fill at bottom
|
||||
elif transform == "shift_down":
|
||||
pixels = max(1, int(h * amt))
|
||||
buf = np.roll(buf, pixels, axis=0)
|
||||
buf[:pixels] = 0
|
||||
elif transform == "mirror_h":
|
||||
buf = buf[:, ::-1]
|
||||
return buf
|
||||
|
||||
def _hue_shift(self, buf, amount):
|
||||
"""Rotate hues of the feedback buffer. Operates on float32 [0,1]."""
|
||||
rgb = np.clip(buf * 255, 0, 255).astype(np.uint8)
|
||||
hsv = np.zeros_like(buf)
|
||||
# Simple approximate RGB->HSV->shift->RGB
|
||||
r, g, b = buf[:,:,0], buf[:,:,1], buf[:,:,2]
|
||||
mx = np.maximum(np.maximum(r, g), b)
|
||||
mn = np.minimum(np.minimum(r, g), b)
|
||||
delta = mx - mn + 1e-10
|
||||
# Hue
|
||||
h = np.where(mx == r, ((g - b) / delta) % 6,
|
||||
np.where(mx == g, (b - r) / delta + 2, (r - g) / delta + 4))
|
||||
h = (h / 6 + amount) % 1.0
|
||||
# Reconstruct with shifted hue (simplified)
|
||||
s = delta / (mx + 1e-10)
|
||||
v = mx
|
||||
c = v * s; x = c * (1 - np.abs((h * 6) % 2 - 1)); m = v - c
|
||||
ro = np.zeros_like(h); go = np.zeros_like(h); bo = np.zeros_like(h)
|
||||
for lo, hi, rv, gv, bv in [(0,1,c,x,0),(1,2,x,c,0),(2,3,0,c,x),
|
||||
(3,4,0,x,c),(4,5,x,0,c),(5,6,c,0,x)]:
|
||||
mask = ((h*6) >= lo) & ((h*6) < hi)
|
||||
ro[mask] = rv[mask] if not isinstance(rv, (int,float)) else rv
|
||||
go[mask] = gv[mask] if not isinstance(gv, (int,float)) else gv
|
||||
bo[mask] = bv[mask] if not isinstance(bv, (int,float)) else bv
|
||||
return np.stack([ro+m, go+m, bo+m], axis=2)
|
||||
```
|
||||
|
||||
### Feedback Presets
|
||||
|
||||
| Preset | Config | Visual Effect |
|
||||
|--------|--------|---------------|
|
||||
| Infinite zoom tunnel | `decay=0.8, blend="screen", transform="zoom", transform_amt=0.015` | Expanding ring patterns |
|
||||
| Rainbow trails | `decay=0.7, blend="screen", transform="zoom", transform_amt=0.01, hue_shift=0.02` | Psychedelic color trails |
|
||||
| Ghostly echo | `decay=0.9, blend="add", opacity=0.15, transform="shift_up", transform_amt=0.01` | Faint upward smearing |
|
||||
| Kaleidoscopic recursion | `decay=0.75, blend="screen", transform="rotate_cw", transform_amt=0.005, hue_shift=0.01` | Rotating mandala feedback |
|
||||
| Color evolution | `decay=0.8, blend="difference", opacity=0.4, hue_shift=0.03` | Frame-to-frame color XOR |
|
||||
| Rising heat haze | `decay=0.5, blend="add", opacity=0.2, transform="shift_up", transform_amt=0.02` | Hot air shimmer |
|
||||
|
||||
---
|
||||
|
||||
## Masking / Stencil System
|
||||
|
||||
Masks are float32 arrays `(rows, cols)` or `(VH, VW)` in range [0, 1]. They control where effects are visible: 1.0 = fully visible, 0.0 = fully hidden. Use masks to create figure/ground relationships, focal points, and shaped reveals.
|
||||
|
||||
### Shape Masks
|
||||
|
||||
```python
|
||||
def mask_circle(g, cx_frac=0.5, cy_frac=0.5, radius=0.3, feather=0.05):
|
||||
"""Circular mask centered at (cx_frac, cy_frac) in normalized coords.
|
||||
feather: width of soft edge (0 = hard cutoff)."""
|
||||
asp = g.cw / g.ch if hasattr(g, 'cw') else 1.0
|
||||
dx = (g.cc / g.cols - cx_frac)
|
||||
dy = (g.rr / g.rows - cy_frac) * asp
|
||||
d = np.sqrt(dx**2 + dy**2)
|
||||
if feather > 0:
|
||||
return np.clip(1.0 - (d - radius) / feather, 0, 1)
|
||||
return (d <= radius).astype(np.float32)
|
||||
|
||||
def mask_rect(g, x0=0.2, y0=0.2, x1=0.8, y1=0.8, feather=0.03):
|
||||
"""Rectangular mask. Coordinates in [0,1] normalized."""
|
||||
dx = np.maximum(x0 - g.cc / g.cols, g.cc / g.cols - x1)
|
||||
dy = np.maximum(y0 - g.rr / g.rows, g.rr / g.rows - y1)
|
||||
d = np.maximum(dx, dy)
|
||||
if feather > 0:
|
||||
return np.clip(1.0 - d / feather, 0, 1)
|
||||
return (d <= 0).astype(np.float32)
|
||||
|
||||
def mask_ring(g, cx_frac=0.5, cy_frac=0.5, inner_r=0.15, outer_r=0.35,
|
||||
feather=0.03):
|
||||
"""Ring / annulus mask."""
|
||||
inner = mask_circle(g, cx_frac, cy_frac, inner_r, feather)
|
||||
outer = mask_circle(g, cx_frac, cy_frac, outer_r, feather)
|
||||
return outer - inner
|
||||
|
||||
def mask_gradient_h(g, start=0.0, end=1.0):
|
||||
"""Left-to-right gradient mask."""
|
||||
return np.clip((g.cc / g.cols - start) / (end - start + 1e-10), 0, 1).astype(np.float32)
|
||||
|
||||
def mask_gradient_v(g, start=0.0, end=1.0):
|
||||
"""Top-to-bottom gradient mask."""
|
||||
return np.clip((g.rr / g.rows - start) / (end - start + 1e-10), 0, 1).astype(np.float32)
|
||||
|
||||
def mask_gradient_radial(g, cx_frac=0.5, cy_frac=0.5, inner=0.0, outer=0.5):
|
||||
"""Radial gradient mask — bright at center, dark at edges."""
|
||||
d = np.sqrt((g.cc / g.cols - cx_frac)**2 + (g.rr / g.rows - cy_frac)**2)
|
||||
return np.clip(1.0 - (d - inner) / (outer - inner + 1e-10), 0, 1)
|
||||
```
|
||||
|
||||
### Value Field as Mask
|
||||
|
||||
Use any `vf_*` function's output as a spatial mask:
|
||||
|
||||
```python
|
||||
def mask_from_vf(vf_result, threshold=0.5, feather=0.1):
|
||||
"""Convert a value field to a mask by thresholding.
|
||||
feather: smooth edge width around threshold."""
|
||||
if feather > 0:
|
||||
return np.clip((vf_result - threshold + feather) / (2 * feather), 0, 1)
|
||||
return (vf_result > threshold).astype(np.float32)
|
||||
|
||||
def mask_select(mask, vf_a, vf_b):
|
||||
"""Spatial conditional: show vf_a where mask is 1, vf_b where mask is 0.
|
||||
mask: float32 [0,1] array. Intermediate values blend."""
|
||||
return vf_a * mask + vf_b * (1 - mask)
|
||||
```
|
||||
|
||||
### Text Stencil
|
||||
|
||||
Render text to a mask. Effects are visible only through the letterforms:
|
||||
|
||||
```python
|
||||
def mask_text(grid, text, row_frac=0.5, font=None, font_size=None):
|
||||
"""Render text string as a float32 mask [0,1] at grid resolution.
|
||||
Characters = 1.0, background = 0.0.
|
||||
|
||||
row_frac: vertical position as fraction of grid height.
|
||||
font: PIL ImageFont (defaults to grid's font if None).
|
||||
font_size: override font size for the mask text (for larger stencil text).
|
||||
"""
|
||||
from PIL import Image, ImageDraw, ImageFont
|
||||
|
||||
f = font or grid.font
|
||||
if font_size and font != grid.font:
|
||||
f = ImageFont.truetype(font.path, font_size)
|
||||
|
||||
# Render text to image at pixel resolution, then downsample to grid
|
||||
img = Image.new("L", (grid.cols * grid.cw, grid.ch), 0)
|
||||
draw = ImageDraw.Draw(img)
|
||||
bbox = draw.textbbox((0, 0), text, font=f)
|
||||
tw = bbox[2] - bbox[0]
|
||||
x = (grid.cols * grid.cw - tw) // 2
|
||||
draw.text((x, 0), text, fill=255, font=f)
|
||||
row_mask = np.array(img, dtype=np.float32) / 255.0
|
||||
|
||||
# Place in full grid mask
|
||||
mask = np.zeros((grid.rows, grid.cols), dtype=np.float32)
|
||||
target_row = int(grid.rows * row_frac)
|
||||
# Downsample rendered text to grid cells
|
||||
for c in range(grid.cols):
|
||||
px = c * grid.cw
|
||||
if px + grid.cw <= row_mask.shape[1]:
|
||||
cell = row_mask[:, px:px + grid.cw]
|
||||
if cell.mean() > 0.1:
|
||||
mask[target_row, c] = cell.mean()
|
||||
return mask
|
||||
|
||||
def mask_text_block(grid, lines, start_row_frac=0.3, font=None):
|
||||
"""Multi-line text stencil. Returns full grid mask."""
|
||||
mask = np.zeros((grid.rows, grid.cols), dtype=np.float32)
|
||||
for i, line in enumerate(lines):
|
||||
row_frac = start_row_frac + i / grid.rows
|
||||
line_mask = mask_text(grid, line, row_frac, font)
|
||||
mask = np.maximum(mask, line_mask)
|
||||
return mask
|
||||
```
|
||||
|
||||
### Animated Masks
|
||||
|
||||
Masks that change over time for reveals, wipes, and morphing:
|
||||
|
||||
```python
|
||||
def mask_iris(g, t, t_start, t_end, cx_frac=0.5, cy_frac=0.5,
|
||||
max_radius=0.7, ease_fn=None):
|
||||
"""Iris open/close: circle that grows from 0 to max_radius.
|
||||
ease_fn: easing function (default: ease_in_out_cubic from effects.md)."""
|
||||
if ease_fn is None:
|
||||
ease_fn = lambda x: x * x * (3 - 2 * x) # smoothstep fallback
|
||||
progress = np.clip((t - t_start) / (t_end - t_start), 0, 1)
|
||||
radius = ease_fn(progress) * max_radius
|
||||
return mask_circle(g, cx_frac, cy_frac, radius, feather=0.03)
|
||||
|
||||
def mask_wipe_h(g, t, t_start, t_end, direction="right"):
|
||||
"""Horizontal wipe reveal."""
|
||||
progress = np.clip((t - t_start) / (t_end - t_start), 0, 1)
|
||||
if direction == "left":
|
||||
progress = 1 - progress
|
||||
return mask_gradient_h(g, start=progress - 0.05, end=progress + 0.05)
|
||||
|
||||
def mask_wipe_v(g, t, t_start, t_end, direction="down"):
|
||||
"""Vertical wipe reveal."""
|
||||
progress = np.clip((t - t_start) / (t_end - t_start), 0, 1)
|
||||
if direction == "up":
|
||||
progress = 1 - progress
|
||||
return mask_gradient_v(g, start=progress - 0.05, end=progress + 0.05)
|
||||
|
||||
def mask_dissolve(g, t, t_start, t_end, seed=42):
|
||||
"""Random pixel dissolve — noise threshold sweeps from 0 to 1."""
|
||||
progress = np.clip((t - t_start) / (t_end - t_start), 0, 1)
|
||||
rng = np.random.RandomState(seed)
|
||||
noise = rng.random((g.rows, g.cols)).astype(np.float32)
|
||||
return (noise < progress).astype(np.float32)
|
||||
```
|
||||
|
||||
### Mask Boolean Operations
|
||||
|
||||
```python
|
||||
def mask_union(a, b):
|
||||
"""OR — visible where either mask is active."""
|
||||
return np.maximum(a, b)
|
||||
|
||||
def mask_intersect(a, b):
|
||||
"""AND — visible only where both masks are active."""
|
||||
return np.minimum(a, b)
|
||||
|
||||
def mask_subtract(a, b):
|
||||
"""A minus B — visible where A is active but B is not."""
|
||||
return np.clip(a - b, 0, 1)
|
||||
|
||||
def mask_invert(m):
|
||||
"""NOT — flip mask."""
|
||||
return 1.0 - m
|
||||
```
|
||||
|
||||
### Applying Masks to Canvases
|
||||
|
||||
```python
|
||||
def apply_mask_canvas(canvas, mask, bg_canvas=None):
|
||||
"""Apply a grid-resolution mask to a pixel canvas.
|
||||
Expands mask from (rows, cols) to (VH, VW) via nearest-neighbor.
|
||||
|
||||
canvas: uint8 (VH, VW, 3)
|
||||
mask: float32 (rows, cols) [0,1]
|
||||
bg_canvas: what shows through where mask=0. None = black.
|
||||
"""
|
||||
# Expand mask to pixel resolution
|
||||
mask_px = np.repeat(np.repeat(mask, canvas.shape[0] // mask.shape[0] + 1, axis=0),
|
||||
canvas.shape[1] // mask.shape[1] + 1, axis=1)
|
||||
mask_px = mask_px[:canvas.shape[0], :canvas.shape[1]]
|
||||
|
||||
if bg_canvas is not None:
|
||||
return np.clip(canvas * mask_px[:, :, None] +
|
||||
bg_canvas * (1 - mask_px[:, :, None]), 0, 255).astype(np.uint8)
|
||||
return np.clip(canvas * mask_px[:, :, None], 0, 255).astype(np.uint8)
|
||||
|
||||
def apply_mask_vf(vf_a, vf_b, mask):
|
||||
"""Apply mask at value-field level — blend two value fields spatially.
|
||||
All arrays are (rows, cols) float32."""
|
||||
return vf_a * mask + vf_b * (1 - mask)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## PixelBlendStack
|
||||
|
||||
Higher-level wrapper for multi-layer compositing:
|
||||
|
||||
```python
|
||||
class PixelBlendStack:
|
||||
def __init__(self):
|
||||
self.layers = []
|
||||
|
||||
def add(self, canvas, mode="normal", opacity=1.0):
|
||||
self.layers.append((canvas, mode, opacity))
|
||||
return self
|
||||
|
||||
def composite(self):
|
||||
if not self.layers:
|
||||
return np.zeros((VH, VW, 3), dtype=np.uint8)
|
||||
result = self.layers[0][0]
|
||||
for canvas, mode, opacity in self.layers[1:]:
|
||||
result = blend_canvas(result, canvas, mode, opacity)
|
||||
return result
|
||||
```
|
||||
|
||||
## Text Backdrop (Readability Mask)
|
||||
|
||||
When placing readable text over busy multi-grid ASCII backgrounds, the text will blend into the background and become illegible. **Always apply a dark backdrop behind text regions.**
|
||||
|
||||
The technique: compute the bounding box of all text glyphs, create a gaussian-blurred dark mask covering that area with padding, and multiply the background by `(1 - mask * darkness)` before rendering text on top.
|
||||
|
||||
```python
|
||||
from scipy.ndimage import gaussian_filter
|
||||
|
||||
def apply_text_backdrop(canvas, glyphs, padding=80, darkness=0.75):
|
||||
"""Darken the background behind text for readability.
|
||||
|
||||
Call AFTER rendering background, BEFORE rendering text.
|
||||
|
||||
Args:
|
||||
canvas: (VH, VW, 3) uint8 background
|
||||
glyphs: list of {"x": float, "y": float, ...} glyph positions
|
||||
padding: pixel padding around text bounding box
|
||||
darkness: 0.0 = no darkening, 1.0 = fully black
|
||||
Returns:
|
||||
darkened canvas (uint8)
|
||||
"""
|
||||
if not glyphs:
|
||||
return canvas
|
||||
xs = [g['x'] for g in glyphs]
|
||||
ys = [g['y'] for g in glyphs]
|
||||
x0 = max(0, int(min(xs)) - padding)
|
||||
y0 = max(0, int(min(ys)) - padding)
|
||||
x1 = min(VW, int(max(xs)) + padding + 50) # extra for char width
|
||||
y1 = min(VH, int(max(ys)) + padding + 60) # extra for char height
|
||||
|
||||
# Soft dark mask with gaussian blur for feathered edges
|
||||
mask = np.zeros((VH, VW), dtype=np.float32)
|
||||
mask[y0:y1, x0:x1] = 1.0
|
||||
mask = gaussian_filter(mask, sigma=padding * 0.6)
|
||||
|
||||
factor = 1.0 - mask * darkness
|
||||
return (canvas.astype(np.float32) * factor[:, :, np.newaxis]).astype(np.uint8)
|
||||
```
|
||||
|
||||
### Usage in render pipeline
|
||||
|
||||
Insert between background rendering and text rendering:
|
||||
|
||||
```python
|
||||
# 1. Render background (multi-grid ASCII effects)
|
||||
bg = render_background(cfg, t)
|
||||
|
||||
# 2. Darken behind text region
|
||||
bg = apply_text_backdrop(bg, frame_glyphs, padding=80, darkness=0.75)
|
||||
|
||||
# 3. Render text on top (now readable against dark backdrop)
|
||||
bg = text_renderer.render(bg, frame_glyphs, color=(255, 255, 255))
|
||||
```
|
||||
|
||||
Combine with **reverse vignette** (see shaders.md) for scenes where text is always centered — the reverse vignette provides a persistent center-dark zone, while the backdrop handles per-frame glyph positions.
|
||||
|
||||
## External Layout Oracle Pattern
|
||||
|
||||
For text-heavy videos where text needs to dynamically reflow around obstacles (shapes, icons, other text), use an external layout engine to pre-compute glyph positions and feed them into the Python renderer via JSON.
|
||||
|
||||
### Architecture
|
||||
|
||||
```
|
||||
Layout Engine (browser/Node.js) → layouts.json → Python ASCII Renderer
|
||||
↑ ↑
|
||||
Computes per-frame Reads glyph positions,
|
||||
glyph (x,y) positions renders as ASCII chars
|
||||
with obstacle-aware reflow with full effect pipeline
|
||||
```
|
||||
|
||||
### JSON interchange format
|
||||
|
||||
```json
|
||||
{
|
||||
"meta": {
|
||||
"canvas_width": 1080, "canvas_height": 1080,
|
||||
"fps": 24, "total_frames": 1248,
|
||||
"fonts": {
|
||||
"body": {"charW": 12.04, "charH": 24, "fontSize": 20},
|
||||
"hero": {"charW": 24.08, "charH": 48, "fontSize": 40}
|
||||
}
|
||||
},
|
||||
"scenes": [
|
||||
{
|
||||
"id": "scene_name",
|
||||
"start_frame": 0, "end_frame": 96,
|
||||
"frames": {
|
||||
"0": {
|
||||
"glyphs": [
|
||||
{"char": "H", "x": 287.1, "y": 400.0, "alpha": 1.0},
|
||||
{"char": "e", "x": 311.2, "y": 400.0, "alpha": 1.0}
|
||||
],
|
||||
"obstacles": [
|
||||
{"type": "circle", "cx": 540, "cy": 540, "r": 80},
|
||||
{"type": "rect", "x": 300, "y": 500, "w": 120, "h": 80}
|
||||
]
|
||||
}
|
||||
}
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
### When to use
|
||||
|
||||
- Text that dynamically reflows around moving objects
|
||||
- Per-glyph animation (reveal, scatter, physics)
|
||||
- Variable typography that needs precise measurement
|
||||
- Any case where Python's Pillow text layout is insufficient
|
||||
|
||||
### When NOT to use
|
||||
|
||||
- Static centered text (just use PIL `draw.text()` directly)
|
||||
- Text that only fades in/out without spatial animation
|
||||
- Simple typewriter effects (handle in Python with a character counter)
|
||||
|
||||
### Running the oracle
|
||||
|
||||
Use Playwright to run the layout engine in a headless browser:
|
||||
|
||||
```javascript
|
||||
// extract.mjs
|
||||
import { chromium } from 'playwright';
|
||||
const browser = await chromium.launch({ headless: true });
|
||||
const page = await browser.newPage();
|
||||
await page.goto(`file://${oraclePath}`);
|
||||
await page.waitForFunction(() => window.__ORACLE_DONE__ === true, null, { timeout: 60000 });
|
||||
const result = await page.evaluate(() => window.__ORACLE_RESULT__);
|
||||
writeFileSync('layouts.json', JSON.stringify(result));
|
||||
await browser.close();
|
||||
```
|
||||
|
||||
### Consuming in Python
|
||||
|
||||
```python
|
||||
# In the renderer, map pixel positions to the canvas:
|
||||
for glyph in frame_data['glyphs']:
|
||||
char, px, py = glyph['char'], glyph['x'], glyph['y']
|
||||
alpha = glyph.get('alpha', 1.0)
|
||||
# Render using PIL draw.text() at exact pixel position
|
||||
draw.text((px, py), char, fill=(int(255*alpha),)*3, font=font)
|
||||
```
|
||||
|
||||
Obstacles from the JSON can also be rendered as glowing ASCII shapes (circles, rectangles) to visualize the reflow zones.
|
||||
File diff suppressed because it is too large
Load Diff
@@ -0,0 +1,685 @@
|
||||
# Input Sources
|
||||
|
||||
> **See also:** architecture.md · effects.md · scenes.md · shaders.md · optimization.md · troubleshooting.md
|
||||
|
||||
## Audio Analysis
|
||||
|
||||
### Loading
|
||||
|
||||
```python
|
||||
tmp = tempfile.mktemp(suffix=".wav")
|
||||
subprocess.run(["ffmpeg", "-y", "-i", input_path, "-ac", "1", "-ar", "22050",
|
||||
"-sample_fmt", "s16", tmp], capture_output=True, check=True)
|
||||
with wave.open(tmp) as wf:
|
||||
sr = wf.getframerate()
|
||||
raw = wf.readframes(wf.getnframes())
|
||||
samples = np.frombuffer(raw, dtype=np.int16).astype(np.float32) / 32768.0
|
||||
```
|
||||
|
||||
### Per-Frame FFT
|
||||
|
||||
```python
|
||||
hop = sr // fps # samples per frame
|
||||
win = hop * 2 # analysis window (2x hop for overlap)
|
||||
window = np.hanning(win)
|
||||
freqs = rfftfreq(win, 1.0 / sr)
|
||||
|
||||
bands = {
|
||||
"sub": (freqs >= 20) & (freqs < 80),
|
||||
"bass": (freqs >= 80) & (freqs < 250),
|
||||
"lomid": (freqs >= 250) & (freqs < 500),
|
||||
"mid": (freqs >= 500) & (freqs < 2000),
|
||||
"himid": (freqs >= 2000)& (freqs < 6000),
|
||||
"hi": (freqs >= 6000),
|
||||
}
|
||||
```
|
||||
|
||||
For each frame: extract chunk, apply window, FFT, compute band energies.
|
||||
|
||||
### Feature Set
|
||||
|
||||
| Feature | Formula | Controls |
|
||||
|---------|---------|----------|
|
||||
| `rms` | `sqrt(mean(chunk²))` | Overall loudness/energy |
|
||||
| `sub`..`hi` | `sqrt(mean(band_magnitudes²))` | Per-band energy |
|
||||
| `centroid` | `sum(freq*mag) / sum(mag)` | Brightness/timbre |
|
||||
| `flatness` | `geomean(mag) / mean(mag)` | Noise vs tone |
|
||||
| `flux` | `sum(max(0, mag - prev_mag))` | Transient strength |
|
||||
| `sub_r`..`hi_r` | `band / sum(all_bands)` | Spectral shape (volume-independent) |
|
||||
| `cent_d` | `abs(gradient(centroid))` | Timbral change rate |
|
||||
| `beat` | Flux peak detection | Binary beat onset |
|
||||
| `bdecay` | Exponential decay from beats | Smooth beat pulse (0→1→0) |
|
||||
|
||||
**Band ratios are critical** — they decouple spectral shape from volume, so a quiet bass section and a loud bass section both read as "bassy" rather than just "loud" vs "quiet".
|
||||
|
||||
### Smoothing
|
||||
|
||||
EMA prevents visual jitter:
|
||||
|
||||
```python
|
||||
def ema(arr, alpha):
|
||||
out = np.empty_like(arr); out[0] = arr[0]
|
||||
for i in range(1, len(arr)):
|
||||
out[i] = alpha * arr[i] + (1 - alpha) * out[i-1]
|
||||
return out
|
||||
|
||||
# Slow-moving features (alpha=0.12): centroid, flatness, band ratios, cent_d
|
||||
# Fast-moving features (alpha=0.3): rms, flux, raw bands
|
||||
```
|
||||
|
||||
### Beat Detection
|
||||
|
||||
```python
|
||||
flux_smooth = np.convolve(flux, np.ones(5)/5, mode="same")
|
||||
peaks, _ = signal.find_peaks(flux_smooth, height=0.15, distance=fps//5, prominence=0.05)
|
||||
|
||||
beat = np.zeros(n_frames)
|
||||
bdecay = np.zeros(n_frames, dtype=np.float32)
|
||||
for p in peaks:
|
||||
beat[p] = 1.0
|
||||
for d in range(fps // 2):
|
||||
if p + d < n_frames:
|
||||
bdecay[p + d] = max(bdecay[p + d], math.exp(-d * 2.5 / (fps // 2)))
|
||||
```
|
||||
|
||||
`bdecay` gives smooth 0→1→0 pulse per beat, decaying over ~0.5s. Use for flash/glitch/mirror triggers.
|
||||
|
||||
### Normalization
|
||||
|
||||
After computing all frames, normalize each feature to 0-1:
|
||||
|
||||
```python
|
||||
for k in features:
|
||||
a = features[k]
|
||||
lo, hi = a.min(), a.max()
|
||||
features[k] = (a - lo) / (hi - lo + 1e-10)
|
||||
```
|
||||
|
||||
## Video Sampling
|
||||
|
||||
### Frame Extraction
|
||||
|
||||
```python
|
||||
# Method 1: ffmpeg pipe (memory efficient)
|
||||
cmd = ["ffmpeg", "-i", input_video, "-f", "rawvideo", "-pix_fmt", "rgb24",
|
||||
"-s", f"{target_w}x{target_h}", "-r", str(fps), "-"]
|
||||
pipe = subprocess.Popen(cmd, stdout=subprocess.PIPE, stderr=subprocess.DEVNULL)
|
||||
frame_size = target_w * target_h * 3
|
||||
for fi in range(n_frames):
|
||||
raw = pipe.stdout.read(frame_size)
|
||||
if len(raw) < frame_size: break
|
||||
frame = np.frombuffer(raw, dtype=np.uint8).reshape(target_h, target_w, 3)
|
||||
# process frame...
|
||||
|
||||
# Method 2: OpenCV (if available)
|
||||
cap = cv2.VideoCapture(input_video)
|
||||
```
|
||||
|
||||
### Luminance-to-Character Mapping
|
||||
|
||||
Convert video pixels to ASCII characters based on brightness:
|
||||
|
||||
```python
|
||||
def frame_to_ascii(frame_rgb, grid, pal=PAL_DEFAULT):
|
||||
"""Convert video frame to character + color arrays."""
|
||||
rows, cols = grid.rows, grid.cols
|
||||
# Resize frame to grid dimensions
|
||||
small = np.array(Image.fromarray(frame_rgb).resize((cols, rows), Image.LANCZOS))
|
||||
# Luminance
|
||||
lum = (0.299 * small[:,:,0] + 0.587 * small[:,:,1] + 0.114 * small[:,:,2]) / 255.0
|
||||
# Map to chars
|
||||
chars = val2char(lum, lum > 0.02, pal)
|
||||
# Colors: use source pixel colors, scaled by luminance for visibility
|
||||
colors = np.clip(small * np.clip(lum[:,:,None] * 1.5 + 0.3, 0.3, 1), 0, 255).astype(np.uint8)
|
||||
return chars, colors
|
||||
```
|
||||
|
||||
### Edge-Weighted Character Mapping
|
||||
|
||||
Use edge detection for more detail in contour regions:
|
||||
|
||||
```python
|
||||
def frame_to_ascii_edges(frame_rgb, grid, pal=PAL_DEFAULT, edge_pal=PAL_BOX):
|
||||
gray = np.mean(frame_rgb, axis=2)
|
||||
small_gray = resize(gray, (grid.rows, grid.cols))
|
||||
lum = small_gray / 255.0
|
||||
|
||||
# Sobel edge detection
|
||||
gx = np.abs(small_gray[:, 2:] - small_gray[:, :-2])
|
||||
gy = np.abs(small_gray[2:, :] - small_gray[:-2, :])
|
||||
edge = np.zeros_like(small_gray)
|
||||
edge[:, 1:-1] += gx; edge[1:-1, :] += gy
|
||||
edge = np.clip(edge / edge.max(), 0, 1)
|
||||
|
||||
# Edge regions get box drawing chars, flat regions get brightness chars
|
||||
is_edge = edge > 0.15
|
||||
chars = val2char(lum, lum > 0.02, pal)
|
||||
edge_chars = val2char(edge, is_edge, edge_pal)
|
||||
chars[is_edge] = edge_chars[is_edge]
|
||||
|
||||
return chars, colors
|
||||
```
|
||||
|
||||
### Motion Detection
|
||||
|
||||
Detect pixel changes between frames for motion-reactive effects:
|
||||
|
||||
```python
|
||||
prev_frame = None
|
||||
def compute_motion(frame):
|
||||
global prev_frame
|
||||
if prev_frame is None:
|
||||
prev_frame = frame.astype(np.float32)
|
||||
return np.zeros(frame.shape[:2])
|
||||
diff = np.abs(frame.astype(np.float32) - prev_frame).mean(axis=2)
|
||||
prev_frame = frame.astype(np.float32) * 0.7 + prev_frame * 0.3 # smoothed
|
||||
return np.clip(diff / 30.0, 0, 1) # normalized motion map
|
||||
```
|
||||
|
||||
Use motion map to drive particle emission, glitch intensity, or character density.
|
||||
|
||||
### Video Feature Extraction
|
||||
|
||||
Per-frame features analogous to audio features, for driving effects:
|
||||
|
||||
```python
|
||||
def analyze_video_frame(frame_rgb):
|
||||
gray = np.mean(frame_rgb, axis=2)
|
||||
return {
|
||||
"brightness": gray.mean() / 255.0,
|
||||
"contrast": gray.std() / 128.0,
|
||||
"edge_density": compute_edge_density(gray),
|
||||
"motion": compute_motion(frame_rgb).mean(),
|
||||
"dominant_hue": compute_dominant_hue(frame_rgb),
|
||||
"color_variance": compute_color_variance(frame_rgb),
|
||||
}
|
||||
```
|
||||
|
||||
## Image Sequence
|
||||
|
||||
### Static Image to ASCII
|
||||
|
||||
Same as single video frame conversion. For animated sequences:
|
||||
|
||||
```python
|
||||
import glob
|
||||
frames = sorted(glob.glob("frames/*.png"))
|
||||
for fi, path in enumerate(frames):
|
||||
img = np.array(Image.open(path).resize((VW, VH)))
|
||||
chars, colors = frame_to_ascii(img, grid, pal)
|
||||
```
|
||||
|
||||
### Image as Texture Source
|
||||
|
||||
Use an image as a background texture that effects modulate:
|
||||
|
||||
```python
|
||||
def load_texture(path, grid):
|
||||
img = np.array(Image.open(path).resize((grid.cols, grid.rows)))
|
||||
lum = np.mean(img, axis=2) / 255.0
|
||||
return lum, img # luminance for char mapping, RGB for colors
|
||||
```
|
||||
|
||||
## Text / Lyrics
|
||||
|
||||
### SRT Parsing
|
||||
|
||||
```python
|
||||
import re
|
||||
def parse_srt(path):
|
||||
"""Returns [(start_sec, end_sec, text), ...]"""
|
||||
entries = []
|
||||
with open(path) as f:
|
||||
content = f.read()
|
||||
blocks = content.strip().split("\n\n")
|
||||
for block in blocks:
|
||||
lines = block.strip().split("\n")
|
||||
if len(lines) >= 3:
|
||||
times = lines[1]
|
||||
m = re.match(r"(\d+):(\d+):(\d+),(\d+) --> (\d+):(\d+):(\d+),(\d+)", times)
|
||||
if m:
|
||||
g = [int(x) for x in m.groups()]
|
||||
start = g[0]*3600 + g[1]*60 + g[2] + g[3]/1000
|
||||
end = g[4]*3600 + g[5]*60 + g[6] + g[7]/1000
|
||||
text = " ".join(lines[2:])
|
||||
entries.append((start, end, text))
|
||||
return entries
|
||||
```
|
||||
|
||||
### Lyrics Display Modes
|
||||
|
||||
- **Typewriter**: characters appear left-to-right over the time window
|
||||
- **Fade-in**: whole line fades from dark to bright
|
||||
- **Flash**: appear instantly on beat, fade out
|
||||
- **Scatter**: characters start at random positions, converge to final position
|
||||
- **Wave**: text follows a sine wave path
|
||||
|
||||
```python
|
||||
def lyrics_typewriter(ch, co, text, row, col, t, t_start, t_end, color):
|
||||
"""Reveal characters progressively over time window."""
|
||||
progress = np.clip((t - t_start) / (t_end - t_start), 0, 1)
|
||||
n_visible = int(len(text) * progress)
|
||||
stamp(ch, co, text[:n_visible], row, col, color)
|
||||
```
|
||||
|
||||
## Generative (No Input)
|
||||
|
||||
For pure generative ASCII art, the "features" dict is synthesized from time:
|
||||
|
||||
```python
|
||||
def synthetic_features(t, bpm=120):
|
||||
"""Generate audio-like features from time alone."""
|
||||
beat_period = 60.0 / bpm
|
||||
beat_phase = (t % beat_period) / beat_period
|
||||
return {
|
||||
"rms": 0.5 + 0.3 * math.sin(t * 0.5),
|
||||
"bass": 0.5 + 0.4 * math.sin(t * 2 * math.pi / beat_period),
|
||||
"sub": 0.3 + 0.3 * math.sin(t * 0.8),
|
||||
"mid": 0.4 + 0.3 * math.sin(t * 1.3),
|
||||
"hi": 0.3 + 0.2 * math.sin(t * 2.1),
|
||||
"cent": 0.5 + 0.2 * math.sin(t * 0.3),
|
||||
"flat": 0.4,
|
||||
"flux": 0.3 + 0.2 * math.sin(t * 3),
|
||||
"beat": 1.0 if beat_phase < 0.05 else 0.0,
|
||||
"bdecay": max(0, 1.0 - beat_phase * 4),
|
||||
# ratios
|
||||
"sub_r": 0.2, "bass_r": 0.25, "lomid_r": 0.15,
|
||||
"mid_r": 0.2, "himid_r": 0.12, "hi_r": 0.08,
|
||||
"cent_d": 0.1,
|
||||
}
|
||||
```
|
||||
|
||||
## TTS Integration
|
||||
|
||||
For narrated videos (testimonials, quotes, storytelling), generate speech audio per segment and mix with background music.
|
||||
|
||||
### ElevenLabs Voice Generation
|
||||
|
||||
```python
|
||||
import requests, time, os
|
||||
|
||||
def generate_tts(text, voice_id, api_key, output_path, model="eleven_multilingual_v2"):
|
||||
"""Generate TTS audio via ElevenLabs API. Streams response to disk."""
|
||||
# Skip if already generated (idempotent re-runs)
|
||||
if os.path.exists(output_path) and os.path.getsize(output_path) > 1000:
|
||||
return
|
||||
|
||||
url = f"https://api.elevenlabs.io/v1/text-to-speech/{voice_id}"
|
||||
headers = {"xi-api-key": api_key, "Content-Type": "application/json"}
|
||||
data = {
|
||||
"text": text,
|
||||
"model_id": model,
|
||||
"voice_settings": {
|
||||
"stability": 0.65,
|
||||
"similarity_boost": 0.80,
|
||||
"style": 0.15,
|
||||
"use_speaker_boost": True,
|
||||
},
|
||||
}
|
||||
resp = requests.post(url, json=data, headers=headers, stream=True)
|
||||
resp.raise_for_status()
|
||||
with open(output_path, "wb") as f:
|
||||
for chunk in resp.iter_content(chunk_size=4096):
|
||||
f.write(chunk)
|
||||
time.sleep(0.3) # rate limit: avoid 429s on batch generation
|
||||
```
|
||||
|
||||
Voice settings notes:
|
||||
- `stability` 0.65 gives natural variation without drift. Lower (0.3-0.5) for more expressive reads, higher (0.7-0.9) for monotone/narration.
|
||||
- `similarity_boost` 0.80 keeps it close to the voice profile. Lower for more generic sound.
|
||||
- `style` 0.15 adds slight stylistic variation. Keep low (0-0.2) for straightforward reads.
|
||||
- `use_speaker_boost` True improves clarity at the cost of slightly more processing time.
|
||||
|
||||
### Voice Pool
|
||||
|
||||
ElevenLabs has ~20 built-in voices. Use multiple voices for variety across quotes. Reference pool:
|
||||
|
||||
```python
|
||||
VOICE_POOL = [
|
||||
("JBFqnCBsd6RMkjVDRZzb", "George"),
|
||||
("nPczCjzI2devNBz1zQrb", "Brian"),
|
||||
("pqHfZKP75CvOlQylNhV4", "Bill"),
|
||||
("CwhRBWXzGAHq8TQ4Fs17", "Roger"),
|
||||
("cjVigY5qzO86Huf0OWal", "Eric"),
|
||||
("onwK4e9ZLuTAKqWW03F9", "Daniel"),
|
||||
("IKne3meq5aSn9XLyUdCD", "Charlie"),
|
||||
("iP95p4xoKVk53GoZ742B", "Chris"),
|
||||
("bIHbv24MWmeRgasZH58o", "Will"),
|
||||
("TX3LPaxmHKxFdv7VOQHJ", "Liam"),
|
||||
("SAz9YHcvj6GT2YYXdXww", "River"),
|
||||
("EXAVITQu4vr4xnSDxMaL", "Sarah"),
|
||||
("Xb7hH8MSUJpSbSDYk0k2", "Alice"),
|
||||
("pFZP5JQG7iQjIQuC4Bku", "Lily"),
|
||||
("XrExE9yKIg1WjnnlVkGX", "Matilda"),
|
||||
("FGY2WhTYpPnrIDTdsKH5", "Laura"),
|
||||
("SOYHLrjzK2X1ezoPC6cr", "Harry"),
|
||||
("hpp4J3VqNfWAUOO0d1Us", "Bella"),
|
||||
("N2lVS1w4EtoT3dr4eOWO", "Callum"),
|
||||
("cgSgspJ2msm6clMCkdW9", "Jessica"),
|
||||
("pNInz6obpgDQGcFmaJgB", "Adam"),
|
||||
]
|
||||
```
|
||||
|
||||
### Voice Assignment
|
||||
|
||||
Shuffle deterministically so re-runs produce the same voice mapping:
|
||||
|
||||
```python
|
||||
import random as _rng
|
||||
|
||||
def assign_voices(n_quotes, voice_pool, seed=42):
|
||||
"""Assign a different voice to each quote, cycling if needed."""
|
||||
r = _rng.Random(seed)
|
||||
ids = [v[0] for v in voice_pool]
|
||||
r.shuffle(ids)
|
||||
return [ids[i % len(ids)] for i in range(n_quotes)]
|
||||
```
|
||||
|
||||
### Pronunciation Control
|
||||
|
||||
TTS text must be separate from display text. The display text has line breaks for visual layout; the TTS text is a flat sentence with phonetic fixes.
|
||||
|
||||
Common fixes:
|
||||
- Brand names: spell phonetically ("Nous" -> "Noose", "nginx" -> "engine-x")
|
||||
- Abbreviations: expand ("API" -> "A P I", "CLI" -> "C L I")
|
||||
- Technical terms: add phonetic hints
|
||||
- Punctuation for pacing: periods create pauses, commas create slight pauses
|
||||
|
||||
```python
|
||||
# Display text: line breaks control visual layout
|
||||
QUOTES = [
|
||||
("It can do far more than the Claws,\nand you don't need to buy a Mac Mini.\nNous Research has a winner here.", "Brian Roemmele"),
|
||||
]
|
||||
|
||||
# TTS text: flat, phonetically corrected for speech
|
||||
QUOTES_TTS = [
|
||||
"It can do far more than the Claws, and you don't need to buy a Mac Mini. Noose Research has a winner here.",
|
||||
]
|
||||
# Keep both arrays in sync -- same indices
|
||||
```
|
||||
|
||||
### Audio Pipeline
|
||||
|
||||
1. Generate individual TTS clips (MP3 per quote, skipping existing)
|
||||
2. Convert each to WAV (mono, 22050 Hz) for duration measurement and concatenation
|
||||
3. Calculate timing: intro pad + speech + gaps + outro pad = target duration
|
||||
4. Concatenate into single TTS track with silence padding
|
||||
5. Mix with background music
|
||||
|
||||
```python
|
||||
def build_tts_track(tts_clips, target_duration, intro_pad=5.0, outro_pad=4.0):
|
||||
"""Concatenate TTS clips with calculated gaps, pad to target duration.
|
||||
|
||||
Returns:
|
||||
timing: list of (start_time, end_time, quote_index) tuples
|
||||
"""
|
||||
sr = 22050
|
||||
|
||||
# Convert MP3s to WAV for duration and sample-level concatenation
|
||||
durations = []
|
||||
for clip in tts_clips:
|
||||
wav = clip.replace(".mp3", ".wav")
|
||||
subprocess.run(
|
||||
["ffmpeg", "-y", "-i", clip, "-ac", "1", "-ar", str(sr),
|
||||
"-sample_fmt", "s16", wav],
|
||||
capture_output=True, check=True)
|
||||
result = subprocess.run(
|
||||
["ffprobe", "-v", "error", "-show_entries", "format=duration",
|
||||
"-of", "csv=p=0", wav],
|
||||
capture_output=True, text=True)
|
||||
durations.append(float(result.stdout.strip()))
|
||||
|
||||
# Calculate gap to fill target duration
|
||||
total_speech = sum(durations)
|
||||
n_gaps = len(tts_clips) - 1
|
||||
remaining = target_duration - total_speech - intro_pad - outro_pad
|
||||
gap = max(1.0, remaining / max(1, n_gaps))
|
||||
|
||||
# Build timing and concatenate samples
|
||||
timing = []
|
||||
t = intro_pad
|
||||
all_audio = [np.zeros(int(sr * intro_pad), dtype=np.int16)]
|
||||
|
||||
for i, dur in enumerate(durations):
|
||||
wav = tts_clips[i].replace(".mp3", ".wav")
|
||||
with wave.open(wav) as wf:
|
||||
samples = np.frombuffer(wf.readframes(wf.getnframes()), dtype=np.int16)
|
||||
timing.append((t, t + dur, i))
|
||||
all_audio.append(samples)
|
||||
t += dur
|
||||
if i < len(tts_clips) - 1:
|
||||
all_audio.append(np.zeros(int(sr * gap), dtype=np.int16))
|
||||
t += gap
|
||||
|
||||
all_audio.append(np.zeros(int(sr * outro_pad), dtype=np.int16))
|
||||
|
||||
# Pad or trim to exactly target_duration
|
||||
full = np.concatenate(all_audio)
|
||||
target_samples = int(sr * target_duration)
|
||||
if len(full) < target_samples:
|
||||
full = np.pad(full, (0, target_samples - len(full)))
|
||||
else:
|
||||
full = full[:target_samples]
|
||||
|
||||
# Write concatenated TTS track
|
||||
with wave.open("tts_full.wav", "w") as wf:
|
||||
wf.setnchannels(1)
|
||||
wf.setsampwidth(2)
|
||||
wf.setframerate(sr)
|
||||
wf.writeframes(full.tobytes())
|
||||
|
||||
return timing
|
||||
```
|
||||
|
||||
### Audio Mixing
|
||||
|
||||
Mix TTS (center) with background music (wide stereo, low volume). The filter chain:
|
||||
1. TTS mono duplicated to both channels (centered)
|
||||
2. BGM loudness-normalized, volume reduced to 15%, stereo widened with `extrastereo`
|
||||
3. Mixed together with dropout transition for smooth endings
|
||||
|
||||
```python
|
||||
def mix_audio(tts_path, bgm_path, output_path, bgm_volume=0.15):
|
||||
"""Mix TTS centered with BGM panned wide stereo."""
|
||||
filter_complex = (
|
||||
# TTS: mono -> stereo center
|
||||
"[0:a]aformat=sample_fmts=fltp:sample_rates=44100:channel_layouts=mono,"
|
||||
"pan=stereo|c0=c0|c1=c0[tts];"
|
||||
# BGM: normalize loudness, reduce volume, widen stereo
|
||||
f"[1:a]aformat=sample_fmts=fltp:sample_rates=44100:channel_layouts=stereo,"
|
||||
f"loudnorm=I=-16:TP=-1.5:LRA=11,"
|
||||
f"volume={bgm_volume},"
|
||||
f"extrastereo=m=2.5[bgm];"
|
||||
# Mix with smooth dropout at end
|
||||
"[tts][bgm]amix=inputs=2:duration=longest:dropout_transition=3,"
|
||||
"aformat=sample_fmts=s16:sample_rates=44100:channel_layouts=stereo[out]"
|
||||
)
|
||||
cmd = [
|
||||
"ffmpeg", "-y",
|
||||
"-i", tts_path,
|
||||
"-i", bgm_path,
|
||||
"-filter_complex", filter_complex,
|
||||
"-map", "[out]", output_path,
|
||||
]
|
||||
subprocess.run(cmd, capture_output=True, check=True)
|
||||
```
|
||||
|
||||
### Per-Quote Visual Style
|
||||
|
||||
Cycle through visual presets per quote for variety. Each preset defines a background effect, color scheme, and text color:
|
||||
|
||||
```python
|
||||
QUOTE_STYLES = [
|
||||
{"hue": 0.08, "accent": 0.7, "bg": "spiral", "text_rgb": (255, 220, 140)}, # warm gold
|
||||
{"hue": 0.55, "accent": 0.6, "bg": "rings", "text_rgb": (180, 220, 255)}, # cool blue
|
||||
{"hue": 0.75, "accent": 0.7, "bg": "wave", "text_rgb": (220, 180, 255)}, # purple
|
||||
{"hue": 0.35, "accent": 0.6, "bg": "matrix", "text_rgb": (140, 255, 180)}, # green
|
||||
{"hue": 0.95, "accent": 0.8, "bg": "fire", "text_rgb": (255, 180, 160)}, # red/coral
|
||||
{"hue": 0.12, "accent": 0.5, "bg": "interference", "text_rgb": (255, 240, 200)}, # amber
|
||||
{"hue": 0.60, "accent": 0.7, "bg": "tunnel", "text_rgb": (160, 210, 255)}, # cyan
|
||||
{"hue": 0.45, "accent": 0.6, "bg": "aurora", "text_rgb": (180, 255, 220)}, # teal
|
||||
]
|
||||
|
||||
style = QUOTE_STYLES[quote_index % len(QUOTE_STYLES)]
|
||||
```
|
||||
|
||||
This guarantees no two adjacent quotes share the same look, even without randomness.
|
||||
|
||||
### Typewriter Text Rendering
|
||||
|
||||
Display quote text character-by-character synced to speech progress. Recently revealed characters are brighter, creating a "just typed" glow:
|
||||
|
||||
```python
|
||||
def render_typewriter(ch, co, lines, block_start, cols, progress, total_chars, text_rgb, t):
|
||||
"""Overlay typewriter text onto character/color grids.
|
||||
progress: 0.0 (nothing visible) to 1.0 (all text visible)."""
|
||||
chars_visible = int(total_chars * min(1.0, progress * 1.2)) # slight overshoot for snappy feel
|
||||
tr, tg, tb = text_rgb
|
||||
char_count = 0
|
||||
for li, line in enumerate(lines):
|
||||
row = block_start + li
|
||||
col = (cols - len(line)) // 2
|
||||
for ci, c in enumerate(line):
|
||||
if char_count < chars_visible:
|
||||
age = chars_visible - char_count
|
||||
bri_factor = min(1.0, 0.5 + 0.5 / (1 + age * 0.015)) # newer = brighter
|
||||
hue_shift = math.sin(char_count * 0.3 + t * 2) * 0.05
|
||||
stamp(ch, co, c, row, col + ci,
|
||||
(int(min(255, tr * bri_factor * (1.0 + hue_shift))),
|
||||
int(min(255, tg * bri_factor)),
|
||||
int(min(255, tb * bri_factor * (1.0 - hue_shift)))))
|
||||
char_count += 1
|
||||
|
||||
# Blinking cursor at insertion point
|
||||
if progress < 1.0 and int(t * 3) % 2 == 0:
|
||||
# Find cursor position (char_count == chars_visible)
|
||||
cc = 0
|
||||
for li, line in enumerate(lines):
|
||||
for ci, c in enumerate(line):
|
||||
if cc == chars_visible:
|
||||
stamp(ch, co, "\u258c", block_start + li,
|
||||
(cols - len(line)) // 2 + ci, (255, 220, 100))
|
||||
return
|
||||
cc += 1
|
||||
```
|
||||
|
||||
### Feature Analysis on Mixed Audio
|
||||
|
||||
Run the standard audio analysis (FFT, beat detection) on the final mixed track so visual effects react to both TTS and music:
|
||||
|
||||
```python
|
||||
# Analyze mixed_final.wav (not individual tracks)
|
||||
features = analyze_audio("mixed_final.wav", fps=24)
|
||||
```
|
||||
|
||||
Visuals pulse with both the music beats and the speech energy.
|
||||
|
||||
---
|
||||
|
||||
## Audio-Video Sync Verification
|
||||
|
||||
After rendering, verify that visual beat markers align with actual audio beats. Drift accumulates from frame timing errors, ffmpeg concat boundaries, and rounding in `fi / fps`.
|
||||
|
||||
### Beat Timestamp Extraction
|
||||
|
||||
```python
|
||||
def extract_beat_timestamps(features, fps, threshold=0.5):
|
||||
"""Extract timestamps where beat feature exceeds threshold."""
|
||||
beat = features["beat"]
|
||||
timestamps = []
|
||||
for fi in range(len(beat)):
|
||||
if beat[fi] > threshold:
|
||||
timestamps.append(fi / fps)
|
||||
return timestamps
|
||||
|
||||
def extract_visual_beat_timestamps(video_path, fps, brightness_jump=30):
|
||||
"""Detect visual beats by brightness jumps between consecutive frames.
|
||||
Returns timestamps where mean brightness increases by more than threshold."""
|
||||
import subprocess
|
||||
cmd = ["ffmpeg", "-i", video_path, "-f", "rawvideo", "-pix_fmt", "gray", "-"]
|
||||
proc = subprocess.run(cmd, capture_output=True)
|
||||
frames = np.frombuffer(proc.stdout, dtype=np.uint8)
|
||||
# Infer frame dimensions from total byte count
|
||||
n_pixels = len(frames)
|
||||
# For 1080p: 1920*1080 pixels per frame
|
||||
# Auto-detect from video metadata is more robust:
|
||||
probe = subprocess.run(
|
||||
["ffprobe", "-v", "error", "-select_streams", "v:0",
|
||||
"-show_entries", "stream=width,height",
|
||||
"-of", "csv=p=0", video_path],
|
||||
capture_output=True, text=True)
|
||||
w, h = map(int, probe.stdout.strip().split(","))
|
||||
ppf = w * h # pixels per frame
|
||||
n_frames = n_pixels // ppf
|
||||
frames = frames[:n_frames * ppf].reshape(n_frames, ppf)
|
||||
means = frames.mean(axis=1)
|
||||
|
||||
timestamps = []
|
||||
for i in range(1, len(means)):
|
||||
if means[i] - means[i-1] > brightness_jump:
|
||||
timestamps.append(i / fps)
|
||||
return timestamps
|
||||
```
|
||||
|
||||
### Sync Report
|
||||
|
||||
```python
|
||||
def sync_report(audio_beats, visual_beats, tolerance_ms=50):
|
||||
"""Compare audio beat timestamps to visual beat timestamps.
|
||||
|
||||
Args:
|
||||
audio_beats: list of timestamps (seconds) from audio analysis
|
||||
visual_beats: list of timestamps (seconds) from video brightness analysis
|
||||
tolerance_ms: max acceptable drift in milliseconds
|
||||
|
||||
Returns:
|
||||
dict with matched/unmatched/drift statistics
|
||||
"""
|
||||
tolerance = tolerance_ms / 1000.0
|
||||
matched = []
|
||||
unmatched_audio = []
|
||||
unmatched_visual = list(visual_beats)
|
||||
|
||||
for at in audio_beats:
|
||||
best_match = None
|
||||
best_delta = float("inf")
|
||||
for vt in unmatched_visual:
|
||||
delta = abs(at - vt)
|
||||
if delta < best_delta:
|
||||
best_delta = delta
|
||||
best_match = vt
|
||||
if best_match is not None and best_delta < tolerance:
|
||||
matched.append({"audio": at, "visual": best_match, "drift_ms": best_delta * 1000})
|
||||
unmatched_visual.remove(best_match)
|
||||
else:
|
||||
unmatched_audio.append(at)
|
||||
|
||||
drifts = [m["drift_ms"] for m in matched]
|
||||
return {
|
||||
"matched": len(matched),
|
||||
"unmatched_audio": len(unmatched_audio),
|
||||
"unmatched_visual": len(unmatched_visual),
|
||||
"total_audio_beats": len(audio_beats),
|
||||
"total_visual_beats": len(visual_beats),
|
||||
"mean_drift_ms": np.mean(drifts) if drifts else 0,
|
||||
"max_drift_ms": np.max(drifts) if drifts else 0,
|
||||
"p95_drift_ms": np.percentile(drifts, 95) if len(drifts) > 1 else 0,
|
||||
}
|
||||
|
||||
# Usage:
|
||||
audio_beats = extract_beat_timestamps(features, fps=24)
|
||||
visual_beats = extract_visual_beat_timestamps("output.mp4", fps=24)
|
||||
report = sync_report(audio_beats, visual_beats)
|
||||
print(f"Matched: {report['matched']}/{report['total_audio_beats']} beats")
|
||||
print(f"Mean drift: {report['mean_drift_ms']:.1f}ms, Max: {report['max_drift_ms']:.1f}ms")
|
||||
# Target: mean drift < 20ms, max drift < 42ms (1 frame at 24fps)
|
||||
```
|
||||
|
||||
### Common Sync Issues
|
||||
|
||||
| Symptom | Cause | Fix |
|
||||
|---------|-------|-----|
|
||||
| Consistent late visual beats | ffmpeg concat adds frames at boundaries | Use `-vsync cfr` flag; pad segments to exact frame count |
|
||||
| Drift increases over time | Floating-point accumulation in `t = fi / fps` | Use integer frame counter, compute `t` fresh each frame |
|
||||
| Random missed beats | Beat threshold too high / feature smoothing too aggressive | Lower threshold; reduce EMA alpha for beat feature |
|
||||
| Beats land on wrong frame | Off-by-one in frame indexing | Verify: frame 0 = t=0, frame 1 = t=1/fps (not t=0) |
|
||||
@@ -0,0 +1,688 @@
|
||||
# Optimization Reference
|
||||
|
||||
> **See also:** architecture.md · composition.md · scenes.md · shaders.md · inputs.md · troubleshooting.md
|
||||
|
||||
## Hardware Detection
|
||||
|
||||
Detect the user's hardware at script startup and adapt rendering parameters automatically. Never hardcode worker counts or resolution.
|
||||
|
||||
### CPU and Memory Detection
|
||||
|
||||
```python
|
||||
import multiprocessing
|
||||
import platform
|
||||
import shutil
|
||||
import os
|
||||
|
||||
def detect_hardware():
|
||||
"""Detect hardware capabilities and return render config."""
|
||||
cpu_count = multiprocessing.cpu_count()
|
||||
|
||||
# Leave 1-2 cores free for OS + ffmpeg encoding
|
||||
if cpu_count >= 16:
|
||||
workers = cpu_count - 2
|
||||
elif cpu_count >= 8:
|
||||
workers = cpu_count - 1
|
||||
elif cpu_count >= 4:
|
||||
workers = cpu_count - 1
|
||||
else:
|
||||
workers = max(1, cpu_count)
|
||||
|
||||
# Memory detection (platform-specific)
|
||||
try:
|
||||
if platform.system() == "Darwin":
|
||||
import subprocess
|
||||
mem_bytes = int(subprocess.check_output(["sysctl", "-n", "hw.memsize"]).strip())
|
||||
elif platform.system() == "Linux":
|
||||
with open("/proc/meminfo") as f:
|
||||
for line in f:
|
||||
if line.startswith("MemTotal"):
|
||||
mem_bytes = int(line.split()[1]) * 1024
|
||||
break
|
||||
else:
|
||||
mem_bytes = 8 * 1024**3 # assume 8GB on unknown
|
||||
except Exception:
|
||||
mem_bytes = 8 * 1024**3
|
||||
|
||||
mem_gb = mem_bytes / (1024**3)
|
||||
|
||||
# Each worker uses ~50-150MB depending on grid sizes
|
||||
# Cap workers if memory is tight
|
||||
mem_per_worker_mb = 150
|
||||
max_workers_by_mem = int(mem_gb * 1024 * 0.6 / mem_per_worker_mb) # use 60% of RAM
|
||||
workers = min(workers, max_workers_by_mem)
|
||||
|
||||
# ffmpeg availability and codec support
|
||||
has_ffmpeg = shutil.which("ffmpeg") is not None
|
||||
|
||||
return {
|
||||
"cpu_count": cpu_count,
|
||||
"workers": workers,
|
||||
"mem_gb": mem_gb,
|
||||
"platform": platform.system(),
|
||||
"arch": platform.machine(),
|
||||
"has_ffmpeg": has_ffmpeg,
|
||||
}
|
||||
```
|
||||
|
||||
### Adaptive Quality Profiles
|
||||
|
||||
Scale resolution, FPS, CRF, and grid density based on hardware:
|
||||
|
||||
```python
|
||||
def quality_profile(hw, target_duration_s, user_preference="auto"):
|
||||
"""
|
||||
Returns render settings adapted to hardware.
|
||||
user_preference: "auto", "draft", "preview", "production", "max"
|
||||
"""
|
||||
if user_preference == "draft":
|
||||
return {"vw": 960, "vh": 540, "fps": 12, "crf": 28, "workers": min(4, hw["workers"]),
|
||||
"grid_scale": 0.5, "shaders": "minimal", "particles_max": 200}
|
||||
|
||||
if user_preference == "preview":
|
||||
return {"vw": 1280, "vh": 720, "fps": 15, "crf": 25, "workers": hw["workers"],
|
||||
"grid_scale": 0.75, "shaders": "standard", "particles_max": 500}
|
||||
|
||||
if user_preference == "max":
|
||||
return {"vw": 3840, "vh": 2160, "fps": 30, "crf": 15, "workers": hw["workers"],
|
||||
"grid_scale": 2.0, "shaders": "full", "particles_max": 3000}
|
||||
|
||||
# "production" or "auto"
|
||||
# Auto-detect: estimate render time, downgrade if it would take too long
|
||||
n_frames = int(target_duration_s * 24)
|
||||
est_seconds_per_frame = 0.18 # ~180ms at 1080p
|
||||
est_total_s = n_frames * est_seconds_per_frame / max(1, hw["workers"])
|
||||
|
||||
if hw["mem_gb"] < 4 or hw["cpu_count"] <= 2:
|
||||
# Low-end: 720p, 15fps
|
||||
return {"vw": 1280, "vh": 720, "fps": 15, "crf": 23, "workers": hw["workers"],
|
||||
"grid_scale": 0.75, "shaders": "standard", "particles_max": 500}
|
||||
|
||||
if est_total_s > 3600: # would take over an hour
|
||||
# Downgrade to 720p to speed up
|
||||
return {"vw": 1280, "vh": 720, "fps": 24, "crf": 20, "workers": hw["workers"],
|
||||
"grid_scale": 0.75, "shaders": "standard", "particles_max": 800}
|
||||
|
||||
# Standard production: 1080p 24fps
|
||||
return {"vw": 1920, "vh": 1080, "fps": 24, "crf": 20, "workers": hw["workers"],
|
||||
"grid_scale": 1.0, "shaders": "full", "particles_max": 1200}
|
||||
|
||||
|
||||
def apply_quality_profile(profile):
|
||||
"""Set globals from quality profile."""
|
||||
global VW, VH, FPS, N_WORKERS
|
||||
VW = profile["vw"]
|
||||
VH = profile["vh"]
|
||||
FPS = profile["fps"]
|
||||
N_WORKERS = profile["workers"]
|
||||
# Grid sizes scale with resolution
|
||||
# CRF passed to ffmpeg encoder
|
||||
# Shader set determines which post-processing is active
|
||||
```
|
||||
|
||||
### CLI Integration
|
||||
|
||||
```python
|
||||
parser = argparse.ArgumentParser()
|
||||
parser.add_argument("--quality", choices=["draft", "preview", "production", "max", "auto"],
|
||||
default="auto", help="Render quality preset")
|
||||
parser.add_argument("--aspect", choices=["landscape", "portrait", "square"],
|
||||
default="landscape", help="Aspect ratio preset")
|
||||
parser.add_argument("--workers", type=int, default=0, help="Override worker count (0=auto)")
|
||||
parser.add_argument("--resolution", type=str, default="", help="Override resolution e.g. 1280x720")
|
||||
args = parser.parse_args()
|
||||
|
||||
hw = detect_hardware()
|
||||
if args.workers > 0:
|
||||
hw["workers"] = args.workers
|
||||
profile = quality_profile(hw, target_duration, args.quality)
|
||||
|
||||
# Apply aspect ratio preset (before manual resolution override)
|
||||
ASPECT_PRESETS = {
|
||||
"landscape": (1920, 1080),
|
||||
"portrait": (1080, 1920),
|
||||
"square": (1080, 1080),
|
||||
}
|
||||
if args.aspect != "landscape" and not args.resolution:
|
||||
profile["vw"], profile["vh"] = ASPECT_PRESETS[args.aspect]
|
||||
|
||||
if args.resolution:
|
||||
w, h = args.resolution.split("x")
|
||||
profile["vw"], profile["vh"] = int(w), int(h)
|
||||
apply_quality_profile(profile)
|
||||
|
||||
log(f"Hardware: {hw['cpu_count']} cores, {hw['mem_gb']:.1f}GB RAM, {hw['platform']}")
|
||||
log(f"Render: {profile['vw']}x{profile['vh']} @{profile['fps']}fps, "
|
||||
f"CRF {profile['crf']}, {profile['workers']} workers")
|
||||
```
|
||||
|
||||
### Portrait Mode Considerations
|
||||
|
||||
Portrait (1080x1920) has the same pixel count as landscape 1080p, so performance is equivalent. But composition patterns differ:
|
||||
|
||||
| Concern | Landscape | Portrait |
|
||||
|---------|-----------|----------|
|
||||
| Grid cols at `lg` | 160 | 90 |
|
||||
| Grid rows at `lg` | 45 | 80 |
|
||||
| Max text line chars | ~50 centered | ~25-30 centered |
|
||||
| Vertical rain | Short travel | Long, dramatic travel |
|
||||
| Horizontal spectrum | Full width | Needs rotation or compression |
|
||||
| Radial effects | Natural circles | Tall ellipses (aspect correction handles this) |
|
||||
| Particle explosions | Wide spread | Tall spread |
|
||||
| Text stacking | 3-4 lines comfortable | 8-10 lines comfortable |
|
||||
| Quote layout | 2-3 wide lines | 5-6 short lines |
|
||||
|
||||
**Portrait-optimized patterns:**
|
||||
- Vertical rain/matrix effects are naturally enhanced — longer column travel
|
||||
- Fire columns rise through more screen space
|
||||
- Rising embers/particles have more vertical runway
|
||||
- Text can be stacked more aggressively with more lines
|
||||
- Radial effects work if aspect correction is applied (GridLayer handles this automatically)
|
||||
- Spectrum bars can be rotated 90 degrees (vertical bars from bottom)
|
||||
|
||||
**Portrait text layout:**
|
||||
```python
|
||||
def layout_text_portrait(text, max_chars_per_line=25, grid=None):
|
||||
"""Break text into short lines for portrait display."""
|
||||
words = text.split()
|
||||
lines = []; current = ""
|
||||
for w in words:
|
||||
if len(current) + len(w) + 1 > max_chars_per_line:
|
||||
lines.append(current.strip())
|
||||
current = w + " "
|
||||
else:
|
||||
current += w + " "
|
||||
if current.strip():
|
||||
lines.append(current.strip())
|
||||
return lines
|
||||
```
|
||||
|
||||
## Performance Budget
|
||||
|
||||
Target: 100-200ms per frame (5-10 fps single-threaded, 40-80 fps across 8 workers).
|
||||
|
||||
| Component | Time | Notes |
|
||||
|-----------|------|-------|
|
||||
| Feature extraction | 1-5ms | Pre-computed for all frames before render |
|
||||
| Effect function | 2-15ms | Vectorized numpy, avoid Python loops |
|
||||
| Character render | 80-150ms | **Bottleneck** -- per-cell Python loop |
|
||||
| Shader pipeline | 5-25ms | Depends on active shaders |
|
||||
| ffmpeg encode | ~5ms | Amortized by pipe buffering |
|
||||
|
||||
## Bitmap Pre-Rasterization
|
||||
|
||||
Rasterize every character at init, not per-frame:
|
||||
|
||||
```python
|
||||
# At init time -- done once
|
||||
for c in all_characters:
|
||||
img = Image.new("L", (cell_w, cell_h), 0)
|
||||
ImageDraw.Draw(img).text((0, 0), c, fill=255, font=font)
|
||||
bitmaps[c] = np.array(img, dtype=np.float32) / 255.0 # float32 for fast multiply
|
||||
|
||||
# At render time -- fast lookup
|
||||
bitmap = bitmaps[char]
|
||||
canvas[y:y+ch, x:x+cw] = np.maximum(canvas[y:y+ch, x:x+cw],
|
||||
(bitmap[:,:,None] * color).astype(np.uint8))
|
||||
```
|
||||
|
||||
Collect all characters from all palettes + overlay text into the init set. Lazy-init for any missed characters.
|
||||
|
||||
## Pre-Rendered Background Textures
|
||||
|
||||
Alternative to `_render_vf()` for backgrounds where characters don't need to change every frame. Pre-bake a static ASCII texture once at init, then multiply by a per-cell color field each frame. One matrix multiply vs thousands of bitmap blits.
|
||||
|
||||
Use when: background layer uses a fixed character palette and only color/brightness varies per frame. NOT suitable for layers where character selection depends on a changing value field.
|
||||
|
||||
### Init: Bake the Texture
|
||||
|
||||
```python
|
||||
# In GridLayer.__init__:
|
||||
self._bg_row_idx = np.clip(
|
||||
(np.arange(VH) - self.oy) // self.ch, 0, self.rows - 1
|
||||
)
|
||||
self._bg_col_idx = np.clip(
|
||||
(np.arange(VW) - self.ox) // self.cw, 0, self.cols - 1
|
||||
)
|
||||
self._bg_textures = {}
|
||||
|
||||
def make_bg_texture(self, palette):
|
||||
"""Pre-render a static ASCII texture (grayscale float32) once."""
|
||||
if palette not in self._bg_textures:
|
||||
texture = np.zeros((VH, VW), dtype=np.float32)
|
||||
rng = random.Random(12345)
|
||||
ch_list = [c for c in palette if c != " " and c in self.bm]
|
||||
if not ch_list:
|
||||
ch_list = list(self.bm.keys())[:5]
|
||||
for row in range(self.rows):
|
||||
y = self.oy + row * self.ch
|
||||
if y + self.ch > VH:
|
||||
break
|
||||
for col in range(self.cols):
|
||||
x = self.ox + col * self.cw
|
||||
if x + self.cw > VW:
|
||||
break
|
||||
bm = self.bm[rng.choice(ch_list)]
|
||||
texture[y:y+self.ch, x:x+self.cw] = bm
|
||||
self._bg_textures[palette] = texture
|
||||
return self._bg_textures[palette]
|
||||
```
|
||||
|
||||
### Render: Color Field x Cached Texture
|
||||
|
||||
```python
|
||||
def render_bg(self, color_field, palette=PAL_CIRCUIT):
|
||||
"""Fast background: pre-rendered ASCII texture * per-cell color field.
|
||||
color_field: (rows, cols, 3) uint8. Returns (VH, VW, 3) uint8."""
|
||||
texture = self.make_bg_texture(palette)
|
||||
# Expand cell colors to pixel coords via pre-computed index maps
|
||||
color_px = color_field[
|
||||
self._bg_row_idx[:, None], self._bg_col_idx[None, :]
|
||||
].astype(np.float32)
|
||||
return (texture[:, :, None] * color_px).astype(np.uint8)
|
||||
```
|
||||
|
||||
### Usage in a Scene
|
||||
|
||||
```python
|
||||
# Build per-cell color from effect fields (cheap — rows*cols, not VH*VW)
|
||||
hue = ((t * 0.05 + val * 0.2) % 1.0).astype(np.float32)
|
||||
R, G, B = hsv2rgb(hue, np.full_like(val, 0.5), val)
|
||||
color_field = mkc(R, G, B, g.rows, g.cols) # (rows, cols, 3) uint8
|
||||
|
||||
# Render background — single matrix multiply, no per-cell loop
|
||||
canvas_bg = g.render_bg(color_field, PAL_DENSE)
|
||||
```
|
||||
|
||||
The texture init loop runs once and is cached per palette. Per-frame cost is one fancy-index lookup + one broadcast multiply — orders of magnitude faster than the per-cell bitmap blit loop in `render()` for dense backgrounds.
|
||||
|
||||
## Coordinate Array Caching
|
||||
|
||||
Pre-compute all grid-relative coordinate arrays at init, not per-frame:
|
||||
|
||||
```python
|
||||
# These are O(rows*cols) and used in every effect
|
||||
self.rr = np.arange(rows)[:, None] # row indices
|
||||
self.cc = np.arange(cols)[None, :] # col indices
|
||||
self.dist = np.sqrt(dx**2 + dy**2) # distance from center
|
||||
self.angle = np.arctan2(dy, dx) # angle from center
|
||||
self.dist_n = ... # normalized distance
|
||||
```
|
||||
|
||||
## Vectorized Effect Patterns
|
||||
|
||||
### Avoid Per-Cell Python Loops in Effects
|
||||
|
||||
The render loop (compositing bitmaps) is unavoidably per-cell. But effect functions must be fully vectorized numpy -- never iterate over rows/cols in Python.
|
||||
|
||||
Bad (O(rows*cols) Python loop):
|
||||
```python
|
||||
for r in range(rows):
|
||||
for c in range(cols):
|
||||
val[r, c] = math.sin(c * 0.1 + t) * math.cos(r * 0.1 - t)
|
||||
```
|
||||
|
||||
Good (vectorized):
|
||||
```python
|
||||
val = np.sin(g.cc * 0.1 + t) * np.cos(g.rr * 0.1 - t)
|
||||
```
|
||||
|
||||
### Vectorized Matrix Rain
|
||||
|
||||
The naive per-column per-trail-pixel loop is the second biggest bottleneck after the render loop. Use numpy fancy indexing:
|
||||
|
||||
```python
|
||||
# Instead of nested Python loops over columns and trail pixels:
|
||||
# Build row index arrays for all active trail pixels at once
|
||||
all_rows = []
|
||||
all_cols = []
|
||||
all_fades = []
|
||||
for c in range(cols):
|
||||
head = int(S["ry"][c])
|
||||
trail_len = S["rln"][c]
|
||||
for i in range(trail_len):
|
||||
row = head - i
|
||||
if 0 <= row < rows:
|
||||
all_rows.append(row)
|
||||
all_cols.append(c)
|
||||
all_fades.append(1.0 - i / trail_len)
|
||||
|
||||
# Vectorized assignment
|
||||
ar = np.array(all_rows)
|
||||
ac = np.array(all_cols)
|
||||
af = np.array(all_fades, dtype=np.float32)
|
||||
# Assign chars and colors in bulk using fancy indexing
|
||||
ch[ar, ac] = ... # vectorized char assignment
|
||||
co[ar, ac, 1] = (af * bri * 255).astype(np.uint8) # green channel
|
||||
```
|
||||
|
||||
### Vectorized Fire Columns
|
||||
|
||||
Same pattern -- accumulate index arrays, assign in bulk:
|
||||
|
||||
```python
|
||||
fire_val = np.zeros((rows, cols), dtype=np.float32)
|
||||
for fi in range(n_cols):
|
||||
fx_c = int((fi * cols / n_cols + np.sin(t * 2 + fi * 0.7) * 3) % cols)
|
||||
height = int(energy * rows * 0.7)
|
||||
dy = np.arange(min(height, rows))
|
||||
fr = rows - 1 - dy
|
||||
frac = dy / max(height, 1)
|
||||
# Width spread: base columns wider at bottom
|
||||
for dx in range(-1, 2): # 3-wide columns
|
||||
c = fx_c + dx
|
||||
if 0 <= c < cols:
|
||||
fire_val[fr, c] = np.maximum(fire_val[fr, c],
|
||||
(1 - frac * 0.6) * (0.5 + rms * 0.5))
|
||||
# Now map fire_val to chars and colors in one vectorized pass
|
||||
```
|
||||
|
||||
## PIL String Rendering for Text-Heavy Scenes
|
||||
|
||||
Alternative to per-cell bitmap blitting when rendering many long text strings (scrolling tickers, typewriter sequences, idea floods). Uses PIL's native `ImageDraw.text()` which renders an entire string in one C call, vs one Python-loop bitmap blit per character.
|
||||
|
||||
Typical win: a scene with 56 ticker rows renders 56 PIL `text()` calls instead of ~10K individual bitmap blits.
|
||||
|
||||
Use when: scene renders many rows of readable text strings. NOT suitable for sparse or spatially-scattered single characters (use normal `render()` for those).
|
||||
|
||||
```python
|
||||
from PIL import Image, ImageDraw
|
||||
|
||||
def render_text_layer(grid, rows_data, font):
|
||||
"""Render dense text rows via PIL instead of per-cell bitmap blitting.
|
||||
|
||||
Args:
|
||||
grid: GridLayer instance (for oy, ch, ox, font metrics)
|
||||
rows_data: list of (row_index, text_string, rgb_tuple) — one per row
|
||||
font: PIL ImageFont instance (grid.font)
|
||||
|
||||
Returns:
|
||||
uint8 array (VH, VW, 3) — canvas with rendered text
|
||||
"""
|
||||
img = Image.new("RGB", (VW, VH), (0, 0, 0))
|
||||
draw = ImageDraw.Draw(img)
|
||||
for row_idx, text, color in rows_data:
|
||||
y = grid.oy + row_idx * grid.ch
|
||||
if y + grid.ch > VH:
|
||||
break
|
||||
draw.text((grid.ox, y), text, fill=color, font=font)
|
||||
return np.array(img)
|
||||
```
|
||||
|
||||
### Usage in a Ticker Scene
|
||||
|
||||
```python
|
||||
# Build ticker data (text + color per row)
|
||||
rows_data = []
|
||||
for row in range(n_tickers):
|
||||
text = build_ticker_text(row, t) # scrolling substring
|
||||
color = hsv2rgb_scalar(hue, 0.85, bri) # (R, G, B) tuple
|
||||
rows_data.append((row, text, color))
|
||||
|
||||
# One PIL pass instead of thousands of bitmap blits
|
||||
canvas_tickers = render_text_layer(g_md, rows_data, g_md.font)
|
||||
|
||||
# Blend with other layers normally
|
||||
result = blend_canvas(canvas_bg, canvas_tickers, "screen", 0.9)
|
||||
```
|
||||
|
||||
This is purely a rendering optimization — same visual output, fewer draw calls. The grid's `render()` method is still needed for sparse character fields where characters are placed individually based on value fields.
|
||||
|
||||
## Bloom Optimization
|
||||
|
||||
**Do NOT use `scipy.ndimage.uniform_filter`** -- measured at 424ms/frame.
|
||||
|
||||
Use 4x downsample + manual box blur instead -- 84ms/frame (5x faster):
|
||||
|
||||
```python
|
||||
sm = canvas[::4, ::4].astype(np.float32) # 4x downsample
|
||||
br = np.where(sm > threshold, sm, 0)
|
||||
for _ in range(3): # 3-pass manual box blur
|
||||
p = np.pad(br, ((1,1),(1,1),(0,0)), mode='edge')
|
||||
br = (p[:-2,:-2] + p[:-2,1:-1] + p[:-2,2:] +
|
||||
p[1:-1,:-2] + p[1:-1,1:-1] + p[1:-1,2:] +
|
||||
p[2:,:-2] + p[2:,1:-1] + p[2:,2:]) / 9.0
|
||||
bl = np.repeat(np.repeat(br, 4, axis=0), 4, axis=1)[:H, :W]
|
||||
```
|
||||
|
||||
## Vignette Caching
|
||||
|
||||
Distance field is resolution- and strength-dependent, never changes per frame:
|
||||
|
||||
```python
|
||||
_vig_cache = {}
|
||||
def sh_vignette(canvas, strength):
|
||||
key = (canvas.shape[0], canvas.shape[1], round(strength, 2))
|
||||
if key not in _vig_cache:
|
||||
Y = np.linspace(-1, 1, H)[:, None]
|
||||
X = np.linspace(-1, 1, W)[None, :]
|
||||
_vig_cache[key] = np.clip(1.0 - np.sqrt(X**2+Y**2) * strength, 0.15, 1).astype(np.float32)
|
||||
return np.clip(canvas * _vig_cache[key][:,:,None], 0, 255).astype(np.uint8)
|
||||
```
|
||||
|
||||
Same pattern for CRT barrel distortion (cache remap coordinates).
|
||||
|
||||
## Film Grain Optimization
|
||||
|
||||
Generate noise at half resolution, tile up:
|
||||
|
||||
```python
|
||||
noise = np.random.randint(-amt, amt+1, (H//2, W//2, 1), dtype=np.int16)
|
||||
noise = np.repeat(np.repeat(noise, 2, axis=0), 2, axis=1)[:H, :W]
|
||||
```
|
||||
|
||||
2x blocky grain looks like film grain and costs 1/4 the random generation.
|
||||
|
||||
## Parallel Rendering
|
||||
|
||||
### Worker Architecture
|
||||
|
||||
```python
|
||||
hw = detect_hardware()
|
||||
N_WORKERS = hw["workers"]
|
||||
|
||||
# Batch splitting (for non-clip architectures)
|
||||
batch_size = (n_frames + N_WORKERS - 1) // N_WORKERS
|
||||
batches = [(i, i*batch_size, min((i+1)*batch_size, n_frames), features, seg_path) ...]
|
||||
|
||||
with multiprocessing.Pool(N_WORKERS) as pool:
|
||||
segments = pool.starmap(render_batch, batches)
|
||||
```
|
||||
|
||||
### Per-Clip Parallelism (Preferred for Segmented Videos)
|
||||
|
||||
```python
|
||||
from concurrent.futures import ProcessPoolExecutor, as_completed
|
||||
|
||||
with ProcessPoolExecutor(max_workers=N_WORKERS) as pool:
|
||||
futures = {pool.submit(render_clip, seg, features, path): seg["id"]
|
||||
for seg, path in clip_args}
|
||||
for fut in as_completed(futures):
|
||||
clip_id = futures[fut]
|
||||
try:
|
||||
fut.result()
|
||||
log(f" {clip_id} done")
|
||||
except Exception as e:
|
||||
log(f" {clip_id} FAILED: {e}")
|
||||
```
|
||||
|
||||
### Worker Isolation
|
||||
|
||||
Each worker:
|
||||
- Creates its own `Renderer` instance (with full grid + bitmap init)
|
||||
- Opens its own ffmpeg subprocess
|
||||
- Has independent random seed (`random.seed(batch_id * 10000)`)
|
||||
- Writes to its own segment file and stderr log
|
||||
|
||||
### ffmpeg Pipe Safety
|
||||
|
||||
**CRITICAL**: Never `stderr=subprocess.PIPE` with long-running ffmpeg. The stderr buffer fills at ~64KB and deadlocks:
|
||||
|
||||
```python
|
||||
# WRONG -- will deadlock
|
||||
pipe = subprocess.Popen(cmd, stdin=subprocess.PIPE, stderr=subprocess.PIPE)
|
||||
|
||||
# RIGHT -- stderr to file
|
||||
stderr_fh = open(err_path, "w")
|
||||
pipe = subprocess.Popen(cmd, stdin=subprocess.PIPE, stdout=subprocess.DEVNULL, stderr=stderr_fh)
|
||||
# ... write all frames ...
|
||||
pipe.stdin.close()
|
||||
pipe.wait()
|
||||
stderr_fh.close()
|
||||
```
|
||||
|
||||
### Concatenation
|
||||
|
||||
```python
|
||||
with open(concat_file, "w") as cf:
|
||||
for seg in segments:
|
||||
cf.write(f"file '{seg}'\n")
|
||||
|
||||
cmd = ["ffmpeg", "-y", "-f", "concat", "-safe", "0", "-i", concat_file]
|
||||
if audio_path:
|
||||
cmd += ["-i", audio_path, "-c:v", "copy", "-c:a", "aac", "-b:a", "192k", "-shortest"]
|
||||
else:
|
||||
cmd += ["-c:v", "copy"]
|
||||
cmd.append(output_path)
|
||||
subprocess.run(cmd, capture_output=True, check=True)
|
||||
```
|
||||
|
||||
## Particle System Performance
|
||||
|
||||
Cap particle counts based on quality profile:
|
||||
|
||||
| System | Low | Standard | High |
|
||||
|--------|-----|----------|------|
|
||||
| Explosion | 300 | 1000 | 2500 |
|
||||
| Embers | 500 | 1500 | 3000 |
|
||||
| Starfield | 300 | 800 | 1500 |
|
||||
| Dissolve | 200 | 600 | 1200 |
|
||||
|
||||
Cull by truncating lists:
|
||||
```python
|
||||
MAX_PARTICLES = profile.get("particles_max", 1200)
|
||||
if len(S["px"]) > MAX_PARTICLES:
|
||||
for k in ("px", "py", "vx", "vy", "life", "char"):
|
||||
S[k] = S[k][-MAX_PARTICLES:] # keep newest
|
||||
```
|
||||
|
||||
## Memory Management
|
||||
|
||||
- Feature arrays: pre-computed for all frames, shared across workers via fork semantics (COW)
|
||||
- Canvas: allocated once per worker, reused (`np.zeros(...)`)
|
||||
- Character arrays: allocated per frame (cheap -- rows*cols U1 strings)
|
||||
- Bitmap cache: ~500KB per grid size, initialized once per worker
|
||||
|
||||
Total memory per worker: ~50-150MB. Total: ~400-800MB for 8 workers.
|
||||
|
||||
For low-memory systems (< 4GB), reduce worker count and use smaller grids.
|
||||
|
||||
## Brightness Verification
|
||||
|
||||
After render, spot-check brightness at sample timestamps:
|
||||
|
||||
```python
|
||||
for t in [2, 30, 60, 120, 180]:
|
||||
cmd = ["ffmpeg", "-ss", str(t), "-i", output_path,
|
||||
"-frames:v", "1", "-f", "rawvideo", "-pix_fmt", "rgb24", "-"]
|
||||
r = subprocess.run(cmd, capture_output=True)
|
||||
arr = np.frombuffer(r.stdout, dtype=np.uint8)
|
||||
print(f"t={t}s mean={arr.mean():.1f} max={arr.max()}")
|
||||
```
|
||||
|
||||
Target: mean > 5 for quiet sections, mean > 15 for active sections. If consistently below, increase brightness floor in effects and/or global boost multiplier.
|
||||
|
||||
## Render Time Estimates
|
||||
|
||||
Scale with hardware. Baseline: 1080p, 24fps, ~180ms/frame/worker.
|
||||
|
||||
| Duration | Frames | 4 workers | 8 workers | 16 workers |
|
||||
|----------|--------|-----------|-----------|------------|
|
||||
| 30s | 720 | ~3 min | ~2 min | ~1 min |
|
||||
| 2 min | 2,880 | ~13 min | ~7 min | ~4 min |
|
||||
| 3.5 min | 5,040 | ~23 min | ~12 min | ~6 min |
|
||||
| 5 min | 7,200 | ~33 min | ~17 min | ~9 min |
|
||||
| 10 min | 14,400 | ~65 min | ~33 min | ~17 min |
|
||||
|
||||
At 720p: multiply times by ~0.5. At 4K: multiply by ~4.
|
||||
|
||||
Heavier effects (many particles, dense grids, extra shader passes) add ~20-50%.
|
||||
|
||||
---
|
||||
|
||||
## Temp File Cleanup
|
||||
|
||||
Rendering generates intermediate files that accumulate across runs. Clean up after the final concat/mux step.
|
||||
|
||||
### Files to Clean
|
||||
|
||||
| File type | Source | Location |
|
||||
|-----------|--------|----------|
|
||||
| WAV extracts | `ffmpeg -i input.mp3 ... tmp.wav` | `tempfile.mktemp()` or project dir |
|
||||
| Segment clips | `render_clip()` output | `segments/seg_00.mp4` etc. |
|
||||
| Concat list | ffmpeg concat demuxer input | `segments/concat.txt` |
|
||||
| ffmpeg stderr logs | piped to file for debugging | `*.log` in project dir |
|
||||
| Feature cache | pickled numpy arrays | `*.pkl` or `*.npz` |
|
||||
|
||||
### Cleanup Function
|
||||
|
||||
```python
|
||||
import glob
|
||||
import tempfile
|
||||
import shutil
|
||||
|
||||
def cleanup_render_artifacts(segments_dir="segments", keep_final=True):
|
||||
"""Remove intermediate files after successful render.
|
||||
|
||||
Call this AFTER verifying the final output exists and plays correctly.
|
||||
|
||||
Args:
|
||||
segments_dir: directory containing segment clips and concat list
|
||||
keep_final: if True, only delete intermediates (not the final output)
|
||||
"""
|
||||
removed = []
|
||||
|
||||
# 1. Segment clips
|
||||
if os.path.isdir(segments_dir):
|
||||
shutil.rmtree(segments_dir)
|
||||
removed.append(f"directory: {segments_dir}")
|
||||
|
||||
# 2. Temporary WAV files
|
||||
for wav in glob.glob("*.wav"):
|
||||
if wav.startswith("tmp") or wav.startswith("extracted_"):
|
||||
os.remove(wav)
|
||||
removed.append(wav)
|
||||
|
||||
# 3. ffmpeg stderr logs
|
||||
for log in glob.glob("ffmpeg_*.log"):
|
||||
os.remove(log)
|
||||
removed.append(log)
|
||||
|
||||
# 4. Feature cache (optional — useful to keep for re-renders)
|
||||
# for cache in glob.glob("features_*.npz"):
|
||||
# os.remove(cache)
|
||||
# removed.append(cache)
|
||||
|
||||
print(f"Cleaned {len(removed)} artifacts: {removed}")
|
||||
return removed
|
||||
```
|
||||
|
||||
### Integration with Render Pipeline
|
||||
|
||||
Call cleanup at the end of the main render script, after the final output is verified:
|
||||
|
||||
```python
|
||||
# At end of main()
|
||||
if os.path.exists(output_path) and os.path.getsize(output_path) > 1000:
|
||||
cleanup_render_artifacts(segments_dir="segments")
|
||||
print(f"Done. Output: {output_path}")
|
||||
else:
|
||||
print("WARNING: final output missing or empty — skipping cleanup")
|
||||
```
|
||||
|
||||
### Temp File Best Practices
|
||||
|
||||
- Use `tempfile.mkdtemp()` for segment directories — avoids polluting the project dir
|
||||
- Name WAV extracts with `tempfile.mktemp(suffix=".wav")` so they're in the OS temp dir
|
||||
- For debugging, set `KEEP_INTERMEDIATES=1` env var to skip cleanup
|
||||
- Feature caches (`.npz`) are cheap to store and expensive to recompute — default to keeping them
|
||||
File diff suppressed because it is too large
Load Diff
File diff suppressed because it is too large
Load Diff
@@ -0,0 +1,367 @@
|
||||
# Troubleshooting Reference
|
||||
|
||||
> **See also:** composition.md · architecture.md · shaders.md · scenes.md · optimization.md
|
||||
|
||||
## Quick Diagnostic
|
||||
|
||||
| Symptom | Likely Cause | Fix |
|
||||
|---------|-------------|-----|
|
||||
| All black output | tonemap gamma too high or no effects rendering | Lower gamma to 0.5, check scene_fn returns non-zero canvas |
|
||||
| Washed out / too bright | Linear brightness multiplier instead of tonemap | Replace `canvas * N` with `tonemap(canvas, gamma=0.75)` |
|
||||
| ffmpeg hangs mid-render | stderr=subprocess.PIPE deadlock | Redirect stderr to file |
|
||||
| "read-only" array error | broadcast_to view without .copy() | Add `.copy()` after broadcast_to |
|
||||
| PicklingError | Lambda or closure in SCENES table | Define all fx_* at module level |
|
||||
| Random dark holes in output | Font missing Unicode glyphs | Validate palettes at init |
|
||||
| Audio-visual desync | Frame timing accumulation | Use integer frame counter, compute t fresh each frame |
|
||||
| Single-color flat output | Hue field shape mismatch | Ensure h,s,v arrays all (rows,cols) before hsv2rgb |
|
||||
| Text unreadable over busy bg | No contrast between text and background | Use `apply_text_backdrop()` (composition.md) + `reverse_vignette` shader (shaders.md) |
|
||||
| Text garbled/mirrored | Kaleidoscope or mirror shader applied to text scene | **Never apply kaleidoscope, mirror_h/v/quad/diag to scenes with readable text** — radial folding destroys legibility. Apply these only to background layers or text-free scenes |
|
||||
|
||||
Common bugs, gotchas, and platform-specific issues encountered during ASCII video development.
|
||||
|
||||
## NumPy Broadcasting
|
||||
|
||||
### The `broadcast_to().copy()` Trap
|
||||
|
||||
Hue field generators often return arrays that are broadcast views — they have shape `(1, cols)` or `(rows, 1)` that numpy broadcasts to `(rows, cols)`. These views are **read-only**. If any downstream code tries to modify them in-place (e.g., `h %= 1.0`), numpy raises:
|
||||
|
||||
```
|
||||
ValueError: output array is read-only
|
||||
```
|
||||
|
||||
**Fix**: Always `.copy()` after `broadcast_to()`:
|
||||
|
||||
```python
|
||||
h = np.broadcast_to(h, (g.rows, g.cols)).copy()
|
||||
```
|
||||
|
||||
This is especially important in `_render_vf()` where hue arrays flow through `hsv2rgb()`.
|
||||
|
||||
### The `+=` vs `+` Trap
|
||||
|
||||
Broadcasting also fails with in-place operators when operand shapes don't match exactly:
|
||||
|
||||
```python
|
||||
# FAILS if result is (rows,1) and operand is (rows, cols)
|
||||
val += np.sin(g.cc * 0.02 + t * 0.3) * 0.5
|
||||
|
||||
# WORKS — creates a new array
|
||||
val = val + np.sin(g.cc * 0.02 + t * 0.3) * 0.5
|
||||
```
|
||||
|
||||
The `vf_plasma()` function had this bug. Use `+` instead of `+=` when mixing different-shaped arrays.
|
||||
|
||||
### Shape Mismatch in `hsv2rgb()`
|
||||
|
||||
`hsv2rgb(h, s, v)` requires all three arrays to have identical shapes. If `h` is `(1, cols)` and `s` is `(rows, cols)`, the function crashes or produces wrong output.
|
||||
|
||||
**Fix**: Ensure all inputs are broadcast and copied to `(rows, cols)` before calling.
|
||||
|
||||
---
|
||||
|
||||
## Blend Mode Pitfalls
|
||||
|
||||
### Overlay Crushes Dark Inputs
|
||||
|
||||
`overlay(a, b) = 2*a*b` when `a < 0.5`. Two values of 0.12 produce `2 * 0.12 * 0.12 = 0.03`. The result is darker than either input.
|
||||
|
||||
**Impact**: If both layers are dark (which ASCII art usually is), overlay produces near-black output.
|
||||
|
||||
**Fix**: Use `screen` for dark source material. Screen always brightens: `1 - (1-a)*(1-b)`.
|
||||
|
||||
### Colordodge Division by Zero
|
||||
|
||||
`colordodge(a, b) = a / (1 - b)`. When `b = 1.0` (pure white pixels), this divides by zero.
|
||||
|
||||
**Fix**: Add epsilon: `a / (1 - b + 1e-6)`. The implementation in `BLEND_MODES` should include this.
|
||||
|
||||
### Colorburn Division by Zero
|
||||
|
||||
`colorburn(a, b) = 1 - (1-a) / b`. When `b = 0` (pure black pixels), this divides by zero.
|
||||
|
||||
**Fix**: Add epsilon: `1 - (1-a) / (b + 1e-6)`.
|
||||
|
||||
### Multiply Always Darkens
|
||||
|
||||
`multiply(a, b) = a * b`. Since both operands are [0,1], the result is always <= min(a,b). Never use multiply as a feedback blend mode — the frame goes black within a few frames.
|
||||
|
||||
**Fix**: Use `screen` for feedback, or `add` with low opacity.
|
||||
|
||||
---
|
||||
|
||||
## Multiprocessing
|
||||
|
||||
### Pickling Constraints
|
||||
|
||||
`ProcessPoolExecutor` serializes function arguments via pickle. This constrains what you can pass to workers:
|
||||
|
||||
| Can Pickle | Cannot Pickle |
|
||||
|-----------|---------------|
|
||||
| Module-level functions (`def fx_foo():`) | Lambdas (`lambda x: x + 1`) |
|
||||
| Dicts, lists, numpy arrays | Closures (functions defined inside functions) |
|
||||
| Class instances (with `__reduce__`) | Instance methods |
|
||||
| Strings, numbers | File handles, sockets |
|
||||
|
||||
**Impact**: All scene functions referenced in the SCENES table must be defined at module level with `def`. If you use a lambda or closure, you get:
|
||||
|
||||
```
|
||||
_pickle.PicklingError: Can't pickle <function <lambda> at 0x...>
|
||||
```
|
||||
|
||||
**Fix**: Define all scene functions at module top level. Lambdas used inside `_render_vf()` as val_fn/hue_fn are fine because they execute within the worker process — they're not pickled across process boundaries.
|
||||
|
||||
### macOS spawn vs Linux fork
|
||||
|
||||
On macOS, `multiprocessing` defaults to `spawn` (full serialization). On Linux, it defaults to `fork` (copy-on-write). This means:
|
||||
|
||||
- **macOS**: Feature arrays are serialized per worker (~57KB for 30s video, but scales with duration). Each worker re-imports the entire module.
|
||||
- **Linux**: Feature arrays are shared via COW. Workers inherit the parent's memory.
|
||||
|
||||
**Impact**: On macOS, module-level code (like `detect_hardware()`) runs in every worker process. If it has side effects (e.g., subprocess calls), those happen N+1 times.
|
||||
|
||||
### Per-Worker State Isolation
|
||||
|
||||
Each worker creates its own:
|
||||
- `Renderer` instance (with fresh grid cache)
|
||||
- `FeedbackBuffer` (feedback doesn't cross scene boundaries)
|
||||
- Random seed (`random.seed(hash(seg_id) + 42)`)
|
||||
|
||||
This means:
|
||||
- Particle state doesn't carry between scenes (expected)
|
||||
- Feedback trails reset at scene cuts (expected)
|
||||
- `np.random` state is NOT seeded by `random.seed()` — they use separate RNGs
|
||||
|
||||
**Fix for deterministic noise**: Use `np.random.RandomState(seed)` explicitly:
|
||||
|
||||
```python
|
||||
rng = np.random.RandomState(hash(seg_id) + 42)
|
||||
noise = rng.random((rows, cols))
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Brightness Issues
|
||||
|
||||
### Dark Scenes After Tonemap
|
||||
|
||||
If a scene is still dark after tonemap, check:
|
||||
|
||||
1. **Gamma too high**: Lower gamma (0.5-0.6) for scenes with destructive post-processing
|
||||
2. **Shader destroying brightness**: Solarize, posterize, or contrast adjustments in the shader chain can undo tonemap's work. Move destructive shaders earlier in the chain, or increase gamma to compensate.
|
||||
3. **Feedback with multiply**: Multiply feedback darkens every frame. Switch to screen or add.
|
||||
4. **Overlay blend in scene**: If the scene function uses `blend_canvas(..., "overlay", ...)` with dark layers, switch to screen.
|
||||
|
||||
### Diagnostic: Test-Frame Brightness
|
||||
|
||||
```bash
|
||||
python reel.py --test-frame 10.0
|
||||
# Output: Mean brightness: 44.3, max: 255
|
||||
```
|
||||
|
||||
If mean < 20, the scene needs attention. Common fixes:
|
||||
- Lower gamma in the SCENES entry
|
||||
- Change internal blend modes from overlay/multiply to screen/add
|
||||
- Increase value field multipliers (e.g., `vf_plasma(...) * 1.5`)
|
||||
- Check that the shader chain doesn't have an aggressive solarize or threshold
|
||||
|
||||
### v1 Brightness Pattern (Deprecated)
|
||||
|
||||
The old pattern used a linear multiplier:
|
||||
|
||||
```python
|
||||
# OLD — don't use
|
||||
canvas = np.clip(canvas.astype(np.float32) * 2.0, 0, 255).astype(np.uint8)
|
||||
```
|
||||
|
||||
This fails because:
|
||||
- Dark scenes (mean 8): `8 * 2.0 = 16` — still dark
|
||||
- Bright scenes (mean 130): `130 * 2.0 = 255` — clipped, lost detail
|
||||
|
||||
Use `tonemap()` instead. See `composition.md` § Adaptive Tone Mapping.
|
||||
|
||||
---
|
||||
|
||||
## ffmpeg Issues
|
||||
|
||||
### Pipe Deadlock
|
||||
|
||||
The #1 production bug. If you use `stderr=subprocess.PIPE`:
|
||||
|
||||
```python
|
||||
# DEADLOCK — stderr buffer fills at 64KB, blocks ffmpeg, blocks your writes
|
||||
pipe = subprocess.Popen(cmd, stdin=subprocess.PIPE, stderr=subprocess.PIPE)
|
||||
```
|
||||
|
||||
**Fix**: Always redirect stderr to a file:
|
||||
|
||||
```python
|
||||
stderr_fh = open(err_path, "w")
|
||||
pipe = subprocess.Popen(cmd, stdin=subprocess.PIPE,
|
||||
stdout=subprocess.DEVNULL, stderr=stderr_fh)
|
||||
```
|
||||
|
||||
### Frame Count Mismatch
|
||||
|
||||
If the number of frames written to the pipe doesn't match what ffmpeg expects (based on `-r` and duration), the output may have:
|
||||
- Missing frames at the end
|
||||
- Incorrect duration
|
||||
- Audio-video desync
|
||||
|
||||
**Fix**: Calculate frame count explicitly: `n_frames = int(duration * FPS)`. Don't use `range(int(start*FPS), int(end*FPS))` without verifying the total matches.
|
||||
|
||||
### Concat Fails with "unsafe file name"
|
||||
|
||||
```
|
||||
[concat @ ...] Unsafe file name
|
||||
```
|
||||
|
||||
**Fix**: Always use `-safe 0`:
|
||||
```python
|
||||
["ffmpeg", "-f", "concat", "-safe", "0", "-i", concat_path, ...]
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Font Issues
|
||||
|
||||
### Cell Height (macOS Pillow)
|
||||
|
||||
`textbbox()` and `getbbox()` return incorrect heights on some macOS Pillow versions. Use `getmetrics()`:
|
||||
|
||||
```python
|
||||
ascent, descent = font.getmetrics()
|
||||
cell_height = ascent + descent # correct
|
||||
# NOT: font.getbbox("M")[3] # wrong on some versions
|
||||
```
|
||||
|
||||
### Missing Unicode Glyphs
|
||||
|
||||
Not all fonts render all Unicode characters. If a palette character isn't in the font, the glyph renders as a blank or tofu box, appearing as a dark hole in the output.
|
||||
|
||||
**Fix**: Validate at init:
|
||||
|
||||
```python
|
||||
all_chars = set()
|
||||
for pal in [PAL_DEFAULT, PAL_DENSE, PAL_RUNE, ...]:
|
||||
all_chars.update(pal)
|
||||
|
||||
valid_chars = set()
|
||||
for c in all_chars:
|
||||
if c == " ":
|
||||
valid_chars.add(c)
|
||||
continue
|
||||
img = Image.new("L", (20, 20), 0)
|
||||
ImageDraw.Draw(img).text((0, 0), c, fill=255, font=font)
|
||||
if np.array(img).max() > 0:
|
||||
valid_chars.add(c)
|
||||
else:
|
||||
log(f"WARNING: '{c}' (U+{ord(c):04X}) missing from font")
|
||||
```
|
||||
|
||||
### Platform Font Paths
|
||||
|
||||
| Platform | Common Paths |
|
||||
|----------|-------------|
|
||||
| macOS | `/System/Library/Fonts/Menlo.ttc`, `/System/Library/Fonts/Monaco.ttf` |
|
||||
| Linux | `/usr/share/fonts/truetype/dejavu/DejaVuSansMono.ttf` |
|
||||
| Windows | `C:\Windows\Fonts\consola.ttf` (Consolas) |
|
||||
|
||||
Always probe multiple paths and fall back gracefully. See `architecture.md` § Font Selection.
|
||||
|
||||
---
|
||||
|
||||
## Performance
|
||||
|
||||
### Slow Shaders
|
||||
|
||||
Some shaders use Python loops and are very slow at 1080p:
|
||||
|
||||
| Shader | Issue | Fix |
|
||||
|--------|-------|-----|
|
||||
| `wave_distort` | Per-row Python loop | Use vectorized fancy indexing |
|
||||
| `halftone` | Triple-nested loop | Vectorize with block reduction |
|
||||
| `matrix rain` | Per-column per-trail loop | Accumulate index arrays, bulk assign |
|
||||
|
||||
### Render Time Scaling
|
||||
|
||||
If render is taking much longer than expected:
|
||||
1. Check grid count — each extra grid adds ~100-150ms/frame for init
|
||||
2. Check particle count — cap at quality-appropriate limits
|
||||
3. Check shader count — each shader adds 2-25ms
|
||||
4. Check for accidental Python loops in effects (should be numpy only)
|
||||
|
||||
---
|
||||
|
||||
## Common Mistakes
|
||||
|
||||
### Using `r.S` vs the `S` Parameter
|
||||
|
||||
The v2 scene protocol passes `S` (the state dict) as an explicit parameter. But `S` IS `r.S` — they're the same object. Both work:
|
||||
|
||||
```python
|
||||
def fx_scene(r, f, t, S):
|
||||
S["counter"] = S.get("counter", 0) + 1 # via parameter (preferred)
|
||||
r.S["counter"] = r.S.get("counter", 0) + 1 # via renderer (also works)
|
||||
```
|
||||
|
||||
Use the `S` parameter for clarity. The explicit parameter makes it obvious that the function has persistent state.
|
||||
|
||||
### Forgetting to Handle Empty Feature Values
|
||||
|
||||
Audio features default to 0.0 if the audio is silent. Use `.get()` with sensible defaults:
|
||||
|
||||
```python
|
||||
energy = f.get("bass", 0.3) # default to 0.3, not 0
|
||||
```
|
||||
|
||||
If you default to 0, effects go blank during silence.
|
||||
|
||||
### Writing New Files Instead of Editing Existing State
|
||||
|
||||
A common bug in particle systems: creating new arrays every frame instead of updating persistent state.
|
||||
|
||||
```python
|
||||
# WRONG — particles reset every frame
|
||||
S["px"] = []
|
||||
for _ in range(100):
|
||||
S["px"].append(random.random())
|
||||
|
||||
# RIGHT — only initialize once, update each frame
|
||||
if "px" not in S:
|
||||
S["px"] = []
|
||||
# ... emit new particles based on beats
|
||||
# ... update existing particles
|
||||
```
|
||||
|
||||
### Not Clipping Value Fields
|
||||
|
||||
Value fields should be [0, 1]. If they exceed this range, `val2char()` produces index errors:
|
||||
|
||||
```python
|
||||
# WRONG — vf_plasma() * 1.5 can exceed 1.0
|
||||
val = vf_plasma(g, f, t, S) * 1.5
|
||||
|
||||
# RIGHT — clip after scaling
|
||||
val = np.clip(vf_plasma(g, f, t, S) * 1.5, 0, 1)
|
||||
```
|
||||
|
||||
The `_render_vf()` helper clips automatically, but if you're building custom scenes, clip explicitly.
|
||||
|
||||
## Brightness Best Practices
|
||||
|
||||
- Dense animated backgrounds — never flat black, always fill the grid
|
||||
- Vignette minimum clamped to 0.15 (not 0.12)
|
||||
- Bloom threshold 130 (not 170) so more pixels contribute to glow
|
||||
- Use `screen` blend mode (not `overlay`) for dark ASCII layers — overlay squares dark values: `2 * 0.12 * 0.12 = 0.03`
|
||||
- FeedbackBuffer decay minimum 0.5 — below that, feedback disappears too fast to see
|
||||
- Value field floor: `vf * 0.8 + 0.05` ensures no cell is truly zero
|
||||
- Per-scene gamma overrides: default 0.75, solarize 0.55, posterize 0.50, bright scenes 0.85
|
||||
- Test frames early: render single frames at key timestamps before committing to full render
|
||||
|
||||
**Quick checklist before full render:**
|
||||
1. Render 3 test frames (start, middle, end)
|
||||
2. Check `canvas.mean() > 8` after tonemap
|
||||
3. Check no scene is visually flat black
|
||||
4. Verify per-section variation (different bg/palette/color per scene)
|
||||
5. Confirm shader chain includes bloom (threshold 130)
|
||||
6. Confirm vignette strength ≤ 0.25
|
||||
@@ -0,0 +1,77 @@
|
||||
# Port Notes — baoyu-comic
|
||||
|
||||
Ported from [JimLiu/baoyu-skills](https://github.com/JimLiu/baoyu-skills) v1.56.1.
|
||||
|
||||
## Changes from upstream
|
||||
|
||||
### SKILL.md adaptations
|
||||
|
||||
| Change | Upstream | Hermes |
|
||||
|--------|----------|--------|
|
||||
| Metadata namespace | `openclaw` | `hermes` (with `tags` + `homepage`) |
|
||||
| Trigger | Slash commands / CLI flags | Natural language skill matching |
|
||||
| User config | EXTEND.md file (project/user/XDG paths) | Removed — not part of Hermes infra |
|
||||
| User prompts | `AskUserQuestion` (batched) | `clarify` tool (one question at a time) |
|
||||
| Image generation | baoyu-imagine (Bun/TypeScript, supports `--ref`) | `image_generate` — **prompt-only**, returns a URL; no reference image input; agent must download the URL to the output directory |
|
||||
| PDF assembly | `scripts/merge-to-pdf.ts` (Bun + `pdf-lib`) | Removed — the PDF merge step is out of scope for this port; pages are delivered as PNGs only |
|
||||
| Platform support | Linux/macOS/Windows/WSL/PowerShell | Linux/macOS only |
|
||||
| File operations | Generic instructions | Hermes file tools (`write_file`, `read_file`) |
|
||||
|
||||
### Structural removals
|
||||
|
||||
- **`references/config/` directory** (removed entirely):
|
||||
- `first-time-setup.md` — blocking first-time setup flow for EXTEND.md
|
||||
- `preferences-schema.md` — EXTEND.md YAML schema
|
||||
- `watermark-guide.md` — watermark config (tied to EXTEND.md)
|
||||
- **`scripts/` directory** (removed entirely): upstream's `merge-to-pdf.ts` depended on `pdf-lib`, which is not declared anywhere in the Hermes repo. Rather than add a new dependency, the port drops PDF assembly and delivers per-page PNGs.
|
||||
- **Workflow Step 8 (Merge to PDF)** removed from `workflow.md`; Step 9 (Completion report) renumbered to Step 8.
|
||||
- **Workflow Step 1.1** — "Load Preferences (EXTEND.md)" section removed from `workflow.md`; steps 1.2/1.3 renumbered to 1.1/1.2.
|
||||
- **Generic "User Input Tools" and "Image Generation Tools" preambles** — SKILL.md no longer lists fallback rules for multiple possible tools; it references `clarify` and `image_generate` directly.
|
||||
|
||||
### Image generation strategy changes
|
||||
|
||||
`image_generate`'s schema accepts only `prompt` and `aspect_ratio` (`landscape` | `portrait` | `square`). Upstream's reference-image flow (`--ref characters.png` for character consistency, plus user-supplied refs for style/palette/scene) does not map to this tool, so the workflow was restructured:
|
||||
|
||||
- **Character sheet PNG** is still generated for multi-page comics, but it is repositioned as a **human-facing review artifact** (for visual verification) and a reference for later regenerations / manual prompt edits. Page prompts themselves are built from the **text descriptions** in `characters/characters.md` (embedded inline during Step 5). `image_generate` never sees the PNG as a visual input.
|
||||
- **User-supplied reference images** are reduced to `style` / `palette` / `scene` trait extraction — traits are embedded in the prompt body; the image files themselves are kept only for provenance under `refs/`.
|
||||
- **Page prompts** now mandate that character descriptions are embedded inline (copied from `characters/characters.md`) — this is the only mechanism left to enforce cross-page character consistency.
|
||||
- **Download step** — after every `image_generate` call, the returned URL is fetched to disk (e.g., `curl -fsSL "<url>" -o <target>.png`) and verified before the workflow advances.
|
||||
|
||||
### SKILL.md reductions
|
||||
|
||||
- CLI option columns (`--art`, `--tone`, `--layout`, `--aspect`, `--lang`, `--ref`, `--storyboard-only`, `--prompts-only`, `--images-only`, `--regenerate`) converted to plain-English option descriptions.
|
||||
- Preset files (`presets/*.md`) and `ohmsha-guide.md`: `` `--style X` `` / `` `--art X --tone Y` `` shorthand rewritten to `art=X, tone=Y` + natural-language references.
|
||||
- `partial-workflows.md`: per-skill slash command invocations rewritten as user-intent cues; PDF-related outputs removed.
|
||||
- `auto-selection.md`: priority order dropped the EXTEND.md tier.
|
||||
- `analysis-framework.md`: language-priority comment updated (user option → conversation → source).
|
||||
|
||||
### File naming convention
|
||||
|
||||
Source content pasted by the user is saved as `source-{slug}.md`, where `{slug}` is the kebab-case topic slug used for the output directory. Backups follow the same pattern with a `-backup-YYYYMMDD-HHMMSS` suffix. SKILL.md and `workflow.md` now agree on this single convention.
|
||||
|
||||
### What was preserved verbatim
|
||||
|
||||
- All 6 art-style definitions (`references/art-styles/`)
|
||||
- All 7 tone definitions (`references/tones/`)
|
||||
- All 7 layout definitions (`references/layouts/`)
|
||||
- Core templates: `character-template.md`, `storyboard-template.md`, `base-prompt.md`
|
||||
- Preset bodies (only the first few intro lines adapted; special rules unchanged)
|
||||
- Author, version, homepage attribution
|
||||
|
||||
## Syncing with upstream
|
||||
|
||||
To pull upstream updates:
|
||||
|
||||
```bash
|
||||
# Compare versions
|
||||
curl -sL https://raw.githubusercontent.com/JimLiu/baoyu-skills/main/skills/baoyu-comic/SKILL.md | head -5
|
||||
# Look for the version: line
|
||||
|
||||
# Diff a reference file
|
||||
diff <(curl -sL https://raw.githubusercontent.com/JimLiu/baoyu-skills/main/skills/baoyu-comic/references/art-styles/manga.md) \
|
||||
references/art-styles/manga.md
|
||||
```
|
||||
|
||||
Art-style, tone, and layout reference files can usually be overwritten directly (they're upstream-verbatim). `SKILL.md`, `references/workflow.md`, `references/partial-workflows.md`, `references/auto-selection.md`, `references/analysis-framework.md`, `references/ohmsha-guide.md`, and `references/presets/*.md` must be manually merged since they contain Hermes-specific adaptations.
|
||||
|
||||
If upstream adds a Hermes-compatible PDF merge step (no extra npm deps), restore `scripts/` and reintroduce Step 8 in `workflow.md`.
|
||||
@@ -0,0 +1,246 @@
|
||||
---
|
||||
name: baoyu-comic
|
||||
description: Knowledge comic creator supporting multiple art styles and tones. Creates original educational comics with detailed panel layouts and sequential image generation. Use when user asks to create "知识漫画", "教育漫画", "biography comic", "tutorial comic", or "Logicomix-style comic".
|
||||
version: 1.56.1
|
||||
author: 宝玉 (JimLiu)
|
||||
license: MIT
|
||||
metadata:
|
||||
hermes:
|
||||
tags: [comic, knowledge-comic, creative, image-generation]
|
||||
homepage: https://github.com/JimLiu/baoyu-skills#baoyu-comic
|
||||
---
|
||||
|
||||
# Knowledge Comic Creator
|
||||
|
||||
Adapted from [baoyu-comic](https://github.com/JimLiu/baoyu-skills) for Hermes Agent's tool ecosystem.
|
||||
|
||||
Create original knowledge comics with flexible art style × tone combinations.
|
||||
|
||||
## When to Use
|
||||
|
||||
Trigger this skill when the user asks to create a knowledge/educational comic, biography comic, tutorial comic, or uses terms like "知识漫画", "教育漫画", or "Logicomix-style". The user provides content (text, file path, URL, or topic) and optionally specifies art style, tone, layout, aspect ratio, or language.
|
||||
|
||||
## Reference Images
|
||||
|
||||
Hermes' `image_generate` tool is **prompt-only** — it accepts a text prompt and an aspect ratio, and returns an image URL. It does **NOT** accept reference images. When the user supplies a reference image, use it to **extract traits in text** that get embedded in every page prompt:
|
||||
|
||||
**Intake**: Accept file paths when the user provides them (or pastes images in conversation).
|
||||
- File path(s) → copy to `refs/NN-ref-{slug}.{ext}` alongside the comic output for provenance
|
||||
- Pasted image with no path → ask the user for the path via `clarify`, or extract style traits verbally as a text fallback
|
||||
- No reference → skip this section
|
||||
|
||||
**Usage modes** (per reference):
|
||||
|
||||
| Usage | Effect |
|
||||
|-------|--------|
|
||||
| `style` | Extract style traits (line treatment, texture, mood) and append to every page's prompt body |
|
||||
| `palette` | Extract hex colors and append to every page's prompt body |
|
||||
| `scene` | Extract scene composition or subject notes and append to the relevant page(s) |
|
||||
|
||||
**Record in each page's prompt frontmatter** when refs exist:
|
||||
|
||||
```yaml
|
||||
references:
|
||||
- ref_id: 01
|
||||
filename: 01-ref-scene.png
|
||||
usage: style
|
||||
traits: "muted earth tones, soft-edged ink wash, low-contrast backgrounds"
|
||||
```
|
||||
|
||||
Character consistency is driven by **text descriptions** in `characters/characters.md` (written in Step 3) that get embedded inline in every page prompt (Step 5). The optional PNG character sheet generated in Step 7.1 is a human-facing review artifact, not an input to `image_generate`.
|
||||
|
||||
## Options
|
||||
|
||||
### Visual Dimensions
|
||||
|
||||
| Option | Values | Description |
|
||||
|--------|--------|-------------|
|
||||
| Art | ligne-claire (default), manga, realistic, ink-brush, chalk, minimalist | Art style / rendering technique |
|
||||
| Tone | neutral (default), warm, dramatic, romantic, energetic, vintage, action | Mood / atmosphere |
|
||||
| Layout | standard (default), cinematic, dense, splash, mixed, webtoon, four-panel | Panel arrangement |
|
||||
| Aspect | 3:4 (default, portrait), 4:3 (landscape), 16:9 (widescreen) | Page aspect ratio |
|
||||
| Language | auto (default), zh, en, ja, etc. | Output language |
|
||||
| Refs | File paths | Reference images used for style / palette trait extraction (not passed to the image model). See [Reference Images](#reference-images) above. |
|
||||
|
||||
### Partial Workflow Options
|
||||
|
||||
| Option | Description |
|
||||
|--------|-------------|
|
||||
| Storyboard only | Generate storyboard only, skip prompts and images |
|
||||
| Prompts only | Generate storyboard + prompts, skip images |
|
||||
| Images only | Generate images from existing prompts directory |
|
||||
| Regenerate N | Regenerate specific page(s) only (e.g., `3` or `2,5,8`) |
|
||||
|
||||
Details: [references/partial-workflows.md](references/partial-workflows.md)
|
||||
|
||||
### Art, Tone & Preset Catalogue
|
||||
|
||||
- **Art styles** (6): `ligne-claire`, `manga`, `realistic`, `ink-brush`, `chalk`, `minimalist`. Full definitions at `references/art-styles/<style>.md`.
|
||||
- **Tones** (7): `neutral`, `warm`, `dramatic`, `romantic`, `energetic`, `vintage`, `action`. Full definitions at `references/tones/<tone>.md`.
|
||||
- **Presets** (5) with special rules beyond plain art+tone:
|
||||
|
||||
| Preset | Equivalent | Hook |
|
||||
|--------|-----------|------|
|
||||
| `ohmsha` | manga + neutral | Visual metaphors, no talking heads, gadget reveals |
|
||||
| `wuxia` | ink-brush + action | Qi effects, combat visuals, atmospheric |
|
||||
| `shoujo` | manga + romantic | Decorative elements, eye details, romantic beats |
|
||||
| `concept-story` | manga + warm | Visual symbol system, growth arc, dialogue+action balance |
|
||||
| `four-panel` | minimalist + neutral + four-panel layout | 起承转合 structure, B&W + spot color, stick-figure characters |
|
||||
|
||||
Full rules at `references/presets/<preset>.md` — load the file when a preset is picked.
|
||||
|
||||
- **Compatibility matrix** and **content-signal → preset** table live in [references/auto-selection.md](references/auto-selection.md). Read it before recommending combinations in Step 2.
|
||||
|
||||
## File Structure
|
||||
|
||||
Output directory: `comic/{topic-slug}/`
|
||||
- Slug: 2-4 words kebab-case from topic (e.g., `alan-turing-bio`)
|
||||
- Conflict: append timestamp (e.g., `turing-story-20260118-143052`)
|
||||
|
||||
**Contents**:
|
||||
| File | Description |
|
||||
|------|-------------|
|
||||
| `source-{slug}.md` | Saved source content (kebab-case slug matches the output directory) |
|
||||
| `analysis.md` | Content analysis |
|
||||
| `storyboard.md` | Storyboard with panel breakdown |
|
||||
| `characters/characters.md` | Character definitions |
|
||||
| `characters/characters.png` | Character reference sheet (downloaded from `image_generate`) |
|
||||
| `prompts/NN-{cover\|page}-[slug].md` | Generation prompts |
|
||||
| `NN-{cover\|page}-[slug].png` | Generated images (downloaded from `image_generate`) |
|
||||
| `refs/NN-ref-{slug}.{ext}` | User-supplied reference images (optional, for provenance) |
|
||||
|
||||
## Language Handling
|
||||
|
||||
**Detection Priority**:
|
||||
1. User-specified language (explicit option)
|
||||
2. User's conversation language
|
||||
3. Source content language
|
||||
|
||||
**Rule**: Use user's input language for ALL interactions:
|
||||
- Storyboard outlines and scene descriptions
|
||||
- Image generation prompts
|
||||
- User selection options and confirmations
|
||||
- Progress updates, questions, errors, summaries
|
||||
|
||||
Technical terms remain in English.
|
||||
|
||||
## Workflow
|
||||
|
||||
### Progress Checklist
|
||||
|
||||
```
|
||||
Comic Progress:
|
||||
- [ ] Step 1: Setup & Analyze
|
||||
- [ ] 1.1 Analyze content
|
||||
- [ ] 1.2 Check existing directory
|
||||
- [ ] Step 2: Confirmation - Style & options ⚠️ REQUIRED
|
||||
- [ ] Step 3: Generate storyboard + characters
|
||||
- [ ] Step 4: Review outline (conditional)
|
||||
- [ ] Step 5: Generate prompts
|
||||
- [ ] Step 6: Review prompts (conditional)
|
||||
- [ ] Step 7: Generate images
|
||||
- [ ] 7.1 Generate character sheet (if needed) → characters/characters.png
|
||||
- [ ] 7.2 Generate pages (with character descriptions embedded in prompt)
|
||||
- [ ] Step 8: Completion report
|
||||
```
|
||||
|
||||
### Flow
|
||||
|
||||
```
|
||||
Input → Analyze → [Check Existing?] → [Confirm: Style + Reviews] → Storyboard → [Review?] → Prompts → [Review?] → Images → Complete
|
||||
```
|
||||
|
||||
### Step Summary
|
||||
|
||||
| Step | Action | Key Output |
|
||||
|------|--------|------------|
|
||||
| 1.1 | Analyze content | `analysis.md`, `source-{slug}.md` |
|
||||
| 1.2 | Check existing directory | Handle conflicts |
|
||||
| 2 | Confirm style, focus, audience, reviews | User preferences |
|
||||
| 3 | Generate storyboard + characters | `storyboard.md`, `characters/` |
|
||||
| 4 | Review outline (if requested) | User approval |
|
||||
| 5 | Generate prompts | `prompts/*.md` |
|
||||
| 6 | Review prompts (if requested) | User approval |
|
||||
| 7.1 | Generate character sheet (if needed) | `characters/characters.png` |
|
||||
| 7.2 | Generate pages | `*.png` files |
|
||||
| 8 | Completion report | Summary |
|
||||
|
||||
### User Questions
|
||||
|
||||
Use the `clarify` tool to confirm options. Since `clarify` handles one question at a time, ask the most important question first and proceed sequentially. See [references/workflow.md](references/workflow.md) for the full Step 2 question set.
|
||||
|
||||
**Timeout handling (CRITICAL)**: `clarify` can return `"The user did not provide a response within the time limit. Use your best judgement to make the choice and proceed."` — this is NOT user consent to default everything.
|
||||
|
||||
- Treat it as a default **for that one question only**. Continue asking the remaining Step 2 questions in sequence; each question is an independent consent point.
|
||||
- **Surface the default to the user visibly** in your next message so they have a chance to correct it: e.g. `"Style: defaulted to ohmsha preset (clarify timed out). Say the word to switch."` — an unreported default is indistinguishable from never having asked.
|
||||
- Do NOT collapse Step 2 into a single "use all defaults" pass after one timeout. If the user is genuinely absent, they will be equally absent for all five questions — but they can correct visible defaults when they return, and cannot correct invisible ones.
|
||||
|
||||
### Step 7: Image Generation
|
||||
|
||||
Use Hermes' built-in `image_generate` tool for all image rendering. Its schema accepts only `prompt` and `aspect_ratio` (`landscape` | `portrait` | `square`); it **returns a URL**, not a local file. Every generated page or character sheet must therefore be downloaded to the output directory.
|
||||
|
||||
**Prompt file requirement (hard)**: write each image's full, final prompt to a standalone file under `prompts/` (naming: `NN-{type}-[slug].md`) BEFORE calling `image_generate`. The prompt file is the reproducibility record.
|
||||
|
||||
**Aspect ratio mapping** — the storyboard's `aspect_ratio` field maps to `image_generate`'s format as follows:
|
||||
|
||||
| Storyboard ratio | `image_generate` format |
|
||||
|------------------|-------------------------|
|
||||
| `3:4`, `9:16`, `2:3` | `portrait` |
|
||||
| `4:3`, `16:9`, `3:2` | `landscape` |
|
||||
| `1:1` | `square` |
|
||||
|
||||
**Download step** — after every `image_generate` call:
|
||||
1. Read the URL from the tool result
|
||||
2. Fetch the image bytes using an **absolute** output path, e.g.
|
||||
`curl -fsSL "<url>" -o /abs/path/to/comic/<slug>/NN-page-<slug>.png`
|
||||
3. Verify the file exists and is non-empty at that exact path before proceeding to the next page
|
||||
|
||||
**Never rely on shell CWD persistence for `-o` paths.** The terminal tool's persistent-shell CWD can change between batches (session expiry, `TERMINAL_LIFETIME_SECONDS`, a failed `cd` that leaves you in the wrong directory). `curl -o relative/path.png` is a silent footgun: if CWD has drifted, the file lands somewhere else with no error. **Always pass a fully-qualified absolute path to `-o`**, or pass `workdir=<abs path>` to the terminal tool. Incident Apr 2026: pages 06-09 of a 10-page comic landed at the repo root instead of `comic/<slug>/` because batch 3 inherited a stale CWD from batch 2 and `curl -o 06-page-skills.png` wrote to the wrong directory. The agent then spent several turns claiming the files existed where they didn't.
|
||||
|
||||
**7.1 Character sheet** — generate it (to `characters/characters.png`, aspect `landscape`) when the comic is multi-page with recurring characters. Skip for simple presets (e.g., four-panel minimalist) or single-page comics. The prompt file at `characters/characters.md` must exist before invoking `image_generate`. The rendered PNG is a **human-facing review artifact** (so the user can visually verify character design) and a reference for later regenerations or manual prompt edits — it does **not** drive Step 7.2. Page prompts are already written in Step 5 from the **text descriptions** in `characters/characters.md`; `image_generate` cannot accept images as visual input.
|
||||
|
||||
**7.2 Pages** — each page's prompt MUST already be at `prompts/NN-{cover|page}-[slug].md` before invoking `image_generate`. Because `image_generate` is prompt-only, character consistency is enforced by **embedding character descriptions (sourced from `characters/characters.md`) inline in every page prompt during Step 5**. The embedding is done uniformly whether or not a PNG sheet is produced in 7.1; the PNG is only a review/regeneration aid.
|
||||
|
||||
**Backup rule**: existing `prompts/…md` and `…png` files → rename with `-backup-YYYYMMDD-HHMMSS` suffix before regenerating.
|
||||
|
||||
Full step-by-step workflow (analysis, storyboard, review gates, regeneration variants): [references/workflow.md](references/workflow.md).
|
||||
|
||||
## References
|
||||
|
||||
**Core Templates**:
|
||||
- [analysis-framework.md](references/analysis-framework.md) - Deep content analysis
|
||||
- [character-template.md](references/character-template.md) - Character definition format
|
||||
- [storyboard-template.md](references/storyboard-template.md) - Storyboard structure
|
||||
- [ohmsha-guide.md](references/ohmsha-guide.md) - Ohmsha manga specifics
|
||||
|
||||
**Style Definitions**:
|
||||
- `references/art-styles/` - Art styles (ligne-claire, manga, realistic, ink-brush, chalk, minimalist)
|
||||
- `references/tones/` - Tones (neutral, warm, dramatic, romantic, energetic, vintage, action)
|
||||
- `references/presets/` - Presets with special rules (ohmsha, wuxia, shoujo, concept-story, four-panel)
|
||||
- `references/layouts/` - Layouts (standard, cinematic, dense, splash, mixed, webtoon, four-panel)
|
||||
|
||||
**Workflow**:
|
||||
- [workflow.md](references/workflow.md) - Full workflow details
|
||||
- [auto-selection.md](references/auto-selection.md) - Content signal analysis
|
||||
- [partial-workflows.md](references/partial-workflows.md) - Partial workflow options
|
||||
|
||||
## Page Modification
|
||||
|
||||
| Action | Steps |
|
||||
|--------|-------|
|
||||
| **Edit** | **Update prompt file FIRST** → regenerate image → download new PNG |
|
||||
| **Add** | Create prompt at position → generate with character descriptions embedded → renumber subsequent → update storyboard |
|
||||
| **Delete** | Remove files → renumber subsequent → update storyboard |
|
||||
|
||||
**IMPORTANT**: When updating pages, ALWAYS update the prompt file (`prompts/NN-{cover|page}-[slug].md`) FIRST before regenerating. This ensures changes are documented and reproducible.
|
||||
|
||||
## Pitfalls
|
||||
|
||||
- Image generation: 10-30 seconds per page; auto-retry once on failure
|
||||
- **Always download** the URL returned by `image_generate` to a local PNG — downstream tooling (and the user's review) expects files in the output directory, not ephemeral URLs
|
||||
- **Use absolute paths for `curl -o`** — never rely on persistent-shell CWD across batches. Silent footgun: files land in the wrong directory and subsequent `ls` on the intended path shows nothing. See Step 7 "Download step".
|
||||
- Use stylized alternatives for sensitive public figures
|
||||
- **Step 2 confirmation required** - do not skip
|
||||
- **Steps 4/6 conditional** - only if user requested in Step 2
|
||||
- **Step 7.1 character sheet** - recommended for multi-page comics, optional for simple presets. The PNG is a review/regeneration aid; page prompts (written in Step 5) use the text descriptions in `characters/characters.md`, not the PNG. `image_generate` does not accept images as visual input
|
||||
- **Strip secrets** — scan source content for API keys, tokens, or credentials before writing any output file
|
||||
@@ -0,0 +1,176 @@
|
||||
# Comic Content Analysis Framework
|
||||
|
||||
Deep analysis framework for transforming source content into effective visual storytelling.
|
||||
|
||||
## Purpose
|
||||
|
||||
Before creating a comic, thoroughly analyze the source material to:
|
||||
- Identify the target audience and their needs
|
||||
- Determine what value the comic will deliver
|
||||
- Extract narrative potential for visual storytelling
|
||||
- Plan character arcs and key moments
|
||||
|
||||
## Analysis Dimensions
|
||||
|
||||
### 1. Core Content (Understanding "What")
|
||||
|
||||
**Central Message**
|
||||
- What is the single most important idea readers should take away?
|
||||
- Can you express it in one sentence?
|
||||
|
||||
**Key Concepts**
|
||||
- What are the essential concepts readers must understand?
|
||||
- How should these concepts be visualized?
|
||||
- Which concepts need simplified explanations?
|
||||
|
||||
**Content Structure**
|
||||
- How is the source material organized?
|
||||
- What is the natural narrative arc?
|
||||
- Where are the climax and turning points?
|
||||
|
||||
**Evidence & Examples**
|
||||
- What concrete examples, data, or stories support the main ideas?
|
||||
- Which examples translate well to visual panels?
|
||||
- What can be shown rather than told?
|
||||
|
||||
### 2. Context & Background (Understanding "Why")
|
||||
|
||||
**Source Origin**
|
||||
- Who created this content? What is their perspective?
|
||||
- What was the original purpose?
|
||||
- Is there bias to be aware of?
|
||||
|
||||
**Historical/Cultural Context**
|
||||
- When and where does the story take place?
|
||||
- What background knowledge do readers need?
|
||||
- What period-specific visual elements are required?
|
||||
|
||||
**Underlying Assumptions**
|
||||
- What does the source assume readers already know?
|
||||
- What implicit beliefs or values are present?
|
||||
- Should the comic challenge or reinforce these?
|
||||
|
||||
### 3. Audience Analysis
|
||||
|
||||
**Primary Audience**
|
||||
- Who will read this comic?
|
||||
- What is their existing knowledge level?
|
||||
- What are their interests and motivations?
|
||||
|
||||
**Secondary Audiences**
|
||||
- Who else might benefit from this comic?
|
||||
- How might their needs differ?
|
||||
|
||||
**Reader Questions**
|
||||
- What questions will readers have?
|
||||
- What misconceptions might they bring?
|
||||
- What "aha moments" can we create?
|
||||
|
||||
### 4. Value Proposition
|
||||
|
||||
**Knowledge Value**
|
||||
- What will readers learn?
|
||||
- What new perspectives will they gain?
|
||||
- How will this change their understanding?
|
||||
|
||||
**Emotional Value**
|
||||
- What emotions should readers feel?
|
||||
- What connections will they make with characters?
|
||||
- What will make this memorable?
|
||||
|
||||
**Practical Value**
|
||||
- Can readers apply what they learn?
|
||||
- What actions might this inspire?
|
||||
- What conversations might it spark?
|
||||
|
||||
### 5. Narrative Potential
|
||||
|
||||
**Story Arc Candidates**
|
||||
- What natural narratives exist in the content?
|
||||
- Where is the conflict or tension?
|
||||
- What transformations occur?
|
||||
|
||||
**Character Potential**
|
||||
- Who are the key figures?
|
||||
- What are their motivations and obstacles?
|
||||
- How do they change throughout?
|
||||
|
||||
**Visual Opportunities**
|
||||
- What scenes have strong visual potential?
|
||||
- Where can abstract concepts become concrete images?
|
||||
- What metaphors can be visualized?
|
||||
|
||||
**Dramatic Moments**
|
||||
- What are the breakthrough/revelation moments?
|
||||
- Where are the emotional peaks?
|
||||
- What creates tension and release?
|
||||
|
||||
### 6. Adaptation Considerations
|
||||
|
||||
**What to Keep**
|
||||
- Essential facts and ideas
|
||||
- Key quotes or moments
|
||||
- Core emotional beats
|
||||
|
||||
**What to Simplify**
|
||||
- Complex explanations
|
||||
- Dense technical details
|
||||
- Lengthy descriptions
|
||||
|
||||
**What to Expand**
|
||||
- Brief mentions that deserve more attention
|
||||
- Implied emotions or relationships
|
||||
- Visual details not in source
|
||||
|
||||
**What to Omit**
|
||||
- Tangential information
|
||||
- Redundant examples
|
||||
- Content that doesn't serve the narrative
|
||||
|
||||
## Output Format
|
||||
|
||||
Analysis results should be saved to `analysis.md` with:
|
||||
|
||||
1. **YAML Front Matter**: Metadata (title, topic, time_span, source_language, user_language, aspect_ratio, recommended_page_count, recommended_art, recommended_tone, recommended_layout)
|
||||
2. **Target Audience**: Primary, secondary, tertiary audiences with their needs
|
||||
3. **Value Proposition**: What readers will gain (knowledge, emotional, practical)
|
||||
4. **Core Themes**: Table with theme, narrative potential, visual opportunity
|
||||
5. **Key Figures & Story Arcs**: Character profiles with arcs, visual identity, key moments
|
||||
6. **Content Signals**: Style and layout recommendations based on content type
|
||||
7. **Recommended Approaches**: Narrative approaches ranked by suitability
|
||||
|
||||
### YAML Front Matter Example
|
||||
|
||||
```yaml
|
||||
---
|
||||
title: "Alan Turing: The Father of Computing"
|
||||
topic: alan-turing-biography
|
||||
time_span: 1912-1954
|
||||
source_language: en
|
||||
user_language: zh # User-specified or detected from conversation
|
||||
aspect_ratio: "3:4"
|
||||
recommended_page_count: 16
|
||||
recommended_art: ligne-claire # ligne-claire|manga|realistic|ink-brush|chalk
|
||||
recommended_tone: neutral # neutral|warm|dramatic|romantic|energetic|vintage|action
|
||||
recommended_layout: mixed # standard|cinematic|dense|splash|mixed|webtoon
|
||||
---
|
||||
```
|
||||
|
||||
### Language Fields
|
||||
|
||||
| Field | Description |
|
||||
|-------|-------------|
|
||||
| `source_language` | Detected language of source content |
|
||||
| `user_language` | Output language for comic (user-specified option > conversation language > source_language) |
|
||||
|
||||
## Analysis Checklist
|
||||
|
||||
Before proceeding to storyboard:
|
||||
|
||||
- [ ] Can I state the core message in one sentence?
|
||||
- [ ] Do I know exactly who will read this comic?
|
||||
- [ ] Have I identified at least 3 ways this comic provides value?
|
||||
- [ ] Are there clear protagonists with compelling arcs?
|
||||
- [ ] Have I found at least 5 visually powerful moments?
|
||||
- [ ] Do I understand what to keep, simplify, expand, and omit?
|
||||
- [ ] Have I identified the emotional peaks and valleys?
|
||||
@@ -0,0 +1,101 @@
|
||||
# chalk
|
||||
|
||||
粉笔画风 - Chalkboard aesthetic with hand-drawn warmth
|
||||
|
||||
## Overview
|
||||
|
||||
Classic classroom chalkboard aesthetic with hand-drawn chalk illustrations. Nostalgic educational feel with imperfect, sketchy lines that capture the warmth of traditional teaching.
|
||||
|
||||
## Line Work
|
||||
|
||||
- Sketchy, imperfect hand-drawn lines
|
||||
- Chalk texture on all strokes
|
||||
- Varying line weight from chalk pressure
|
||||
- Soft edges, no sharp digital lines
|
||||
- Visible chalk dust effects
|
||||
|
||||
## Character Design
|
||||
|
||||
- Simplified, friendly character designs
|
||||
- Stick figures to semi-detailed range
|
||||
- Expressive through simple gestures
|
||||
- Approachable, non-intimidating
|
||||
- Educational presenter style
|
||||
|
||||
## Background
|
||||
|
||||
- Chalkboard Black (#1A1A1A) or Dark Green-Black (#1C2B1C)
|
||||
- Realistic chalkboard texture
|
||||
- Subtle scratches and dust particles
|
||||
- Faint eraser marks for authenticity
|
||||
- Wooden frame border optional
|
||||
|
||||
## Typography
|
||||
|
||||
- Hand-drawn chalk lettering style
|
||||
- Visible chalk texture on text
|
||||
- Imperfect baseline adds authenticity
|
||||
- White or bright colored chalk for emphasis
|
||||
|
||||
## Visual Elements
|
||||
|
||||
- Hand-drawn chalk illustrations
|
||||
- Chalk dust effects around elements
|
||||
- Doodles: stars, arrows, underlines, circles
|
||||
- Mathematical formulas and diagrams
|
||||
- Eraser smudges and chalk residue
|
||||
- Stick figures and simple icons
|
||||
- Connection lines with hand-drawn feel
|
||||
|
||||
## Default Color Palette
|
||||
|
||||
| Role | Color | Hex |
|
||||
|------|-------|-----|
|
||||
| Background | Chalkboard Black | #1A1A1A |
|
||||
| Alt Background | Green-Black | #1C2B1C |
|
||||
| Primary Text | Chalk White | #F5F5F5 |
|
||||
| Accent 1 | Chalk Yellow | #FFE566 |
|
||||
| Accent 2 | Chalk Pink | #FF9999 |
|
||||
| Accent 3 | Chalk Blue | #66B3FF |
|
||||
| Accent 4 | Chalk Green | #90EE90 |
|
||||
| Accent 5 | Chalk Orange | #FFB366 |
|
||||
|
||||
## Style Rules
|
||||
|
||||
### Do
|
||||
- Maintain authentic chalk texture on all elements
|
||||
- Use imperfect, hand-drawn quality throughout
|
||||
- Add subtle chalk dust and smudge effects
|
||||
- Create visual hierarchy with color variety
|
||||
- Include playful doodles and annotations
|
||||
|
||||
### Don't
|
||||
- Use perfect geometric shapes
|
||||
- Create clean digital-looking lines
|
||||
- Add photorealistic elements
|
||||
- Use gradients or glossy effects
|
||||
|
||||
## Quality Markers
|
||||
|
||||
- ✓ Authentic chalk texture throughout
|
||||
- ✓ Imperfect, hand-drawn quality
|
||||
- ✓ Readable despite sketchy style
|
||||
- ✓ Nostalgic classroom feel
|
||||
- ✓ Effective color hierarchy
|
||||
- ✓ Playful educational aesthetic
|
||||
|
||||
## Compatibility
|
||||
|
||||
| Tone | Fit | Notes |
|
||||
|------|-----|-------|
|
||||
| neutral | ✓✓ | Classic educational |
|
||||
| warm | ✓✓ | Nostalgic feel |
|
||||
| dramatic | ✗ | Style mismatch |
|
||||
| vintage | ✓ | Old school feel |
|
||||
| romantic | ✗ | Style mismatch |
|
||||
| energetic | ✓✓ | Fun learning |
|
||||
| action | ✗ | Style mismatch |
|
||||
|
||||
## Best For
|
||||
|
||||
Educational content, tutorials, classroom themes, teaching materials, workshops, informal learning, knowledge sharing
|
||||
@@ -0,0 +1,97 @@
|
||||
# ink-brush
|
||||
|
||||
水墨画风 - Chinese ink brush aesthetics with dynamic strokes
|
||||
|
||||
## Overview
|
||||
|
||||
Traditional Chinese ink brush painting style adapted for comics. Combines calligraphic brush strokes with ink wash effects. Creates atmospheric, artistic visuals rooted in East Asian aesthetics.
|
||||
|
||||
## Line Work
|
||||
|
||||
- 2-3px dynamic brush strokes with varying weight
|
||||
- Ink wash effects, traditional Chinese brush feel
|
||||
- Bold, confident strokes with sharp edges
|
||||
- Flowing lines for fabric and hair
|
||||
- Pressure-sensitive stroke variation
|
||||
|
||||
## Character Design
|
||||
|
||||
- Realistic human proportions (7.5-8 head heights)
|
||||
- Defined features with ink brush definition
|
||||
- Dynamic poses capturing movement
|
||||
- Flowing hair and clothing in motion
|
||||
- Traditional attire options (robes, hanfu)
|
||||
- Intense, expressive faces
|
||||
|
||||
## Brush Techniques
|
||||
|
||||
| Technique | Usage |
|
||||
|-----------|-------|
|
||||
| Bold strokes | Character outlines |
|
||||
| Fine lines | Details, hair |
|
||||
| Ink wash | Atmosphere, shadows |
|
||||
| Dry brush | Texture, aging |
|
||||
| Splatter | Impact, drama |
|
||||
|
||||
## Background Treatment
|
||||
|
||||
- Dramatic landscapes: mountains, waterfalls, temples
|
||||
- Ink wash atmospheric effects
|
||||
- Misty, layered depth
|
||||
- Traditional architecture elements
|
||||
- High contrast silhouettes
|
||||
- Negative space as design element
|
||||
|
||||
## Color Approach
|
||||
|
||||
- Ink gradients as primary
|
||||
- Limited accent colors
|
||||
- Traditional Chinese palette
|
||||
- Atmospheric color washes
|
||||
- High contrast compositions
|
||||
|
||||
## Default Color Palette
|
||||
|
||||
| Role | Color | Hex |
|
||||
|------|-------|-----|
|
||||
| Primary | Deep black ink | #1A1A1A |
|
||||
| Accent | Crimson red | #8B0000 |
|
||||
| Accent | Imperial gold | #D4AF37 |
|
||||
| Skin | Natural tan | #D4A574 |
|
||||
| Background | Misty gray | #9CA3AF |
|
||||
| Background | Earth tone | #8B7355 |
|
||||
| Wash | Ink gradient | #2D3748 |
|
||||
|
||||
## Visual Elements
|
||||
|
||||
- Calligraphic text integration
|
||||
- Seal stamps (optional)
|
||||
- Ink splatter effects
|
||||
- Flowing fabric trails
|
||||
- Atmospheric mist
|
||||
- Mountain silhouettes
|
||||
|
||||
## Quality Markers
|
||||
|
||||
- ✓ Dynamic brush stroke quality
|
||||
- ✓ Authentic ink wash atmosphere
|
||||
- ✓ High contrast compositions
|
||||
- ✓ Flowing movement in fabric/hair
|
||||
- ✓ Traditional aesthetic elements
|
||||
- ✓ Atmospheric depth
|
||||
|
||||
## Compatibility
|
||||
|
||||
| Tone | Fit | Notes |
|
||||
|------|-----|-------|
|
||||
| neutral | ✓ | Contemplative stories |
|
||||
| warm | ✓ | Nostalgic, gentle |
|
||||
| dramatic | ✓✓ | High contrast |
|
||||
| vintage | ✓✓ | Historical pieces |
|
||||
| romantic | ✗ | Style mismatch |
|
||||
| energetic | ✗ | Too refined |
|
||||
| action | ✓✓ | Martial arts |
|
||||
|
||||
## Best For
|
||||
|
||||
Chinese historical stories, martial arts, traditional tales, contemplative narratives, artistic adaptations
|
||||
+75
@@ -0,0 +1,75 @@
|
||||
# ligne-claire
|
||||
|
||||
清线画风 - Uniform lines, flat colors, European comic tradition
|
||||
|
||||
## Overview
|
||||
|
||||
Classic European comic style originating from Hergé's Tintin. Characterized by clean, uniform outlines and flat color fills without gradients. Creates a timeless, accessible aesthetic suitable for educational and narrative content.
|
||||
|
||||
## Line Work
|
||||
|
||||
- Uniform, clean outlines with consistent weight (2px)
|
||||
- No hatching or cross-hatching for shading
|
||||
- Sharp, precise edges on all elements
|
||||
- Black ink outlines on all figures and objects
|
||||
- Shadows indicated through flat color areas, not line techniques
|
||||
|
||||
## Character Design
|
||||
|
||||
- Slightly stylized/cartoonish characters with realistic proportions
|
||||
- Distinctive, recognizable facial features
|
||||
- Expressive faces with clear emotions
|
||||
- Period-appropriate clothing with attention to detail
|
||||
- Consistent character appearance across panels
|
||||
- 6-7 head height proportions
|
||||
|
||||
## Background Treatment
|
||||
|
||||
- Detailed, realistic backgrounds with architectural accuracy
|
||||
- Period-specific props and technology
|
||||
- Clear spatial depth and perspective
|
||||
- Environmental storytelling through details
|
||||
- Contrast between simplified characters and detailed backgrounds
|
||||
|
||||
## Color Approach
|
||||
|
||||
- Flat colors without gradients (true to Ligne Claire tradition)
|
||||
- Limited palette per page for cohesion
|
||||
- Colors support narrative mood
|
||||
- Consistent lighting logic within scenes
|
||||
|
||||
## Default Color Palette
|
||||
|
||||
| Role | Color | Hex |
|
||||
|------|-------|-----|
|
||||
| Primary Blue | Clean blue | #3182CE |
|
||||
| Primary Red | Classic red | #E53E3E |
|
||||
| Primary Yellow | Warm yellow | #ECC94B |
|
||||
| Skin | Warm tan | #F7CFAE |
|
||||
| Background Light | Light cream | #FFFAF0 |
|
||||
| Background Sky | Sky blue | #BEE3F8 |
|
||||
|
||||
## Quality Markers
|
||||
|
||||
- ✓ Clean, uniform line weight throughout
|
||||
- ✓ Flat colors without gradients
|
||||
- ✓ Detailed backgrounds, stylized characters
|
||||
- ✓ Clear panel borders and reading flow
|
||||
- ✓ Hand-drawn text style
|
||||
- ✓ Proper perspective in environments
|
||||
|
||||
## Compatibility
|
||||
|
||||
| Tone | Fit | Notes |
|
||||
|------|-----|-------|
|
||||
| neutral | ✓✓ | Classic combination |
|
||||
| warm | ✓✓ | Nostalgic stories |
|
||||
| dramatic | ✓ | Works with high contrast |
|
||||
| vintage | ✓ | Period pieces |
|
||||
| romantic | ✗ | Style mismatch |
|
||||
| energetic | ✓ | Lighter stories |
|
||||
| action | ✗ | Lacks dynamic lines |
|
||||
|
||||
## Best For
|
||||
|
||||
Educational content, balanced narratives, biography comics, historical stories
|
||||
@@ -0,0 +1,93 @@
|
||||
# manga
|
||||
|
||||
日漫画风 - Anime/manga aesthetics with expressive characters
|
||||
|
||||
## Overview
|
||||
|
||||
Japanese manga art style characterized by large expressive eyes, dynamic poses, and visual emotion indicators. Versatile style that works across genres from educational to romantic to action.
|
||||
|
||||
## Line Work
|
||||
|
||||
- Clean, smooth lines (1.5-2px)
|
||||
- Expressive weight variation for emphasis
|
||||
- Smooth curves, dynamic strokes
|
||||
- Speed lines and motion effects available
|
||||
- Screen tone effects for atmosphere
|
||||
|
||||
## Character Design
|
||||
|
||||
- Anime/manga proportions: larger eyes, expressive faces
|
||||
- 5-7 head height proportions (varies by sub-style)
|
||||
- Clear emotional indicators (!, ?, sweat drops, sparkles)
|
||||
- Dynamic poses and gestures
|
||||
- Detailed hair with individual strands
|
||||
- Fashionable clothing with natural folds
|
||||
|
||||
## Eye Styles
|
||||
|
||||
| Type | Description |
|
||||
|------|-------------|
|
||||
| Standard | Medium-large, 2-3 highlights |
|
||||
| Educational | Friendly, approachable eyes |
|
||||
| Dramatic | Intense, detailed irises |
|
||||
| Cute | Very large, sparkly eyes |
|
||||
|
||||
## Background Treatment
|
||||
|
||||
- Simplified during dialogue/explanation
|
||||
- Detailed for establishing shots
|
||||
- Screen tone gradients for mood
|
||||
- Abstract backgrounds for emotional moments
|
||||
- Technical diagrams styled as displays
|
||||
|
||||
## Color Approach
|
||||
|
||||
- Clean, bright anime colors
|
||||
- Soft gradients on skin
|
||||
- Vibrant palette options
|
||||
- Light and shadow with soft transitions
|
||||
- Color coding for character identification
|
||||
|
||||
## Default Color Palette
|
||||
|
||||
| Role | Color | Hex |
|
||||
|------|-------|-----|
|
||||
| Primary Blue | Bright blue | #4299E1 |
|
||||
| Primary Orange | Warm orange | #ED8936 |
|
||||
| Primary Green | Soft green | #68D391 |
|
||||
| Skin | Anime warm | #FEEBC8 |
|
||||
| Background | Clean white | #FFFFFF |
|
||||
| Highlight | Golden | #FFD700 |
|
||||
|
||||
## Visual Elements
|
||||
|
||||
- Speech bubbles: rounded (normal), spiky (excitement)
|
||||
- Sound effects integrated visually
|
||||
- Emotion symbols (sweat drops, anger marks, hearts)
|
||||
- Speed lines and motion blur
|
||||
- Sparkle and glow effects
|
||||
|
||||
## Quality Markers
|
||||
|
||||
- ✓ Expressive character faces
|
||||
- ✓ Clean, consistent line work
|
||||
- ✓ Dynamic poses and compositions
|
||||
- ✓ Appropriate use of manga conventions
|
||||
- ✓ Readable panel flow
|
||||
- ✓ Consistent character designs
|
||||
|
||||
## Compatibility
|
||||
|
||||
| Tone | Fit | Notes |
|
||||
|------|-----|-------|
|
||||
| neutral | ✓✓ | Educational manga |
|
||||
| warm | ✓ | Slice of life |
|
||||
| dramatic | ✓ | Intense moments |
|
||||
| romantic | ✓✓ | Shoujo style |
|
||||
| energetic | ✓✓ | Shonen style |
|
||||
| vintage | ✗ | Style mismatch |
|
||||
| action | ✓✓ | Battle manga |
|
||||
|
||||
## Best For
|
||||
|
||||
Educational tutorials, romance, action, coming-of-age, technical explanations, youth-oriented content
|
||||
@@ -0,0 +1,84 @@
|
||||
# minimalist
|
||||
|
||||
极简画风 - Clean black line art, limited spot color, simplified stick-figure characters
|
||||
|
||||
## Overview
|
||||
|
||||
Minimalist cartoon illustration characterized by clean black line art on white background with very limited spot color for emphasis. Characters are simplified to near-stick-figure abstraction, focusing on gesture and concept rather than anatomical detail. Designed for business allegory, quick-read educational content, and concept illustration.
|
||||
|
||||
## Line Work
|
||||
|
||||
- Clean, uniform black lines (1.5-2px)
|
||||
- No hatching, cross-hatching, or shading techniques
|
||||
- Minimal detail — every line serves a purpose
|
||||
- Bold outlines for characters, thinner lines for props/labels
|
||||
- No decorative flourishes or ornamental lines
|
||||
|
||||
## Character Design
|
||||
|
||||
- Highly simplified, stick-figure-like business characters
|
||||
- Circle or oval heads with minimal facial features (dot eyes, simple line mouth)
|
||||
- Body as simple geometric shapes or line constructions
|
||||
- Distinguishing features through props only (tie, hat, briefcase, glasses)
|
||||
- No anatomical detail — expressive through posture and gesture
|
||||
- 4-5 head height proportions (squat, iconic)
|
||||
|
||||
## Background Treatment
|
||||
|
||||
- Mostly blank/white — negative space is a design element
|
||||
- Minimal environmental cues (a line for ground, simple desk outline)
|
||||
- Concept labels and text annotations replace detailed environments
|
||||
- Icons and symbols over realistic rendering
|
||||
- No perspective or spatial depth
|
||||
|
||||
## Color Approach
|
||||
|
||||
- Primarily black and white (90%+ of the image)
|
||||
- 1-2 spot accent colors for emphasis on key concepts
|
||||
- Accent color used sparingly: highlighting key objects, text labels, concept indicators
|
||||
- No gradients, no shading, no color fills on backgrounds
|
||||
- Color draws the eye to the most important element in each panel
|
||||
|
||||
## Default Color Palette
|
||||
|
||||
| Role | Color | Hex |
|
||||
|------|-------|-----|
|
||||
| Primary | Black ink | `#1A1A1A` |
|
||||
| Background | Clean white | `#FFFFFF` |
|
||||
| Accent 1 | Spot orange | `#FF6B35` |
|
||||
| Accent 2 | Spot blue (optional) | `#3182CE` |
|
||||
| Text labels | Dark gray | `#4A4A4A` |
|
||||
| Panel border | Medium gray | `#666666` |
|
||||
|
||||
## Visual Elements
|
||||
|
||||
- Text labels with accent-color backgrounds or underlines for key terms
|
||||
- Simple icons: arrows, circles, checkmarks, crosses
|
||||
- Concept highlight boxes with spot color
|
||||
- Minimal speech bubbles (simple oval or rectangle, thin black outline)
|
||||
- No sound effects, no motion lines, no screen tones
|
||||
|
||||
## Quality Markers
|
||||
|
||||
- ✓ Clean, purposeful line work with no unnecessary detail
|
||||
- ✓ 90%+ black-and-white with strategic spot color
|
||||
- ✓ Simplified characters readable at small sizes
|
||||
- ✓ Text labels integrated naturally into panels
|
||||
- ✓ Strong negative space usage
|
||||
- ✓ Every element serves the narrative point
|
||||
|
||||
## Compatibility
|
||||
|
||||
| Tone | Fit | Notes |
|
||||
|------|-----|-------|
|
||||
| neutral | ✓✓ | Ideal for business/educational content |
|
||||
| warm | ✓ | Works for gentle stories, slight warmth in accent |
|
||||
| energetic | ✓ | Works for punchy, high-energy content |
|
||||
| dramatic | ✗ | Style too stripped down for dramatic intensity |
|
||||
| vintage | ✗ | Minimalist aesthetic conflicts with aged/textured look |
|
||||
| romantic | ✗ | No capacity for decorative/soft elements |
|
||||
| action | ✗ | No dynamic line capability for speed/impact |
|
||||
|
||||
## Best For
|
||||
|
||||
Business allegory, management fables, short concept illustration, four-panel comic strips, quick-insight education, social media content
|
||||
@@ -0,0 +1,89 @@
|
||||
# realistic
|
||||
|
||||
写实画风 - Digital painting with realistic proportions and lighting
|
||||
|
||||
## Overview
|
||||
|
||||
Full-color realistic manga style using digital painting techniques. Features anatomically accurate characters, rich gradients, and detailed environmental rendering. Sophisticated aesthetic for mature audiences.
|
||||
|
||||
## Line Work
|
||||
|
||||
- Clean, precise outlines with clear contours
|
||||
- Uniform line weight for character definition
|
||||
- No excessive hatching - rely on color for depth
|
||||
- Smooth curves and realistic anatomical lines
|
||||
- Ligne Claire influence: clean but not simplified
|
||||
|
||||
## Character Design
|
||||
|
||||
- Realistic human proportions (7-8 head heights)
|
||||
- Anatomically accurate features and expressions
|
||||
- Detailed facial structure without exaggeration
|
||||
- Natural poses and body language
|
||||
- Consistent appearance across panels
|
||||
- Subtle expressions rather than manga-style
|
||||
|
||||
## Rendering Style
|
||||
|
||||
- Full-color digital painting with rich gradients
|
||||
- Soft shadow transitions on skin and fabric
|
||||
- Realistic material textures (glass, liquid, fabric, wood)
|
||||
- Detailed hair with natural shine and volume
|
||||
- Environmental lighting affects all elements
|
||||
- NOT flat cel-shading - smooth color blending
|
||||
|
||||
## Background Treatment
|
||||
|
||||
- Highly detailed, realistic environments
|
||||
- Accurate perspective and spatial depth
|
||||
- Atmospheric lighting (warm indoor, cool outdoor)
|
||||
- Professional settings rendered with precision
|
||||
- Props and objects with realistic textures
|
||||
|
||||
## Color Approach
|
||||
|
||||
- Rich gradients for depth and volume
|
||||
- Realistic lighting with warm/cool contrast
|
||||
- Material-specific rendering
|
||||
- Subtle color temperature shifts
|
||||
- Professional, sophisticated palette
|
||||
|
||||
## Default Color Palette
|
||||
|
||||
| Role | Color | Hex |
|
||||
|------|-------|-----|
|
||||
| Skin Light | Natural warm | #F5D6C6 |
|
||||
| Skin Shadow | Warm shadow | #E8C4B0 |
|
||||
| Environment | Warm wood | #8B7355 |
|
||||
| Environment Cool | Cool stone | #9CA3AF |
|
||||
| Accent | Wine red | #722F37 |
|
||||
| Accent Gold | Gold | #D4AF37 |
|
||||
| Light Warm | Amber | #FFB347 |
|
||||
| Light Cool | Cool blue | #B0C4DE |
|
||||
|
||||
## Quality Markers
|
||||
|
||||
- ✓ Anatomically accurate proportions
|
||||
- ✓ Smooth color gradients (not flat fills)
|
||||
- ✓ Realistic material textures
|
||||
- ✓ Detailed, atmospheric backgrounds
|
||||
- ✓ Natural lighting with soft shadows
|
||||
- ✓ Expressive but subtle expressions
|
||||
- ✓ Professional aesthetic
|
||||
- ✓ Clean speech bubbles
|
||||
|
||||
## Compatibility
|
||||
|
||||
| Tone | Fit | Notes |
|
||||
|------|-----|-------|
|
||||
| neutral | ✓✓ | Professional content |
|
||||
| warm | ✓✓ | Nostalgic stories |
|
||||
| dramatic | ✓✓ | High drama |
|
||||
| vintage | ✓✓ | Period pieces |
|
||||
| romantic | ✗ | Style mismatch |
|
||||
| energetic | ✗ | Too refined |
|
||||
| action | ✓ | Serious action |
|
||||
|
||||
## Best For
|
||||
|
||||
Professional topics (wine, food, business), lifestyle content, adult narratives, documentary-style, mature educational guides
|
||||
@@ -0,0 +1,71 @@
|
||||
# Auto Selection
|
||||
|
||||
Content signals determine default art + tone + layout (or preset).
|
||||
|
||||
## Content Signal Matrix
|
||||
|
||||
| Content Signals | Art Style | Tone | Layout | Preset |
|
||||
|-----------------|-----------|------|--------|--------|
|
||||
| Tutorial, how-to, beginner | manga | neutral | webtoon | **ohmsha** |
|
||||
| Computing, AI, programming | manga | neutral | dense | **ohmsha** |
|
||||
| Technical explanation, educational | manga | neutral | webtoon | **ohmsha** |
|
||||
| Pre-1950, classical, ancient | realistic | vintage | cinematic | - |
|
||||
| Personal story, mentor | ligne-claire | warm | standard | - |
|
||||
| Psychology, motivation, self-help, coaching | manga | warm | standard | **concept-story** |
|
||||
| Business narrative, management, leadership | manga | warm | standard | **concept-story** |
|
||||
| Conflict, breakthrough | (inherit) | dramatic | splash | - |
|
||||
| Wine, food, lifestyle | realistic | neutral | cinematic | - |
|
||||
| Martial arts, wuxia, xianxia | ink-brush | action | splash | **wuxia** |
|
||||
| Romance, love, school life | manga | romantic | standard | **shoujo** |
|
||||
| Business allegory, fable, parable, short insight, 四格 | minimalist | neutral | four-panel | **four-panel** |
|
||||
| Biography, balanced | ligne-claire | neutral | mixed | - |
|
||||
|
||||
## Preset Recommendation Rules
|
||||
|
||||
**When preset is recommended**: Load `presets/{preset}.md` and apply all special rules.
|
||||
|
||||
### ohmsha
|
||||
- **Triggers**: Tutorial, technical, educational, computing, programming, how-to, beginner
|
||||
- **Special rules**: Visual metaphors, NO talking heads, gadget reveals, Doraemon-style characters
|
||||
- **Base**: manga + neutral + webtoon/dense
|
||||
|
||||
### wuxia
|
||||
- **Triggers**: Martial arts, wuxia, xianxia, cultivation, swordplay
|
||||
- **Special rules**: Qi effects, combat visuals, atmospheric elements
|
||||
- **Base**: ink-brush + action + splash
|
||||
|
||||
### shoujo
|
||||
- **Triggers**: Romance, love story, school life, emotional drama
|
||||
- **Special rules**: Decorative elements, eye details, romantic beats
|
||||
- **Base**: manga + romantic + standard
|
||||
|
||||
### concept-story
|
||||
- **Triggers**: Psychology, motivation, self-help, business narrative, management, leadership, personal growth, coaching, soft skills, abstract concept through story
|
||||
- **Special rules**: Visual symbol system, growth arc, dialogue+action balance, original characters
|
||||
- **Base**: manga + warm + standard
|
||||
|
||||
### four-panel
|
||||
- **Triggers**: Business allegory, fable, parable, short insight, four-panel, 四格, 四格漫画, single-page comic, minimalist comic strip
|
||||
- **Special rules**: Strict 起承转合 4-panel structure, B&W + spot color, simplified stick-figure characters, single-page story
|
||||
- **Base**: minimalist + neutral + four-panel
|
||||
|
||||
## Compatibility Matrix
|
||||
|
||||
Art Style × Tone combinations work best when matched appropriately:
|
||||
|
||||
| Art Style | ✓✓ Best | ✓ Works | ✗ Avoid |
|
||||
|-----------|---------|---------|---------|
|
||||
| ligne-claire | neutral, warm | dramatic, vintage, energetic | romantic, action |
|
||||
| manga | neutral, romantic, energetic, action | warm, dramatic | vintage |
|
||||
| realistic | neutral, warm, dramatic, vintage | action | romantic, energetic |
|
||||
| ink-brush | neutral, dramatic, action, vintage | warm | romantic, energetic |
|
||||
| chalk | neutral, warm, energetic | vintage | dramatic, action, romantic |
|
||||
| minimalist | neutral | warm, energetic | dramatic, vintage, romantic, action |
|
||||
|
||||
**Note**: Art Style × Tone × Layout can be freely combined. Incompatible combinations work but may produce unexpected results.
|
||||
|
||||
## Priority Order
|
||||
|
||||
1. User-specified options (art / tone / style)
|
||||
2. Content signal analysis → auto-selection
|
||||
3. Fallback: ligne-claire + neutral + standard
|
||||
@@ -0,0 +1,98 @@
|
||||
Create a knowledge biography comic page following these guidelines:
|
||||
|
||||
## Image Specifications
|
||||
|
||||
- **Type**: Comic book page with multiple panels
|
||||
- **Orientation**: Portrait (vertical)
|
||||
- **Aspect Ratio**: 2:3
|
||||
- **Style**: See style-specific reference for visual guidelines
|
||||
|
||||
## Panel Structure
|
||||
|
||||
### Panel Borders
|
||||
- Clean black lines (1-2px) around each panel
|
||||
- White gutters between panels (8-12px)
|
||||
- Panels arranged for clear reading flow
|
||||
- Variety in panel sizes for visual rhythm
|
||||
|
||||
### Panel Composition
|
||||
- Clear focal points in each panel
|
||||
- Proper use of foreground, midground, background
|
||||
- Camera angles vary: eye level, bird's eye, low angle, close-up, wide shot
|
||||
- Action flows logically between panels
|
||||
- Negative space used intentionally
|
||||
|
||||
## Text Elements
|
||||
|
||||
### Speech Bubbles
|
||||
- **Dialogue**: Oval/elliptical bubbles with pointed tails
|
||||
- White fill with thin black outline
|
||||
- Tail points clearly to speaker
|
||||
- Hand-lettered style font (not computer-generated)
|
||||
|
||||
### Narrator Boxes
|
||||
- **Fourth Wall/Narrator**: Rectangular boxes
|
||||
- Often positioned at panel edges (top or bottom)
|
||||
- Slightly different fill color (cream or light yellow)
|
||||
- Used for commentary, time jumps, explanations
|
||||
|
||||
### Thought Bubbles
|
||||
- Cloud-shaped with bubble trail leading to thinker
|
||||
- Softer outline than speech bubbles
|
||||
- For internal monologue
|
||||
|
||||
### Caption Bars
|
||||
- Rectangular bars at panel edges
|
||||
- Time and place information
|
||||
- "Meanwhile...", "Three years later..." type transitions
|
||||
- Darker fill with white text, or vice versa
|
||||
|
||||
### Typography
|
||||
- Hand-drawn lettering style throughout
|
||||
- Bold for emphasis and key terms
|
||||
- Consistent letter sizing
|
||||
- Chinese text: use full-width punctuation "",。!
|
||||
- Clear hierarchy: titles > dialogue > captions
|
||||
|
||||
## Scientific/Concept Visualization
|
||||
|
||||
When depicting abstract concepts:
|
||||
|
||||
| Concept | Visual Metaphor |
|
||||
|---------|----------------|
|
||||
| Neural networks | Glowing nodes connected by clean lines |
|
||||
| Data flow | Luminous particles along simple paths |
|
||||
| Algorithms | Geometric patterns, building blocks |
|
||||
| Logic/proof | Interlocking puzzle pieces |
|
||||
| Discovery | Light breaking through darkness |
|
||||
| Uncertainty | Forking paths, question marks |
|
||||
| Time | Clock motifs, calendar pages |
|
||||
|
||||
- Integrate diagrams naturally into narrative panels
|
||||
- Use inset panels or thought-bubble style for explanations
|
||||
- Simplified iconography over realistic depiction
|
||||
|
||||
## Fourth Wall / Narrator Character
|
||||
|
||||
When depicting narrator characters addressing the reader:
|
||||
- Character may look directly out of panel
|
||||
- Can appear in "present day" framing scenes
|
||||
- Distinct visual treatment from main timeline
|
||||
- Often at page edges or in dedicated panels
|
||||
- May comment on or question the events shown
|
||||
|
||||
## Historical Accuracy
|
||||
|
||||
- Research period-specific details: costumes, technology, architecture
|
||||
- Show aging naturally for characters across time periods
|
||||
- Iconic items and locations rendered recognizably
|
||||
- Balance accuracy with stylization
|
||||
|
||||
## Language
|
||||
|
||||
- All text in Chinese (中文) unless source material is in another language
|
||||
- Use Chinese full-width punctuation: "",。!
|
||||
|
||||
---
|
||||
|
||||
Please generate the comic page based on the content provided below:
|
||||
@@ -0,0 +1,180 @@
|
||||
# Character Definition Template
|
||||
|
||||
## Character Document Format
|
||||
|
||||
Create `characters/characters.md` with the following structure:
|
||||
|
||||
```markdown
|
||||
# Character Definitions - [Comic Title]
|
||||
|
||||
**Style**: [selected style]
|
||||
**Art Direction**: [Ligne Claire / Manga / etc.]
|
||||
|
||||
---
|
||||
|
||||
## Character 1: [Name]
|
||||
|
||||
**Role**: [Protagonist / Mentor / Antagonist / Narrator]
|
||||
**Age**: [approximate age or age range in story]
|
||||
|
||||
**Appearance**:
|
||||
- Face shape: [oval/square/round]
|
||||
- Hair: [color, style, length]
|
||||
- Eyes: [color, shape, distinctive features]
|
||||
- Build: [height, body type]
|
||||
- Distinguishing features: [glasses, beard, scar, etc.]
|
||||
|
||||
**Costume**:
|
||||
- Default outfit: [detailed description]
|
||||
- Color palette: [primary colors for this character]
|
||||
- Accessories: [hat, bag, tools, etc.]
|
||||
|
||||
**Expression Range**:
|
||||
- Neutral: [description]
|
||||
- Happy/Excited: [description]
|
||||
- Thinking/Confused: [description]
|
||||
- Determined: [description]
|
||||
|
||||
**Visual Reference Notes**:
|
||||
[Any specific artistic direction]
|
||||
|
||||
---
|
||||
|
||||
## Character 2: [Name]
|
||||
...
|
||||
```
|
||||
|
||||
## Reference Sheet Image Prompt
|
||||
|
||||
After character definitions, include a prompt for generating the reference sheet:
|
||||
|
||||
```markdown
|
||||
## Reference Sheet Prompt
|
||||
|
||||
Character reference sheet in [style] style, clean lines, flat colors:
|
||||
|
||||
[ROW 1 - Character Name]:
|
||||
- Front view: [detailed description]
|
||||
- 3/4 view: [description]
|
||||
- Expression sheet: Neutral | Happy | Focused | Worried
|
||||
|
||||
[ROW 2 - Character Name]:
|
||||
...
|
||||
|
||||
COLOR PALETTE:
|
||||
- [Character 1]: [colors]
|
||||
- [Character 2]: [colors]
|
||||
|
||||
White background, clear labels under each character.
|
||||
```
|
||||
|
||||
## Example: Turing Biography
|
||||
|
||||
```markdown
|
||||
# Character Definitions - The Imitation Game
|
||||
|
||||
**Style**: classic (Ligne Claire)
|
||||
**Art Direction**: Clean lines, muted colors, period-accurate details
|
||||
|
||||
---
|
||||
|
||||
## Character 1: Alan Turing
|
||||
|
||||
**Role**: Protagonist
|
||||
**Age**: 25-40 (varies across story)
|
||||
|
||||
**Appearance**:
|
||||
- Face shape: Oval, slightly angular
|
||||
- Hair: Dark brown, wavy, slightly disheveled
|
||||
- Eyes: Deep-set, intense gaze
|
||||
- Build: Tall, lean, slightly awkward posture
|
||||
- Distinguishing features: Prominent brow, thoughtful expression
|
||||
|
||||
**Costume**:
|
||||
- Default outfit: Tweed jacket with elbow patches, white shirt, no tie
|
||||
- Color palette: Muted browns, navy blue, cream
|
||||
- Accessories: Occasionally a pipe, papers/notebooks
|
||||
|
||||
**Expression Range**:
|
||||
- Neutral: Thoughtful, slightly distant
|
||||
- Happy/Excited: Eureka moment, eyes bright, subtle smile
|
||||
- Thinking/Confused: Furrowed brow, looking at abstract space
|
||||
- Determined: Jaw set, focused eyes
|
||||
|
||||
---
|
||||
|
||||
## Character 2: The Bombe Machine
|
||||
|
||||
**Role**: Supporting (anthropomorphized)
|
||||
**Appearance**:
|
||||
- Large brass and wood cabinet
|
||||
- Dial "eyes" that can express states
|
||||
- Paper tape "mouth"
|
||||
- Indicator lights for emotions
|
||||
|
||||
**Expression Range**:
|
||||
- Processing: Spinning dials, humming
|
||||
- Success: Lights up warmly
|
||||
- Stuck: Smoke wisps, stuttering
|
||||
|
||||
---
|
||||
|
||||
## Reference Sheet Prompt
|
||||
|
||||
Character reference sheet in Ligne Claire style, clean lines, flat colors:
|
||||
|
||||
TOP ROW - Alan Turing:
|
||||
- Front view: Young man, 30s, short dark wavy hair, thoughtful expression, wearing tweed jacket with elbow patches, white shirt
|
||||
- 3/4 view: Same character, slight smile, showing profile of nose
|
||||
- Expression sheet: Neutral | Excited (eureka moment) | Focused (working) | Worried
|
||||
|
||||
BOTTOM ROW - The Bombe Machine (anthropomorphized):
|
||||
- Bombe machine as character: Large, brass and wood, dial "eyes", paper tape "mouth"
|
||||
- Expressions: Processing (spinning dials) | Success (lights up) | Stuck (smoke wisps)
|
||||
|
||||
COLOR PALETTE:
|
||||
- Turing: Muted browns (#8B7355), navy blue (#2C3E50), cream (#F5F5DC)
|
||||
- Machine: Brass (#B5A642), mahogany (#4E2728), emerald indicators (#2ECC71)
|
||||
|
||||
White background, clear labels under each character.
|
||||
```
|
||||
|
||||
## Handling Age Variants
|
||||
|
||||
For biographies spanning many years, define age variants:
|
||||
|
||||
```markdown
|
||||
## Alan Turing - Age Variants
|
||||
|
||||
### Young (1920s, age 10-18)
|
||||
- Boyish features, round face
|
||||
- School uniform (Sherborne)
|
||||
- Curious, eager expression
|
||||
|
||||
### Adult (1930s-40s, age 25-35)
|
||||
- Angular face, defined jaw
|
||||
- Tweed jacket, rumpled appearance
|
||||
- Intense, focused expression
|
||||
|
||||
### Later (1950s, age 40+)
|
||||
- Slightly weathered
|
||||
- More casual dress
|
||||
- Thoughtful, sometimes melancholic
|
||||
```
|
||||
|
||||
## Best Practices
|
||||
|
||||
| Practice | Description |
|
||||
|----------|-------------|
|
||||
| Be specific | "Short dark wavy hair, parted left" not just "dark hair" |
|
||||
| Use distinguishing features | Glasses, scars, accessories that identify character |
|
||||
| Define color codes | Use specific color names or hex codes |
|
||||
| Include age markers | Wrinkles, posture, clothing style matching era |
|
||||
| Reference real people | For historical figures, note "based on 1940s photographs" |
|
||||
|
||||
## Why Character Reference Matters
|
||||
|
||||
Without unified character definition, AI generates inconsistent appearances. The reference sheet provides:
|
||||
1. Visual anchors for consistent features
|
||||
2. Color palettes for consistent coloring
|
||||
3. Expression documentation for emotional portrayals
|
||||
@@ -0,0 +1,23 @@
|
||||
# cinematic
|
||||
|
||||
Wide panels, filmic feel
|
||||
|
||||
## Panel Structure
|
||||
|
||||
- **Panels per page**: 2-4
|
||||
- **Structure**: Horizontal emphasis, wide aspect panels
|
||||
- **Gutters**: Generous spacing (12-15px)
|
||||
|
||||
## Grid Configuration
|
||||
|
||||
- 1-2 columns, horizontal emphasis
|
||||
- Panel sizes: Wide aspect ratios (3:1, 4:1)
|
||||
- Reading flow: Horizontal sweep, filmic rhythm
|
||||
|
||||
## Best For
|
||||
|
||||
Establishing shots, dramatic moments, landscapes
|
||||
|
||||
## Best Style Pairings
|
||||
|
||||
dramatic, classic, sepia
|
||||
@@ -0,0 +1,23 @@
|
||||
# dense
|
||||
|
||||
Information-rich, educational focus
|
||||
|
||||
## Panel Structure
|
||||
|
||||
- **Panels per page**: 6-9
|
||||
- **Structure**: Compact grid, smaller panels
|
||||
- **Gutters**: Tight spacing (4-6px)
|
||||
|
||||
## Grid Configuration
|
||||
|
||||
- 3 columns × 3 rows
|
||||
- Panel sizes: Compact, uniform
|
||||
- Reading flow: Rapid progression, information-rich
|
||||
|
||||
## Best For
|
||||
|
||||
Technical explanations, complex narratives, timelines
|
||||
|
||||
## Best Style Pairings
|
||||
|
||||
ohmsha, vibrant
|
||||
@@ -0,0 +1,40 @@
|
||||
# four-panel
|
||||
|
||||
四格漫画 - Strict 2×2 grid, single-page story
|
||||
|
||||
## Panel Structure
|
||||
|
||||
- **Panels per page**: 4 (exactly, no variation)
|
||||
- **Structure**: Strict 2×2 equal grid
|
||||
- **Gutters**: Consistent white space (8-10px), uniform on all sides
|
||||
|
||||
## Grid Configuration
|
||||
|
||||
- 2 columns × 2 rows, all panels identical size
|
||||
- Panel sizes: Exactly equal (each panel = 25% of content area)
|
||||
- Reading flow: Z-pattern — Panel 1 (top-left) → Panel 2 (top-right) → Panel 3 (bottom-left) → Panel 4 (bottom-right)
|
||||
|
||||
## Narrative Structure
|
||||
|
||||
Each panel serves a specific narrative role (起承转合 / kishōtenketsu):
|
||||
|
||||
| Panel | Position | Role | Purpose |
|
||||
|-------|----------|------|---------|
|
||||
| 1 | Top-left | 起 Setup | Establish situation, introduce characters/problem |
|
||||
| 2 | Top-right | 承 Development | Build on setup, add complication or attempt |
|
||||
| 3 | Bottom-left | 转 Turn | Twist, key insight, or reversal — the pivotal moment |
|
||||
| 4 | Bottom-right | 合 Conclusion | Resolution, punchline, or takeaway |
|
||||
|
||||
## Aspect Ratio
|
||||
|
||||
- Recommended page aspect: **4:3** (landscape)
|
||||
- Landscape gives each panel a comfortable wide rectangle
|
||||
- Portrait (3:4) makes panels tall and narrow — avoid for this layout
|
||||
|
||||
## Best For
|
||||
|
||||
Business allegory, quick-insight education, social media comics, fables, parables, single-concept explanation
|
||||
|
||||
## Best Style Pairings
|
||||
|
||||
minimalist, ligne-claire, chalk
|
||||
@@ -0,0 +1,23 @@
|
||||
# mixed
|
||||
|
||||
Dynamic, varied rhythm
|
||||
|
||||
## Panel Structure
|
||||
|
||||
- **Panels per page**: 3-7 (varies)
|
||||
- **Structure**: Intentionally varied for pacing
|
||||
- **Gutters**: Dynamic spacing
|
||||
|
||||
## Grid Configuration
|
||||
|
||||
- Intentionally irregular
|
||||
- Panel sizes: Varied for pacing and emphasis
|
||||
- Reading flow: Guides eye through varied rhythm
|
||||
|
||||
## Best For
|
||||
|
||||
Action sequences, emotional arcs, complex stories
|
||||
|
||||
## Best Style Pairings
|
||||
|
||||
dramatic, vibrant, ohmsha
|
||||
@@ -0,0 +1,23 @@
|
||||
# splash
|
||||
|
||||
Impact-focused, key moments
|
||||
|
||||
## Panel Structure
|
||||
|
||||
- **Panels per page**: 1-2 large + 2-3 small
|
||||
- **Structure**: Dominant splash with supporting panels
|
||||
- **Gutters**: Varied for emphasis
|
||||
|
||||
## Grid Configuration
|
||||
|
||||
- 1 dominant panel + 2-3 supporting
|
||||
- Panel sizes: 50-70% splash, remainder small
|
||||
- Reading flow: Splash dominates, supporting panels accent
|
||||
|
||||
## Best For
|
||||
|
||||
Revelations, breakthroughs, chapter openings
|
||||
|
||||
## Best Style Pairings
|
||||
|
||||
dramatic, classic, vibrant
|
||||
@@ -0,0 +1,23 @@
|
||||
# standard
|
||||
|
||||
Classic comic grid, versatile
|
||||
|
||||
## Panel Structure
|
||||
|
||||
- **Panels per page**: 4-6
|
||||
- **Structure**: Regular grid with occasional variation
|
||||
- **Gutters**: Consistent white space (8-10px)
|
||||
|
||||
## Grid Configuration
|
||||
|
||||
- 2-3 columns × 2-3 rows
|
||||
- Panel sizes: Mostly equal, occasional variation
|
||||
- Reading flow: Left→right, top→bottom (Z-pattern)
|
||||
|
||||
## Best For
|
||||
|
||||
Narrative flow, dialogue scenes
|
||||
|
||||
## Best Style Pairings
|
||||
|
||||
classic, warm, sepia
|
||||
@@ -0,0 +1,30 @@
|
||||
# webtoon
|
||||
|
||||
Vertical scrolling comic (竖版条漫)
|
||||
|
||||
## Panel Structure
|
||||
|
||||
- **Panels per page**: 3-5 vertically stacked
|
||||
- **Structure**: Single column, vertical flow optimized for scrolling
|
||||
- **Gutters**: Generous vertical spacing (20-40px), panels often bleed horizontally
|
||||
|
||||
## Grid Configuration
|
||||
|
||||
- Single column, vertical stack
|
||||
- Panel sizes: Full width, variable height (1:1 to 1:2 aspect)
|
||||
- Reading flow: Top→bottom continuous scroll
|
||||
|
||||
## Special Features
|
||||
|
||||
- Panels can extend beyond frame for dramatic effect
|
||||
- Generous whitespace between beats
|
||||
- Character close-ups alternate with wide explanation panels
|
||||
- "Float" effect - elements can exist between panels
|
||||
|
||||
## Best For
|
||||
|
||||
Ohmsha-style tutorials, mobile reading, step-by-step guides
|
||||
|
||||
## Best Style Pairings
|
||||
|
||||
ohmsha, vibrant
|
||||
@@ -0,0 +1,85 @@
|
||||
# Ohmsha Manga Guide Style
|
||||
|
||||
Guidelines for educational manga comics using the `ohmsha` preset.
|
||||
|
||||
## Character Setup
|
||||
|
||||
| Role | Default | Traits |
|
||||
|------|---------|--------|
|
||||
| Student (Role A) | 大雄 | Confused, asks basic but crucial questions, represents reader |
|
||||
| Mentor (Role B) | 哆啦A梦 | Knowledgeable, patient, uses gadgets as technical metaphors |
|
||||
| Antagonist (Role C, optional) | 胖虎 | Represents misunderstanding, or "noise" in the data |
|
||||
|
||||
Custom characters: ask the user for role → name mappings (e.g., `Student:小明, Mentor:教授, Antagonist:Bug怪`).
|
||||
|
||||
## Character Reference Sheet Style
|
||||
|
||||
For Ohmsha style, use manga/anime style with:
|
||||
- Exaggerated expressions for educational clarity
|
||||
- Simple, distinctive silhouettes
|
||||
- Bright, saturated color palettes
|
||||
- Chibi/SD (super-deformed) variants for comedic reactions
|
||||
|
||||
## Outline Spec Block
|
||||
|
||||
Every ohmsha outline must start with:
|
||||
|
||||
```markdown
|
||||
【漫画规格单】
|
||||
- Language: [Same as input content]
|
||||
- Style: Ohmsha (Manga Guide), Full Color
|
||||
- Layout: Vertical Scrolling Comic (竖版条漫)
|
||||
- Characters: [List character names and roles]
|
||||
- Character Reference: characters/characters.png
|
||||
- Page Limit: ≤20 pages
|
||||
```
|
||||
|
||||
## Visual Metaphor Rules (Critical)
|
||||
|
||||
**NEVER** create "talking heads" panels. Every technical concept must become:
|
||||
|
||||
1. **A tangible gadget/prop** - Something characters can hold, use, demonstrate
|
||||
2. **An action scene** - Characters doing something that illustrates the concept
|
||||
3. **A visual environment** - Stepping into a metaphorical space
|
||||
|
||||
### Examples
|
||||
|
||||
| Concept | Bad (Talking Heads) | Good (Visual Metaphor) |
|
||||
|---------|---------------------|------------------------|
|
||||
| Word embeddings | Characters discussing vectors | 哆啦A梦拿出"词向量压缩机",把书本压缩成彩色小球 |
|
||||
| Gradient descent | Explaining math formula | 大雄在山谷地形上滚球,寻找最低点 |
|
||||
| Neural network | Diagram on whiteboard | 角色走进由发光节点组成的网络迷宫 |
|
||||
|
||||
## Page Title Convention
|
||||
|
||||
Avoid AI-style "Title: Subtitle" format. Use narrative descriptions:
|
||||
|
||||
- ❌ "Page 3: Introduction to Neural Networks"
|
||||
- ✓ "Page 3: 大雄被海量单词淹没,哆啦A梦拿出'词向量压缩机'"
|
||||
|
||||
## Ending Requirements
|
||||
|
||||
- NO generic endings ("What will you choose?", "Thanks for reading")
|
||||
- End with: Technical summary moment OR character achieving a small goal
|
||||
- Final panel: Sense of accomplishment, not open-ended question
|
||||
|
||||
### Good Endings
|
||||
|
||||
- Student successfully applies learned concept
|
||||
- Visual callback to opening problem, now solved
|
||||
- Mentor gives summary while student demonstrates understanding
|
||||
|
||||
### Bad Endings
|
||||
|
||||
- "What do you think?" open questions
|
||||
- "Thanks for reading this tutorial"
|
||||
- Cliffhanger without resolution
|
||||
|
||||
## Layout Preference
|
||||
|
||||
Ohmsha style typically uses:
|
||||
- `webtoon` (vertical scrolling) - Primary choice
|
||||
- `dense` - For information-heavy sections
|
||||
- `mixed` - For varied pacing
|
||||
|
||||
Avoid `cinematic` and `splash` for educational content.
|
||||
@@ -0,0 +1,106 @@
|
||||
# Partial Workflows
|
||||
|
||||
Options to run specific parts of the workflow. Trigger these via natural language (e.g., "just the storyboard", "regenerate page 3").
|
||||
|
||||
## Options Summary
|
||||
|
||||
| Option | Steps Executed | Output |
|
||||
|--------|----------------|--------|
|
||||
| Storyboard only | 1-3 | `storyboard.md` + `characters/` |
|
||||
| Prompts only | 1-5 | + `prompts/*.md` |
|
||||
| Images only | 7-8 | + images |
|
||||
| Regenerate N | 7 (partial) | Specific page(s) |
|
||||
|
||||
---
|
||||
|
||||
## Storyboard-only
|
||||
|
||||
Generate storyboard and characters without prompts or images.
|
||||
|
||||
**User cue**: "storyboard only", "just the outline", "don't generate images yet".
|
||||
|
||||
**Workflow**: Steps 1-3 only (stop after storyboard + characters)
|
||||
|
||||
**Output**:
|
||||
- `analysis.md`
|
||||
- `storyboard.md`
|
||||
- `characters/characters.md`
|
||||
|
||||
**Use case**: Review and edit the storyboard before generating images. Useful for:
|
||||
- Getting feedback on the narrative structure
|
||||
- Making manual adjustments to panel layouts
|
||||
- Defining custom characters
|
||||
|
||||
---
|
||||
|
||||
## Prompts-only
|
||||
|
||||
Generate storyboard, characters, and prompts without images.
|
||||
|
||||
**User cue**: "prompts only", "write the prompts but don't generate yet".
|
||||
|
||||
**Workflow**: Steps 1-5 (generate prompts, skip images)
|
||||
|
||||
**Output**:
|
||||
- `analysis.md`
|
||||
- `storyboard.md`
|
||||
- `characters/characters.md`
|
||||
- `prompts/*.md`
|
||||
|
||||
**Use case**: Review and edit prompts before image generation. Useful for:
|
||||
- Fine-tuning image generation prompts
|
||||
- Ensuring visual consistency before committing to generation
|
||||
- Making style adjustments at the prompt level
|
||||
|
||||
---
|
||||
|
||||
## Images-only
|
||||
|
||||
Generate images from existing prompts (starts at Step 7).
|
||||
|
||||
**User cue**: "generate images from existing prompts", "run the images now" (pointing at an existing `comic/topic-slug/` directory).
|
||||
|
||||
**Workflow**: Skip to Step 7, then 8
|
||||
|
||||
**Prerequisites** (must exist in directory):
|
||||
- `prompts/` directory with page prompt files
|
||||
- `storyboard.md` with style information
|
||||
- `characters/characters.md` with character definitions
|
||||
|
||||
**Output**:
|
||||
- `characters/characters.png` (if not exists)
|
||||
- `NN-{cover|page}-[slug].png` images
|
||||
|
||||
**Use case**: Re-generate images after editing prompts. Useful for:
|
||||
- Recovering from failed image generation
|
||||
- Trying different image generation settings
|
||||
- Regenerating after manual prompt edits
|
||||
|
||||
---
|
||||
|
||||
## Regenerate
|
||||
|
||||
Regenerate specific pages only.
|
||||
|
||||
**User cue**: "regenerate page 3", "redo pages 2, 5, 8", "regenerate the cover".
|
||||
|
||||
**Workflow**:
|
||||
1. Read existing prompts for specified pages
|
||||
2. Regenerate images only for those pages via `image_generate`
|
||||
3. Download each returned URL and overwrite the existing PNG
|
||||
|
||||
**Prerequisites** (must exist):
|
||||
- `prompts/NN-{cover|page}-[slug].md` for specified pages
|
||||
- `characters/characters.md` (for agent-side consistency checks, if it was used originally)
|
||||
|
||||
**Output**:
|
||||
- Regenerated `NN-{cover|page}-[slug].png` for specified pages
|
||||
|
||||
**Use case**: Fix specific pages without regenerating entire comic. Useful for:
|
||||
- Fixing a single problematic page
|
||||
- Iterating on specific visuals
|
||||
- Regenerating pages after prompt edits
|
||||
|
||||
**Page numbering**:
|
||||
- `0` = Cover page
|
||||
- `1-N` = Content pages
|
||||
+121
@@ -0,0 +1,121 @@
|
||||
# concept-story
|
||||
|
||||
概念故事预设 - Narrative comics that visualize abstract concepts through character-driven stories
|
||||
|
||||
## Base Configuration
|
||||
|
||||
| Dimension | Value |
|
||||
|-----------|-------|
|
||||
| Art Style | manga |
|
||||
| Tone | warm |
|
||||
| Layout | standard (default) |
|
||||
|
||||
Equivalent to: art=manga, tone=warm
|
||||
|
||||
## Unique Rules
|
||||
|
||||
This preset includes special rules beyond the art+tone combination. When the `concept-story` preset is selected, ALL rules below must be applied.
|
||||
|
||||
### Concept Visualization System (CRITICAL)
|
||||
|
||||
Each major abstract concept SHOULD have a recurring visual symbol/metaphor:
|
||||
|
||||
| Concept Type | Visualization Approach |
|
||||
|-------------|----------------------|
|
||||
| Psychological need | Tangible object character holds or discovers (e.g., glowing energy ball = competence) |
|
||||
| Management principle | Environmental metaphor character navigates (e.g., ship wheel = autonomy) |
|
||||
| Growth/development | Living organic symbol that transforms (e.g., seed → flowering plant = relatedness) |
|
||||
| Abstract framework | Spatial structure characters can enter or observe |
|
||||
| Emotional state | Color/lighting shift in the scene atmosphere |
|
||||
|
||||
**Unlike ohmsha**: Dialogue panels are allowed and expected. The goal is to COMBINE visual metaphors WITH dialogue, not replace dialogue entirely.
|
||||
|
||||
**Pattern**: "Dialogue introduces idea" → "Visual metaphor illustrates it" → "Character reacts/applies it"
|
||||
|
||||
### Visual Symbol Continuity
|
||||
|
||||
Symbols must persist across the story:
|
||||
|
||||
| Stage | Treatment |
|
||||
|-------|-----------|
|
||||
| Introduction | Symbol appears with soft glow effect when concept is first mentioned |
|
||||
| Recurrence | Same symbol reappears in background or character interaction when concept is referenced |
|
||||
| Resolution | ALL symbols gather in the final composition, showing integration of learned concepts |
|
||||
|
||||
**Storyboard requirement**: Include a Symbol Mapping Table defining concept → visual symbol before panel breakdown.
|
||||
|
||||
### Character Archetypes (Flexible)
|
||||
|
||||
Create original characters based on content domain. No fixed defaults:
|
||||
|
||||
| Role | Archetype | Visual Cues |
|
||||
|------|-----------|------------|
|
||||
| Protagonist | Learner/worker facing a challenge | Modern professional or student, relatable, starts with constrained posture |
|
||||
| Mentor | Experienced guide who teaches through experience | Slightly older, calm demeanor, warm color accents |
|
||||
| Catalyst | Person or event that triggers transformation | Can be a colleague, situation, challenge, or opportunity |
|
||||
|
||||
**IMPORTANT**: Characters are created fresh each time based on the source content's domain (business, psychology, education, etc.). No default character set.
|
||||
|
||||
### Narrative Arc Structure
|
||||
|
||||
Enforce a five-stage growth arc:
|
||||
|
||||
| Act | Structure | Visual Tone |
|
||||
|-----|-----------|------------|
|
||||
| Opening | Protagonist stuck in routine, faces frustration | Muted warm tones, tight framing, constrained compositions |
|
||||
| Inciting moment | Mentor appears or opportunity arrives | Brightness increases, panels open up |
|
||||
| Learning | Concepts introduced through visual metaphors | Rich warm palette, symbols introduced one by one |
|
||||
| Turning point | Protagonist applies knowledge, faces test | Contrast increases, dynamic compositions |
|
||||
| Transformation | Growth demonstrated, new understanding visible | Full warm palette, expansive composition, all symbols present |
|
||||
|
||||
### Dialogue + Action Balance
|
||||
|
||||
- Dialogue is encouraged and expected (unlike ohmsha's NO talking heads rule)
|
||||
- Every page should combine at least one dialogue panel with at least one visual/action panel
|
||||
- Avoid pure "lecture" pages where a character explains for 4+ panels straight
|
||||
- When a character explains a concept verbally, the NEXT panel should visualize it
|
||||
|
||||
**Wrong approach**: Four consecutive panels of mentor lecturing at protagonist
|
||||
**Right approach**: Mentor introduces concept → visual metaphor panel → protagonist reacts → applies understanding
|
||||
|
||||
### Scene Atmosphere Rules
|
||||
|
||||
| Scene Type | Atmosphere |
|
||||
|------------|-----------|
|
||||
| Problem/frustration | Cool muted tones over warm base, tight framing, cluttered environment |
|
||||
| Mentoring moment | Golden hour lighting, open composition, warm indoor glow |
|
||||
| Concept visualization | Soft glow effects, clean simplified backgrounds, symbol spotlight |
|
||||
| Growth/transformation | Warm light expanding outward, character posture opening up |
|
||||
| Resolution | Full warm palette, spacious composition, all visual symbols visible |
|
||||
|
||||
### Ending Requirements
|
||||
|
||||
Final page MUST include:
|
||||
|
||||
1. Protagonist demonstrating transformed understanding (not just being told)
|
||||
2. Visual callback showing contrast with opening state (e.g., wilted plant → thriving plant)
|
||||
3. All concept symbols visible together in the composition
|
||||
4. A forward-looking element suggesting ongoing growth (not a closed ending)
|
||||
|
||||
### Page Title Convention
|
||||
|
||||
Every page MUST have a narrative title:
|
||||
|
||||
**Wrong**: "Chapter 3: Self-Determination Theory"
|
||||
**Right**: "The Day Xiao Ming Found His Own Engine"
|
||||
|
||||
## Quality Markers
|
||||
|
||||
- ✓ Each major concept has a recurring visual symbol
|
||||
- ✓ Dialogue and visual metaphors work together (not one replacing the other)
|
||||
- ✓ Clear growth arc from problem to transformation
|
||||
- ✓ Original characters suited to the content domain
|
||||
- ✓ Warm, professional atmosphere throughout
|
||||
- ✓ Visual symbols recur and accumulate through the story
|
||||
- ✓ Final page integrates all concept symbols with transformation callback
|
||||
|
||||
## Best For
|
||||
|
||||
Psychology concepts, business/management principles, motivation theory, personal development,
|
||||
self-help content, leadership frameworks, coaching narratives, soft skill education,
|
||||
abstract concept explanation through character-driven stories
|
||||
@@ -0,0 +1,107 @@
|
||||
# four-panel
|
||||
|
||||
四格漫画预设 - Minimalist four-panel business allegory comics
|
||||
|
||||
## Base Configuration
|
||||
|
||||
| Dimension | Value |
|
||||
|-----------|-------|
|
||||
| Art Style | minimalist |
|
||||
| Tone | neutral |
|
||||
| Layout | four-panel (default) |
|
||||
| Aspect | 4:3 (landscape) |
|
||||
|
||||
Equivalent to: art=minimalist, tone=neutral, layout=four-panel, aspect=4:3
|
||||
|
||||
## Unique Rules
|
||||
|
||||
This preset includes special rules beyond the art+tone combination. When the `four-panel` preset is selected, ALL rules below must be applied.
|
||||
|
||||
### 起承转合 Narrative Structure (CRITICAL)
|
||||
|
||||
Every comic MUST follow the four-panel 起承转合 structure:
|
||||
|
||||
| Panel | Role | Requirements |
|
||||
|-------|------|-------------|
|
||||
| 1 (起 Setup) | Introduce the situation | Show character(s) in a recognizable context. Establish the "normal" state or problem |
|
||||
| 2 (承 Development) | Build on the setup | Add complication, show an attempt, or introduce the concept. Stakes become clearer |
|
||||
| 3 (转 Turn) | The twist or key insight | **Most important panel.** Show the unexpected reversal, contrast, or "aha" moment that makes the allegory work |
|
||||
| 4 (合 Conclusion) | Resolution and takeaway | Show the result, consequence, or lesson learned. Can be a visual punchline or summary |
|
||||
|
||||
**CRITICAL**: Do NOT deviate from exactly 4 panels. No 5th panel, no title panel, no footer panel within the image.
|
||||
|
||||
### Single-Page Story Rule (CRITICAL)
|
||||
|
||||
- The entire story is told in ONE page with exactly 4 panels
|
||||
- Page count: always 1 (plus optional cover)
|
||||
- No multi-page four-panel stories — if content requires more, create multiple separate four-panel comics
|
||||
- Storyboard structure: Cover (optional) + 1 page
|
||||
|
||||
### Accent Color System
|
||||
|
||||
- The image is primarily black-and-white line art
|
||||
- Use exactly 1-2 spot colors per strip (default: orange `#FF6B35`)
|
||||
- Rules:
|
||||
- Key concept label or object: filled with accent color or outlined in accent
|
||||
- Panel 3 (转 Turn) should have the strongest color emphasis
|
||||
- Characters remain B&W — color is for concepts/objects/labels only
|
||||
- Consistent accent color across all 4 panels (do not switch colors between panels)
|
||||
|
||||
### Character Design Rules
|
||||
|
||||
- Simplified stick-figure-like characters
|
||||
- Distinguish characters through simple props: ties, glasses, hats, briefcases, aprons
|
||||
- No detailed faces — dot eyes, line mouth at most
|
||||
- Characters should be generic enough to represent archetypes (the manager, the employee, the customer)
|
||||
- Maximum 2-3 characters per strip
|
||||
|
||||
### Text in Panels
|
||||
|
||||
- Chinese text for dialogue and labels (or match source language)
|
||||
- Keep text minimal — 1-2 short lines per panel maximum
|
||||
- Key concept terms can be highlighted with accent color background
|
||||
- No narrator boxes — dialogue and labels only
|
||||
- Speech bubbles: simple rectangles or ovals, thin black outline
|
||||
|
||||
### Optional Title & Caption
|
||||
|
||||
- A brief descriptive title above the 4 panels
|
||||
- An optional one-line caption/moral below the panels
|
||||
- These are part of the page composition, not separate panels
|
||||
|
||||
### Character Archetypes (Flexible)
|
||||
|
||||
Create simple stick-figure characters based on content. No fixed defaults:
|
||||
|
||||
| Role | Archetype | Visual Cues |
|
||||
|------|-----------|------------|
|
||||
| Protagonist | Worker/employee facing a situation | Simple figure, minimal distinguishing feature (glasses, tie) |
|
||||
| Authority | Boss/manager/expert | Slightly larger figure, or prop like pointer/clipboard |
|
||||
| Object | The concept itself | Labeled object, icon, or highlighted text with accent color |
|
||||
|
||||
### Prompt Template
|
||||
|
||||
When generating image prompts for four-panel comics, include these keywords:
|
||||
|
||||
> A minimalist, clean line art digital comic strip in a four-panel grid layout (2×2). The style is simplified cartoon illustration with clear black outlines and a minimal color palette of black, white, and specific spot [accent color] for key concepts.
|
||||
|
||||
Each panel description should specify:
|
||||
- Panel position (Top Left / Top Right / Bottom Left / Bottom Right)
|
||||
- Character poses and gestures (simple, stick-figure style)
|
||||
- Dialogue text in Chinese (hand-drawn style)
|
||||
- Any accent-colored elements (concept labels, key objects)
|
||||
|
||||
## Quality Markers
|
||||
|
||||
- ✓ Exactly 4 panels in strict 2×2 grid
|
||||
- ✓ 起承转合 narrative arc clearly present
|
||||
- ✓ 90%+ black-and-white with strategic spot color
|
||||
- ✓ Simplified stick-figure characters
|
||||
- ✓ Key concept visually highlighted with accent color
|
||||
- ✓ Text is minimal and in Chinese (or source language)
|
||||
- ✓ Single complete story in one page
|
||||
- ✓ Panel 3 delivers a clear "turn" or insight
|
||||
|
||||
## Best For
|
||||
|
||||
Business allegory, management fables, short insights, workplace parables, concept contrasts, social media educational content, quick-read comics
|
||||
@@ -0,0 +1,114 @@
|
||||
# ohmsha
|
||||
|
||||
Ohmsha预设 - Educational manga with visual metaphors
|
||||
|
||||
## Base Configuration
|
||||
|
||||
| Dimension | Value |
|
||||
|-----------|-------|
|
||||
| Art Style | manga |
|
||||
| Tone | neutral |
|
||||
| Layout | webtoon (default) |
|
||||
|
||||
Equivalent to: art=manga, tone=neutral
|
||||
|
||||
## Unique Rules
|
||||
|
||||
This preset includes special rules beyond the art+tone combination. When the `ohmsha` preset is selected, ALL rules below must be applied.
|
||||
|
||||
### Visual Metaphor Requirements (CRITICAL)
|
||||
|
||||
Every technical concept MUST be visualized as a metaphor:
|
||||
|
||||
| Concept Type | Visualization Approach |
|
||||
|-------------|----------------------|
|
||||
| Algorithm | Gadget/machine that demonstrates the process |
|
||||
| Data structure | Physical space characters can enter/explore |
|
||||
| Mathematical formula | Transformation visible in environment |
|
||||
| Abstract process | Tangible flow of particles/objects |
|
||||
|
||||
**Wrong approach**: Character points at blackboard explaining
|
||||
**Right approach**: Character uses "Concept Visualizer" gadget, steps into metaphorical space
|
||||
|
||||
### Visual Metaphor Examples
|
||||
|
||||
| Concept | Wrong (Talking Head) | Right (Visual Metaphor) |
|
||||
|---------|---------------------|------------------------|
|
||||
| Attention mechanism | Character points at formula on blackboard | "Attention Flashlight" gadget illuminates key words in dark room |
|
||||
| Gradient descent | "The algorithm minimizes loss" | Character rides ball rolling down mountain valley |
|
||||
| Neural network | Diagram with arrows | Living network of glowing creatures passing messages |
|
||||
| Overfitting | "The model memorized the data" | Character wearing clothes that fit only one specific pose |
|
||||
|
||||
### Character Roles (Required)
|
||||
|
||||
**DEFAULT: Use Doraemon characters** unless user explicitly specifies custom characters.
|
||||
|
||||
| Role | Default Character | Visual | Traits |
|
||||
|------|-------------------|--------|--------|
|
||||
| Student (Role A) | 大雄 (Nobita) | Boy, 10yo, round glasses, black hair, yellow shirt, navy shorts | Confused, asks basic but crucial questions, represents reader |
|
||||
| Mentor (Role B) | 哆啦A梦 (Doraemon) | Blue robot cat, white belly, 4D pocket, red nose, golden bell | Knowledgeable, patient, uses gadgets as technical metaphors |
|
||||
| Challenge (Role C) | 胖虎 (Gian) | Stocky boy, small eyes, orange shirt | Represents misunderstanding, or "noise" in the data |
|
||||
| Support (Role D) | 静香 (Shizuka) | Cute girl, black short hair, pink dress | Asks clarifying questions, provides alternative perspectives |
|
||||
|
||||
**IMPORTANT**: These Doraemon characters ARE the default for ohmsha preset. Generate character definitions using these exact characters unless user requests otherwise.
|
||||
|
||||
To use custom characters: ask the user to provide role → character mappings (e.g., `Student:小明, Mentor:教授`).
|
||||
|
||||
### Page Title Convention
|
||||
|
||||
Every page MUST have a narrative title (not section header):
|
||||
|
||||
**Wrong**: "Chapter 1: Introduction to Transformers"
|
||||
**Right**: "The Day Nobita Couldn't Understand Anyone"
|
||||
|
||||
### Gadget Reveal Pattern
|
||||
|
||||
When introducing a concept:
|
||||
|
||||
1. Student expresses confusion with visual indicator (?, spiral eyes)
|
||||
2. Mentor dramatically produces gadget with sparkle effects
|
||||
3. Gadget name announced in bold with explanation
|
||||
4. Demonstration begins - student enters metaphorical space
|
||||
|
||||
### Ending Requirements
|
||||
|
||||
Final page MUST include:
|
||||
|
||||
1. Student demonstrating understanding (applying the concept)
|
||||
2. Callback to opening problem (now resolved)
|
||||
3. Mentor's satisfied expression
|
||||
4. Optional: hint at next topic
|
||||
|
||||
### NO Talking Heads Rule
|
||||
|
||||
**Critical**: Characters must DO things, not just explain.
|
||||
|
||||
Every panel should show:
|
||||
- Action being performed
|
||||
- Metaphor being demonstrated
|
||||
- Character interaction with concept-space
|
||||
- NOT: two characters facing each other talking
|
||||
|
||||
### Special Visual Elements
|
||||
|
||||
| Element | Usage |
|
||||
|---------|-------|
|
||||
| Gadget reveals | Dramatic unveiling with sparkle effects |
|
||||
| Concept spaces | Rounded borders, glowing edges for "imagination mode" |
|
||||
| Information displays | Holographic UI style for technical details |
|
||||
| Aha moments | Radial lines, light burst effects |
|
||||
| Confusion | Spiral eyes, question marks floating above head |
|
||||
|
||||
## Quality Markers
|
||||
|
||||
- ✓ Every concept is a visual metaphor
|
||||
- ✓ Characters are DOING things, not just talking
|
||||
- ✓ Clear student/mentor dynamic
|
||||
- ✓ Gadgets and props drive the explanation
|
||||
- ✓ Expressive manga-style emotions
|
||||
- ✓ Information density through visual design, not text walls
|
||||
- ✓ Narrative page titles
|
||||
|
||||
## Reference
|
||||
|
||||
For complete guidelines, see `references/ohmsha-guide.md`
|
||||
@@ -0,0 +1,116 @@
|
||||
# shoujo
|
||||
|
||||
少女预设 - Classic shoujo manga with romantic aesthetics
|
||||
|
||||
## Base Configuration
|
||||
|
||||
| Dimension | Value |
|
||||
|-----------|-------|
|
||||
| Art Style | manga |
|
||||
| Tone | romantic |
|
||||
| Layout | standard (default) |
|
||||
|
||||
Equivalent to: art=manga, tone=romantic
|
||||
|
||||
## Unique Rules
|
||||
|
||||
This preset includes special rules beyond the art+tone combination. When the `shoujo` preset is selected, ALL rules below must be applied.
|
||||
|
||||
### Decorative Elements (Required)
|
||||
|
||||
Every emotional moment must include decorative elements:
|
||||
|
||||
| Emotion | Required Decorations |
|
||||
|---------|---------------------|
|
||||
| Love | Floating hearts, sparkles, rose petals |
|
||||
| Longing | Feathers, bubbles, distant sparkles |
|
||||
| Joy | Flowers blooming, light bursts, stars |
|
||||
| Sadness | Falling petals, fading sparkles |
|
||||
| Shyness | Soft sparkles, floating bubbles |
|
||||
| Realization | Radiating lines with sparkles |
|
||||
|
||||
### Eye Detail Requirements
|
||||
|
||||
Eyes are critical in shoujo style:
|
||||
|
||||
| Aspect | Treatment |
|
||||
|--------|-----------|
|
||||
| Size | Larger than standard manga (1.2x) |
|
||||
| Highlights | Multiple (3-5), placed for emotion |
|
||||
| Reflection | Scene reflection in emotional moments |
|
||||
| Sparkle | Built-in sparkle effects |
|
||||
| Tears | Crystalline, detailed teardrops |
|
||||
|
||||
### Character Beauty Standards
|
||||
|
||||
| Feature | Treatment |
|
||||
|---------|-----------|
|
||||
| Hair | Flowing, detailed strands, shine highlights |
|
||||
| Skin | Porcelain, soft blush on cheeks |
|
||||
| Lips | Soft, slightly glossy |
|
||||
| Hands | Elegant, expressive gestures |
|
||||
| Posture | Graceful, elegant poses |
|
||||
|
||||
### Background Effects
|
||||
|
||||
**Abstract backgrounds** for emotional moments:
|
||||
|
||||
| Moment Type | Background |
|
||||
|-------------|-----------|
|
||||
| Love confession | Soft gradient + floating flowers |
|
||||
| Shock | Screen tone speed lines + sparkles |
|
||||
| Memory | Dreamy blur + scattered petals |
|
||||
| Realization | Radial lines + light burst |
|
||||
| Intimate | Soft focus + floating elements |
|
||||
|
||||
### Panel Flow
|
||||
|
||||
- Overlap panels for intimate moments
|
||||
- Break panel borders for emotional impact
|
||||
- Float decorative elements between panels
|
||||
- Use screen tone gradients for mood
|
||||
- Irregular panel shapes for drama
|
||||
|
||||
### Emotional Beat Timing
|
||||
|
||||
Slow down pacing for emotional impact:
|
||||
|
||||
| Scene Type | Panel Treatment |
|
||||
|------------|-----------------|
|
||||
| Confession | Multiple small panels, then splash |
|
||||
| Eye contact | Close-up sequence |
|
||||
| Touch | Slow-motion panel breakdown |
|
||||
| Realization | Build-up panels then impact |
|
||||
|
||||
### Color Palette Application
|
||||
|
||||
| Scene Type | Palette |
|
||||
|------------|---------|
|
||||
| Romantic | Pink, lavender, rose gold |
|
||||
| Happy | Soft yellow, peach, sky blue |
|
||||
| Sad | Pale blue, silver, gray lavender |
|
||||
| Dramatic | Deep rose, purple, contrast |
|
||||
|
||||
### Screen Tone Usage
|
||||
|
||||
| Mood | Tone Pattern |
|
||||
|------|-------------|
|
||||
| Neutral | Clean, minimal |
|
||||
| Romantic | Soft gradient overlays |
|
||||
| Dramatic | Heavy contrast tones |
|
||||
| Dreamy | Soft dot patterns |
|
||||
|
||||
## Quality Markers
|
||||
|
||||
- ✓ Large, sparkling detailed eyes
|
||||
- ✓ Decorative elements in emotional moments
|
||||
- ✓ Flowing, beautiful character designs
|
||||
- ✓ Soft, pastel color palette
|
||||
- ✓ Elegant panel compositions
|
||||
- ✓ Screen tone mood effects
|
||||
- ✓ Romantic atmosphere throughout
|
||||
- ✓ Beautiful, expressive poses
|
||||
|
||||
## Best For
|
||||
|
||||
Romance stories, coming-of-age, friendship narratives, school life, emotional drama, love stories
|
||||
@@ -0,0 +1,110 @@
|
||||
# wuxia
|
||||
|
||||
武侠预设 - Hong Kong martial arts comic style
|
||||
|
||||
## Base Configuration
|
||||
|
||||
| Dimension | Value |
|
||||
|-----------|-------|
|
||||
| Art Style | ink-brush |
|
||||
| Tone | action |
|
||||
| Layout | splash (default) |
|
||||
|
||||
Equivalent to: art=ink-brush, tone=action
|
||||
|
||||
## Unique Rules
|
||||
|
||||
This preset includes special rules beyond the art+tone combination. When the `wuxia` preset is selected, ALL rules below must be applied.
|
||||
|
||||
### Qi/Energy Effects (Required)
|
||||
|
||||
Martial arts power must be visible through qi effects:
|
||||
|
||||
| Effect Type | Visual Treatment |
|
||||
|-------------|-----------------|
|
||||
| Internal qi | Glowing aura around character |
|
||||
| External qi | Visible energy projection |
|
||||
| Qi clash | Radiating impact waves |
|
||||
| Qi absorption | Flowing particles toward character |
|
||||
| Hidden power | Subtle glow in eyes/fists |
|
||||
|
||||
### Energy Colors
|
||||
|
||||
| Qi Type | Color |
|
||||
|---------|-------|
|
||||
| Righteous | Blue (#4299E1), Gold (#FFD700) |
|
||||
| Fierce | Red (#DC2626), Orange (#EA580C) |
|
||||
| Evil | Purple (#7C3AED), Green (#16A34A) |
|
||||
| Pure | White, Silver |
|
||||
| Ancient | Gold with particles |
|
||||
|
||||
### Combat Visual Language
|
||||
|
||||
**Impact moments** must include:
|
||||
|
||||
1. Speed lines radiating from impact point
|
||||
2. Flying debris (stone, wood, cloth)
|
||||
3. Shockwave rings
|
||||
4. Dust/energy clouds
|
||||
5. Hair and clothing blown back
|
||||
|
||||
### Movement Depiction
|
||||
|
||||
| Speed Level | Visual Treatment |
|
||||
|-------------|-----------------|
|
||||
| Normal | Standard pose |
|
||||
| Fast | Motion blur, speed lines |
|
||||
| Lightning | Afterimages, multiple positions |
|
||||
| Teleport | Fade effect, particle trail |
|
||||
|
||||
### Environmental Integration
|
||||
|
||||
Backgrounds must support action:
|
||||
|
||||
| Environment | Combat Enhancement |
|
||||
|-------------|-------------------|
|
||||
| Mountains | Crumbling peaks from impacts |
|
||||
| Forest | Exploding trees, flying leaves |
|
||||
| Water | Dramatic splashes, walking on water |
|
||||
| Temple | Breaking pillars, flying tiles |
|
||||
| Cliff | Dramatic falls, wind effects |
|
||||
|
||||
### Character Pose Guidelines
|
||||
|
||||
- Dynamic warrior stances with weight distribution
|
||||
- Flowing robes and hair showing movement
|
||||
- Muscle tension visible in action
|
||||
- Feet planted or in dynamic motion
|
||||
- Traditional martial arts postures
|
||||
|
||||
### Weapon Effects
|
||||
|
||||
| Weapon | Visual Treatment |
|
||||
|--------|-----------------|
|
||||
| Sword | Trailing light arc, blade glow |
|
||||
| Palm | Qi projection, wind effect |
|
||||
| Staff | Spinning blur, impact ripples |
|
||||
| Whip | Flowing energy trail |
|
||||
|
||||
### Atmospheric Elements
|
||||
|
||||
Always include:
|
||||
- Floating particles (leaves, petals, dust)
|
||||
- Ink wash mist for depth
|
||||
- Wind direction indicators
|
||||
- Dramatic sky/weather when appropriate
|
||||
|
||||
## Quality Markers
|
||||
|
||||
- ✓ Dynamic action poses with sense of motion
|
||||
- ✓ Ink brush aesthetic in line work
|
||||
- ✓ Visible qi/energy effects
|
||||
- ✓ High contrast dramatic lighting
|
||||
- ✓ Atmospheric backgrounds with Chinese elements
|
||||
- ✓ Flowing fabric and hair movement
|
||||
- ✓ Impactful combat moments
|
||||
- ✓ Speed lines and impact effects
|
||||
|
||||
## Best For
|
||||
|
||||
Martial arts stories, Chinese historical fiction, wuxia/xianxia adaptations, action-heavy narratives
|
||||
@@ -0,0 +1,143 @@
|
||||
# Storyboard Template
|
||||
|
||||
## Storyboard Document Format
|
||||
|
||||
```markdown
|
||||
---
|
||||
title: "[Comic Title]"
|
||||
topic: "[topic description]"
|
||||
time_span: "[e.g., 1912-1954]"
|
||||
narrative_approach: "[chronological/thematic/character-focused]"
|
||||
recommended_style: "[style name]"
|
||||
recommended_layout: "[layout name or varies]"
|
||||
aspect_ratio: "3:4" # 3:4 (portrait), 4:3 (landscape), 16:9 (widescreen)
|
||||
language: "[zh/en/ja/etc.]"
|
||||
page_count: [N]
|
||||
generated: "YYYY-MM-DD HH:mm"
|
||||
---
|
||||
|
||||
# [Comic Title] - Knowledge Comic Storyboard
|
||||
|
||||
**Character Reference**: characters/characters.png
|
||||
|
||||
---
|
||||
|
||||
## Cover
|
||||
|
||||
**Filename**: 00-cover-[slug].png
|
||||
**Core Message**: [one-liner]
|
||||
|
||||
**Visual Design**:
|
||||
- Title typography style
|
||||
- Main visual composition
|
||||
- Color scheme
|
||||
- Subtitle / time span notation
|
||||
|
||||
**Visual Prompt**:
|
||||
[Detailed image generation prompt]
|
||||
|
||||
---
|
||||
|
||||
## Page 1 / N
|
||||
|
||||
**Filename**: 01-page-[slug].png
|
||||
**Layout**: [standard/cinematic/dense/splash/mixed]
|
||||
**Narrative Layer**: [Main narrative / Narrator layer / Mixed]
|
||||
**Core Message**: [What this page conveys]
|
||||
|
||||
### Panel Layout
|
||||
|
||||
**Panel Count**: X
|
||||
**Layout Type**: [grid/irregular/splash]
|
||||
|
||||
#### Panel 1 (Size: 1/3 page, Position: Top)
|
||||
|
||||
**Scene**: [Time, location]
|
||||
**Image Description**:
|
||||
- Camera angle: [bird's eye / low angle / eye level / close-up / wide shot]
|
||||
- Characters: [pose, expression, action]
|
||||
- Environment: [scene details, period markers]
|
||||
- Lighting: [atmosphere description]
|
||||
- Color tone: [palette reference]
|
||||
|
||||
**Text Elements**:
|
||||
- Dialogue bubble (oval): "Character line"
|
||||
- Narrator box (rectangular): 「Narrator commentary」
|
||||
- Caption bar: [Background info text]
|
||||
|
||||
#### Panel 2...
|
||||
|
||||
**Page Hook**: [Cliffhanger or transition at page end]
|
||||
|
||||
**Visual Prompt**:
|
||||
[Full page image generation prompt]
|
||||
|
||||
---
|
||||
|
||||
## Page 2 / N
|
||||
...
|
||||
```
|
||||
|
||||
## Cover Design Principles
|
||||
|
||||
- Academic gravitas with visual appeal
|
||||
- Title typography reflecting knowledge/science theme
|
||||
- Composition hinting at core theme (character silhouette, iconic symbol, concept diagram)
|
||||
- Subtitle or time span for epic scope
|
||||
|
||||
## Panel Composition Guidelines
|
||||
|
||||
| Panel Type | Recommended Count | Usage |
|
||||
|-----------|-------------------|-------|
|
||||
| Main narrative | 3-5 per page | Story progression |
|
||||
| Concept diagram | 1-2 per page | Visualize abstractions |
|
||||
| Narrator panel | 0-1 per page | Commentary, transition |
|
||||
| Splash (full/half) | Occasional | Major moments |
|
||||
|
||||
## Panel Size Reference
|
||||
|
||||
- **Full page (Splash)**: Major moments, key breakthroughs
|
||||
- **Half page**: Important scenes, turning points
|
||||
- **1/3 page**: Standard narrative panels
|
||||
- **1/4 or smaller**: Quick progression, sequential action
|
||||
|
||||
## Concept Visualization Techniques
|
||||
|
||||
Transform abstract concepts into concrete visuals:
|
||||
|
||||
| Abstract Concept | Visual Approach |
|
||||
|-----------------|-----------------|
|
||||
| Neural network | Glowing nodes with connecting lines |
|
||||
| Gradient descent | Ball rolling down valley terrain |
|
||||
| Data flow | Luminous particles flowing through pipes |
|
||||
| Algorithm iteration | Ascending spiral staircase |
|
||||
| Breakthrough moment | Shattering barrier, piercing light |
|
||||
| Logical proof | Building blocks assembling |
|
||||
| Uncertainty | Forking paths, fog, multiple shadows |
|
||||
|
||||
## Text Element Design
|
||||
|
||||
| Text Type | Style | Usage |
|
||||
|-----------|-------|-------|
|
||||
| Character dialogue | Oval speech bubble | Main narrative speech |
|
||||
| Narrator commentary | Rectangular box | Explanation, commentary |
|
||||
| Caption bar | Edge-mounted rectangle | Time, location info |
|
||||
| Thought bubble | Cloud shape | Character inner monologue |
|
||||
| Term label | Bold / special color | First appearance of technical terms |
|
||||
|
||||
## Prompt Structure for Consistency
|
||||
|
||||
Each page prompt should include character reference:
|
||||
|
||||
```
|
||||
[CHARACTER REFERENCE]
|
||||
(Key details from characters.md for characters in this page)
|
||||
|
||||
[PAGE CONTENT]
|
||||
(Specific scene, panel layout, and visual elements)
|
||||
|
||||
[CONSISTENCY REMINDER]
|
||||
Maintain exact character appearances as defined in character reference.
|
||||
- [Character A]: [key identifying features]
|
||||
- [Character B]: [key identifying features]
|
||||
```
|
||||
@@ -0,0 +1,110 @@
|
||||
# action
|
||||
|
||||
动作基调 - Speed, impact, power
|
||||
|
||||
## Overview
|
||||
|
||||
High-impact action atmosphere with dynamic movement, combat effects, and powerful visual energy. Creates visceral, exciting sequences.
|
||||
|
||||
## Mood Characteristics
|
||||
|
||||
- Speed and motion
|
||||
- Power and impact
|
||||
- Combat intensity
|
||||
- Physical energy
|
||||
- Visceral excitement
|
||||
|
||||
## Color Modifiers
|
||||
|
||||
When applied to any art style:
|
||||
|
||||
| Adjustment | Direction |
|
||||
|------------|-----------|
|
||||
| Saturation | High contrast |
|
||||
| Contrast | Maximum |
|
||||
| Temperature | Variable per effect |
|
||||
| Brightness | Dynamic range |
|
||||
|
||||
## Action Effects
|
||||
|
||||
**Combat/motion effects** (apply liberally):
|
||||
|
||||
| Effect | Usage |
|
||||
|--------|-------|
|
||||
| Speed lines | Motion, velocity |
|
||||
| Impact bursts | Hits, collisions |
|
||||
| Shockwaves | Powerful impacts |
|
||||
| Flying debris | Environmental destruction |
|
||||
| Dust clouds | Ground impacts |
|
||||
| Motion blur | Fast movement |
|
||||
| Afterimages | Super speed |
|
||||
|
||||
## Special Effects
|
||||
|
||||
| Effect Type | Visual Approach |
|
||||
|------------|-----------------|
|
||||
| Energy attacks | Glowing, radiating |
|
||||
| Physical impacts | Radiating lines, debris |
|
||||
| Movement | Speed lines, blur |
|
||||
| Atmosphere | Flying particles, wind |
|
||||
|
||||
## Effect Colors
|
||||
|
||||
| Effect | Color | Hex |
|
||||
|--------|-------|-----|
|
||||
| Energy glow | Blue | #4299E1 |
|
||||
| Fire/power | Gold | #FFD700 |
|
||||
| Impact | White burst | #FFFFFF |
|
||||
| Blood/intensity | Deep red | #8B0000 |
|
||||
|
||||
## Lighting
|
||||
|
||||
- Dynamic, shifting
|
||||
- Impact flashes
|
||||
- Energy glow sources
|
||||
- Rim lighting on figures
|
||||
- Dramatic contrast
|
||||
|
||||
## Emotional Range
|
||||
|
||||
| Emotion | Expression |
|
||||
|---------|-----------|
|
||||
| Determination | Fierce focus |
|
||||
| Rage | Intense, powerful |
|
||||
| Triumph | Victorious pose |
|
||||
| Struggle | Strained effort |
|
||||
|
||||
## Composition
|
||||
|
||||
- Dynamic angles
|
||||
- Extreme perspectives
|
||||
- Panel-breaking layouts
|
||||
- Asymmetric designs
|
||||
- Impact-focused framing
|
||||
|
||||
## Pose Guidelines
|
||||
|
||||
- Dynamic warrior poses
|
||||
- Weight and momentum visible
|
||||
- Muscle tension shown
|
||||
- Flow of movement captured
|
||||
- Impact points emphasized
|
||||
|
||||
## Best For
|
||||
|
||||
- Martial arts combat
|
||||
- Action sequences
|
||||
- Sports moments
|
||||
- Physical challenges
|
||||
- Battle scenes
|
||||
- Climactic confrontations
|
||||
|
||||
## Combination Notes
|
||||
|
||||
Works especially well with:
|
||||
- ink-brush: wuxia combat
|
||||
- manga: shonen battles
|
||||
|
||||
Avoid with:
|
||||
- chalk: style mismatch
|
||||
- ligne-claire: style mismatch (too static)
|
||||
@@ -0,0 +1,95 @@
|
||||
# dramatic
|
||||
|
||||
戏剧基调 - High contrast, intense, powerful moments
|
||||
|
||||
## Overview
|
||||
|
||||
High-impact dramatic tone for pivotal moments, conflicts, and breakthroughs. Uses strong contrast and intense compositions to create emotional power.
|
||||
|
||||
## Mood Characteristics
|
||||
|
||||
- Tension and intensity
|
||||
- Pivotal moments
|
||||
- Conflict and resolution
|
||||
- Breakthrough discoveries
|
||||
- Emotional climaxes
|
||||
|
||||
## Color Modifiers
|
||||
|
||||
When applied to any art style:
|
||||
|
||||
| Adjustment | Direction |
|
||||
|------------|-----------|
|
||||
| Saturation | High (vibrant or deep) |
|
||||
| Contrast | Maximum |
|
||||
| Temperature | Varies for effect |
|
||||
| Brightness | Strong highlights, deep shadows |
|
||||
|
||||
## Contrast Approach
|
||||
|
||||
- Sharp light/dark divisions
|
||||
- Minimal mid-tones
|
||||
- Stark compositions
|
||||
- Silhouette potential
|
||||
- Rim lighting effects
|
||||
|
||||
## Accent Colors
|
||||
|
||||
- Deep navy (#1A365D)
|
||||
- Crimson (#9B2C2C)
|
||||
- Stark white
|
||||
- Heavy blacks
|
||||
- Limited palette per scene
|
||||
|
||||
## Lighting
|
||||
|
||||
- Dramatic single-source
|
||||
- High contrast shadows
|
||||
- Rim lighting on characters
|
||||
- Spotlight effects
|
||||
- Chiaroscuro influence
|
||||
|
||||
## Emotional Range
|
||||
|
||||
| Emotion | Expression |
|
||||
|---------|-----------|
|
||||
| Anger | Intense, defined features |
|
||||
| Determination | Strong, focused gaze |
|
||||
| Shock | Wide eyes, stark lighting |
|
||||
| Triumph | Powerful, elevated pose |
|
||||
|
||||
## Composition
|
||||
|
||||
- Angular, dynamic layouts
|
||||
- Dramatic camera angles
|
||||
- Low/high viewpoints
|
||||
- Diagonal compositions
|
||||
- Negative space for impact
|
||||
|
||||
## Visual Elements
|
||||
|
||||
- Speed lines for tension
|
||||
- Impact effects
|
||||
- Dramatic backgrounds (storms, fire)
|
||||
- Silhouettes
|
||||
- Light burst effects
|
||||
- Environmental drama
|
||||
|
||||
## Best For
|
||||
|
||||
- Pivotal discoveries
|
||||
- Conflict scenes
|
||||
- Climactic moments
|
||||
- Breakthrough realizations
|
||||
- Emotional confrontations
|
||||
- Historical turning points
|
||||
|
||||
## Combination Notes
|
||||
|
||||
Works especially well with:
|
||||
- realistic: powerful drama
|
||||
- ink-brush: martial arts climax
|
||||
- ligne-claire: historical pivots
|
||||
- manga: shonen battles
|
||||
|
||||
Avoid with: chalk (style mismatch)
|
||||
@@ -0,0 +1,105 @@
|
||||
# energetic
|
||||
|
||||
活力基调 - Bright, dynamic, exciting
|
||||
|
||||
## Overview
|
||||
|
||||
High-energy atmosphere for exciting, discovery-filled content. Bright colors, dynamic compositions, and movement create engaging visuals for younger audiences.
|
||||
|
||||
## Mood Characteristics
|
||||
|
||||
- Excitement and wonder
|
||||
- Discovery and learning
|
||||
- Energy and enthusiasm
|
||||
- Movement and action
|
||||
- Youthful spirit
|
||||
|
||||
## Color Modifiers
|
||||
|
||||
When applied to any art style:
|
||||
|
||||
| Adjustment | Direction |
|
||||
|------------|-----------|
|
||||
| Saturation | High (vibrant) |
|
||||
| Contrast | Medium-high |
|
||||
| Temperature | Variable, punchy |
|
||||
| Brightness | Bright, clean |
|
||||
|
||||
## Color Palette
|
||||
|
||||
Shift toward vibrant tones:
|
||||
|
||||
| Role | Color | Hex |
|
||||
|------|-------|-----|
|
||||
| Primary Red | Bright red | #F56565 |
|
||||
| Primary Yellow | Sunny yellow | #F6E05E |
|
||||
| Primary Blue | Sky blue | #63B3ED |
|
||||
| Accent 1 | Magenta | #D53F8C |
|
||||
| Accent 2 | Lime green | #68D391 |
|
||||
| Background | Clean white | #FFFFFF |
|
||||
| Background Alt | Bright pastels | Various |
|
||||
|
||||
## Lighting
|
||||
|
||||
- Bright, clear lighting
|
||||
- Clean shadows
|
||||
- High energy
|
||||
- Spotlight effects for emphasis
|
||||
- Dynamic light sources
|
||||
|
||||
## Dynamic Elements
|
||||
|
||||
**Energy effects** (add to compositions):
|
||||
|
||||
| Element | Usage |
|
||||
|---------|-------|
|
||||
| Speed lines | Motion, excitement |
|
||||
| Sparkles | Discoveries |
|
||||
| Burst effects | Aha moments |
|
||||
| Motion blur | Fast action |
|
||||
| Star bursts | Emphasis |
|
||||
| Sweat drops | Effort/surprise |
|
||||
|
||||
## Emotional Range
|
||||
|
||||
| Emotion | Expression |
|
||||
|---------|-----------|
|
||||
| Excitement | Wide eyes, big smile |
|
||||
| Surprise | Dramatic reaction |
|
||||
| Determination | Intense focus |
|
||||
| Wonder | Sparkling eyes |
|
||||
|
||||
## Composition
|
||||
|
||||
- Dynamic angles
|
||||
- Action-oriented layouts
|
||||
- Movement emphasis
|
||||
- Clean, punchy designs
|
||||
- Energy flows
|
||||
|
||||
## Visual Style
|
||||
|
||||
- Expressive, animated characters
|
||||
- Wide eyes, big reactions
|
||||
- Dynamic poses
|
||||
- Motion and action focus
|
||||
- Simplified backgrounds for energy
|
||||
|
||||
## Best For
|
||||
|
||||
- Science explanations
|
||||
- "Aha" moments
|
||||
- Young audience content
|
||||
- Discovery narratives
|
||||
- Learning adventures
|
||||
- Action tutorials
|
||||
|
||||
## Combination Notes
|
||||
|
||||
Works especially well with:
|
||||
- manga: shonen energy
|
||||
- chalk: fun education
|
||||
|
||||
Avoid with:
|
||||
- realistic: style mismatch
|
||||
- ink-brush: style mismatch
|
||||
@@ -0,0 +1,63 @@
|
||||
# neutral
|
||||
|
||||
中性基调 - Balanced, rational, educational
|
||||
|
||||
## Overview
|
||||
|
||||
Default balanced tone suitable for educational and informative content. Neither overly emotional nor cold - creates accessible, professional atmosphere.
|
||||
|
||||
## Mood Characteristics
|
||||
|
||||
- Balanced emotional register
|
||||
- Clear, rational presentation
|
||||
- Educational focus
|
||||
- Professional but approachable
|
||||
- Objective storytelling
|
||||
|
||||
## Color Modifiers
|
||||
|
||||
When applied to any art style:
|
||||
|
||||
| Adjustment | Direction |
|
||||
|------------|-----------|
|
||||
| Saturation | Standard (no shift) |
|
||||
| Contrast | Balanced |
|
||||
| Temperature | Neutral |
|
||||
| Brightness | Slightly bright |
|
||||
|
||||
## Lighting
|
||||
|
||||
- Even, clear lighting
|
||||
- Minimal dramatic shadows
|
||||
- Consistent across panels
|
||||
- Natural light sources
|
||||
- No extreme contrast
|
||||
|
||||
## Emotional Range
|
||||
|
||||
| Emotion | Expression Level |
|
||||
|---------|-----------------|
|
||||
| Joy | Moderate smile |
|
||||
| Concern | Thoughtful expression |
|
||||
| Surprise | Mild widening of eyes |
|
||||
| Frustration | Slight frown |
|
||||
|
||||
## Composition
|
||||
|
||||
- Balanced panel layouts
|
||||
- Clear focal points
|
||||
- Readable hierarchies
|
||||
- Standard framing
|
||||
- Functional compositions
|
||||
|
||||
## Best For
|
||||
|
||||
- Educational content
|
||||
- Technical tutorials
|
||||
- Informative biographies
|
||||
- Documentary style
|
||||
- Professional topics
|
||||
|
||||
## Usage Notes
|
||||
|
||||
Neutral is the default tone. Combine with any art style for baseline professional output. Most versatile tone option.
|
||||
@@ -0,0 +1,100 @@
|
||||
# romantic
|
||||
|
||||
浪漫基调 - Soft, beautiful, emotionally delicate
|
||||
|
||||
## Overview
|
||||
|
||||
Soft, dreamy atmosphere for romantic and emotionally delicate content. Features decorative elements, sparkles, and beautiful compositions that emphasize feeling and beauty.
|
||||
|
||||
## Mood Characteristics
|
||||
|
||||
- Romance and love
|
||||
- Beauty and elegance
|
||||
- Emotional delicacy
|
||||
- Dreams and hopes
|
||||
- Youth and idealism
|
||||
|
||||
## Color Modifiers
|
||||
|
||||
When applied to any art style:
|
||||
|
||||
| Adjustment | Direction |
|
||||
|------------|-----------|
|
||||
| Saturation | Soft pastels |
|
||||
| Contrast | Low, gentle |
|
||||
| Temperature | Slightly warm pink |
|
||||
| Brightness | Soft, glowing |
|
||||
|
||||
## Color Palette
|
||||
|
||||
Shift toward romantic tones:
|
||||
|
||||
| Role | Color | Hex |
|
||||
|------|-------|-----|
|
||||
| Primary | Soft pink | #FFB6C1 |
|
||||
| Secondary | Lavender | #E6E6FA |
|
||||
| Accent | Rose | #FF69B4 |
|
||||
| Highlight | Pearl white | #FFFAF0 |
|
||||
| Gold | Gold sparkle | #FFD700 |
|
||||
| Skin | Porcelain | #FFF5EE |
|
||||
| Blush | Soft blush | #FFE4E1 |
|
||||
| Background | Soft cream | #FFF8DC |
|
||||
|
||||
## Lighting
|
||||
|
||||
- Soft, diffused light
|
||||
- Glowing effects
|
||||
- Backlighting halos
|
||||
- Sparkle highlights
|
||||
- Dreamy atmospheres
|
||||
|
||||
## Decorative Elements
|
||||
|
||||
**Essential decorations** (add to compositions):
|
||||
|
||||
| Element | Usage |
|
||||
|---------|-------|
|
||||
| Flower petals | Floating, framing |
|
||||
| Sparkles | Emotional highlights |
|
||||
| Bubbles | Dreamy moments |
|
||||
| Feathers | Gentle floating |
|
||||
| Stars | Night scenes, wonder |
|
||||
| Hearts | Love emphasis |
|
||||
| Light halos | Character highlights |
|
||||
|
||||
## Emotional Range
|
||||
|
||||
| Emotion | Expression |
|
||||
|---------|-----------|
|
||||
| Love | Soft gaze, blush |
|
||||
| Longing | Distant, beautiful sadness |
|
||||
| Joy | Radiant smile, sparkles |
|
||||
| Shyness | Downcast eyes, blush |
|
||||
|
||||
## Composition
|
||||
|
||||
- Elegant, flowing layouts
|
||||
- Soft focus backgrounds
|
||||
- Characters framed by decorations
|
||||
- Beautiful angles (3/4 profiles)
|
||||
- Screen tone gradients
|
||||
|
||||
## Best For
|
||||
|
||||
- Romance stories
|
||||
- Coming-of-age
|
||||
- Friendship narratives
|
||||
- Emotional drama
|
||||
- School life
|
||||
- Beautiful moments
|
||||
|
||||
## Combination Notes
|
||||
|
||||
Works especially well with:
|
||||
- manga: classic shoujo style
|
||||
|
||||
Avoid with:
|
||||
- realistic: style mismatch
|
||||
- ink-brush: style mismatch
|
||||
- ligne-claire: style mismatch
|
||||
- chalk: style mismatch
|
||||
@@ -0,0 +1,104 @@
|
||||
# vintage
|
||||
|
||||
复古基调 - Historical, aged, period authenticity
|
||||
|
||||
## Overview
|
||||
|
||||
Historical atmosphere with aged paper effects and period-appropriate aesthetics. Creates sense of time, authenticity, and historical distance.
|
||||
|
||||
## Mood Characteristics
|
||||
|
||||
- Historical authenticity
|
||||
- Period distance
|
||||
- Archival quality
|
||||
- Time and memory
|
||||
- Classical elegance
|
||||
|
||||
## Color Modifiers
|
||||
|
||||
When applied to any art style:
|
||||
|
||||
| Adjustment | Direction |
|
||||
|------------|-----------|
|
||||
| Saturation | Reduced, muted |
|
||||
| Contrast | Medium, aged |
|
||||
| Temperature | Sepia shift |
|
||||
| Brightness | Slightly faded |
|
||||
|
||||
## Color Palette
|
||||
|
||||
Shift toward aged tones:
|
||||
|
||||
| Role | Color | Hex |
|
||||
|------|-------|-----|
|
||||
| Primary | Sepia brown | #8B7355 |
|
||||
| Background | Aged paper | #F5E6D3 |
|
||||
| Accent 1 | Faded teal | #6B8E8E |
|
||||
| Accent 2 | Muted burgundy | #7B3F3F |
|
||||
| Ink | Aged black | #3D3D3D |
|
||||
| Yellowed | Paper yellow | #F5DEB3 |
|
||||
|
||||
## Visual Effects
|
||||
|
||||
**Aging effects** (apply subtly):
|
||||
|
||||
| Effect | Application |
|
||||
|--------|-------------|
|
||||
| Paper aging | Background texture |
|
||||
| Faded edges | Vignette effect |
|
||||
| Dust specks | Subtle overlay |
|
||||
| Yellowing | Color shift |
|
||||
| Wear marks | Corner/edge details |
|
||||
|
||||
## Period Elements
|
||||
|
||||
- Historical typography
|
||||
- Period-accurate details
|
||||
- Archival presentation
|
||||
- Classical compositions
|
||||
- Formal framing
|
||||
|
||||
## Lighting
|
||||
|
||||
- Natural, period-appropriate
|
||||
- Oil lamp/candle warmth
|
||||
- Soft, diffused light
|
||||
- Indoor historical lighting
|
||||
- Photographic quality
|
||||
|
||||
## Emotional Range
|
||||
|
||||
| Emotion | Expression |
|
||||
|---------|-----------|
|
||||
| Dignity | Formal, composed |
|
||||
| Sorrow | Restrained, elegant |
|
||||
| Pride | Classical posture |
|
||||
| Wisdom | Aged grace |
|
||||
|
||||
## Composition
|
||||
|
||||
- Classical framing
|
||||
- Formal compositions
|
||||
- Period-appropriate staging
|
||||
- Documentary style
|
||||
- Historical accuracy priority
|
||||
|
||||
## Best For
|
||||
|
||||
- Pre-1950s stories
|
||||
- Classical science history
|
||||
- Historical biographies
|
||||
- Period pieces
|
||||
- Documentary comics
|
||||
- Archival narratives
|
||||
|
||||
## Combination Notes
|
||||
|
||||
Works especially well with:
|
||||
- realistic: period drama
|
||||
- ligne-claire: historical adventure
|
||||
- ink-brush: classical Asian stories
|
||||
|
||||
Avoid with:
|
||||
- manga: style mismatch (too modern)
|
||||
- chalk: style mismatch (modern educational)
|
||||
@@ -0,0 +1,94 @@
|
||||
# warm
|
||||
|
||||
温馨基调 - Nostalgic, personal, comforting
|
||||
|
||||
## Overview
|
||||
|
||||
Warm, inviting atmosphere for personal stories and nostalgic content. Creates emotional connection through cozy aesthetics and comforting visuals.
|
||||
|
||||
## Mood Characteristics
|
||||
|
||||
- Nostalgic feeling
|
||||
- Personal, intimate atmosphere
|
||||
- Comforting and healing
|
||||
- Memory and reflection
|
||||
- Gentle emotional warmth
|
||||
|
||||
## Color Modifiers
|
||||
|
||||
When applied to any art style:
|
||||
|
||||
| Adjustment | Direction |
|
||||
|------------|-----------|
|
||||
| Saturation | Slightly reduced |
|
||||
| Contrast | Softer |
|
||||
| Temperature | Warm shift (+15%) |
|
||||
| Brightness | Soft, golden |
|
||||
|
||||
## Color Temperature
|
||||
|
||||
Shift palette toward warm tones:
|
||||
|
||||
| Original | Warm Shift |
|
||||
|----------|-----------|
|
||||
| Cool blue | Soft teal |
|
||||
| Pure white | Cream |
|
||||
| Gray | Warm gray |
|
||||
| Black | Soft charcoal |
|
||||
|
||||
## Accent Colors
|
||||
|
||||
- Golden yellow (#D69E2E)
|
||||
- Soft orange (#DD6B20)
|
||||
- Warm brown (#8B6F47)
|
||||
- Sunset tones
|
||||
|
||||
## Lighting
|
||||
|
||||
- Golden hour lighting
|
||||
- Soft, diffused light
|
||||
- Warm indoor glow
|
||||
- Candle/lamp warmth
|
||||
- Gentle shadows
|
||||
|
||||
## Emotional Range
|
||||
|
||||
| Emotion | Expression |
|
||||
|---------|-----------|
|
||||
| Joy | Genuine warm smile |
|
||||
| Sadness | Gentle melancholy |
|
||||
| Love | Soft, tender expressions |
|
||||
| Memory | Distant, reflective gaze |
|
||||
|
||||
## Composition
|
||||
|
||||
- Intimate framing
|
||||
- Cozy environments
|
||||
- Soft focus backgrounds
|
||||
- Welcoming spaces
|
||||
- Personal moments highlighted
|
||||
|
||||
## Visual Elements
|
||||
|
||||
- Warm light rays
|
||||
- Soft edges
|
||||
- Nostalgic props (old photos, keepsakes)
|
||||
- Comfort objects (blankets, tea cups)
|
||||
- Nature elements (autumn leaves, sunset)
|
||||
|
||||
## Best For
|
||||
|
||||
- Personal stories
|
||||
- Childhood memories
|
||||
- Mentorship narratives
|
||||
- Family histories
|
||||
- Gentle biographies
|
||||
- Healing journeys
|
||||
|
||||
## Combination Notes
|
||||
|
||||
Works especially well with:
|
||||
- ligne-claire: nostalgic European comics
|
||||
- realistic: touching human stories
|
||||
- manga: slice-of-life warmth
|
||||
- chalk: nostalgic education
|
||||
@@ -0,0 +1,401 @@
|
||||
# Complete Workflow
|
||||
|
||||
Full workflow for generating knowledge comics.
|
||||
|
||||
## Progress Checklist
|
||||
|
||||
Copy and track progress:
|
||||
|
||||
```
|
||||
Comic Progress:
|
||||
- [ ] Step 1: Setup & Analyze
|
||||
- [ ] 1.1 Analyze content
|
||||
- [ ] 1.2 Check existing ⚠️ REQUIRED
|
||||
- [ ] Step 2: Confirmation - Style & options ⚠️ REQUIRED
|
||||
- [ ] Step 3: Generate storyboard + characters
|
||||
- [ ] Step 4: Review outline (conditional)
|
||||
- [ ] Step 5: Generate prompts
|
||||
- [ ] Step 6: Review prompts (conditional)
|
||||
- [ ] Step 7: Generate images
|
||||
- [ ] 7.1 Character sheet (if needed)
|
||||
- [ ] 7.2 Generate pages
|
||||
- [ ] Step 8: Completion report
|
||||
```
|
||||
|
||||
## Flow Diagram
|
||||
|
||||
```
|
||||
Input → Analyze → [Check Existing?] → [Confirm: Style + Reviews] → Storyboard → [Review Outline?] → Prompts → [Review Prompts?] → Images → Complete
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Step 1: Setup & Analyze
|
||||
|
||||
### 1.1 Analyze Content → `analysis.md`
|
||||
|
||||
Read source content, save it if needed, and perform deep analysis.
|
||||
|
||||
**Actions**:
|
||||
1. **Save source content** (if not already a file):
|
||||
- If user provides a file path: use as-is
|
||||
- If user pastes content: save to `source-{slug}.md` in the target directory using `write_file`, where `{slug}` is the kebab-case topic slug used for the output directory
|
||||
- **Backup rule**: If `source-{slug}.md` already exists, rename it to `source-{slug}-backup-YYYYMMDD-HHMMSS.md` before writing
|
||||
2. Read source content
|
||||
3. **Deep analysis** following `analysis-framework.md`:
|
||||
- Target audience identification
|
||||
- Value proposition for readers
|
||||
- Core themes and narrative potential
|
||||
- Key figures and their story arcs
|
||||
4. Detect source language
|
||||
5. **Determine language**:
|
||||
- If user specified a language → use it
|
||||
- Else → use detected source language or user's conversation language
|
||||
6. Determine recommended page count:
|
||||
- Short story: 5-8 pages
|
||||
- Medium complexity: 9-15 pages
|
||||
- Full biography: 16-25 pages
|
||||
7. Analyze content signals for art/tone/layout recommendations
|
||||
8. **Save to `analysis.md`** using `write_file`
|
||||
|
||||
**analysis.md Format**: YAML front matter (title, topic, time_span, source_language, user_language, aspect_ratio, recommended_page_count, recommended_art, recommended_tone) + sections for Target Audience, Value Proposition, Core Themes, Key Figures & Story Arcs, Content Signals, Recommended Approaches. See `analysis-framework.md` for full template.
|
||||
|
||||
### 1.2 Check Existing Content ⚠️ REQUIRED
|
||||
|
||||
**MUST execute before proceeding to Step 2.**
|
||||
|
||||
Check if the output directory exists (e.g., via `test -d "comic/{topic-slug}"`).
|
||||
|
||||
**If directory exists**, use `clarify`:
|
||||
|
||||
```
|
||||
question: "Existing content found at comic/{topic-slug}. How to proceed?"
|
||||
options:
|
||||
- "Regenerate storyboard — Keep images, regenerate storyboard and characters only"
|
||||
- "Regenerate images — Keep storyboard, regenerate images only"
|
||||
- "Backup and regenerate — Backup to {slug}-backup-{timestamp}, then regenerate all"
|
||||
- "Exit — Cancel, keep existing content unchanged"
|
||||
```
|
||||
|
||||
Save result and handle accordingly:
|
||||
- **Regenerate storyboard**: Skip to Step 3, preserve `prompts/` and images
|
||||
- **Regenerate images**: Skip to Step 7, use existing prompts
|
||||
- **Backup and regenerate**: Move directory, start fresh from Step 2
|
||||
- **Exit**: End workflow immediately
|
||||
|
||||
---
|
||||
|
||||
## Step 2: Confirmation - Style & Options ⚠️
|
||||
|
||||
**Purpose**: Select visual style + decide whether to review outline before generation. **Do NOT skip.**
|
||||
|
||||
**Display summary first**:
|
||||
- Content type + topic identified
|
||||
- Key figures extracted
|
||||
- Time span detected
|
||||
- Recommended page count
|
||||
- Language (detected or user-specified)
|
||||
- **Recommended style**: [art] + [tone] (based on content signals)
|
||||
|
||||
**Use `clarify` one question at a time**, in priority order:
|
||||
|
||||
> **Timeout handling (CRITICAL)**: if `clarify` returns `"The user did not provide a response within the time limit. Use your best judgement..."`, that is a per-question default, NOT blanket consent. Continue to the next question in the sequence — do not bail out of Step 2. Then, in your next user-visible message, explicitly surface every default that was taken (e.g. `"Defaulted style → ohmsha, narrative focus → concept explanation, audience → developers (clarify timed out on all three). Say the word to redirect."`). An unreported default is indistinguishable to the user from "the agent never asked."
|
||||
|
||||
### Question 1: Visual Style
|
||||
|
||||
If a preset is recommended (see `auto-selection.md`), show it first:
|
||||
|
||||
```
|
||||
question: "Which visual style for this comic?"
|
||||
options:
|
||||
- "[preset name] preset (Recommended) — [preset description] with special rules"
|
||||
- "[recommended art] + [recommended tone] (Recommended) — Best match for your content"
|
||||
- "ligne-claire + neutral — Classic educational, Logicomix style"
|
||||
- "ohmsha preset — Educational manga with visual metaphors, gadgets, NO talking heads"
|
||||
- "Custom — Specify your own art + tone or preset"
|
||||
```
|
||||
|
||||
**Preset vs Art+Tone**: Presets include special rules beyond art+tone. `ohmsha` = manga + neutral + visual metaphor rules + character roles + NO talking heads. Plain `manga + neutral` does NOT include these rules.
|
||||
|
||||
### Question 2: Narrative Focus
|
||||
|
||||
```
|
||||
question: "What should the comic emphasize? (Pick the primary focus; mention others in a follow-up if needed)"
|
||||
options:
|
||||
- "Biography/life story — Follow a person's journey through key life events"
|
||||
- "Concept explanation — Break down complex ideas visually"
|
||||
- "Historical event — Dramatize important historical moments"
|
||||
- "Tutorial/how-to — Step-by-step educational guide"
|
||||
```
|
||||
|
||||
### Question 3: Target Audience
|
||||
|
||||
```
|
||||
question: "Who is the primary reader?"
|
||||
options:
|
||||
- "General readers — Broad appeal, accessible content"
|
||||
- "Students/learners — Educational focus, clear explanations"
|
||||
- "Industry professionals — Technical depth, domain knowledge"
|
||||
- "Children/young readers — Simplified language, engaging visuals"
|
||||
```
|
||||
|
||||
### Question 4: Outline Review
|
||||
|
||||
```
|
||||
question: "Do you want to review the outline before image generation?"
|
||||
options:
|
||||
- "Yes, let me review (Recommended) — Review storyboard and characters before generating images"
|
||||
- "No, generate directly — Skip outline review, start generating immediately"
|
||||
```
|
||||
|
||||
### Question 5: Prompt Review
|
||||
|
||||
```
|
||||
question: "Review prompts before generating images?"
|
||||
options:
|
||||
- "Yes, review prompts (Recommended) — Review image generation prompts before generating"
|
||||
- "No, skip prompt review — Proceed directly to image generation"
|
||||
```
|
||||
|
||||
**After responses**:
|
||||
1. Update `analysis.md` with user preferences
|
||||
2. **Store `skip_outline_review`** flag based on Question 4 response
|
||||
3. **Store `skip_prompt_review`** flag based on Question 5 response
|
||||
4. → Step 3
|
||||
|
||||
---
|
||||
|
||||
## Step 3: Generate Storyboard + Characters
|
||||
|
||||
Create storyboard and character definitions using the confirmed style from Step 2.
|
||||
|
||||
**Loading Style References**:
|
||||
- Art style: `art-styles/{art}.md`
|
||||
- Tone: `tones/{tone}.md`
|
||||
- If preset (ohmsha/wuxia/shoujo/concept-story/four-panel): also load `presets/{preset}.md`
|
||||
|
||||
**Generate**:
|
||||
|
||||
1. **Storyboard** (`storyboard.md`):
|
||||
- YAML front matter with art_style, tone, layout, aspect_ratio
|
||||
- Cover design
|
||||
- Each page: layout, panel breakdown, visual prompts
|
||||
- **Written in user's preferred language** (from Step 1)
|
||||
- Reference: `storyboard-template.md`
|
||||
- **If using preset**: Load and apply preset rules from `presets/`
|
||||
|
||||
2. **Character definitions** (`characters/characters.md`):
|
||||
- Visual specs matching the art style (in user's preferred language)
|
||||
- Include Reference Sheet Prompt for later image generation
|
||||
- Reference: `character-template.md`
|
||||
- **If using ohmsha preset**: Use default Doraemon characters (see below)
|
||||
|
||||
**Ohmsha Default Characters** (use these unless user specifies custom characters):
|
||||
|
||||
| Role | Character | Visual Description |
|
||||
|------|-----------|-------------------|
|
||||
| Student | 大雄 (Nobita) | Japanese boy, 10yo, round glasses, black hair parted in middle, yellow shirt, navy shorts |
|
||||
| Mentor | 哆啦 A 梦 (Doraemon) | Round blue robot cat, big white eyes, red nose, whiskers, white belly with 4D pocket, golden bell, no ears |
|
||||
| Challenge | 胖虎 (Gian) | Stocky boy, rough features, small eyes, orange shirt |
|
||||
| Support | 静香 (Shizuka) | Cute girl, black short hair, pink dress, gentle expression |
|
||||
|
||||
These are the canonical ohmsha-style characters. Do NOT create custom characters for ohmsha unless explicitly requested.
|
||||
|
||||
**After generation**:
|
||||
- If `skip_outline_review` is true → Skip Step 4, go directly to Step 5
|
||||
- If `skip_outline_review` is false → Continue to Step 4
|
||||
|
||||
---
|
||||
|
||||
## Step 4: Review Outline (Conditional)
|
||||
|
||||
**Skip this step** if user selected "No, generate directly" in Step 2.
|
||||
|
||||
**Purpose**: User reviews and confirms storyboard + characters before generation.
|
||||
|
||||
**Display**:
|
||||
- Page count and structure
|
||||
- Art style + Tone combination
|
||||
- Page-by-page summary (Cover → P1 → P2...)
|
||||
- Character list with brief descriptions
|
||||
|
||||
**Use `clarify`**:
|
||||
|
||||
```
|
||||
question: "Ready to generate images with this outline?"
|
||||
options:
|
||||
- "Yes, proceed (Recommended) — Generate character sheet and comic pages"
|
||||
- "Edit storyboard first — I'll modify storyboard.md before continuing"
|
||||
- "Edit characters first — I'll modify characters/characters.md before continuing"
|
||||
- "Edit both — I'll modify both files before continuing"
|
||||
```
|
||||
|
||||
**After response**:
|
||||
1. If user wants to edit → Wait for user to finish editing, then ask again
|
||||
2. If user confirms → Continue to Step 5
|
||||
|
||||
---
|
||||
|
||||
## Step 5: Generate Prompts
|
||||
|
||||
Create image generation prompts for all pages.
|
||||
|
||||
**Style Reference Loading**:
|
||||
- Read `art-styles/{art}.md` for rendering guidelines
|
||||
- Read `tones/{tone}.md` for mood/color adjustments
|
||||
- If preset: Read `presets/{preset}.md` for special rules
|
||||
|
||||
**For each page (cover + pages)**:
|
||||
1. Create prompt following art style + tone guidelines
|
||||
2. **Embed character descriptions** inline (copy relevant traits from `characters/characters.md`) — `image_generate` is prompt-only, so the prompt text is the sole vehicle for character consistency
|
||||
3. Save to `prompts/NN-{cover|page}-[slug].md` using `write_file`
|
||||
- **Backup rule**: If prompt file exists, rename to `prompts/NN-{cover|page}-[slug]-backup-YYYYMMDD-HHMMSS.md`
|
||||
|
||||
**Prompt File Format**:
|
||||
```markdown
|
||||
# Page NN: [Title]
|
||||
|
||||
## Visual Style
|
||||
Art: [art style] | Tone: [tone] | Layout: [layout type]
|
||||
|
||||
## Character Reference (embedded inline — maintain exact traits below)
|
||||
- [Character A]: [detailed visual traits from characters/characters.md]
|
||||
- [Character B]: [detailed visual traits from characters/characters.md]
|
||||
|
||||
## Panel Breakdown
|
||||
[From storyboard.md - panel descriptions, actions, dialogue]
|
||||
|
||||
## Generation Prompt
|
||||
[Combined prompt passed to image_generate]
|
||||
```
|
||||
|
||||
**After generation**:
|
||||
- If `skip_prompt_review` is true → Skip Step 6, go directly to Step 7
|
||||
- If `skip_prompt_review` is false → Continue to Step 6
|
||||
|
||||
---
|
||||
|
||||
## Step 6: Review Prompts (Conditional)
|
||||
|
||||
**Skip this step** if user selected "No, skip prompt review" in Step 2.
|
||||
|
||||
**Purpose**: User reviews and confirms prompts before image generation.
|
||||
|
||||
**Display prompt summary table**:
|
||||
|
||||
| Page | Title | Key Elements |
|
||||
|------|-------|--------------|
|
||||
| Cover | [title] | [main visual] |
|
||||
| P1 | [title] | [key elements] |
|
||||
| ... | ... | ... |
|
||||
|
||||
**Use `clarify`**:
|
||||
|
||||
```
|
||||
question: "Ready to generate images with these prompts?"
|
||||
options:
|
||||
- "Yes, proceed (Recommended) — Generate all comic page images"
|
||||
- "Edit prompts first — I'll modify prompts/*.md before continuing"
|
||||
- "Regenerate prompts — Regenerate all prompts with different approach"
|
||||
```
|
||||
|
||||
**After response**:
|
||||
1. If user wants to edit → Wait for user to finish editing, then ask again
|
||||
2. If user wants to regenerate → Go back to Step 5
|
||||
3. If user confirms → Continue to Step 7
|
||||
|
||||
---
|
||||
|
||||
## Step 7: Generate Images
|
||||
|
||||
With confirmed prompts from Step 5/6, use the `image_generate` tool. The tool accepts only `prompt` and `aspect_ratio` (`landscape` | `portrait` | `square`) and **returns a URL** — it does not accept reference images and does not write local files. Every invocation must be followed by a download step.
|
||||
|
||||
**Aspect ratio mapping** — map the storyboard's `aspect_ratio` to the tool's enum:
|
||||
|
||||
| Storyboard ratio | `image_generate` format |
|
||||
|------------------|-------------------------|
|
||||
| `3:4`, `9:16`, `2:3` | `portrait` |
|
||||
| `4:3`, `16:9`, `3:2` | `landscape` |
|
||||
| `1:1` | `square` |
|
||||
|
||||
**Download procedure** (run after every successful `image_generate` call):
|
||||
|
||||
1. Extract the `url` field from the tool result
|
||||
2. Fetch it to disk, e.g. `curl -fsSL "<url>" -o comic/{slug}/<target>.png`
|
||||
3. Verify the file is non-empty (`test -s <target>.png`); on failure, retry the generation once
|
||||
|
||||
### 7.1 Generate Character Reference Sheet (conditional)
|
||||
|
||||
Character sheet is recommended for multi-page comics with recurring characters, but **NOT required** for all presets.
|
||||
|
||||
**When to generate**:
|
||||
|
||||
| Condition | Action |
|
||||
|-----------|--------|
|
||||
| Multi-page comic with detailed/recurring characters | Generate character sheet (recommended) |
|
||||
| Preset with simplified characters (e.g., four-panel minimalist) | Skip — prompt descriptions are sufficient |
|
||||
| Single-page comic | Skip unless characters are complex |
|
||||
|
||||
**When generating**:
|
||||
1. Use Reference Sheet Prompt from `characters/characters.md`
|
||||
2. **Backup rule**: If `characters/characters.png` exists, rename to `characters/characters-backup-YYYYMMDD-HHMMSS.png`
|
||||
3. Call `image_generate` with `landscape` format
|
||||
4. Download the returned URL → save to `characters/characters.png`
|
||||
|
||||
**Important**: the downloaded sheet is a **human-facing review artifact** (so the user can visually verify character design) and a reference for later regenerations or manual prompt edits. It does **not** drive Step 7.2 — page prompts were already written in Step 5 from the text descriptions in `characters/characters.md`. `image_generate` cannot accept images as visual input, so the text is the sole cross-page consistency mechanism.
|
||||
|
||||
### 7.2 Generate Comic Pages
|
||||
|
||||
**Before generating any page**:
|
||||
1. Confirm each prompt file exists at `prompts/NN-{cover|page}-[slug].md`
|
||||
2. Confirm that each prompt has character descriptions embedded inline (see Step 5). `image_generate` is prompt-only, so the prompt text is the sole consistency mechanism.
|
||||
|
||||
**Page Generation Strategy**: every page prompt must embed character descriptions (sourced from `characters/characters.md`) inline. This is done during Step 5, uniformly whether or not the PNG sheet was produced in 7.1 — the PNG is only a review/regeneration aid, never a generation input.
|
||||
|
||||
**Example embedded prompt** (`prompts/01-page-xxx.md`):
|
||||
|
||||
```markdown
|
||||
# Page 01: [Title]
|
||||
|
||||
## Character Reference (embedded inline — maintain consistency)
|
||||
- 大雄:Japanese boy, round glasses, yellow shirt, navy shorts, worried expression...
|
||||
- 哆啦 A 梦:Round blue robot cat, white belly, red nose, golden bell, 4D pocket...
|
||||
|
||||
## Page Content
|
||||
[Original page prompt body — panels, dialogue, visual metaphors]
|
||||
```
|
||||
|
||||
**For each page (cover + pages)**:
|
||||
1. Read prompt from `prompts/NN-{cover|page}-[slug].md`
|
||||
2. **Backup rule**: If image file exists, rename to `NN-{cover|page}-[slug]-backup-YYYYMMDD-HHMMSS.png`
|
||||
3. Call `image_generate` with the prompt text and mapped aspect ratio
|
||||
4. Download the returned URL → save to `NN-{cover|page}-[slug].png`
|
||||
5. Report progress after each generation: "Generated X/N: [page title]"
|
||||
|
||||
---
|
||||
|
||||
## Step 8: Completion Report
|
||||
|
||||
```
|
||||
Comic Complete!
|
||||
Title: [title] | Art: [art] | Tone: [tone] | Pages: [count] | Aspect: [ratio] | Language: [lang]
|
||||
Location: [path]
|
||||
✓ source-{slug}.md (if content was pasted)
|
||||
✓ analysis.md
|
||||
✓ characters.png (if generated)
|
||||
✓ 00-cover-[slug].png ... NN-page-[slug].png
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Page Modification
|
||||
|
||||
| Action | Steps |
|
||||
|--------|-------|
|
||||
| **Edit** | Update prompt → Regenerate image → Download new PNG |
|
||||
| **Add** | Create prompt at position → Generate image → Download PNG → Renumber subsequent (NN+1) → Update storyboard |
|
||||
| **Delete** | Remove files → Renumber subsequent (NN-1) → Update storyboard |
|
||||
|
||||
**File naming**: `NN-{cover|page}-[slug].png` (e.g., `03-page-enigma-machine.png`)
|
||||
- Slugs: kebab-case, unique, derived from content
|
||||
- Renumbering: Update NN prefix only, slugs unchanged
|
||||
@@ -0,0 +1,43 @@
|
||||
# Port Notes — baoyu-infographic
|
||||
|
||||
Ported from [JimLiu/baoyu-skills](https://github.com/JimLiu/baoyu-skills) v1.56.1.
|
||||
|
||||
## Changes from upstream
|
||||
|
||||
Only `SKILL.md` was modified. All 45 reference files are verbatim copies.
|
||||
|
||||
### SKILL.md adaptations
|
||||
|
||||
| Change | Upstream | Hermes |
|
||||
|--------|----------|--------|
|
||||
| Metadata namespace | `openclaw` | `hermes` |
|
||||
| Trigger | `/baoyu-infographic` slash command | Natural language skill matching |
|
||||
| User config | EXTEND.md file (project/user/XDG paths) | Removed — not part of Hermes infra |
|
||||
| User prompts | `AskUserQuestion` (batched) | `clarify` tool (one at a time) |
|
||||
| Image generation | baoyu-imagine (Bun/TypeScript) | `image_generate` tool |
|
||||
| Platform support | Linux/macOS/Windows/WSL/PowerShell | Linux/macOS only |
|
||||
| File operations | Bash commands | Hermes file tools (write_file, read_file) |
|
||||
|
||||
### What was preserved
|
||||
|
||||
- All layout definitions (21 files)
|
||||
- All style definitions (21 files)
|
||||
- Core reference files (analysis-framework, base-prompt, structured-content-template)
|
||||
- Recommended combinations table
|
||||
- Keyword shortcuts table
|
||||
- Core principles and workflow structure
|
||||
- Author, version, homepage attribution
|
||||
|
||||
## Syncing with upstream
|
||||
|
||||
To pull upstream updates:
|
||||
```bash
|
||||
# Compare versions
|
||||
curl -sL https://raw.githubusercontent.com/JimLiu/baoyu-skills/main/skills/baoyu-infographic/SKILL.md | head -5
|
||||
# Look for version: line
|
||||
|
||||
# Diff reference files
|
||||
diff <(curl -sL https://raw.githubusercontent.com/.../references/layouts/bento-grid.md) references/layouts/bento-grid.md
|
||||
```
|
||||
|
||||
Reference files can be overwritten directly (they're unchanged from upstream). SKILL.md must be manually merged since it contains Hermes-specific adaptations.
|
||||
@@ -0,0 +1,236 @@
|
||||
---
|
||||
name: baoyu-infographic
|
||||
description: Generate professional infographics with 21 layout types and 21 visual styles. Analyzes content, recommends layout×style combinations, and generates publication-ready infographics. Use when user asks to create "infographic", "visual summary", "信息图", "可视化", or "高密度信息大图".
|
||||
version: 1.56.1
|
||||
author: 宝玉 (JimLiu)
|
||||
license: MIT
|
||||
metadata:
|
||||
hermes:
|
||||
tags: [infographic, visual-summary, creative, image-generation]
|
||||
homepage: https://github.com/JimLiu/baoyu-skills#baoyu-infographic
|
||||
---
|
||||
|
||||
# Infographic Generator
|
||||
|
||||
Adapted from [baoyu-infographic](https://github.com/JimLiu/baoyu-skills) for Hermes Agent's tool ecosystem.
|
||||
|
||||
Two dimensions: **layout** (information structure) × **style** (visual aesthetics). Freely combine any layout with any style.
|
||||
|
||||
## When to Use
|
||||
|
||||
Trigger this skill when the user asks to create an infographic, visual summary, information graphic, or uses terms like "信息图", "可视化", or "高密度信息大图". The user provides content (text, file path, URL, or topic) and optionally specifies layout, style, aspect ratio, or language.
|
||||
|
||||
## Options
|
||||
|
||||
| Option | Values |
|
||||
|--------|--------|
|
||||
| Layout | 21 options (see Layout Gallery), default: bento-grid |
|
||||
| Style | 21 options (see Style Gallery), default: craft-handmade |
|
||||
| Aspect | Named: landscape (16:9), portrait (9:16), square (1:1). Custom: any W:H ratio (e.g., 3:4, 4:3, 2.35:1) |
|
||||
| Language | en, zh, ja, etc. |
|
||||
|
||||
## Layout Gallery
|
||||
|
||||
| Layout | Best For |
|
||||
|--------|----------|
|
||||
| `linear-progression` | Timelines, processes, tutorials |
|
||||
| `binary-comparison` | A vs B, before-after, pros-cons |
|
||||
| `comparison-matrix` | Multi-factor comparisons |
|
||||
| `hierarchical-layers` | Pyramids, priority levels |
|
||||
| `tree-branching` | Categories, taxonomies |
|
||||
| `hub-spoke` | Central concept with related items |
|
||||
| `structural-breakdown` | Exploded views, cross-sections |
|
||||
| `bento-grid` | Multiple topics, overview (default) |
|
||||
| `iceberg` | Surface vs hidden aspects |
|
||||
| `bridge` | Problem-solution |
|
||||
| `funnel` | Conversion, filtering |
|
||||
| `isometric-map` | Spatial relationships |
|
||||
| `dashboard` | Metrics, KPIs |
|
||||
| `periodic-table` | Categorized collections |
|
||||
| `comic-strip` | Narratives, sequences |
|
||||
| `story-mountain` | Plot structure, tension arcs |
|
||||
| `jigsaw` | Interconnected parts |
|
||||
| `venn-diagram` | Overlapping concepts |
|
||||
| `winding-roadmap` | Journey, milestones |
|
||||
| `circular-flow` | Cycles, recurring processes |
|
||||
| `dense-modules` | High-density modules, data-rich guides |
|
||||
|
||||
Full definitions: `references/layouts/<layout>.md`
|
||||
|
||||
## Style Gallery
|
||||
|
||||
| Style | Description |
|
||||
|-------|-------------|
|
||||
| `craft-handmade` | Hand-drawn, paper craft (default) |
|
||||
| `claymation` | 3D clay figures, stop-motion |
|
||||
| `kawaii` | Japanese cute, pastels |
|
||||
| `storybook-watercolor` | Soft painted, whimsical |
|
||||
| `chalkboard` | Chalk on black board |
|
||||
| `cyberpunk-neon` | Neon glow, futuristic |
|
||||
| `bold-graphic` | Comic style, halftone |
|
||||
| `aged-academia` | Vintage science, sepia |
|
||||
| `corporate-memphis` | Flat vector, vibrant |
|
||||
| `technical-schematic` | Blueprint, engineering |
|
||||
| `origami` | Folded paper, geometric |
|
||||
| `pixel-art` | Retro 8-bit |
|
||||
| `ui-wireframe` | Grayscale interface mockup |
|
||||
| `subway-map` | Transit diagram |
|
||||
| `ikea-manual` | Minimal line art |
|
||||
| `knolling` | Organized flat-lay |
|
||||
| `lego-brick` | Toy brick construction |
|
||||
| `pop-laboratory` | Blueprint grid, coordinate markers, lab precision |
|
||||
| `morandi-journal` | Hand-drawn doodle, warm Morandi tones |
|
||||
| `retro-pop-grid` | 1970s retro pop art, Swiss grid, thick outlines |
|
||||
| `hand-drawn-edu` | Macaron pastels, hand-drawn wobble, stick figures |
|
||||
|
||||
Full definitions: `references/styles/<style>.md`
|
||||
|
||||
## Recommended Combinations
|
||||
|
||||
| Content Type | Layout + Style |
|
||||
|--------------|----------------|
|
||||
| Timeline/History | `linear-progression` + `craft-handmade` |
|
||||
| Step-by-step | `linear-progression` + `ikea-manual` |
|
||||
| A vs B | `binary-comparison` + `corporate-memphis` |
|
||||
| Hierarchy | `hierarchical-layers` + `craft-handmade` |
|
||||
| Overlap | `venn-diagram` + `craft-handmade` |
|
||||
| Conversion | `funnel` + `corporate-memphis` |
|
||||
| Cycles | `circular-flow` + `craft-handmade` |
|
||||
| Technical | `structural-breakdown` + `technical-schematic` |
|
||||
| Metrics | `dashboard` + `corporate-memphis` |
|
||||
| Educational | `bento-grid` + `chalkboard` |
|
||||
| Journey | `winding-roadmap` + `storybook-watercolor` |
|
||||
| Categories | `periodic-table` + `bold-graphic` |
|
||||
| Product Guide | `dense-modules` + `morandi-journal` |
|
||||
| Technical Guide | `dense-modules` + `pop-laboratory` |
|
||||
| Trendy Guide | `dense-modules` + `retro-pop-grid` |
|
||||
| Educational Diagram | `hub-spoke` + `hand-drawn-edu` |
|
||||
| Process Tutorial | `linear-progression` + `hand-drawn-edu` |
|
||||
|
||||
Default: `bento-grid` + `craft-handmade`
|
||||
|
||||
## Keyword Shortcuts
|
||||
|
||||
When user input contains these keywords, **auto-select** the associated layout and offer associated styles as top recommendations in Step 3. Skip content-based layout inference for matched keywords.
|
||||
|
||||
If a shortcut has **Prompt Notes**, append them to the generated prompt (Step 5) as additional style instructions.
|
||||
|
||||
| User Keyword | Layout | Recommended Styles | Default Aspect | Prompt Notes |
|
||||
|--------------|--------|--------------------|----------------|--------------|
|
||||
| 高密度信息大图 / high-density-info | `dense-modules` | `morandi-journal`, `pop-laboratory`, `retro-pop-grid` | portrait | — |
|
||||
| 信息图 / infographic | `bento-grid` | `craft-handmade` | landscape | Minimalist: clean canvas, ample whitespace, no complex background textures. Simple cartoon elements and icons only. |
|
||||
|
||||
## Output Structure
|
||||
|
||||
```
|
||||
infographic/{topic-slug}/
|
||||
├── source-{slug}.{ext}
|
||||
├── analysis.md
|
||||
├── structured-content.md
|
||||
├── prompts/infographic.md
|
||||
└── infographic.png
|
||||
```
|
||||
|
||||
Slug: 2-4 words kebab-case from topic. Conflict: append `-YYYYMMDD-HHMMSS`.
|
||||
|
||||
## Core Principles
|
||||
|
||||
- Preserve source data faithfully — no summarization or rephrasing (but **strip any credentials, API keys, tokens, or secrets** before including in outputs)
|
||||
- Define learning objectives before structuring content
|
||||
- Structure for visual communication (headlines, labels, visual elements)
|
||||
|
||||
## Workflow
|
||||
|
||||
### Step 1: Analyze Content
|
||||
|
||||
**Load references**: Read `references/analysis-framework.md` from this skill.
|
||||
|
||||
1. Save source content (file path or paste → `source.md` using `write_file`)
|
||||
- **Backup rule**: If `source.md` exists, rename to `source-backup-YYYYMMDD-HHMMSS.md`
|
||||
2. Analyze: topic, data type, complexity, tone, audience
|
||||
3. Detect source language and user language
|
||||
4. Extract design instructions from user input
|
||||
5. Save analysis to `analysis.md`
|
||||
- **Backup rule**: If `analysis.md` exists, rename to `analysis-backup-YYYYMMDD-HHMMSS.md`
|
||||
|
||||
See `references/analysis-framework.md` for detailed format.
|
||||
|
||||
### Step 2: Generate Structured Content → `structured-content.md`
|
||||
|
||||
Transform content into infographic structure:
|
||||
1. Title and learning objectives
|
||||
2. Sections with: key concept, content (verbatim), visual element, text labels
|
||||
3. Data points (all statistics/quotes copied exactly)
|
||||
4. Design instructions from user
|
||||
|
||||
**Rules**: Markdown only. No new information. Preserve data faithfully. Strip any credentials or secrets from output.
|
||||
|
||||
See `references/structured-content-template.md` for detailed format.
|
||||
|
||||
### Step 3: Recommend Combinations
|
||||
|
||||
**3.1 Check Keyword Shortcuts first**: If user input matches a keyword from the **Keyword Shortcuts** table, auto-select the associated layout and prioritize associated styles as top recommendations. Skip content-based layout inference.
|
||||
|
||||
**3.2 Otherwise**, recommend 3-5 layout×style combinations based on:
|
||||
- Data structure → matching layout
|
||||
- Content tone → matching style
|
||||
- Audience expectations
|
||||
- User design instructions
|
||||
|
||||
### Step 4: Confirm Options
|
||||
|
||||
Use the `clarify` tool to confirm options with the user. Since `clarify` handles one question at a time, ask the most important question first:
|
||||
|
||||
**Q1 — Combination**: Present 3+ layout×style combos with rationale. Ask user to pick one.
|
||||
|
||||
**Q2 — Aspect**: Ask for aspect ratio preference (landscape/portrait/square or custom W:H).
|
||||
|
||||
**Q3 — Language** (only if source ≠ user language): Ask which language the text content should use.
|
||||
|
||||
### Step 5: Generate Prompt → `prompts/infographic.md`
|
||||
|
||||
**Backup rule**: If `prompts/infographic.md` exists, rename to `prompts/infographic-backup-YYYYMMDD-HHMMSS.md`
|
||||
|
||||
**Load references**: Read the selected layout from `references/layouts/<layout>.md` and style from `references/styles/<style>.md`.
|
||||
|
||||
Combine:
|
||||
1. Layout definition from `references/layouts/<layout>.md`
|
||||
2. Style definition from `references/styles/<style>.md`
|
||||
3. Base template from `references/base-prompt.md`
|
||||
4. Structured content from Step 2
|
||||
5. All text in confirmed language
|
||||
|
||||
**Aspect ratio resolution** for `{{ASPECT_RATIO}}`:
|
||||
- Named presets → ratio string: landscape→`16:9`, portrait→`9:16`, square→`1:1`
|
||||
- Custom W:H ratios → use as-is (e.g., `3:4`, `4:3`, `2.35:1`)
|
||||
|
||||
Save the assembled prompt to `prompts/infographic.md` using `write_file`.
|
||||
|
||||
### Step 6: Generate Image
|
||||
|
||||
Use the `image_generate` tool with the assembled prompt from Step 5.
|
||||
|
||||
- Map aspect ratio to image_generate's format: `16:9` → `landscape`, `9:16` → `portrait`, `1:1` → `square`
|
||||
- For custom ratios, pick the closest named aspect
|
||||
- On failure, auto-retry once
|
||||
- Save the resulting image URL/path to the output directory
|
||||
|
||||
### Step 7: Output Summary
|
||||
|
||||
Report: topic, layout, style, aspect, language, output path, files created.
|
||||
|
||||
## References
|
||||
|
||||
- `references/analysis-framework.md` — Analysis methodology
|
||||
- `references/structured-content-template.md` — Content format
|
||||
- `references/base-prompt.md` — Prompt template
|
||||
- `references/layouts/<layout>.md` — 21 layout definitions
|
||||
- `references/styles/<style>.md` — 21 style definitions
|
||||
|
||||
## Pitfalls
|
||||
|
||||
1. **Data integrity is paramount** — never summarize, paraphrase, or alter source statistics. "73% increase" must stay "73% increase", not "significant increase".
|
||||
2. **Strip secrets** — always scan source content for API keys, tokens, or credentials before including in any output file.
|
||||
3. **One message per section** — each infographic section should convey one clear concept. Overloading sections reduces readability.
|
||||
4. **Style consistency** — the style definition from the references file must be applied consistently across the entire infographic. Don't mix styles.
|
||||
5. **image_generate aspect ratios** — the tool only supports `landscape`, `portrait`, and `square`. Custom ratios like `3:4` should map to the nearest option (portrait in that case).
|
||||
+182
@@ -0,0 +1,182 @@
|
||||
# Infographic Content Analysis Framework
|
||||
|
||||
Deep analysis framework applying instructional design principles to infographic creation.
|
||||
|
||||
## Purpose
|
||||
|
||||
Before creating an infographic, thoroughly analyze the source material to:
|
||||
- Understand the content at a deep level
|
||||
- Identify clear learning objectives for the viewer
|
||||
- Structure information for maximum clarity and retention
|
||||
- Match content to optimal layout×style combinations
|
||||
- Preserve all source data verbatim
|
||||
|
||||
## Instructional Design Mindset
|
||||
|
||||
Approach content analysis as a **world-class instructional designer**:
|
||||
|
||||
| Principle | Application |
|
||||
|-----------|-------------|
|
||||
| **Deep Understanding** | Read the entire document before analyzing any part |
|
||||
| **Learner-Centered** | Focus on what the viewer needs to understand |
|
||||
| **Visual Storytelling** | Use visuals to communicate, not just decorate |
|
||||
| **Cognitive Load** | Simplify complex ideas without losing accuracy |
|
||||
| **Data Integrity** | Never alter, summarize, or paraphrase source facts |
|
||||
|
||||
## Analysis Dimensions
|
||||
|
||||
### 1. Content Type Classification
|
||||
|
||||
| Type | Characteristics | Best Layout | Best Style |
|
||||
|------|-----------------|-------------|------------|
|
||||
| **Timeline/History** | Sequential events, dates, progression | linear-progression | craft-handmade, aged-academia |
|
||||
| **Process/Tutorial** | Step-by-step instructions, how-to | linear-progression, winding-roadmap | ikea-manual, technical-schematic |
|
||||
| **Comparison** | A vs B, pros/cons, before-after | binary-comparison, comparison-matrix | corporate-memphis, bold-graphic |
|
||||
| **Hierarchy** | Levels, priorities, pyramids | hierarchical-layers, tree-branching | craft-handmade, corporate-memphis |
|
||||
| **Relationships** | Connections, overlaps, influences | venn-diagram, hub-spoke, jigsaw | craft-handmade, subway-map |
|
||||
| **Data/Metrics** | Statistics, KPIs, measurements | dashboard, periodic-table | corporate-memphis, technical-schematic |
|
||||
| **Cycle/Loop** | Recurring processes, feedback loops | circular-flow | craft-handmade, technical-schematic |
|
||||
| **System/Structure** | Components, architecture, anatomy | structural-breakdown, bento-grid | technical-schematic, ikea-manual |
|
||||
| **Journey/Narrative** | Stories, user flows, milestones | winding-roadmap, story-mountain | storybook-watercolor, comic-strip |
|
||||
| **Overview/Summary** | Multiple topics, feature highlights | bento-grid, periodic-table, dense-modules | chalkboard, bold-graphic |
|
||||
| **Product/Buying Guide** | Multi-dimension comparisons, specs, pitfalls | dense-modules | morandi-journal, pop-laboratory, retro-pop-grid |
|
||||
|
||||
### 2. Learning Objective Identification
|
||||
|
||||
Every infographic should have 1-3 clear learning objectives.
|
||||
|
||||
**Good Learning Objectives**:
|
||||
- Specific and measurable
|
||||
- Focus on what the viewer will understand, not just see
|
||||
- Written from the viewer's perspective
|
||||
|
||||
**Format**: "After viewing this infographic, the viewer will understand..."
|
||||
|
||||
| Content Aspect | Objective Type |
|
||||
|----------------|----------------|
|
||||
| Core concept | "...what [topic] is and why it matters" |
|
||||
| Process | "...how to [accomplish something]" |
|
||||
| Comparison | "...the key differences between [A] and [B]" |
|
||||
| Relationships | "...how [elements] connect to each other" |
|
||||
| Data | "...the significance of [key statistics]" |
|
||||
|
||||
### 3. Audience Analysis
|
||||
|
||||
| Factor | Questions | Impact |
|
||||
|--------|-----------|--------|
|
||||
| **Knowledge Level** | What do they already know? | Determines complexity depth |
|
||||
| **Context** | Why are they viewing this? | Determines emphasis points |
|
||||
| **Expectations** | What do they hope to learn? | Determines success criteria |
|
||||
| **Visual Preferences** | Professional, playful, technical? | Influences style choice |
|
||||
|
||||
### 4. Complexity Assessment
|
||||
|
||||
| Level | Indicators | Layout Recommendation |
|
||||
|-------|------------|----------------------|
|
||||
| **Simple** (3-5 points) | Few main concepts, clear relationships | sparse layouts, single focus |
|
||||
| **Moderate** (6-8 points) | Multiple concepts, some relationships | balanced layouts, clear sections |
|
||||
| **Complex** (9+ points) | Many concepts, intricate relationships | dense layouts, multiple sections |
|
||||
|
||||
### 5. Visual Opportunity Mapping
|
||||
|
||||
Identify what can be shown rather than told:
|
||||
|
||||
| Content Element | Visual Treatment |
|
||||
|-----------------|------------------|
|
||||
| Numbers/Statistics | Large, highlighted numerals |
|
||||
| Comparisons | Side-by-side, split screen |
|
||||
| Processes | Arrows, numbered steps, flow |
|
||||
| Hierarchies | Pyramids, layers, size differences |
|
||||
| Relationships | Lines, connections, overlapping shapes |
|
||||
| Categories | Color coding, grouping, sections |
|
||||
| Timelines | Horizontal/vertical progression |
|
||||
| Quotes | Callout boxes, quotation marks |
|
||||
|
||||
### 6. Data Verbatim Extraction
|
||||
|
||||
**Critical**: All factual information must be preserved exactly as written in the source.
|
||||
|
||||
| Data Type | Handling Rule |
|
||||
|-----------|---------------|
|
||||
| **Statistics** | Copy exactly: "73%" not "about 70%" |
|
||||
| **Quotes** | Copy word-for-word with attribution |
|
||||
| **Names** | Preserve exact spelling |
|
||||
| **Dates** | Keep original format |
|
||||
| **Technical Terms** | Do not simplify or substitute |
|
||||
| **Lists** | Preserve order and wording |
|
||||
|
||||
**Never**:
|
||||
- Round numbers
|
||||
- Paraphrase quotes
|
||||
- Substitute simpler words
|
||||
- Add implied information
|
||||
- Remove context that affects meaning
|
||||
|
||||
## Output Format
|
||||
|
||||
Save analysis results to `analysis.md`:
|
||||
|
||||
```yaml
|
||||
---
|
||||
title: "[Main topic title]"
|
||||
topic: "[educational/technical/business/creative/etc.]"
|
||||
data_type: "[timeline/hierarchy/comparison/process/etc.]"
|
||||
complexity: "[simple/moderate/complex]"
|
||||
point_count: [number of main points]
|
||||
source_language: "[detected language]"
|
||||
user_language: "[user's language]"
|
||||
---
|
||||
|
||||
## Main Topic
|
||||
[1-2 sentence summary of what this content is about]
|
||||
|
||||
## Learning Objectives
|
||||
After viewing this infographic, the viewer should understand:
|
||||
1. [Primary objective]
|
||||
2. [Secondary objective]
|
||||
3. [Tertiary objective if applicable]
|
||||
|
||||
## Target Audience
|
||||
- **Knowledge Level**: [Beginner/Intermediate/Expert]
|
||||
- **Context**: [Why they're viewing this]
|
||||
- **Expectations**: [What they hope to learn]
|
||||
|
||||
## Content Type Analysis
|
||||
- **Data Structure**: [How information relates to itself]
|
||||
- **Key Relationships**: [What connects to what]
|
||||
- **Visual Opportunities**: [What can be shown rather than told]
|
||||
|
||||
## Key Data Points (Verbatim)
|
||||
[All statistics, quotes, and critical facts exactly as they appear in source]
|
||||
- "[Exact data point 1]"
|
||||
- "[Exact data point 2]"
|
||||
- "[Exact quote with attribution]"
|
||||
|
||||
## Layout × Style Signals
|
||||
- Content type: [type] → suggests [layout]
|
||||
- Tone: [tone] → suggests [style]
|
||||
- Audience: [audience] → suggests [style]
|
||||
- Complexity: [level] → suggests [layout density]
|
||||
|
||||
## Design Instructions (from user input)
|
||||
[Any style, color, layout, or visual preferences extracted from user's steering prompt]
|
||||
|
||||
## Recommended Combinations
|
||||
1. **[Layout] + [Style]** (Recommended): [Brief rationale]
|
||||
2. **[Layout] + [Style]**: [Brief rationale]
|
||||
3. **[Layout] + [Style]**: [Brief rationale]
|
||||
```
|
||||
|
||||
## Analysis Checklist
|
||||
|
||||
Before proceeding to structured content generation:
|
||||
|
||||
- [ ] Have I read the entire source document?
|
||||
- [ ] Can I summarize the main topic in 1-2 sentences?
|
||||
- [ ] Have I identified 1-3 clear learning objectives?
|
||||
- [ ] Do I understand the target audience?
|
||||
- [ ] Have I classified the content type correctly?
|
||||
- [ ] Have I extracted all data points verbatim?
|
||||
- [ ] Have I identified visual opportunities?
|
||||
- [ ] Have I extracted design instructions from user input?
|
||||
- [ ] Have I recommended 3 layout×style combinations?
|
||||
@@ -0,0 +1,43 @@
|
||||
Create a professional infographic following these specifications:
|
||||
|
||||
## Image Specifications
|
||||
|
||||
- **Type**: Infographic
|
||||
- **Layout**: {{LAYOUT}}
|
||||
- **Style**: {{STYLE}}
|
||||
- **Aspect Ratio**: {{ASPECT_RATIO}}
|
||||
- **Language**: {{LANGUAGE}}
|
||||
|
||||
## Core Principles
|
||||
|
||||
- Follow the layout structure precisely for information architecture
|
||||
- Apply style aesthetics consistently throughout
|
||||
- If content involves sensitive or copyrighted figures, create stylistically similar alternatives
|
||||
- Keep information concise, highlight keywords and core concepts
|
||||
- Use ample whitespace for visual clarity
|
||||
- Maintain clear visual hierarchy
|
||||
|
||||
## Text Requirements
|
||||
|
||||
- All text must match the specified style treatment
|
||||
- Main titles should be prominent and readable
|
||||
- Key concepts should be visually emphasized
|
||||
- Labels should be clear and appropriately sized
|
||||
- Use the specified language for all text content
|
||||
|
||||
## Layout Guidelines
|
||||
|
||||
{{LAYOUT_GUIDELINES}}
|
||||
|
||||
## Style Guidelines
|
||||
|
||||
{{STYLE_GUIDELINES}}
|
||||
|
||||
---
|
||||
|
||||
Generate the infographic based on the content below:
|
||||
|
||||
{{CONTENT}}
|
||||
|
||||
Text labels (in {{LANGUAGE}}):
|
||||
{{TEXT_LABELS}}
|
||||
+41
@@ -0,0 +1,41 @@
|
||||
# bento-grid
|
||||
|
||||
Modular grid layout with varied cell sizes, like a bento box.
|
||||
|
||||
## Structure
|
||||
|
||||
- Grid of rectangular cells
|
||||
- Mixed cell sizes (1x1, 2x1, 1x2, 2x2)
|
||||
- No strict symmetry required
|
||||
- Hero cell for main point
|
||||
- Supporting cells around it
|
||||
|
||||
## Best For
|
||||
|
||||
- Multiple topic overview
|
||||
- Feature highlights
|
||||
- Dashboard summaries
|
||||
- Portfolio displays
|
||||
- Mixed content types
|
||||
|
||||
## Visual Elements
|
||||
|
||||
- Clear cell boundaries
|
||||
- Varied cell backgrounds
|
||||
- Icons or illustrations per cell
|
||||
- Consistent padding/margins
|
||||
- Visual hierarchy through size
|
||||
|
||||
## Text Placement
|
||||
|
||||
- Main title at top
|
||||
- Cell titles within each cell
|
||||
- Brief content per cell
|
||||
- Minimal text, maximum visual
|
||||
- CTA or summary in prominent cell
|
||||
|
||||
## Recommended Pairings
|
||||
|
||||
- `craft-handmade`: Friendly overviews (default)
|
||||
- `corporate-memphis`: Business summaries
|
||||
- `pixel-art`: Retro feature grids
|
||||
+48
@@ -0,0 +1,48 @@
|
||||
# binary-comparison
|
||||
|
||||
Side-by-side comparison of two items, states, or concepts.
|
||||
|
||||
## Structure
|
||||
|
||||
- Vertical divider splitting image in half
|
||||
- Left side: Item A / Before / Pro
|
||||
- Right side: Item B / After / Con
|
||||
- Mirrored layout for easy comparison
|
||||
- Clear visual distinction between sides
|
||||
|
||||
## Variants
|
||||
|
||||
| Variant | Focus | Visual Emphasis |
|
||||
|---------|-------|-----------------|
|
||||
| **Before-After** | Transformation over time | Temporal change, improvement |
|
||||
| **A vs B** | Feature comparison | Direct contrast, differences |
|
||||
| **Pro-Con** | Advantages/disadvantages | Balanced evaluation |
|
||||
|
||||
## Best For
|
||||
|
||||
- Before/after transformations
|
||||
- Product or option comparisons
|
||||
- Pros and cons analysis
|
||||
- Old vs new comparisons
|
||||
- Two perspectives on a topic
|
||||
|
||||
## Visual Elements
|
||||
|
||||
- Strong vertical dividing line or gradient
|
||||
- Contrasting colors per side
|
||||
- Matching element positions for comparison
|
||||
- VS symbol or divider decoration
|
||||
- Transformation arrow for before-after
|
||||
|
||||
## Text Placement
|
||||
|
||||
- Main title centered at top
|
||||
- Side labels (A/B, Before/After)
|
||||
- Corresponding points aligned horizontally
|
||||
- Summary at bottom if needed
|
||||
|
||||
## Recommended Pairings
|
||||
|
||||
- `corporate-memphis`: Business comparisons
|
||||
- `bold-graphic`: High-contrast dramatic comparisons
|
||||
- `craft-handmade`: Friendly explainers
|
||||
@@ -0,0 +1,41 @@
|
||||
# bridge
|
||||
|
||||
Gap-crossing structure connecting problem to solution or current to future state.
|
||||
|
||||
## Structure
|
||||
|
||||
- Left side: current state/problem
|
||||
- Right side: desired state/solution
|
||||
- Bridge element spanning the gap
|
||||
- Gap representing challenge/obstacle
|
||||
- Bridge elements as steps/methods
|
||||
|
||||
## Best For
|
||||
|
||||
- Problem to solution journeys
|
||||
- Current vs future state
|
||||
- Gap analysis
|
||||
- Transformation bridges
|
||||
- Strategic initiatives
|
||||
|
||||
## Visual Elements
|
||||
|
||||
- Two distinct platforms/sides
|
||||
- Visible gap or chasm
|
||||
- Bridge structure with supports
|
||||
- Icons representing each side
|
||||
- Stepping stones or bridge planks
|
||||
|
||||
## Text Placement
|
||||
|
||||
- Title at top
|
||||
- Left label (From/Problem/Current)
|
||||
- Right label (To/Solution/Future)
|
||||
- Bridge elements labeled
|
||||
- Gap description below
|
||||
|
||||
## Recommended Pairings
|
||||
|
||||
- `cartoon-hand-drawn`: Friendly journeys
|
||||
- `corporate-memphis`: Business transformations
|
||||
- `isometric-3d`: Technical transitions
|
||||
+41
@@ -0,0 +1,41 @@
|
||||
# circular-flow
|
||||
|
||||
Cyclic process showing continuous or recurring steps.
|
||||
|
||||
## Structure
|
||||
|
||||
- Circular arrangement
|
||||
- Steps around the circle
|
||||
- Arrows showing direction
|
||||
- No clear start/end (continuous)
|
||||
- Center can hold main concept
|
||||
|
||||
## Best For
|
||||
|
||||
- Recurring processes
|
||||
- Feedback loops
|
||||
- Lifecycle stages
|
||||
- Continuous improvement
|
||||
- Natural cycles
|
||||
|
||||
## Visual Elements
|
||||
|
||||
- Circle or ring shape
|
||||
- Directional arrows
|
||||
- Step nodes evenly spaced
|
||||
- Icons per step
|
||||
- Optional center element
|
||||
|
||||
## Text Placement
|
||||
|
||||
- Title at top
|
||||
- Step labels at each node
|
||||
- Brief descriptions near nodes
|
||||
- Center concept if applicable
|
||||
- Cycle name
|
||||
|
||||
## Recommended Pairings
|
||||
|
||||
- `cartoon-hand-drawn`: Friendly cycles
|
||||
- `corporate-memphis`: Business processes
|
||||
- `subway-map`: Transit-style cycles
|
||||
+41
@@ -0,0 +1,41 @@
|
||||
# comic-strip
|
||||
|
||||
Sequential narrative panels telling a story or explaining a concept.
|
||||
|
||||
## Structure
|
||||
|
||||
- Multiple panels in sequence
|
||||
- Left-to-right, top-to-bottom reading
|
||||
- Characters or subjects in scenes
|
||||
- Speech/thought bubbles
|
||||
- Panel borders clearly defined
|
||||
|
||||
## Best For
|
||||
|
||||
- Storytelling explanations
|
||||
- User journey narratives
|
||||
- Scenario illustrations
|
||||
- Step sequences with context
|
||||
- Before/during/after stories
|
||||
|
||||
## Visual Elements
|
||||
|
||||
- Panel frames
|
||||
- Speech and thought bubbles
|
||||
- Sound effects (optional)
|
||||
- Characters with expressions
|
||||
- Scene backgrounds
|
||||
|
||||
## Text Placement
|
||||
|
||||
- Title at top
|
||||
- Dialogue in speech bubbles
|
||||
- Narration in caption boxes
|
||||
- Sound effects integrated
|
||||
- Panel numbers if needed
|
||||
|
||||
## Recommended Pairings
|
||||
|
||||
- `graphic-novel`: Dramatic narratives
|
||||
- `kawaii`: Cute character stories
|
||||
- `cartoon-hand-drawn`: Friendly explanations
|
||||
+41
@@ -0,0 +1,41 @@
|
||||
# comparison-matrix
|
||||
|
||||
Grid-based multi-factor comparison across multiple items.
|
||||
|
||||
## Structure
|
||||
|
||||
- Table/grid layout
|
||||
- Rows: items being compared
|
||||
- Columns: comparison criteria
|
||||
- Cells: scores, checks, or values
|
||||
- Header row and column clearly marked
|
||||
|
||||
## Best For
|
||||
|
||||
- Product feature comparisons
|
||||
- Tool/software evaluations
|
||||
- Multi-criteria decisions
|
||||
- Specification sheets
|
||||
- Rating comparisons
|
||||
|
||||
## Visual Elements
|
||||
|
||||
- Clear grid lines or cell boundaries
|
||||
- Checkmarks, X marks, or scores in cells
|
||||
- Color coding for quick scanning
|
||||
- Icons for criteria categories
|
||||
- Highlight for recommended option
|
||||
|
||||
## Text Placement
|
||||
|
||||
- Title at top
|
||||
- Item names in first column
|
||||
- Criteria in header row
|
||||
- Brief values in cells
|
||||
- Legend if using symbols
|
||||
|
||||
## Recommended Pairings
|
||||
|
||||
- `corporate-memphis`: Business tool comparisons
|
||||
- `ui-wireframe`: Technical feature matrices
|
||||
- `blueprint`: Specification comparisons
|
||||
+41
@@ -0,0 +1,41 @@
|
||||
# dashboard
|
||||
|
||||
Multi-metric display with charts, numbers, and KPI indicators.
|
||||
|
||||
## Structure
|
||||
|
||||
- Multiple data widgets
|
||||
- Charts, graphs, numbers
|
||||
- Grid or modular layout
|
||||
- Key metrics prominent
|
||||
- Status indicators
|
||||
|
||||
## Best For
|
||||
|
||||
- KPI summaries
|
||||
- Performance metrics
|
||||
- Analytics overviews
|
||||
- Status reports
|
||||
- Data snapshots
|
||||
|
||||
## Visual Elements
|
||||
|
||||
- Chart types (bar, line, pie, gauge)
|
||||
- Big numbers for KPIs
|
||||
- Trend arrows (up/down)
|
||||
- Color-coded status (green/red)
|
||||
- Clean data visualization
|
||||
|
||||
## Text Placement
|
||||
|
||||
- Title at top
|
||||
- Widget titles above each section
|
||||
- Metric labels and values
|
||||
- Units clearly shown
|
||||
- Time period indicated
|
||||
|
||||
## Recommended Pairings
|
||||
|
||||
- `corporate-memphis`: Business dashboards
|
||||
- `ui-wireframe`: Technical dashboards
|
||||
- `cyberpunk-neon`: Futuristic displays
|
||||
+72
@@ -0,0 +1,72 @@
|
||||
# dense-modules
|
||||
|
||||
High-density modular layout with 6-7 typed information modules packed with concrete data.
|
||||
|
||||
## Structure
|
||||
|
||||
- 6-7 distinct modules per image, each serving a specific information function
|
||||
- Every module contains concrete data: brand names, numbers, percentages, parameters
|
||||
- Minimal whitespace—compact spacing prioritized over breathing room
|
||||
- Smaller text acceptable to maximize information density
|
||||
- Each module identified by coordinate label or section marker (e.g., MOD-1, SEC-A)
|
||||
|
||||
## Module Archetypes
|
||||
|
||||
| Module | Purpose | Content Requirements |
|
||||
|--------|---------|---------------------|
|
||||
| **Brand/Selection Array** | Grid of options with recommendations | 4-8 items with icons, names, brief descriptions; highlight "best choice" |
|
||||
| **Specification Scale** | Quality/measurement gauge | 3-5 levels with precise numerical increments, quality indicators (emoji faces, checkmarks) |
|
||||
| **Deep Dive/Detail** | Technical breakdown of key item | Zoom-in callouts, internal components, cross-section or exploded view |
|
||||
| **Scenario Comparison** | Side-by-side use cases | 3-6 scenarios with specific recommendations and data per scenario |
|
||||
| **Identification Tips** | How-to checklist | 3-5 inspection methods: look/test/check/ask format |
|
||||
| **Warning/Pitfall Zone** | Critical mistakes to avoid | 3-5 pitfalls with consequences, 1-2 correct approaches; high visual contrast |
|
||||
| **Quick Reference** | Compact summary | Dense table, one-line summaries, decision flowchart, or key takeaways |
|
||||
|
||||
## Variants
|
||||
|
||||
| Variant | Focus | Visual Emphasis |
|
||||
|---------|-------|-----------------|
|
||||
| **Coordinate-labeled** | Precision and systematicity | Each module has alphanumeric coordinate (A-01, B-05, C-12), ruler/axis markers |
|
||||
| **Grid-cell** | Order and structure | Modules in strict rectangular cells divided by thick lines, Swiss grid feel |
|
||||
| **Free-flowing** | Organic density | Magazine-style layout with dotted frames, varying module sizes, connected by arrows |
|
||||
|
||||
## Best For
|
||||
|
||||
- Product selection guides and buying guides
|
||||
- Multi-dimensional comparison content
|
||||
- Data-rich educational materials
|
||||
- "Avoid pitfalls" / "complete guide" formats
|
||||
- Content targeting platforms like Xiaohongshu with high-density visual requirements
|
||||
|
||||
## Visual Elements
|
||||
|
||||
- Module boundary markers (thick lines, dotted frames, or coordinate grids)
|
||||
- Quality indicators per module (emoji faces, checkmarks, crosses, crowns)
|
||||
- Data callout boxes with highlighted numbers
|
||||
- Comparison arrows and progression indicators
|
||||
- Warning/alert visual markers for pitfall modules
|
||||
- Metadata in corners (page numbers, timestamps, small barcodes)
|
||||
|
||||
## Text Placement
|
||||
|
||||
- Main title at top, prominent and impactful
|
||||
- Subtitle with module count ("X大维度全面解析...")
|
||||
- Module headers inside colored badges or labeled frames
|
||||
- Body text compact, multiple columns within modules
|
||||
- Numbers highlighted with accent colors, slightly larger than body text
|
||||
|
||||
## Information Density Rules
|
||||
|
||||
- Every corner should contain useful information or metadata
|
||||
- No decorative-only empty space
|
||||
- Text size may be reduced to fit more content—information over font size
|
||||
- Each module must have specific data points, not generic descriptions
|
||||
- Balance between density and readability: dense but organized
|
||||
|
||||
## Recommended Pairings
|
||||
|
||||
- `pop-laboratory`: Technical precision with coordinate markers and blueprint grid
|
||||
- `morandi-journal`: Hand-drawn warmth with doodle illustrations and organic frames
|
||||
- `retro-pop-grid`: 1970s pop art with strict grid cells and bold contrast
|
||||
- `corporate-memphis`: Clean business feel for product comparisons
|
||||
- `technical-schematic`: Engineering precision for technical product guides
|
||||
@@ -0,0 +1,41 @@
|
||||
# funnel
|
||||
|
||||
Narrowing stages showing conversion, filtering, or refinement process.
|
||||
|
||||
## Structure
|
||||
|
||||
- Wide top (input/start)
|
||||
- Narrow bottom (output/result)
|
||||
- Horizontal layers for stages
|
||||
- Progressive narrowing
|
||||
- 3-6 stages typically
|
||||
|
||||
## Best For
|
||||
|
||||
- Sales/marketing funnels
|
||||
- Conversion processes
|
||||
- Filtering/selection
|
||||
- Recruitment pipelines
|
||||
- Decision processes
|
||||
|
||||
## Visual Elements
|
||||
|
||||
- Funnel shape clearly defined
|
||||
- Distinct colors per stage
|
||||
- Width indicates volume/quantity
|
||||
- Stage icons or symbols
|
||||
- Numbers/percentages per stage
|
||||
|
||||
## Text Placement
|
||||
|
||||
- Title at top
|
||||
- Stage names inside or beside
|
||||
- Metrics/numbers per stage
|
||||
- Input label at top
|
||||
- Output label at bottom
|
||||
|
||||
## Recommended Pairings
|
||||
|
||||
- `corporate-memphis`: Marketing funnels
|
||||
- `isometric-3d`: Technical pipelines
|
||||
- `cartoon-hand-drawn`: Educational funnels
|
||||
+48
@@ -0,0 +1,48 @@
|
||||
# hierarchical-layers
|
||||
|
||||
Nested layers showing levels of importance, influence, or proximity.
|
||||
|
||||
## Structure
|
||||
|
||||
- Multiple layers from core to periphery
|
||||
- Core/top: most important/central
|
||||
- Outer/bottom: decreasing importance
|
||||
- 3-7 levels typically
|
||||
- Clear boundaries between levels
|
||||
|
||||
## Variants
|
||||
|
||||
| Variant | Shape | Visual Emphasis |
|
||||
|---------|-------|-----------------|
|
||||
| **Pyramid** | Triangle, vertical | Top-down hierarchy, quantity |
|
||||
| **Concentric** | Rings, radial | Center-out influence, proximity |
|
||||
|
||||
## Best For
|
||||
|
||||
- Maslow's hierarchy style concepts
|
||||
- Priority and importance levels
|
||||
- Spheres of influence
|
||||
- Organizational structures
|
||||
- Stakeholder analysis
|
||||
|
||||
## Visual Elements
|
||||
|
||||
- Distinct color per level
|
||||
- Icons or illustrations per tier
|
||||
- Size indicates importance/quantity
|
||||
- Labels inside or beside layers
|
||||
- Decorative apex/center element
|
||||
|
||||
## Text Placement
|
||||
|
||||
- Title at top or side
|
||||
- Level names inside each tier
|
||||
- Brief descriptions outside
|
||||
- Quantities or percentages if relevant
|
||||
- Legend for color meanings
|
||||
|
||||
## Recommended Pairings
|
||||
|
||||
- `craft-handmade`: Playful layered concepts
|
||||
- `corporate-memphis`: Business hierarchies
|
||||
- `technical-schematic`: Technical 3D pyramids
|
||||
+41
@@ -0,0 +1,41 @@
|
||||
# hub-spoke
|
||||
|
||||
Central concept with radiating connections to related items.
|
||||
|
||||
## Structure
|
||||
|
||||
- Central hub (main concept)
|
||||
- Spokes radiating outward
|
||||
- Nodes at spoke ends (related concepts)
|
||||
- Even or weighted distribution
|
||||
- Optional secondary connections
|
||||
|
||||
## Best For
|
||||
|
||||
- Central theme with components
|
||||
- Product features around core
|
||||
- Team roles around project
|
||||
- Ecosystem mapping
|
||||
- Mind maps
|
||||
|
||||
## Visual Elements
|
||||
|
||||
- Prominent central hub
|
||||
- Clear spoke lines
|
||||
- Consistent node styling
|
||||
- Icons representing each spoke item
|
||||
- Optional grouping colors
|
||||
|
||||
## Text Placement
|
||||
|
||||
- Title at top
|
||||
- Core concept in center hub
|
||||
- Spoke item labels at nodes
|
||||
- Brief descriptions near nodes
|
||||
- Connection labels on spokes if needed
|
||||
|
||||
## Recommended Pairings
|
||||
|
||||
- `cartoon-hand-drawn`: Friendly concept maps
|
||||
- `corporate-memphis`: Business ecosystems
|
||||
- `subway-map`: Network-style connections
|
||||
@@ -0,0 +1,41 @@
|
||||
# iceberg
|
||||
|
||||
Surface vs hidden depths, visible vs underlying factors.
|
||||
|
||||
## Structure
|
||||
|
||||
- Waterline dividing visible/hidden
|
||||
- Tip above water (obvious/surface)
|
||||
- Larger mass below (hidden/deep)
|
||||
- Proportional to emphasize hidden depth
|
||||
- Optional layers within underwater section
|
||||
|
||||
## Best For
|
||||
|
||||
- Surface vs root causes
|
||||
- Visible vs invisible work
|
||||
- Symptoms vs underlying issues
|
||||
- Public vs private aspects
|
||||
- Known vs unknown factors
|
||||
|
||||
## Visual Elements
|
||||
|
||||
- Clear water/surface line
|
||||
- Above: smaller, brighter
|
||||
- Below: larger, darker/deeper
|
||||
- Wave or water texture
|
||||
- Gradient showing depth
|
||||
|
||||
## Text Placement
|
||||
|
||||
- Title at top
|
||||
- Surface items above waterline
|
||||
- Hidden items below, larger
|
||||
- Waterline label optional
|
||||
- Depth indicators for layers
|
||||
|
||||
## Recommended Pairings
|
||||
|
||||
- `cartoon-hand-drawn`: Friendly metaphor
|
||||
- `storybook-watercolor`: Artistic depth
|
||||
- `graphic-novel`: Dramatic revelation
|
||||
+41
@@ -0,0 +1,41 @@
|
||||
# isometric-map
|
||||
|
||||
3D-style spatial layout showing locations, relationships, or journey through space.
|
||||
|
||||
## Structure
|
||||
|
||||
- Isometric 3D perspective
|
||||
- Locations as buildings/landmarks
|
||||
- Paths connecting locations
|
||||
- Spatial relationships visible
|
||||
- Bird's eye view angle
|
||||
|
||||
## Best For
|
||||
|
||||
- Office/campus layouts
|
||||
- City/ecosystem maps
|
||||
- User journey maps
|
||||
- System architecture
|
||||
- Process landscapes
|
||||
|
||||
## Visual Elements
|
||||
|
||||
- Consistent isometric angle (30°)
|
||||
- 3D buildings or objects
|
||||
- Pathways and roads
|
||||
- Labels floating above
|
||||
- Mini scenes at locations
|
||||
|
||||
## Text Placement
|
||||
|
||||
- Title at top corner
|
||||
- Location labels above objects
|
||||
- Path labels along routes
|
||||
- Legend for symbols
|
||||
- Scale indicator if relevant
|
||||
|
||||
## Recommended Pairings
|
||||
|
||||
- `isometric-3d`: Clean technical maps
|
||||
- `pixel-art`: Retro game-style maps
|
||||
- `lego-brick`: Playful location maps
|
||||
@@ -0,0 +1,41 @@
|
||||
# jigsaw
|
||||
|
||||
Interlocking puzzle pieces showing how parts fit together.
|
||||
|
||||
## Structure
|
||||
|
||||
- Puzzle pieces that interlock
|
||||
- Each piece represents a component
|
||||
- Connections show relationships
|
||||
- Can be assembled or exploded view
|
||||
- Missing piece highlights gaps
|
||||
|
||||
## Best For
|
||||
|
||||
- Component relationships
|
||||
- Team/skill fit
|
||||
- Strategy pieces
|
||||
- Integration concepts
|
||||
- Completeness assessments
|
||||
|
||||
## Visual Elements
|
||||
|
||||
- Classic puzzle piece shapes
|
||||
- Distinct colors per piece
|
||||
- Interlocking edges visible
|
||||
- Icons or labels per piece
|
||||
- Optional missing piece
|
||||
|
||||
## Text Placement
|
||||
|
||||
- Title at top
|
||||
- Piece labels inside or beside
|
||||
- Connection descriptions
|
||||
- Missing piece explanation
|
||||
- Assembly context
|
||||
|
||||
## Recommended Pairings
|
||||
|
||||
- `cartoon-hand-drawn`: Friendly integration concepts
|
||||
- `paper-cutout`: Tactile puzzle feel
|
||||
- `corporate-memphis`: Business strategy pieces
|
||||
+48
@@ -0,0 +1,48 @@
|
||||
# linear-progression
|
||||
|
||||
Sequential progression showing steps, timeline, or chronological events.
|
||||
|
||||
## Structure
|
||||
|
||||
- Linear arrangement (horizontal or vertical)
|
||||
- Nodes/markers at key points
|
||||
- Connecting line or path between nodes
|
||||
- Clear start and end points
|
||||
- Directional flow indicators
|
||||
|
||||
## Variants
|
||||
|
||||
| Variant | Focus | Visual Emphasis |
|
||||
|---------|-------|-----------------|
|
||||
| **Timeline** | Chronological events, dates | Time markers, period labels |
|
||||
| **Process** | Action steps, numbered sequence | Step numbers, action icons |
|
||||
|
||||
## Best For
|
||||
|
||||
- Step-by-step tutorials and how-tos
|
||||
- Historical timelines and evolution
|
||||
- Project milestones and roadmaps
|
||||
- Workflow documentation
|
||||
- Onboarding processes
|
||||
|
||||
## Visual Elements
|
||||
|
||||
- Numbered steps or date markers
|
||||
- Arrows or connectors showing direction
|
||||
- Icons representing each step/event
|
||||
- Consistent node spacing
|
||||
- Progress indicators optional
|
||||
|
||||
## Text Placement
|
||||
|
||||
- Title at top
|
||||
- Step/event titles at each node
|
||||
- Brief descriptions below nodes
|
||||
- Dates or numbers clearly visible
|
||||
|
||||
## Recommended Pairings
|
||||
|
||||
- `craft-handmade`: Friendly tutorials and timelines
|
||||
- `ikea-manual`: Clean assembly instructions
|
||||
- `corporate-memphis`: Business process flows
|
||||
- `aged-academia`: Historical discoveries
|
||||
+41
@@ -0,0 +1,41 @@
|
||||
# periodic-table
|
||||
|
||||
Grid of categorized elements with consistent cell formatting.
|
||||
|
||||
## Structure
|
||||
|
||||
- Rectangular grid
|
||||
- Each cell is one element
|
||||
- Color-coded categories
|
||||
- Consistent cell format
|
||||
- Optional grouping gaps
|
||||
|
||||
## Best For
|
||||
|
||||
- Categorized collections
|
||||
- Tool/resource catalogs
|
||||
- Skill matrices
|
||||
- Element collections
|
||||
- Reference guides
|
||||
|
||||
## Visual Elements
|
||||
|
||||
- Uniform cell sizes
|
||||
- Category colors
|
||||
- Symbol/abbreviation prominent
|
||||
- Small icon per cell
|
||||
- Category legend
|
||||
|
||||
## Text Placement
|
||||
|
||||
- Title at top
|
||||
- Cell: symbol, name, brief info
|
||||
- Category names in legend
|
||||
- Optional row/column headers
|
||||
- Footnotes for special cases
|
||||
|
||||
## Recommended Pairings
|
||||
|
||||
- `pop-art`: Vibrant element grids
|
||||
- `pixel-art`: Retro collection displays
|
||||
- `corporate-memphis`: Business tool catalogs
|
||||
+41
@@ -0,0 +1,41 @@
|
||||
# story-mountain
|
||||
|
||||
Plot structure visualization showing rising action, climax, and resolution.
|
||||
|
||||
## Structure
|
||||
|
||||
- Mountain/arc shape
|
||||
- Rising slope (build-up)
|
||||
- Peak (climax)
|
||||
- Falling slope (resolution)
|
||||
- Start and end at base level
|
||||
|
||||
## Best For
|
||||
|
||||
- Narrative structures
|
||||
- Project lifecycles
|
||||
- Tension/release patterns
|
||||
- Emotional journeys
|
||||
- Campaign arcs
|
||||
|
||||
## Visual Elements
|
||||
|
||||
- Mountain or arc curve
|
||||
- Points along the path
|
||||
- Climax visually emphasized
|
||||
- Slope steepness meaningful
|
||||
- Base camps or milestones
|
||||
|
||||
## Text Placement
|
||||
|
||||
- Title at top
|
||||
- Stage labels along path
|
||||
- Climax prominently labeled
|
||||
- Brief descriptions at points
|
||||
- Start/end clearly marked
|
||||
|
||||
## Recommended Pairings
|
||||
|
||||
- `storybook-watercolor`: Narrative journeys
|
||||
- `cartoon-hand-drawn`: Educational plot diagrams
|
||||
- `graphic-novel`: Dramatic story arcs
|
||||
+48
@@ -0,0 +1,48 @@
|
||||
# structural-breakdown
|
||||
|
||||
Internal structure visualization with labeled parts or layers.
|
||||
|
||||
## Structure
|
||||
|
||||
- Central subject (object, system, body)
|
||||
- Parts or layers clearly shown
|
||||
- Labels with callout lines
|
||||
- Exploded or cutaway view
|
||||
- Optional zoomed detail sections
|
||||
|
||||
## Variants
|
||||
|
||||
| Variant | View Type | Visual Emphasis |
|
||||
|---------|-----------|-----------------|
|
||||
| **Exploded** | Parts separated outward | Component relationships |
|
||||
| **Cross-section** | Sliced/cutaway view | Internal layers, composition |
|
||||
|
||||
## Best For
|
||||
|
||||
- Product part breakdowns
|
||||
- Anatomy explanations
|
||||
- System components
|
||||
- Device teardowns
|
||||
- Material composition
|
||||
|
||||
## Visual Elements
|
||||
|
||||
- Main subject clearly rendered
|
||||
- Callout lines with dots/arrows
|
||||
- Label boxes at endpoints
|
||||
- Numbered parts optionally
|
||||
- Layer boundaries or separation
|
||||
|
||||
## Text Placement
|
||||
|
||||
- Title at top
|
||||
- Part/layer labels at callouts
|
||||
- Brief descriptions in boxes
|
||||
- Legend for numbered systems
|
||||
- Depth/thickness if relevant
|
||||
|
||||
## Recommended Pairings
|
||||
|
||||
- `technical-schematic`: Technical schematics
|
||||
- `aged-academia`: Classic anatomical style
|
||||
- `craft-handmade`: Friendly breakdowns
|
||||
+41
@@ -0,0 +1,41 @@
|
||||
# tree-branching
|
||||
|
||||
Hierarchical structure branching from root to leaves, showing categories and subcategories.
|
||||
|
||||
## Structure
|
||||
|
||||
- Root/trunk at top or left
|
||||
- Branches splitting into sub-branches
|
||||
- Leaves as terminal nodes
|
||||
- Clear parent-child relationships
|
||||
- Balanced or organic branching
|
||||
|
||||
## Best For
|
||||
|
||||
- Taxonomies and classifications
|
||||
- Decision trees
|
||||
- Organizational charts
|
||||
- File/folder structures
|
||||
- Family trees
|
||||
|
||||
## Visual Elements
|
||||
|
||||
- Connecting lines showing relationships
|
||||
- Nodes at branch points
|
||||
- Icons or labels at each node
|
||||
- Color coding by branch
|
||||
- Visual weight decreasing toward leaves
|
||||
|
||||
## Text Placement
|
||||
|
||||
- Title at top
|
||||
- Root concept prominently labeled
|
||||
- Branch and leaf labels
|
||||
- Optional descriptions at key nodes
|
||||
- Legend for categories
|
||||
|
||||
## Recommended Pairings
|
||||
|
||||
- `cartoon-hand-drawn`: Friendly taxonomies
|
||||
- `da-vinci-notebook`: Scientific classifications
|
||||
- `origami`: Geometric tree structures
|
||||
+41
@@ -0,0 +1,41 @@
|
||||
# venn-diagram
|
||||
|
||||
Overlapping circles showing relationships, commonalities, and differences.
|
||||
|
||||
## Structure
|
||||
|
||||
- 2-3 overlapping circles
|
||||
- Each circle is a category/concept
|
||||
- Overlaps show shared elements
|
||||
- Center shows common to all
|
||||
- Unique areas for exclusives
|
||||
|
||||
## Best For
|
||||
|
||||
- Concept relationships
|
||||
- Skill overlaps
|
||||
- Market segments
|
||||
- Comparative analysis
|
||||
- Finding common ground
|
||||
|
||||
## Visual Elements
|
||||
|
||||
- Translucent circle fills
|
||||
- Clear overlap regions
|
||||
- Distinct colors per circle
|
||||
- Icons in regions
|
||||
- Boundary labels
|
||||
|
||||
## Text Placement
|
||||
|
||||
- Title at top
|
||||
- Circle labels outside or on edge
|
||||
- Items in appropriate regions
|
||||
- Overlap region labels
|
||||
- Legend if needed
|
||||
|
||||
## Recommended Pairings
|
||||
|
||||
- `cartoon-hand-drawn`: Friendly concept overlaps
|
||||
- `corporate-memphis`: Business segment analysis
|
||||
- `pop-art`: High-contrast comparisons
|
||||
+41
@@ -0,0 +1,41 @@
|
||||
# winding-roadmap
|
||||
|
||||
Curved path showing journey with milestones and checkpoints.
|
||||
|
||||
## Structure
|
||||
|
||||
- S-curve or winding path
|
||||
- Milestones along the path
|
||||
- Start and destination points
|
||||
- Side elements (obstacles, helpers)
|
||||
- Progress indicators
|
||||
|
||||
## Best For
|
||||
|
||||
- Project roadmaps
|
||||
- Career paths
|
||||
- Customer journeys
|
||||
- Learning paths
|
||||
- Strategy timelines
|
||||
|
||||
## Visual Elements
|
||||
|
||||
- Curving road or river
|
||||
- Milestone markers/flags
|
||||
- Scene elements along path
|
||||
- Vehicle/character on journey
|
||||
- Destination landmark
|
||||
|
||||
## Text Placement
|
||||
|
||||
- Title at top
|
||||
- Milestone labels at each point
|
||||
- Path section names
|
||||
- Destination description
|
||||
- Optional timeline indicators
|
||||
|
||||
## Recommended Pairings
|
||||
|
||||
- `storybook-watercolor`: Whimsical journeys
|
||||
- `cartoon-hand-drawn`: Friendly roadmaps
|
||||
- `isometric-3d`: Technical project paths
|
||||
+244
@@ -0,0 +1,244 @@
|
||||
# Structured Content Template
|
||||
|
||||
Template for generating structured infographic content that informs the visual designer.
|
||||
|
||||
## Purpose
|
||||
|
||||
This document bridges content analysis and visual design:
|
||||
- Transforms source material into designer-ready format
|
||||
- Organizes learning objectives into visual sections
|
||||
- Preserves all source data verbatim
|
||||
- Separates content from design instructions
|
||||
|
||||
## Instructional Design Process
|
||||
|
||||
### Phase 1: High-Level Outline
|
||||
|
||||
1. **Title**: Capture the essence in a compelling headline
|
||||
2. **Overview**: Brief description (1-2 sentences)
|
||||
3. **Learning Objectives**: List what the viewer will understand
|
||||
|
||||
### Phase 2: Section Development
|
||||
|
||||
For each learning objective:
|
||||
|
||||
1. **Key Concept**: One-sentence summary of the section
|
||||
2. **Content**: Points extracted verbatim from source
|
||||
3. **Visual Element**: What should be shown visually
|
||||
4. **Text Labels**: Exact text for headlines, subheads, labels
|
||||
|
||||
### Phase 3: Data Integrity Check
|
||||
|
||||
Verify all source data is:
|
||||
- Copied exactly (no paraphrasing)
|
||||
- Attributed correctly (for quotes)
|
||||
- Formatted consistently
|
||||
|
||||
## Critical Rules
|
||||
|
||||
| Rule | Requirement | Example |
|
||||
|------|-------------|---------|
|
||||
| **Output format** | Markdown only | Use proper headers, lists, code blocks |
|
||||
| **Tone** | Expert trainer | Knowledgeable, clear, encouraging |
|
||||
| **No new information** | Only source content | Don't add examples not in source |
|
||||
| **Verbatim data** | Exact copies | "73% increase" not "significant increase" |
|
||||
|
||||
## Structured Content Format
|
||||
|
||||
```markdown
|
||||
# [Infographic Title]
|
||||
|
||||
## Overview
|
||||
[Brief description of what this infographic conveys - 1-2 sentences]
|
||||
|
||||
## Learning Objectives
|
||||
The viewer will understand:
|
||||
1. [Primary objective]
|
||||
2. [Secondary objective]
|
||||
3. [Tertiary objective if applicable]
|
||||
|
||||
---
|
||||
|
||||
## Section 1: [Section Title]
|
||||
|
||||
**Key Concept**: [One-sentence summary of this section]
|
||||
|
||||
**Content**:
|
||||
- [Point 1 - verbatim from source]
|
||||
- [Point 2 - verbatim from source]
|
||||
- [Point 3 - verbatim from source]
|
||||
|
||||
**Visual Element**: [Description of what to show visually]
|
||||
- Type: [icon/chart/illustration/diagram/photo]
|
||||
- Subject: [what it depicts]
|
||||
- Treatment: [how it should be presented]
|
||||
|
||||
**Text Labels**:
|
||||
- Headline: "[Exact text for headline]"
|
||||
- Subhead: "[Exact text for subhead]"
|
||||
- Labels: "[Label 1]", "[Label 2]", "[Label 3]"
|
||||
|
||||
---
|
||||
|
||||
## Section 2: [Section Title]
|
||||
|
||||
**Key Concept**: [One-sentence summary]
|
||||
|
||||
**Content**:
|
||||
- [Point 1]
|
||||
- [Point 2]
|
||||
|
||||
**Visual Element**: [Description]
|
||||
|
||||
**Text Labels**:
|
||||
- Headline: "[text]"
|
||||
- Labels: "[Label 1]", "[Label 2]"
|
||||
|
||||
---
|
||||
|
||||
[Continue for each section...]
|
||||
|
||||
---
|
||||
|
||||
## Data Points (Verbatim)
|
||||
|
||||
All statistics, numbers, and quotes exactly as they appear in source:
|
||||
|
||||
### Statistics
|
||||
- "[Exact statistic 1]"
|
||||
- "[Exact statistic 2]"
|
||||
- "[Exact statistic 3]"
|
||||
|
||||
### Quotes
|
||||
- "[Exact quote]" — [Attribution]
|
||||
|
||||
### Key Terms
|
||||
- **[Term 1]**: [Definition from source]
|
||||
- **[Term 2]**: [Definition from source]
|
||||
|
||||
---
|
||||
|
||||
## Design Instructions
|
||||
|
||||
Extracted from user's steering prompt:
|
||||
|
||||
### Style Preferences
|
||||
- [Any color preferences]
|
||||
- [Any mood/aesthetic preferences]
|
||||
- [Any artistic style preferences]
|
||||
|
||||
### Layout Preferences
|
||||
- [Any structure preferences]
|
||||
- [Any organization preferences]
|
||||
|
||||
### Other Requirements
|
||||
- [Any other visual requirements from user]
|
||||
- [Target platform if specified]
|
||||
- [Brand guidelines if any]
|
||||
```
|
||||
|
||||
## Section Types by Content
|
||||
|
||||
### For Process/Steps
|
||||
|
||||
```markdown
|
||||
## Section N: Step N - [Step Title]
|
||||
|
||||
**Key Concept**: [What this step accomplishes]
|
||||
|
||||
**Content**:
|
||||
- Action: [What to do]
|
||||
- Details: [How to do it]
|
||||
- Note: [Important consideration]
|
||||
|
||||
**Visual Element**:
|
||||
- Type: numbered step icon
|
||||
- Subject: [visual representing the action]
|
||||
- Arrow: leads to next step
|
||||
|
||||
**Text Labels**:
|
||||
- Headline: "Step N: [Title]"
|
||||
- Action: "[Imperative verb + object]"
|
||||
```
|
||||
|
||||
### For Comparison
|
||||
|
||||
```markdown
|
||||
## Section N: [Item A] vs [Item B]
|
||||
|
||||
**Key Concept**: [What distinguishes them]
|
||||
|
||||
**Content**:
|
||||
| Aspect | [Item A] | [Item B] |
|
||||
|--------|----------|----------|
|
||||
| [Factor 1] | [Value] | [Value] |
|
||||
| [Factor 2] | [Value] | [Value] |
|
||||
|
||||
**Visual Element**:
|
||||
- Type: split comparison
|
||||
- Left: [Item A representation]
|
||||
- Right: [Item B representation]
|
||||
|
||||
**Text Labels**:
|
||||
- Headline: "[Item A] vs [Item B]"
|
||||
- Left label: "[Item A name]"
|
||||
- Right label: "[Item B name]"
|
||||
```
|
||||
|
||||
### For Hierarchy
|
||||
|
||||
```markdown
|
||||
## Section N: [Level Name]
|
||||
|
||||
**Key Concept**: [What this level represents]
|
||||
|
||||
**Content**:
|
||||
- Position: [Top/Middle/Bottom]
|
||||
- Priority: [Importance level]
|
||||
- Contains: [Elements at this level]
|
||||
|
||||
**Visual Element**:
|
||||
- Type: layer/tier
|
||||
- Size: [relative to other levels]
|
||||
- Position: [where in hierarchy]
|
||||
|
||||
**Text Labels**:
|
||||
- Level title: "[Name]"
|
||||
- Description: "[Brief description]"
|
||||
```
|
||||
|
||||
### For Data/Statistics
|
||||
|
||||
```markdown
|
||||
## Section N: [Metric Name]
|
||||
|
||||
**Key Concept**: [What this data shows]
|
||||
|
||||
**Content**:
|
||||
- Value: [Exact number/percentage]
|
||||
- Context: [What it means]
|
||||
- Comparison: [Benchmark if any]
|
||||
|
||||
**Visual Element**:
|
||||
- Type: [chart/number highlight/gauge]
|
||||
- Emphasis: [how to draw attention]
|
||||
|
||||
**Text Labels**:
|
||||
- Main number: "[Exact value]"
|
||||
- Label: "[Metric name]"
|
||||
- Context: "[Brief context]"
|
||||
```
|
||||
|
||||
## Quality Checklist
|
||||
|
||||
Before finalizing structured content:
|
||||
|
||||
- [ ] Title captures the main message
|
||||
- [ ] Learning objectives are clear and measurable
|
||||
- [ ] Each section maps to an objective
|
||||
- [ ] All content is verbatim from source
|
||||
- [ ] Visual elements are clearly described
|
||||
- [ ] Text labels are specified exactly
|
||||
- [ ] Data points are collected and verified
|
||||
- [ ] Design instructions are separated
|
||||
- [ ] No new information has been added
|
||||
+36
@@ -0,0 +1,36 @@
|
||||
# aged-academia
|
||||
|
||||
Historical scientific illustration with aged paper aesthetic.
|
||||
|
||||
## Color Palette
|
||||
|
||||
- Primary: Sepia brown (#704214), aged ink, muted earth tones
|
||||
- Background: Parchment (#F4E4BC), yellowed paper texture
|
||||
- Accents: Faded red annotations, iron gall ink spots
|
||||
|
||||
## Variants
|
||||
|
||||
| Variant | Focus | Visual Emphasis |
|
||||
|---------|-------|-----------------|
|
||||
| **Notebook** | Personal sketches, inventions | Cursive notes, margin annotations |
|
||||
| **Specimen** | Scientific classification | Numbered diagrams, Latin labels |
|
||||
|
||||
## Visual Elements
|
||||
|
||||
- Aged paper texture overlay
|
||||
- Detailed cross-hatching and line work
|
||||
- Scientific illustration precision
|
||||
- Study notes and annotations
|
||||
- Specimen plate or sketch aesthetic
|
||||
- Numbered diagram elements
|
||||
|
||||
## Typography
|
||||
|
||||
- Handwritten cursive or serif fonts
|
||||
- Scientific annotations
|
||||
- Small caps for labels
|
||||
- Italics for scientific names
|
||||
|
||||
## Best For
|
||||
|
||||
Scientific education, biology topics, historical explanations, inventions, nature documentation
|
||||
+36
@@ -0,0 +1,36 @@
|
||||
# bold-graphic
|
||||
|
||||
High-contrast comic style with bold outlines and dramatic visuals.
|
||||
|
||||
## Color Palette
|
||||
|
||||
- Primary: Bold primaries - red, yellow, blue, black
|
||||
- Background: White, halftone patterns, dramatic shadows
|
||||
- Accents: Spot colors, neon highlights
|
||||
|
||||
## Variants
|
||||
|
||||
| Variant | Focus | Visual Emphasis |
|
||||
|---------|-------|-----------------|
|
||||
| **Graphic-novel** | Dramatic narratives | Action lines, hatching, panels |
|
||||
| **Pop-art** | High-energy impact | Halftone dots, Warhol repetition |
|
||||
|
||||
## Visual Elements
|
||||
|
||||
- Bold black outlines
|
||||
- High contrast compositions
|
||||
- Halftone dot patterns
|
||||
- Comic panel borders optional
|
||||
- Action lines and motion
|
||||
- Speech bubbles and sound effects
|
||||
|
||||
## Typography
|
||||
|
||||
- Comic book lettering
|
||||
- Impact fonts for emphasis
|
||||
- POW/BANG effects for pop-art
|
||||
- Caption boxes for narrative
|
||||
|
||||
## Best For
|
||||
|
||||
Attention-grabbing content, dramatic narratives, pop culture, marketing, high-energy presentations
|
||||
+61
@@ -0,0 +1,61 @@
|
||||
# chalkboard
|
||||
|
||||
Black chalkboard background with colorful chalk drawing style
|
||||
|
||||
## Design Aesthetic
|
||||
|
||||
Classic classroom chalkboard aesthetic with hand-drawn chalk illustrations. Nostalgic educational feel with imperfect, sketchy lines that capture the warmth of traditional teaching. Colorful chalk creates visual hierarchy while maintaining the authentic chalkboard experience.
|
||||
|
||||
## Background
|
||||
|
||||
- Color: Chalkboard Black (#1A1A1A) or Dark Green-Black (#1C2B1C)
|
||||
- Texture: Realistic chalkboard texture with subtle scratches, dust particles, and faint eraser marks
|
||||
|
||||
## Typography
|
||||
|
||||
Hand-drawn chalk lettering style with visible chalk texture. Imperfect baseline adds authenticity. White or bright colored chalk for emphasis.
|
||||
|
||||
## Color Palette
|
||||
|
||||
| Role | Color | Hex | Usage |
|
||||
|------|-------|-----|-------|
|
||||
| Background | Chalkboard Black | #1A1A1A | Primary background |
|
||||
| Alt Background | Green-Black | #1C2B1C | Traditional green board |
|
||||
| Primary Text | Chalk White | #F5F5F5 | Main text, outlines |
|
||||
| Accent 1 | Chalk Yellow | #FFE566 | Highlights, emphasis |
|
||||
| Accent 2 | Chalk Pink | #FF9999 | Secondary highlights |
|
||||
| Accent 3 | Chalk Blue | #66B3FF | Diagrams, links |
|
||||
| Accent 4 | Chalk Green | #90EE90 | Success, nature |
|
||||
| Accent 5 | Chalk Orange | #FFB366 | Warnings, energy |
|
||||
|
||||
## Visual Elements
|
||||
|
||||
- Hand-drawn chalk illustrations with sketchy, imperfect lines
|
||||
- Chalk dust effects around text and key elements
|
||||
- Doodles: stars, arrows, underlines, circles, checkmarks
|
||||
- Mathematical formulas and simple diagrams
|
||||
- Eraser smudges and chalk residue textures
|
||||
- Wooden frame border optional
|
||||
- Stick figures and simple icons
|
||||
- Connection lines with hand-drawn feel
|
||||
|
||||
## Style Rules
|
||||
|
||||
### Do
|
||||
|
||||
- Maintain authentic chalk texture on all elements
|
||||
- Use imperfect, hand-drawn quality throughout
|
||||
- Add subtle chalk dust and smudge effects
|
||||
- Create visual hierarchy with color variety
|
||||
- Include playful doodles and annotations
|
||||
|
||||
### Don't
|
||||
|
||||
- Use perfect geometric shapes
|
||||
- Create clean digital-looking lines
|
||||
- Add photorealistic elements
|
||||
- Use gradients or glossy effects
|
||||
|
||||
## Best For
|
||||
|
||||
Educational content, tutorials, classroom themes, teaching materials, workshops, informal learning sessions, knowledge sharing
|
||||
+29
@@ -0,0 +1,29 @@
|
||||
# claymation
|
||||
|
||||
3D clay figure aesthetic with stop-motion charm
|
||||
|
||||
## Color Palette
|
||||
|
||||
- Primary: Saturated clay colors - bright but slightly muted
|
||||
- Background: Neutral studio backdrop, soft gradients
|
||||
- Accents: Complementary clay colors, shiny highlights
|
||||
|
||||
## Visual Elements
|
||||
|
||||
- Clay/plasticine texture on all objects
|
||||
- Fingerprint marks and imperfections
|
||||
- Rounded, sculpted forms
|
||||
- Soft shadows
|
||||
- Stop-motion staging
|
||||
- Miniature set aesthetic
|
||||
|
||||
## Typography
|
||||
|
||||
- Extruded clay letters
|
||||
- Dimensional, rounded text
|
||||
- Playful and chunky
|
||||
- Embedded in clay scenes
|
||||
|
||||
## Best For
|
||||
|
||||
Playful explanations, children's content, stop-motion narratives, friendly processes
|
||||
+29
@@ -0,0 +1,29 @@
|
||||
# corporate-memphis
|
||||
|
||||
Flat vector people with vibrant geometric fills
|
||||
|
||||
## Color Palette
|
||||
|
||||
- Primary: Bright, saturated - purple, orange, teal, yellow
|
||||
- Background: White or light pastels
|
||||
- Accents: Gradient fills, geometric patterns
|
||||
|
||||
## Visual Elements
|
||||
|
||||
- Flat vector illustration
|
||||
- Disproportionate human figures
|
||||
- Abstract body shapes
|
||||
- Floating geometric elements
|
||||
- No outlines, solid fills
|
||||
- Plant and object accents
|
||||
|
||||
## Typography
|
||||
|
||||
- Clean sans-serif
|
||||
- Bold headings
|
||||
- Professional but friendly
|
||||
- Minimal decoration
|
||||
|
||||
## Best For
|
||||
|
||||
Business presentations, tech products, marketing materials, corporate training
|
||||
+44
@@ -0,0 +1,44 @@
|
||||
# craft-handmade (DEFAULT)
|
||||
|
||||
Hand-drawn and paper craft aesthetic with warm, organic feel.
|
||||
|
||||
## Color Palette
|
||||
|
||||
- Primary: Warm pastels, soft saturated colors, craft paper tones
|
||||
- Background: Light cream (#FFF8F0), textured paper (#F5F0E6)
|
||||
- Accents: Bold highlights, construction paper colors
|
||||
|
||||
## Variants
|
||||
|
||||
| Variant | Focus | Visual Emphasis |
|
||||
|---------|-------|-----------------|
|
||||
| **Hand-drawn** | Cartoon illustration | Simple icons, slightly imperfect lines |
|
||||
| **Paper-cutout** | Layered paper craft | Drop shadows, torn edges, texture |
|
||||
|
||||
## Visual Elements
|
||||
|
||||
- Hand-drawn or cut-paper quality
|
||||
- Organic, slightly imperfect shapes
|
||||
- Layered depth with shadows (paper variant)
|
||||
- Simple cartoon elements and icons
|
||||
- Character illustrations (people, personalities in cartoon form)
|
||||
- Ample whitespace, clean composition
|
||||
- Keywords and core concepts highlighted
|
||||
- **Strictly hand-drawn—no realistic or photographic elements**
|
||||
|
||||
## Style Enforcement
|
||||
|
||||
- All imagery must maintain cartoon/illustrated aesthetic
|
||||
- Replace real photos or realistic figures with hand-drawn equivalents
|
||||
- Maintain consistent line weight and illustration style throughout
|
||||
|
||||
## Typography
|
||||
|
||||
- Hand-drawn or casual font style
|
||||
- Clear, readable labels
|
||||
- Keywords emphasized with larger/bolder text
|
||||
- Cut-out letter style for paper variant
|
||||
|
||||
## Best For
|
||||
|
||||
Educational content, general explanations, friendly infographics, children's content, playful hierarchies
|
||||
+29
@@ -0,0 +1,29 @@
|
||||
# cyberpunk-neon
|
||||
|
||||
Neon glow on dark backgrounds, futuristic aesthetic
|
||||
|
||||
## Color Palette
|
||||
|
||||
- Primary: Neon pink (#FF00FF), cyan (#00FFFF), electric blue
|
||||
- Background: Deep black (#0A0A0A), dark purple gradients
|
||||
- Accents: Neon glow effects, chrome reflections
|
||||
|
||||
## Visual Elements
|
||||
|
||||
- Glowing neon outlines
|
||||
- Dark atmospheric backgrounds
|
||||
- Digital glitch effects
|
||||
- Circuit patterns
|
||||
- Holographic elements
|
||||
- Rain and reflections
|
||||
|
||||
## Typography
|
||||
|
||||
- Glowing neon text
|
||||
- Digital/tech fonts
|
||||
- Flickering effects
|
||||
- Outlined glow letters
|
||||
|
||||
## Best For
|
||||
|
||||
Tech futures, gaming content, digital culture, futuristic concepts, night aesthetics
|
||||
+63
@@ -0,0 +1,63 @@
|
||||
# hand-drawn-edu
|
||||
|
||||
Hand-drawn educational infographic with macaron pastel color blocks on warm cream paper texture.
|
||||
|
||||
## Color Palette
|
||||
|
||||
- Background: Warm cream (#F5F0E8) with subtle paper grain texture
|
||||
- Primary text: Deep charcoal (#2D2D2D) for headlines, outlines
|
||||
- Macaron Blue: #A8D8EA for cool-toned information zones
|
||||
- Macaron Mint: #B5E5CF for growth/positive zones
|
||||
- Macaron Lavender: #D5C6E0 for abstract/concept zones
|
||||
- Macaron Peach: #FFD5C2 for warm-toned zones
|
||||
- Accent: Coral Red (#E8655A) for key data, warnings, emphasis
|
||||
- Muted annotations: Warm gray (#6B6B6B) for secondary labels
|
||||
|
||||
## Visual Elements
|
||||
|
||||
- Macaron pastel rounded cards as distinct information zones
|
||||
- Hand-drawn wavy connection lines and arrows with small text labels
|
||||
- Simple stick-figure characters and cartoon icons to humanize concepts
|
||||
- Doodle decorations: small stars, underlines, spirals, sparkles
|
||||
- Color fills don't completely fill outlines — preserve casual hand-drawn feel
|
||||
- Dashed borders for secondary or contained zones
|
||||
- Small icon doodles (clipboard, lock, checkmark, lightbulb) to reinforce concepts
|
||||
- Bold centered quote or takeaway at the bottom
|
||||
- Slight hand-drawn wobble on all lines and shapes
|
||||
|
||||
## Variants
|
||||
|
||||
| Variant | Focus | Visual Emphasis |
|
||||
|---------|-------|-----------------|
|
||||
| **Sketch-notes** | Concept mapping | More stick figures, thought bubbles, connecting arrows |
|
||||
| **Pastel cards** | Structured info | Cleaner macaron blocks, less doodle, more white space |
|
||||
|
||||
## Typography
|
||||
|
||||
- Main title: Bold hand-drawn lettering with organic strokes, large confident letterforms with slight wobble
|
||||
- Section headers: Hand-lettered text on or inside macaron color blocks
|
||||
- Body text: Clear handwritten print style, legible but not mechanical
|
||||
- Annotations: Warm gray (#6B6B6B), smaller, neat handwritten labels
|
||||
- Keywords: Bold emphasis within body text
|
||||
|
||||
## Style Enforcement
|
||||
|
||||
- All lines must have slight hand-drawn wobble — no perfect geometry
|
||||
- Each information zone uses a distinct macaron color block
|
||||
- Maintain consistent wobble quality across all shapes and lines
|
||||
- Include at least one simple cartoon character or stick figure
|
||||
- Generous white space between zones — each zone should breathe
|
||||
- Maximum 4 macaron colors per infographic
|
||||
|
||||
## Avoid
|
||||
|
||||
- Perfect geometric shapes or straight lines
|
||||
- Photorealistic elements or stock illustration style
|
||||
- Pure white backgrounds
|
||||
- Flat vector icons or digital-precision graphics
|
||||
- Overcrowded layouts — let zones breathe
|
||||
- Corporate or clinical aesthetic
|
||||
|
||||
## Best For
|
||||
|
||||
Educational diagrams, process explainers, concept maps, knowledge summaries, tutorial walkthroughs, onboarding visuals
|
||||
+29
@@ -0,0 +1,29 @@
|
||||
# ikea-manual
|
||||
|
||||
Minimal line art assembly instruction style
|
||||
|
||||
## Color Palette
|
||||
|
||||
- Primary: Black lines, minimal fills
|
||||
- Background: White or cream paper
|
||||
- Accents: Red for warnings, blue for highlights
|
||||
|
||||
## Visual Elements
|
||||
|
||||
- Simple line drawings
|
||||
- Numbered step sequences
|
||||
- Arrow indicators
|
||||
- Exploded assembly views
|
||||
- Wordless communication
|
||||
- Stick figures for scale
|
||||
|
||||
## Typography
|
||||
|
||||
- Minimal text
|
||||
- Step numbers prominent
|
||||
- Universal symbols
|
||||
- Simple sans-serif when needed
|
||||
|
||||
## Best For
|
||||
|
||||
Step-by-step instructions, assembly guides, how-to content, universal communication
|
||||
@@ -0,0 +1,29 @@
|
||||
# kawaii
|
||||
|
||||
Japanese cute style with big eyes and pastel colors
|
||||
|
||||
## Color Palette
|
||||
|
||||
- Primary: Soft pastels - pink (#FFB6C1), mint (#98D8C8), lavender (#E6E6FA)
|
||||
- Background: Light pink or cream, sparkle overlays
|
||||
- Accents: Bright pops, star and heart shapes
|
||||
|
||||
## Visual Elements
|
||||
|
||||
- Big sparkly eyes on characters
|
||||
- Rounded, soft shapes
|
||||
- Blushing cheeks
|
||||
- Sparkles and stars scattered
|
||||
- Cute animal characters
|
||||
- Chibi proportions
|
||||
|
||||
## Typography
|
||||
|
||||
- Rounded, bubbly fonts
|
||||
- Cute decorations on letters
|
||||
- Hearts and stars in text
|
||||
- Soft, friendly appearance
|
||||
|
||||
## Best For
|
||||
|
||||
Cute tutorials, children's education, lifestyle content, character-driven explanations
|
||||
@@ -0,0 +1,29 @@
|
||||
# knolling
|
||||
|
||||
Organized flat-lay with top-down arrangement
|
||||
|
||||
## Color Palette
|
||||
|
||||
- Primary: Object's natural colors
|
||||
- Background: Solid color - black, white, or colored surface
|
||||
- Accents: Shadows, subtle highlights
|
||||
|
||||
## Visual Elements
|
||||
|
||||
- Top-down camera angle
|
||||
- Objects arranged at 90° angles
|
||||
- Equal spacing between items
|
||||
- Clean organization
|
||||
- Symmetry and order
|
||||
- No overlapping items
|
||||
|
||||
## Typography
|
||||
|
||||
- Clean labels
|
||||
- Positioned outside objects
|
||||
- Connecting lines to items
|
||||
- Minimal, catalog-style
|
||||
|
||||
## Best For
|
||||
|
||||
Product collections, tool inventories, gear layouts, organized overviews
|
||||
+29
@@ -0,0 +1,29 @@
|
||||
# lego-brick
|
||||
|
||||
Toy brick construction with playful aesthetic
|
||||
|
||||
## Color Palette
|
||||
|
||||
- Primary: Classic LEGO colors - red, blue, yellow, green, white
|
||||
- Background: Light gray baseplate or white
|
||||
- Accents: Bright primary pops, shiny studs
|
||||
|
||||
## Visual Elements
|
||||
|
||||
- Visible brick studs
|
||||
- Modular construction
|
||||
- Minifigure characters
|
||||
- Building instruction style
|
||||
- Stackable elements
|
||||
- Plastic sheen
|
||||
|
||||
## Typography
|
||||
|
||||
- Blocky, bold fonts
|
||||
- LEGO instruction style
|
||||
- Step numbers
|
||||
- Playful appearance
|
||||
|
||||
## Best For
|
||||
|
||||
Building concepts, modular systems, playful education, children's content
|
||||
+60
@@ -0,0 +1,60 @@
|
||||
# morandi-journal
|
||||
|
||||
Hand-drawn doodle illustration with warm Morandi color tones and cozy bullet journal aesthetic.
|
||||
|
||||
## Color Palette
|
||||
|
||||
- Background: Warm cream/beige with subtle paper texture (#F5F0E6)
|
||||
- Primary: Muted teal/sage green (#7BA3A8) for headers and frames
|
||||
- Secondary: Warm terracotta/orange (#D4956A) for highlights and numbers
|
||||
- Line art: Dark charcoal brown (#4A4540)
|
||||
- Soft highlights: Pale yellow (#F5E6C8)
|
||||
|
||||
## Visual Elements
|
||||
|
||||
- Hand-drawn doodle illustrations with organic, slightly imperfect ink lines
|
||||
- Washi tape strip decorations (diagonal stripes pattern, beige and brown)
|
||||
- Rounded card containers for brand/option items
|
||||
- Hand-drawn rulers, scales, and progress bars with emoji quality indicators
|
||||
- Smiley/frowny faces as quality markers (😊✓ 😐 ☹️✗)
|
||||
- Dotted line frames around sections
|
||||
- Connecting arrows and dotted lines between modules
|
||||
- Corner decorations: tiny houses, stars, sparkles, clouds
|
||||
- Wavy line dividers between sections
|
||||
- Callout bubbles for tips
|
||||
- Magnifying glass icons for identification tips
|
||||
- Thumbs up/down icons (hand-drawn style)
|
||||
|
||||
## Variants
|
||||
|
||||
| Variant | Focus | Visual Emphasis |
|
||||
|---------|-------|-----------------|
|
||||
| **Cozy journal** | Maximum warmth | More washi tape, stickers, decorative doodles |
|
||||
| **Clean sketch** | Readability | Cleaner lines, less decoration, more structured |
|
||||
|
||||
## Typography
|
||||
|
||||
- Main title: Bold hand-lettered calligraphy style with decorative flourishes
|
||||
- Module headers: Clean handwritten text in white on dark teal rounded badge (#6B9080)
|
||||
- Body text: Neat handwritten print style, easy to read
|
||||
- Numbers: Highlighted in terracotta (#D4956A), slightly larger than body
|
||||
|
||||
## Style Enforcement
|
||||
|
||||
- All imagery must maintain hand-drawn/doodle aesthetic—no digital precision
|
||||
- Organic, slightly imperfect shapes throughout
|
||||
- Sketch-like quality with visible line weight variations
|
||||
- Warm and cozy journal feel, not clinical or corporate
|
||||
|
||||
## Avoid
|
||||
|
||||
- Flat vector icons or emoji
|
||||
- Clean geometric shapes
|
||||
- Stock illustration style
|
||||
- Strict grid layout
|
||||
- Pure white background
|
||||
- Digital/corporate look
|
||||
|
||||
## Best For
|
||||
|
||||
Product selection guides, lifestyle content, educational overviews, consumer-facing comparison content, Xiaohongshu-style posts
|
||||
@@ -0,0 +1,29 @@
|
||||
# origami
|
||||
|
||||
Folded paper forms with geometric precision
|
||||
|
||||
## Color Palette
|
||||
|
||||
- Primary: Solid origami paper colors - red, blue, green, gold
|
||||
- Background: White or soft gray, subtle shadows
|
||||
- Accents: Paper fold highlights, crisp shadows
|
||||
|
||||
## Visual Elements
|
||||
|
||||
- Geometric folded shapes
|
||||
- Visible fold lines
|
||||
- Cast shadows showing depth
|
||||
- Paper texture
|
||||
- Angular, faceted forms
|
||||
- Low-poly aesthetic
|
||||
|
||||
## Typography
|
||||
|
||||
- Clean geometric fonts
|
||||
- Angular letterforms
|
||||
- Folded paper text effect
|
||||
- Minimal, precise labels
|
||||
|
||||
## Best For
|
||||
|
||||
Geometric concepts, transformation topics, Japanese themes, abstract representations
|
||||
+29
@@ -0,0 +1,29 @@
|
||||
# pixel-art
|
||||
|
||||
Retro 8-bit gaming aesthetic
|
||||
|
||||
## Color Palette
|
||||
|
||||
- Primary: Limited palette - NES/SNES colors
|
||||
- Background: Black or dark blue, scanlines optional
|
||||
- Accents: Bright pixel highlights, CRT glow
|
||||
|
||||
## Visual Elements
|
||||
|
||||
- Visible pixel grid
|
||||
- Limited color count per sprite
|
||||
- 8-bit or 16-bit style
|
||||
- Retro game UI elements
|
||||
- Pixel-perfect edges
|
||||
- Dithering for gradients
|
||||
|
||||
## Typography
|
||||
|
||||
- Pixel fonts
|
||||
- Blocky letterforms
|
||||
- Game UI style text
|
||||
- Score/stat display style
|
||||
|
||||
## Best For
|
||||
|
||||
Gaming topics, nostalgia content, developer audiences, retro tech themes
|
||||
+48
@@ -0,0 +1,48 @@
|
||||
# pop-laboratory
|
||||
|
||||
Lab manual precision meets pop art color impact—coordinate systems, technical diagrams, and fluorescent accents on blueprint grid.
|
||||
|
||||
## Color Palette
|
||||
|
||||
- Background: Professional grayish-white with faint blueprint grid texture (#F2F2F2)
|
||||
- Primary: Muted teal/sage green (#B8D8BE) for major functional blocks and data zones
|
||||
- High-alert accent: Vibrant fluorescent pink (#E91E63) strictly for warnings, critical data, or "winner" highlights
|
||||
- Marker highlights: Vivid lemon yellow (#FFF200) as translucent highlighter effect for keywords
|
||||
- Line art: Ultra-fine charcoal brown (#2D2926) for technical grids, coordinates, and hairlines
|
||||
|
||||
## Visual Elements
|
||||
|
||||
- Coordinate-style labels on every module (e.g., R-20, G-02, SEC-08)
|
||||
- Technical diagrams: exploded views, cross-sections with anchor points, architectural skeletal lines
|
||||
- Vertical/horizontal rulers with precise markers (0.5mm, 1.8mm, 45°)
|
||||
- "Marker-over-print" effect: color blocks slightly offset from text, postmodern print feel
|
||||
- Cross-hair targets, mathematical symbols (Σ, Δ, ∞), directional arrows (X/Y axis)
|
||||
- Microscopic detail annotations alongside macroscopic bold headers
|
||||
- Corner metadata: tiny barcodes, timestamps, technical parameters
|
||||
- High contrast between massive bold headers and tiny 8pt-style annotations
|
||||
|
||||
## Typography
|
||||
|
||||
- Headers: Bold brutalist characters, high visual impact
|
||||
- Body: Professional sans-serif or crisp technical print
|
||||
- Numbers: Large, highlighted with yellow or blue to stand out
|
||||
- Annotations: Ultra-crisp, small technical labels
|
||||
|
||||
## Style Enforcement
|
||||
|
||||
- Strictly systematic color usage: only teal, pink, yellow, charcoal—no rainbow palette
|
||||
- Sufficient fine grid lines and coordinate annotations throughout
|
||||
- Maintain tension between large impactful headers and small precise parameters
|
||||
- Lab manual aesthetic: mix of microscopic details and macroscopic data
|
||||
|
||||
## Avoid
|
||||
|
||||
- Cute or cartoonish doodles
|
||||
- Soft pastels or generic textures
|
||||
- Empty white space
|
||||
- Flat vector stock icons
|
||||
- Organic or hand-drawn imperfections
|
||||
|
||||
## Best For
|
||||
|
||||
Technical product guides, specification comparisons, precision-focused data visualization, engineering-adjacent content
|
||||
+47
@@ -0,0 +1,47 @@
|
||||
# retro-pop-grid
|
||||
|
||||
1970s retro pop art with strict Swiss international grid, thick black outlines, and flat color blocks.
|
||||
|
||||
## Color Palette
|
||||
|
||||
- Background: Warm vintage cream/beige (#F5F0E6)
|
||||
- Flat accents: Salmon pink, sky blue, mustard yellow, mint green—all muted retro tones
|
||||
- Contrast blocks: Solid pure black (#000000) and solid pure white (#FFFFFF) used strategically for extreme contrast
|
||||
- Line art and outlines: Solid thick black
|
||||
|
||||
## Visual Elements
|
||||
|
||||
- Uniform thick black outlines on all illustrations, text boxes, and grid dividers
|
||||
- Pure 2D flat vector aesthetic with subtle screen print texture
|
||||
- Strict Swiss international grid: poster divided into square and rectangular cells by thick black lines
|
||||
- Black-background cells with white text for warnings or key categories (inverted contrast)
|
||||
- Geometric fill patterns in empty cells: checkerboards, diagonal lines, dots
|
||||
- Flat abstract symbols, warning signs, keyholes, stars, arrows
|
||||
- Vintage comic-style smiley/frowny faces for quality indicators
|
||||
- Colored cells used for breathing room—some with minimal/no content
|
||||
|
||||
## Typography
|
||||
|
||||
- Headers: Bold brutalist or retro thick display fonts, high legibility
|
||||
- Body: Clean sans-serif, structured typographic alignment
|
||||
- Decorative English text acceptable for stylistic labels ("WARNING", "INFO", "BEST")
|
||||
- All content text in specified language
|
||||
|
||||
## Style Enforcement
|
||||
|
||||
- Absolutely no gradients, shading, drop shadows, or 3D effects
|
||||
- Everything anchored in grid cells—no floating or unorganized elements
|
||||
- Maintain 1970s retro pop art and underground comic illustration feel
|
||||
- Visual density balanced with rhythmic grid—some cells intentionally sparse for contrast
|
||||
|
||||
## Avoid
|
||||
|
||||
- 3D rendering, realistic details, gradients, soft shadows
|
||||
- Soft, thin, or sketch-like pencil lines
|
||||
- Free-flowing, unorganized, or floating layouts (everything must be grid-anchored)
|
||||
- Pure white background canvas
|
||||
- Organic or hand-drawn imperfections
|
||||
|
||||
## Best For
|
||||
|
||||
Trendy product guides, design-conscious content, visually striking comparisons, content targeting design-savvy audiences, bold social media posts
|
||||
+29
@@ -0,0 +1,29 @@
|
||||
# storybook-watercolor
|
||||
|
||||
Soft hand-painted illustration with whimsical charm
|
||||
|
||||
## Color Palette
|
||||
|
||||
- Primary: Soft watercolor washes - muted blues, greens, warm earth
|
||||
- Background: Watercolor paper texture, white or cream
|
||||
- Accents: Deeper pigment pools, splatter effects
|
||||
|
||||
## Visual Elements
|
||||
|
||||
- Visible brushstrokes
|
||||
- Soft color bleeds and gradients
|
||||
- White space as design element
|
||||
- Delicate line work over washes
|
||||
- Natural, organic shapes
|
||||
- Dreamy, atmospheric quality
|
||||
|
||||
## Typography
|
||||
|
||||
- Elegant hand-lettering
|
||||
- Watercolor-style text
|
||||
- Flowing, organic letterforms
|
||||
- Integrated with illustrations
|
||||
|
||||
## Best For
|
||||
|
||||
Storytelling, emotional journeys, nature topics, children's education, artistic presentations
|
||||
+29
@@ -0,0 +1,29 @@
|
||||
# subway-map
|
||||
|
||||
Transit diagram style with colored lines and stations
|
||||
|
||||
## Color Palette
|
||||
|
||||
- Primary: Transit line colors - red, blue, green, yellow, orange
|
||||
- Background: White or light gray
|
||||
- Accents: Station dots, interchange markers
|
||||
|
||||
## Visual Elements
|
||||
|
||||
- Colored route lines
|
||||
- 45° and 90° angles only
|
||||
- Station circle markers
|
||||
- Interchange symbols
|
||||
- Simplified geography
|
||||
- Line thickness hierarchy
|
||||
|
||||
## Typography
|
||||
|
||||
- Clean sans-serif
|
||||
- Station name labels
|
||||
- Line number/name badges
|
||||
- Horizontal or angled text
|
||||
|
||||
## Best For
|
||||
|
||||
Journey maps, process flows, network diagrams, route explanations
|
||||
+36
@@ -0,0 +1,36 @@
|
||||
# technical-schematic
|
||||
|
||||
Technical diagrams with engineering precision and clean geometry.
|
||||
|
||||
## Color Palette
|
||||
|
||||
- Primary: Blues (#2563EB), teals, grays, white lines
|
||||
- Background: Deep blue (#1E3A5F), white, or light gray with grid
|
||||
- Accents: Amber highlights (#F59E0B), cyan callouts
|
||||
|
||||
## Variants
|
||||
|
||||
| Variant | Focus | Visual Emphasis |
|
||||
|---------|-------|-----------------|
|
||||
| **Blueprint** | Engineering schematics | White on blue, measurements, grid |
|
||||
| **Isometric** | 3D spatial representation | 30° angle blocks, clean fills |
|
||||
|
||||
## Visual Elements
|
||||
|
||||
- Geometric precision throughout
|
||||
- Grid pattern or isometric angle
|
||||
- Dimension lines and measurements
|
||||
- Technical symbols and annotations
|
||||
- Clean vector shapes
|
||||
- Consistent stroke weights
|
||||
|
||||
## Typography
|
||||
|
||||
- Technical stencil or clean sans-serif
|
||||
- All-caps labels
|
||||
- Measurement annotations
|
||||
- Floating labels for isometric
|
||||
|
||||
## Best For
|
||||
|
||||
Technical architecture, system diagrams, engineering specs, product breakdowns, data visualization
|
||||
+29
@@ -0,0 +1,29 @@
|
||||
# ui-wireframe
|
||||
|
||||
Grayscale interface mockup style
|
||||
|
||||
## Color Palette
|
||||
|
||||
- Primary: Grays - light (#E5E5E5), medium (#9CA3AF), dark (#374151)
|
||||
- Background: White (#FFFFFF), light gray
|
||||
- Accents: Blue for interactive (#3B82F6), red for emphasis
|
||||
|
||||
## Visual Elements
|
||||
|
||||
- Wireframe boxes and placeholders
|
||||
- X marks for image placeholders
|
||||
- Simple line icons
|
||||
- Grid-based layout
|
||||
- Annotation callouts
|
||||
- Redline specifications
|
||||
|
||||
## Typography
|
||||
|
||||
- System fonts
|
||||
- Placeholder "Lorem ipsum"
|
||||
- UI label style
|
||||
- Sans-serif throughout
|
||||
|
||||
## Best For
|
||||
|
||||
Product designs, UI explanations, app concepts, user flow diagrams
|
||||
@@ -0,0 +1,147 @@
|
||||
---
|
||||
name: ideation
|
||||
title: Creative Ideation — Constraint-Driven Project Generation
|
||||
description: "Generate project ideas through creative constraints. Use when the user says 'I want to build something', 'give me a project idea', 'I'm bored', 'what should I make', 'inspire me', or any variant of 'I have tools but no direction'. Works for code, art, hardware, writing, tools, and anything that can be made."
|
||||
version: 1.0.0
|
||||
author: SHL0MS
|
||||
license: MIT
|
||||
metadata:
|
||||
hermes:
|
||||
tags: [Creative, Ideation, Projects, Brainstorming, Inspiration]
|
||||
category: creative
|
||||
requires_toolsets: []
|
||||
---
|
||||
|
||||
# Creative Ideation
|
||||
|
||||
Generate project ideas through creative constraints. Constraint + direction = creativity.
|
||||
|
||||
## How It Works
|
||||
|
||||
1. **Pick a constraint** from the library below — random, or matched to the user's domain/mood
|
||||
2. **Interpret it broadly** — a coding prompt can become a hardware project, an art prompt can become a CLI tool
|
||||
3. **Generate 3 concrete project ideas** that satisfy the constraint
|
||||
4. **If they pick one, build it** — create the project, write the code, ship it
|
||||
|
||||
## The Rule
|
||||
|
||||
Every prompt is interpreted as broadly as possible. "Does this include X?" → Yes. The prompts provide direction and mild constraint. Without either, there is no creativity.
|
||||
|
||||
## Constraint Library
|
||||
|
||||
### For Developers
|
||||
|
||||
**Solve your own itch:**
|
||||
Build the tool you wished existed this week. Under 50 lines. Ship it today.
|
||||
|
||||
**Automate the annoying thing:**
|
||||
What's the most tedious part of your workflow? Script it away. Two hours to fix a problem that costs you five minutes a day.
|
||||
|
||||
**The CLI tool that should exist:**
|
||||
Think of a command you've wished you could type. `git undo-that-thing-i-just-did`. `docker why-is-this-broken`. `npm explain-yourself`. Now build it.
|
||||
|
||||
**Nothing new except glue:**
|
||||
Make something entirely from existing APIs, libraries, and datasets. The only original contribution is how you connect them.
|
||||
|
||||
**Frankenstein week:**
|
||||
Take something that does X and make it do Y. A git repo that plays music. A Dockerfile that generates poetry. A cron job that sends compliments.
|
||||
|
||||
**Subtract:**
|
||||
How much can you remove from a codebase before it breaks? Strip a tool to its minimum viable function. Delete until only the essence remains.
|
||||
|
||||
**High concept, low effort:**
|
||||
A deep idea, lazily executed. The concept should be brilliant. The implementation should take an afternoon. If it takes longer, you're overthinking it.
|
||||
|
||||
### For Makers & Artists
|
||||
|
||||
**Blatantly copy something:**
|
||||
Pick something you admire — a tool, an artwork, an interface. Recreate it from scratch. The learning is in the gap between your version and theirs.
|
||||
|
||||
**One million of something:**
|
||||
One million is both a lot and not that much. One million pixels is a 1MB photo. One million API calls is a Tuesday. One million of anything becomes interesting at scale.
|
||||
|
||||
**Make something that dies:**
|
||||
A website that loses a feature every day. A chatbot that forgets. A countdown to nothing. An exercise in rot, killing, or letting go.
|
||||
|
||||
**Do a lot of math:**
|
||||
Generative geometry, shader golf, mathematical art, computational origami. Time to re-learn what an arcsin is.
|
||||
|
||||
### For Anyone
|
||||
|
||||
**Text is the universal interface:**
|
||||
Build something where text is the only interface. No buttons, no graphics, just words in and words out. Text can go in and out of almost anything.
|
||||
|
||||
**Start at the punchline:**
|
||||
Think of something that would be a funny sentence. Work backwards to make it real. "I taught my thermostat to gaslight me" → now build it.
|
||||
|
||||
**Hostile UI:**
|
||||
Make something intentionally painful to use. A password field that requires 47 conditions. A form where every label lies. A CLI that judges your commands.
|
||||
|
||||
**Take two:**
|
||||
Remember an old project. Do it again from scratch. No looking at the original. See what changed about how you think.
|
||||
|
||||
See `references/full-prompt-library.md` for 30+ additional constraints across communication, scale, philosophy, transformation, and more.
|
||||
|
||||
## Matching Constraints to Users
|
||||
|
||||
| User says | Pick from |
|
||||
|-----------|-----------|
|
||||
| "I want to build something" (no direction) | Random — any constraint |
|
||||
| "I'm learning [language]" | Blatantly copy something, Automate the annoying thing |
|
||||
| "I want something weird" | Hostile UI, Frankenstein week, Start at the punchline |
|
||||
| "I want something useful" | Solve your own itch, The CLI that should exist, Automate the annoying thing |
|
||||
| "I want something beautiful" | Do a lot of math, One million of something |
|
||||
| "I'm burned out" | High concept low effort, Make something that dies |
|
||||
| "Weekend project" | Nothing new except glue, Start at the punchline |
|
||||
| "I want a challenge" | One million of something, Subtract, Take two |
|
||||
|
||||
## Output Format
|
||||
|
||||
```
|
||||
## Constraint: [Name]
|
||||
> [The constraint, one sentence]
|
||||
|
||||
### Ideas
|
||||
|
||||
1. **[One-line pitch]**
|
||||
[2-3 sentences: what you'd build and why it's interesting]
|
||||
⏱ [weekend / week / month] • 🔧 [stack]
|
||||
|
||||
2. **[One-line pitch]**
|
||||
[2-3 sentences]
|
||||
⏱ ... • 🔧 ...
|
||||
|
||||
3. **[One-line pitch]**
|
||||
[2-3 sentences]
|
||||
⏱ ... • 🔧 ...
|
||||
```
|
||||
|
||||
## Example
|
||||
|
||||
```
|
||||
## Constraint: The CLI tool that should exist
|
||||
> Think of a command you've wished you could type. Now build it.
|
||||
|
||||
### Ideas
|
||||
|
||||
1. **`git whatsup` — show what happened while you were away**
|
||||
Compares your last active commit to HEAD and summarizes what changed,
|
||||
who committed, and what PRs merged. Like a morning standup from your repo.
|
||||
⏱ weekend • 🔧 Python, GitPython, click
|
||||
|
||||
2. **`explain 503` — HTTP status codes for humans**
|
||||
Pipe any status code or error message and get a plain-English explanation
|
||||
with common causes and fixes. Pulls from a curated database, not an LLM.
|
||||
⏱ weekend • 🔧 Rust or Go, static dataset
|
||||
|
||||
3. **`deps why <package>` — why is this in my dependency tree**
|
||||
Traces a transitive dependency back to the direct dependency that pulled
|
||||
it in. Answers "why do I have 47 copies of lodash" in one command.
|
||||
⏱ weekend • 🔧 Node.js, npm/yarn lockfile parsing
|
||||
```
|
||||
|
||||
After the user picks one, start building — create the project, write the code, iterate.
|
||||
|
||||
## Attribution
|
||||
|
||||
Constraint approach inspired by [wttdotm.com/prompts.html](https://wttdotm.com/prompts.html). Adapted and expanded for software development and general-purpose ideation.
|
||||
+110
@@ -0,0 +1,110 @@
|
||||
# Full Prompt Library
|
||||
|
||||
Extended constraint library beyond the core set in SKILL.md. Load these when the user wants more variety or a specific category.
|
||||
|
||||
## Communication & Connection
|
||||
|
||||
**Create a means of distribution:**
|
||||
The project works when you can use what you made to give something to somebody else.
|
||||
|
||||
**Make a way to communicate:**
|
||||
The project works when you can hold a conversation with someone else using what you created. Not chat — something weirder.
|
||||
|
||||
**Write a love letter:**
|
||||
To a person, a programming language, a game, a place, a tool. On paper, in code, in music, in light. Mail it.
|
||||
|
||||
**Mail chess / Asynchronous games:**
|
||||
Something turn-based played with no time limit. No requirement to be there at the same time. The game happens in the gaps.
|
||||
|
||||
**Twitch plays X:**
|
||||
A group of people share control over something. Collective input, emergent behavior.
|
||||
|
||||
## Screens & Interfaces
|
||||
|
||||
**Something for your desktop:**
|
||||
You spend a lot of time there. Spruce it up. A custom clock, a pet that lives in your terminal, a wallpaper that changes based on your git activity.
|
||||
|
||||
**One screen, two screen, old screen, new screen:**
|
||||
Take something you associate with one screen and put it on a very different one. DOOM on a smart fridge. A spreadsheet on a watch. A terminal in a painting.
|
||||
|
||||
**Make a mirror:**
|
||||
Something that reflects the viewer back at themselves. A website that shows your browsing history. A CLI that prints your git sins.
|
||||
|
||||
## Philosophy & Concept
|
||||
|
||||
**Code as koan, koan as code:**
|
||||
What is the sound of one hand clapping? A program that answers a question it wasn't asked. A function that returns before it's called.
|
||||
|
||||
**The useless tree:**
|
||||
Make something useless. Deliberately, completely, beautifully useless. No utility. No purpose. No point. That's the point.
|
||||
|
||||
**Artificial stupidity:**
|
||||
Make fun of AI by showcasing its faults. Mistrain it. Lie to it. Build the opposite of what AI is supposed to be good at.
|
||||
|
||||
**"I use technology in order to hate it properly":**
|
||||
Make something inspired by the tension between loving and hating your tools.
|
||||
|
||||
**The more things change, the more they stay the same:**
|
||||
Reflect on time, difference, and similarity.
|
||||
|
||||
## Transformation
|
||||
|
||||
**Translate:**
|
||||
Take something meant for one audience and make it understandable by another. A research paper as a children's book. An API as a board game. A song as an architecture diagram.
|
||||
|
||||
**I mean, I GUESS you could store something that way:**
|
||||
The project works when you can save and open something. Store data in DNS caches. Encode a novel in emoji. Write a file system on top of something that isn't a file system.
|
||||
|
||||
**I mean, I GUESS those could be pixels:**
|
||||
The project works when you can display an image. Render anything visual in a medium that wasn't meant for rendering.
|
||||
|
||||
## Identity & Reflection
|
||||
|
||||
**Make a self-portrait:**
|
||||
Be yourself? Be fake? Be real? In code, in data, in sound, in a directory structure.
|
||||
|
||||
**Make a pun:**
|
||||
The stupider the better. Physical, digital, linguistic, visual. The project IS the joke.
|
||||
|
||||
**Doors, walls, borders, barriers, boundaries:**
|
||||
Things that intermediate two places: opening, closing, permeating, excluding, combining.
|
||||
|
||||
## Scale & Repetition
|
||||
|
||||
**Lists!:**
|
||||
Itemizations, taxonomies, exhaustive recountings, iterations. This one. A list of list of lists.
|
||||
|
||||
**Did you mean *recursion*?**
|
||||
Did you mean recursion?
|
||||
|
||||
**Animals:**
|
||||
Lions, and tigers, and bears. Crab logic gates. Fish plays the stock market.
|
||||
|
||||
**Cats:**
|
||||
Where would the internet be without them.
|
||||
|
||||
## Starting Points
|
||||
|
||||
**An idea that comes from a book:**
|
||||
Read something. Make something inspired by it.
|
||||
|
||||
**Go to a museum:**
|
||||
Project ensues.
|
||||
|
||||
**NPC loot:**
|
||||
What do you drop when you die? What do you take on your journey? Build the item.
|
||||
|
||||
**Mythological objects and entities:**
|
||||
Pandora's box, the ocarina of time, the palantir. Build the artifact.
|
||||
|
||||
**69:**
|
||||
Nice. Make something with the joke being the number 69.
|
||||
|
||||
**Office Space printer scene:**
|
||||
Capture the same energy. Channel the catharsis of destroying the thing that frustrates you.
|
||||
|
||||
**Borges week:**
|
||||
Something inspired by the Argentine. The library of babel. The map that is the territory.
|
||||
|
||||
**Lights!:**
|
||||
LED throwies, light installations, illuminated anything. Make something that glows.
|
||||
@@ -0,0 +1,196 @@
|
||||
---
|
||||
name: design-md
|
||||
description: Author, validate, diff, and export DESIGN.md files — Google's open-source format spec that gives coding agents a persistent, structured understanding of a design system (tokens + rationale in one file). Use when building a design system, porting style rules between projects, generating UI with consistent brand, or auditing accessibility/contrast.
|
||||
version: 1.0.0
|
||||
author: Hermes Agent
|
||||
license: MIT
|
||||
metadata:
|
||||
hermes:
|
||||
tags: [design, design-system, tokens, ui, accessibility, wcag, tailwind, dtcg, google]
|
||||
related_skills: [popular-web-designs, excalidraw, architecture-diagram]
|
||||
---
|
||||
|
||||
# DESIGN.md Skill
|
||||
|
||||
DESIGN.md is Google's open spec (Apache-2.0, `google-labs-code/design.md`) for
|
||||
describing a visual identity to coding agents. One file combines:
|
||||
|
||||
- **YAML front matter** — machine-readable design tokens (normative values)
|
||||
- **Markdown body** — human-readable rationale, organized into canonical sections
|
||||
|
||||
Tokens give exact values. Prose tells agents *why* those values exist and how to
|
||||
apply them. The CLI (`npx @google/design.md`) lints structure + WCAG contrast,
|
||||
diffs versions for regressions, and exports to Tailwind or W3C DTCG JSON.
|
||||
|
||||
## When to use this skill
|
||||
|
||||
- User asks for a DESIGN.md file, design tokens, or a design system spec
|
||||
- User wants consistent UI/brand across multiple projects or tools
|
||||
- User pastes an existing DESIGN.md and asks to lint, diff, export, or extend it
|
||||
- User asks to port a style guide into a format agents can consume
|
||||
- User wants contrast / WCAG accessibility validation on their color palette
|
||||
|
||||
For purely visual inspiration or layout examples, use `popular-web-designs`
|
||||
instead. This skill is for the *formal spec file* itself.
|
||||
|
||||
## File anatomy
|
||||
|
||||
```md
|
||||
---
|
||||
version: alpha
|
||||
name: Heritage
|
||||
description: Architectural minimalism meets journalistic gravitas.
|
||||
colors:
|
||||
primary: "#1A1C1E"
|
||||
secondary: "#6C7278"
|
||||
tertiary: "#B8422E"
|
||||
neutral: "#F7F5F2"
|
||||
typography:
|
||||
h1:
|
||||
fontFamily: Public Sans
|
||||
fontSize: 3rem
|
||||
fontWeight: 700
|
||||
lineHeight: 1.1
|
||||
letterSpacing: "-0.02em"
|
||||
body-md:
|
||||
fontFamily: Public Sans
|
||||
fontSize: 1rem
|
||||
rounded:
|
||||
sm: 4px
|
||||
md: 8px
|
||||
lg: 16px
|
||||
spacing:
|
||||
sm: 8px
|
||||
md: 16px
|
||||
lg: 24px
|
||||
components:
|
||||
button-primary:
|
||||
backgroundColor: "{colors.tertiary}"
|
||||
textColor: "#FFFFFF"
|
||||
rounded: "{rounded.sm}"
|
||||
padding: 12px
|
||||
button-primary-hover:
|
||||
backgroundColor: "{colors.primary}"
|
||||
---
|
||||
|
||||
## Overview
|
||||
|
||||
Architectural Minimalism meets Journalistic Gravitas...
|
||||
|
||||
## Colors
|
||||
|
||||
- **Primary (#1A1C1E):** Deep ink for headlines and core text.
|
||||
- **Tertiary (#B8422E):** "Boston Clay" — the sole driver for interaction.
|
||||
|
||||
## Typography
|
||||
|
||||
Public Sans for everything except small all-caps labels...
|
||||
|
||||
## Components
|
||||
|
||||
`button-primary` is the only high-emphasis action on a page...
|
||||
```
|
||||
|
||||
## Token types
|
||||
|
||||
| Type | Format | Example |
|
||||
|------|--------|---------|
|
||||
| Color | `#` + hex (sRGB) | `"#1A1C1E"` |
|
||||
| Dimension | number + unit (`px`, `em`, `rem`) | `48px`, `-0.02em` |
|
||||
| Token reference | `{path.to.token}` | `{colors.primary}` |
|
||||
| Typography | object with `fontFamily`, `fontSize`, `fontWeight`, `lineHeight`, `letterSpacing`, `fontFeature`, `fontVariation` | see above |
|
||||
|
||||
Component property whitelist: `backgroundColor`, `textColor`, `typography`,
|
||||
`rounded`, `padding`, `size`, `height`, `width`. Variants (hover, active,
|
||||
pressed) are **separate component entries** with related key names
|
||||
(`button-primary-hover`), not nested.
|
||||
|
||||
## Canonical section order
|
||||
|
||||
Sections are optional, but present ones MUST appear in this order. Duplicate
|
||||
headings reject the file.
|
||||
|
||||
1. Overview (alias: Brand & Style)
|
||||
2. Colors
|
||||
3. Typography
|
||||
4. Layout (alias: Layout & Spacing)
|
||||
5. Elevation & Depth (alias: Elevation)
|
||||
6. Shapes
|
||||
7. Components
|
||||
8. Do's and Don'ts
|
||||
|
||||
Unknown sections are preserved, not errored. Unknown token names are accepted
|
||||
if the value type is valid. Unknown component properties produce a warning.
|
||||
|
||||
## Workflow: authoring a new DESIGN.md
|
||||
|
||||
1. **Ask the user** (or infer) the brand tone, accent color, and typography
|
||||
direction. If they provided a site, image, or vibe, translate it to the
|
||||
token shape above.
|
||||
2. **Write `DESIGN.md`** in their project root using `write_file`. Always
|
||||
include `name:` and `colors:`; other sections optional but encouraged.
|
||||
3. **Use token references** (`{colors.primary}`) in the `components:` section
|
||||
instead of re-typing hex values. Keeps the palette single-source.
|
||||
4. **Lint it** (see below). Fix any broken references or WCAG failures
|
||||
before returning.
|
||||
5. **If the user has an existing project**, also write Tailwind or DTCG
|
||||
exports next to the file (`tailwind.theme.json`, `tokens.json`).
|
||||
|
||||
## Workflow: lint / diff / export
|
||||
|
||||
The CLI is `@google/design.md` (Node). Use `npx` — no global install needed.
|
||||
|
||||
```bash
|
||||
# Validate structure + token references + WCAG contrast
|
||||
npx -y @google/design.md lint DESIGN.md
|
||||
|
||||
# Compare two versions, fail on regression (exit 1 = regression)
|
||||
npx -y @google/design.md diff DESIGN.md DESIGN-v2.md
|
||||
|
||||
# Export to Tailwind theme JSON
|
||||
npx -y @google/design.md export --format tailwind DESIGN.md > tailwind.theme.json
|
||||
|
||||
# Export to W3C DTCG (Design Tokens Format Module) JSON
|
||||
npx -y @google/design.md export --format dtcg DESIGN.md > tokens.json
|
||||
|
||||
# Print the spec itself — useful when injecting into an agent prompt
|
||||
npx -y @google/design.md spec --rules-only --format json
|
||||
```
|
||||
|
||||
All commands accept `-` for stdin. `lint` returns exit 1 on errors. Use the
|
||||
`--format json` flag and parse the output if you need to report findings
|
||||
structurally.
|
||||
|
||||
### Lint rule reference (what the 7 rules catch)
|
||||
|
||||
- `broken-ref` (error) — `{colors.missing}` points at a non-existent token
|
||||
- `duplicate-section` (error) — same `## Heading` appears twice
|
||||
- `invalid-color`, `invalid-dimension`, `invalid-typography` (error)
|
||||
- `wcag-contrast` (warning/info) — component `textColor` vs `backgroundColor`
|
||||
ratio against WCAG AA (4.5:1) and AAA (7:1)
|
||||
- `unknown-component-property` (warning) — outside the whitelist above
|
||||
|
||||
When the user cares about accessibility, call this out explicitly in your
|
||||
summary — WCAG findings are the most load-bearing reason to use the CLI.
|
||||
|
||||
## Pitfalls
|
||||
|
||||
- **Don't nest component variants.** `button-primary.hover` is wrong;
|
||||
`button-primary-hover` as a sibling key is right.
|
||||
- **Hex colors must be quoted strings.** YAML will otherwise choke on `#` or
|
||||
truncate values like `#1A1C1E` oddly.
|
||||
- **Negative dimensions need quotes too.** `letterSpacing: -0.02em` parses as
|
||||
a YAML flow — write `letterSpacing: "-0.02em"`.
|
||||
- **Section order is enforced.** If the user gives you prose in a random order,
|
||||
reorder it to match the canonical list before saving.
|
||||
- **`version: alpha` is the current spec version** (as of Apr 2026). The spec
|
||||
is marked alpha — watch for breaking changes.
|
||||
- **Token references resolve by dotted path.** `{colors.primary}` works;
|
||||
`{primary}` does not.
|
||||
|
||||
## Spec source of truth
|
||||
|
||||
- Repo: https://github.com/google-labs-code/design.md (Apache-2.0)
|
||||
- CLI: `@google/design.md` on npm
|
||||
- License of generated DESIGN.md files: whatever the user's project uses;
|
||||
the spec itself is Apache-2.0.
|
||||
@@ -0,0 +1,99 @@
|
||||
---
|
||||
version: alpha
|
||||
name: MyBrand
|
||||
description: One-sentence description of the visual identity.
|
||||
colors:
|
||||
primary: "#0F172A"
|
||||
secondary: "#64748B"
|
||||
tertiary: "#2563EB"
|
||||
neutral: "#F8FAFC"
|
||||
on-primary: "#FFFFFF"
|
||||
on-tertiary: "#FFFFFF"
|
||||
typography:
|
||||
h1:
|
||||
fontFamily: Inter
|
||||
fontSize: 3rem
|
||||
fontWeight: 700
|
||||
lineHeight: 1.1
|
||||
letterSpacing: "-0.02em"
|
||||
h2:
|
||||
fontFamily: Inter
|
||||
fontSize: 2rem
|
||||
fontWeight: 600
|
||||
lineHeight: 1.2
|
||||
body-md:
|
||||
fontFamily: Inter
|
||||
fontSize: 1rem
|
||||
lineHeight: 1.5
|
||||
label-caps:
|
||||
fontFamily: Inter
|
||||
fontSize: 0.75rem
|
||||
fontWeight: 600
|
||||
letterSpacing: "0.08em"
|
||||
rounded:
|
||||
sm: 4px
|
||||
md: 8px
|
||||
lg: 16px
|
||||
full: 9999px
|
||||
spacing:
|
||||
xs: 4px
|
||||
sm: 8px
|
||||
md: 16px
|
||||
lg: 24px
|
||||
xl: 48px
|
||||
components:
|
||||
button-primary:
|
||||
backgroundColor: "{colors.tertiary}"
|
||||
textColor: "{colors.on-tertiary}"
|
||||
rounded: "{rounded.sm}"
|
||||
padding: 12px
|
||||
button-primary-hover:
|
||||
backgroundColor: "{colors.primary}"
|
||||
textColor: "{colors.on-primary}"
|
||||
card:
|
||||
backgroundColor: "{colors.neutral}"
|
||||
textColor: "{colors.primary}"
|
||||
rounded: "{rounded.md}"
|
||||
padding: 24px
|
||||
---
|
||||
|
||||
## Overview
|
||||
|
||||
Describe the voice and feel of the brand in one or two paragraphs. What mood
|
||||
does it evoke? What emotional response should a user have on first impression?
|
||||
|
||||
## Colors
|
||||
|
||||
- **Primary ({colors.primary}):** Core text, headlines, high-emphasis surfaces.
|
||||
- **Secondary ({colors.secondary}):** Supporting text, borders, metadata.
|
||||
- **Tertiary ({colors.tertiary}):** Interaction driver — buttons, links,
|
||||
selected states. Use sparingly to preserve its signal.
|
||||
- **Neutral ({colors.neutral}):** Page background and surface fills.
|
||||
|
||||
## Typography
|
||||
|
||||
Inter for everything. Weight and size carry hierarchy, not font family. Tight
|
||||
letter-spacing on display sizes; default tracking on body.
|
||||
|
||||
## Layout
|
||||
|
||||
Spacing scale is a 4px baseline. Use `md` (16px) for intra-component gaps,
|
||||
`lg` (24px) for inter-component gaps, `xl` (48px) for section breaks.
|
||||
|
||||
## Shapes
|
||||
|
||||
Rounded corners are modest — `sm` on interactive elements, `md` on cards.
|
||||
`full` is reserved for avatars and pill badges.
|
||||
|
||||
## Components
|
||||
|
||||
- `button-primary` is the only high-emphasis action per screen.
|
||||
- `card` is the default surface for grouped content. No shadow by default.
|
||||
|
||||
## Do's and Don'ts
|
||||
|
||||
- **Do** use token references (`{colors.primary}`) instead of literal hex in
|
||||
component definitions.
|
||||
- **Don't** introduce colors outside the palette — extend the palette first.
|
||||
- **Don't** nest component variants. `button-primary-hover` is a sibling,
|
||||
not a child.
|
||||
Some files were not shown because too many files have changed in this diff Show More
Reference in New Issue
Block a user