v2.0 — Zero Dependenciesv2.0 — Zero Dependenciasv2.0 — 零依赖

Corino-Gate

Same GPU. Zero waste. Full power. GPU broker plus local control plane for modes, gates, and workstation fixes.

Mesma GPU. Zero desperdicio. Poder total. Broker de GPU com plano de controle local para modos, gates e fixes.

同一块GPU,零浪费,全部性能。74KB Python,零依赖。

Download Linux/WSL Installer Baixar instalador Linux/WSL 下载 Linux/WSL 安装包 Windows Beta Installer Instalador Windows Beta Windows 测试安装包 Quick Start → Inicio Rapido → 快速开始 →
$ gate status
GPU 0  RTX 4060  2.1/8.0 GiB free
Leases: 1 active  |  Priority: focus
RAM guard: OK  |  Fragmentation: low
$ gate cofit
llama3.2:7b + mistral:7b = 6.8G ✓

GPU control plus workstation control plane

Controle de GPU mais plano de controle da workstation

Single daemon, stdlib-only Python, zero external dependencies. It arbitrates VRAM and also exposes modes, access gates, SSE events, and a 100-problem workstation catalog.

Daemon unico, Python stdlib puro, zero dependencias externas. Arbitra VRAM e tambem expõe modos, access gates, SSE e um catalogo de 100 problemas da workstation.

🔒

VRAM Admission Control

Controle de Admissao VRAM

Priority-based leases with TTL. No workload exceeds the GPU budget. Automatic expiration of idle leases.

Leases baseados em prioridade com TTL. Nenhum workload excede o budget da GPU. Expiracao automatica de leases ociosos.

Anti-Thrash Engine

Motor Anti-Thrash

Adaptive throttling that scales with GPU contention. Prevents model load/unload storms that kill throughput.

Throttling adaptativo que escala com contencao da GPU. Evita tempestades de load/unload que destroem throughput.

🧩

Co-Residency Advisor

Consultor de Co-Residencia

The /cofit endpoint tells you which models can run in parallel, fitting within VRAM constraints.

O endpoint /cofit diz quais modelos rodam em paralelo, encaixando nos limites de VRAM.

📋

Execution Planner

Planejador de Execucao

Computes optimal sequential and parallel schedules for multi-model pipelines. Maximum GPU utilization.

Calcula schedules otimos sequenciais e paralelos para pipelines multi-modelo. Utilizacao maxima da GPU.

🚧

Access Gates

Access Gates

Named gates like browser-auth, meeting, and deploy can be claimed and released by local tabs and tools.

Gates nomeados como browser-auth, meeting e deploy podem ser reivindicados e liberados por abas e ferramentas locais.

📡

Realtime Tab Sync

Sync em Tempo Real

/api/events streams mode and gate changes live. /sdk/gates.js lets browser tabs subscribe without inventing their own state store.

/api/events transmite mudancas de modo e gates ao vivo. /sdk/gates.js permite que abas do navegador assinem isso sem inventar outro estado.

🔧

Workstation Fix Catalog

Catalogo de Fixes

A 100-problem catalog now ships with 100 implemented diagnostics for ports, services, Docker, git hygiene, browser groups, memory pressure, and more.

Um catalogo de 100 problemas agora embarca com 100 diagnosticos implementados para portas, servicos, Docker, git, grupos de browser, pressao de memoria e mais.

🖧

Remote Fleet MVP

Fleet Remota MVP

Register remote Corino-Gate nodes and aggregate health, mode, gates, and fix coverage into one control surface.

Cadastre nos remotos do Corino-Gate e agregue health, modo, gates e cobertura de fixes em uma unica superficie de controle.

🔐

Step-Up Protection

Protecao Step-Up

Sensitive Hydra actions can require two fresh proofs: platform passkey plus hardware security key, both bound to method, path, and body hash.

Acoes sensiveis do Hydra podem exigir duas provas frescas: passkey de plataforma mais security key fisica, ambas vinculadas a metodo, rota e hash do corpo.

🛡

RAM Pressure Guard

Guarda de Pressao RAM

Three-level system: warn, pressure, emergency. Manages OOM scores. Prevents system-wide crashes from runaway processes.

Sistema de tres niveis: alerta, pressao, emergencia. Gerencia OOM scores. Previne crashes do sistema por processos descontrolados.

🔍

Squatter Enforcement

Deteccao de Invasores

Detects unmanaged GPU processes consuming VRAM outside the broker. Full visibility into rogue workloads.

Detecta processos GPU nao gerenciados consumindo VRAM fora do broker. Visibilidade total de workloads nao autorizados.

Probationary Leases

Leases Probatorios

Unused leases are auto-demoted. No resource hoarding. GPU time goes to workloads that actually need it.

Leases ociosos sao automaticamente rebaixados. Sem acumulacao de recursos. Tempo de GPU vai pra quem precisa.

🧊

Fragmentation Detection

Deteccao de Fragmentacao

Monitors GPU memory fragmentation. Alerts when allocation patterns degrade performance below SLO thresholds.

Monitora fragmentacao de memoria GPU. Alerta quando padroes de alocacao degradam performance abaixo dos limiares SLO.

🔄

Soft Preemption

Preempcao Suave

Drains active requests before evicting. No killed inferences. Graceful handoff between priority levels.

Drena requests ativos antes de despejar. Nenhuma inferencia morta. Handoff gracioso entre niveis de prioridade.

🚨

Boot Safe Mode

Modo Seguro de Boot

After crashes, starts in conservative mode. Reduced allocations until stability is confirmed. Self-healing.

Apos crashes, inicia em modo conservador. Alocacoes reduzidas ate estabilidade confirmada. Auto-recuperacao.

Hot-Reload Policy

Recarga de Politica a Quente

Change policy.json and POST /reload. No restarts. Zero downtime configuration.

Altere policy.json e POST /reload. Sem restarts. Configuracao sem downtime.

Latency SLOs

SLOs de Latencia

Per-priority-class latency targets. The broker enforces response time contracts across competing workloads.

Alvos de latencia por classe de prioridade. O broker aplica contratos de tempo de resposta entre workloads concorrentes.

REST API on port 18600

API REST na porta 18600

All endpoints accept and return JSON. The daemon binds to 127.0.0.1 by default.

Todos endpoints aceitam e retornam JSON. O daemon escuta em 127.0.0.1 por padrao.

MethodPathDescriptionDescricao
POST/acquireRequest a VRAM leaseSolicitar um lease VRAM
POST/releaseFree a leaseLiberar um lease
POST/renewExtend lease TTLEstender TTL do lease
POST/cofitCo-residency advisorConsultor de co-residencia
POST/prioritySet daily focus priorityDefinir prioridade do dia
POST/reloadHot-reload policy.jsonRecarregar policy.json
POST/api/modeChange control-plane modeAlterar modo do plano de controle
GET/api/gatesAccess-gate snapshotSnapshot dos access gates
POST/api/gates/claimClaim named gateReivindicar gate nomeado
POST/api/gates/releaseRelease named gateLiberar gate nomeado
GET/api/eventsSSE stream for local tabsStream SSE para abas locais
GET/sdk/gates.jsBrowser SDK for tabsSDK browser para abas
GET/api/fixes100-problem fix catalogCatalogo de 100 problemas
POST/api/fixes/runRun implemented diagnostic or fixExecutar diagnostico ou fix implementado
GET/api/fleetFleet snapshotSnapshot da fleet
GET/api/fleet/agent/statusSanitized local node statusStatus sanitizado do no local
POST/api/fleet/nodesAdd or update fleet nodeAdicionar ou atualizar no da fleet
POST/api/fleet/nodes/removeRemove fleet nodeRemover no da fleet
GET/uiLocal dashboard and control planePainel local e plano de controle
GET/statusFull broker snapshotSnapshot completo do broker
GET/leasesActive leasesLeases ativos
GET/ledgerGPU state ledgerEstado da GPU
GET/healthHealth checkVerificacao de saude

How it works

Como funciona

A single daemon mediates all GPU access. Clients acquire leases before touching VRAM.

Um unico daemon intermedia todo acesso a GPU. Clientes adquirem leases antes de tocar a VRAM.

Clients
Clientes
Ollama
PyTorch
llama.cpp
ComfyUI
Custom Scripts
Scripts Custom
↓ acquire / release / renew ↓
gated.py — :18600
policy.json
ram_guard.py
lifecycle.py
↓ nvidia-smi ↓
NVIDIA GPU (RTX 5090 / 23.9 GiB)

Up and running in 60 seconds

Funcionando em 60 segundos

Install

Instalar

Run the installer and the first-run settings screen opens automatically.

Rode o instalador e a tela de definicoes abre automaticamente na primeira execucao.

bash
# Extract and install
unzip corino-gate-linux.zip
cd corino-gate-linux
./install.sh

Start the daemon

Iniciar o daemon

The app can launch the daemon for you, or you can start it manually.

A app pode iniciar o daemon para voce, ou voce pode subir manualmente.

bash
corino-gate --first-run
corino-gate-daemon &

Use it

Usar

CLI for humans, Python client for scripts, REST API for everything else.

CLI para humanos, cliente Python para scripts, API REST para todo o resto.

bash
# CLI
gate status
gate cofit
python
from gate_client import gate_lease

with gate_lease(tag="my_job", vram_mib=15000) as lease:
    run_inference()

Current external read

Leitura externa atual

DeepSeek V3.2
7/10
"Promising for Linux/WSL power users, incomplete for broad adoption"
"Promissor para power users Linux/WSL, incompleto para adocao ampla"
Llama 3.2 Local
6/10
"Good GPU broker plus local control plane, but still niche"
"Bom broker de GPU com control plane local, mas ainda de nicho"
Command R7B Local
5/10
"Valuable core, but missing Windows and remote story"
"Core valioso, mas ainda faltam Windows e historia remota"

Minimal by design

Minimal por design

🐍
Python 3.11+
stdlib only
apenas stdlib
💻
nvidia-smi
any NVIDIA GPU
qualquer GPU NVIDIA
🦙
Ollama
optional
opcional

Ready to take control of your GPU?

Pronto para assumir o controle da sua GPU?

Download Corino-Gate. Zero dependencies. One file. Full control.

Baixe o Corino-Gate. Zero dependencias. Um arquivo. Controle total.

Download Linux/WSL installer Baixar instalador Linux/WSL Windows beta installer Instalador Windows beta