open-jetGitHub

documentation

open-jet Docs

open-jet is a terminal-first agent for Jetson and Linux edge devices. It runs quantized models locally via llama.cpp, supports guarded tool calls, and manages context within RAM/token limits.

Install

Prerequisites:

  • llama-server from llama.cpp available on PATH
  • A local .gguf model file, or ollama for pull workflow
  • Python 3.10+
python -m venv .venv
source .venv/bin/activate
python -m pip install -e .

Quickstart

  1. Run open-jet.
  2. Complete setup: hardware profile, model source, context size, and GPU layers.
  3. Chat normally, or load file context using @file, @[path with spaces], or /load path.
  4. Approve or deny state-changing tool calls with y / n.
open-jet
open-jet --setup

CLI Commands

CommandDescription
open-jetStart the TUI. On first run, setup wizard writes config.yaml.
open-jet --setupForce setup wizard, then restart runtime with updated config.

Slash Commands

Slash commandAliasesDescription
/helpcommands, ?Show command help
/exitquitQuit the app
/clearresetClear chat and restart llama-server (flush KV cache)
/clear-chatclear_messagesClear chat only, keep current server/KV state
/statusstatsShow runtime memory/context status
/condense-Manually condense older context
/load <path>addLoad file into context
/resume-Load previous session state into chat
/setup-Open setup wizard and restart runtime
/util [show|hide|toggle|status]usageShow/hide utilization line

Agent Command Surface

The agent can call these runtime tools directly. This is the full callable command surface exposed to the model.

ToolArgumentsBehaviorApproval required
shell{ command: string, timeout_seconds?: int }Run shell command in subprocess; default timeout 60s; timeout returns exit code 124Yes
read_file{ path: string }Read file text (up to 2MB safety cap)No
write_file{ path: string, content: string }Create/overwrite file; creates parent dirs if neededYes
load_file{ path: string, max_tokens?: int }Load text/code file into prompt context with RAM + token budget clampingNo

File Mentions and Tab Autocomplete

Input behavior for file loading and command discovery follows the runtime usage flow documented in the project README.

PatternBehavior
@fileLoads referenced file content into context automatically.
@[path with spaces]Loads files whose path contains spaces.
Tab after @Autocompletes file paths from the current workspace.
/ then Up/Down + TabShows slash command suggestions, cycles options, and autocompletes selected command.

Interaction Controls

AreaKeysResult
Tool approvalsy / n, Left/Right, EnterApprove/deny proposed state-changing tool call
Setup wizardUp/Down, Tab/Enter, Shift+TabNavigate setup steps, save final selection
Prompt completionType / or @, then Up/Down + TabCycle suggestions and apply selected completion
Generation stopEscStop current generation
QuitCtrl+C or /exitExit app and persist session state

Runtime Functionality

CapabilityDetails
File mentions@file and @[path with spaces] auto-load file content into context
@ path autocompleteTab autocompletes @ file paths from the current workspace
Slash autocompleteTyping / shows command suggestions; Up/Down cycles and Tab applies
Token budget displayLive tokens: context + draft, prompt budget, remaining
Memory-aware condenseAuto-condenses context when budget or RAM pressure is hit
Utilization lineCPU, memory, TPS, power/battery status; toggled via /util
Session logsevents.jsonl + metrics.jsonl in session_logs/
Session resumePersisted session_state.json and /resume restore
Model source workflowsLocal GGUF or Ollama model selection + pull path resolution
Safety defaultsMutation tools gated by explicit approval with preview

Model Guidance

Setup recommends model tags from curated parameter bands based on detected or manually selected hardware.

Recommendation bandSuggested Ollama tags
<= 2B paramsqwen2.5:1.5bdeepseek-r1:1.5bgemma2:2b
<= 4B paramsqwen2.5:3bqwen2.5:3b-instructgemma2:2b
<= 8B paramsqwen2.5:7bmistral:7bdeepseek-r1:7b
<= 14B paramsqwen2.5:14bdeepseek-r1:14bgemma2:9b
<= 32B paramsqwen2.5:32bqwen2.5-coder:32bgemma2:27b

Hardware Profiles

Manual setup includes these Jetson profile overrides:

  • Jetson Nano (4GB RAM)
  • Jetson Xavier NX (8GB RAM)
  • Jetson Orin Nano (8GB RAM)
  • Jetson Orin NX (16GB RAM)
  • Jetson AGX Orin (32GB RAM)
  • Jetson AGX Orin (64GB RAM)

Configuration

Runtime config is stored in config.yaml. Key controls include model selection, context window, memory guard thresholds, logging, and session state behavior.

context_window_tokens: 4096
device: cuda
gpu_layers: 20
model_source: local
model: /absolute/path/to/model.gguf
ollama_model: gemma2:2b
memory_guard:
  context_reserved_tokens: null
  min_prompt_tokens: 256
  min_available_mb: null
  max_used_percent: null
  check_interval_chunks: 16
  condense_target_tokens: 900
  keep_last_messages: 6
logging:
  enabled: true
  directory: session_logs
  label: open-jet
  metrics_interval_seconds: 5
state:
  enabled: true
  auto_resume: false
  path: session_state.json

Logging and Session State

  • Structured logs are written to session_logs/ by default.
  • *.events.jsonl: user/assistant messages, tool requests/results, approvals, errors.
  • *.metrics.jsonl: timestamped CPU, load, memory, and process samples.
  • Session state persists in session_state.json and can be restored with /resume.

Edge Constraints and Safety

  • Context loading accepts only text/code files.
  • Large or over-budget file loads are truncated with explicit markers.
  • Runtime automatically condenses context when token or memory pressure exceeds configured thresholds.
  • load_file token budgets are clamped against remaining prompt budget at runtime.