ManageLM Documentation

Manage your Linux and Windows servers with natural language — securely, instantly, at scale.

Deployment mode:

Overview

ManageLM is a remote server management platform. Instead of SSH-ing into servers and running commands manually, you describe what you want in plain English and ManageLM takes care of the rest.

You are running the self-hosted version. The portal runs on your own infrastructure via package installer or Docker.

You are using the SaaS version hosted by ManageLM.

The three components:
  • Portal — The control plane (this web app). Manages accounts, agents, skills, and bridges communication.
  • Agent — A lightweight daemon on each managed Linux or Windows server. Receives tasks, uses an LLM to interpret them, executes commands, and reports back.
  • Claude — Connects via MCP (Model Context Protocol) to the portal. You talk to Claude, Claude talks to your servers.

How It Works

Architecture diagram
  1. You ask Claude to do something on a server, e.g. "Restart nginx on web-01".
  2. Claude calls a tool on the portal via MCP. The tool is auto-generated from the agent's assigned skills.
  3. The portal dispatches the task to the target agent over a persistent WebSocket connection.
  4. The configured LLM (local Ollama/LM Studio, or a cloud provider) interprets the task and generates the shell commands.
  5. Commands are validated against the skill's allowlist before execution — only explicitly permitted commands can run.
  6. Results flow back through the agent → portal → Claude → you.
Key security principle: Agents only initiate outbound connections. No inbound ports are needed on your servers. Commands are validated against a strict allowlist defined by each skill.

What you can do

Just describe what you need in plain English. Here are examples across skills:

CategoryExample prompt
Services"Restart nginx on web-01 and show me the last 20 log lines"
Packages"Update all packages on production servers"
Users"Add SSH access for Charly on user deploy on pocmail"
Security"Run a security audit on all servers and email me a summary"
Access"Who has sudo on production servers?"
Activity"Run an activity audit on dev and show who logged in today"
Files"Add a server block for api.example.com to nginx on web-01"
Firewall"Open port 8080 on staging servers"
Containers"List all running Docker containers on docker-01 and show which ones use more than 1GB memory"
Certificates"Check TLS certificate expiry on all web servers"
Backups"Show me the last backup status for every agent and which ones are failing"
Database"Show the slow query log for MySQL on db-01"
Monitoring"Which servers have disk usage above 85%?"
Multi-server"Check if chrony is running on all servers, install it where it's missing"
LLM"Pull llama3.2 on the Ollama server and test it with a simple prompt"

These are not templates — you can phrase requests however you want. The agent interprets intent and adapts to each server's OS (Linux or Windows), package manager, and configuration.

Quick Start

From zero to managing a server with natural language — in under 10 minutes.

What You'll Need

  1. Create an account — Register on the portal and verify your email.
  2. Configure the LLM — Go to Settings → Account. Choose Local LLM (install Ollama and run ollama pull qwen3.5:9b) or Cloud LLM (enter a provider API key).
  3. Import Skills — Go to Agent Skills → Catalog and import the skills your agent will need. Start with system, files, services, packages, and users.
  4. Install the agent — Click Add Agent in the dashboard, copy the install command, and run it on your server.
  5. Approve the agent — The portal detects the enrollment automatically. Verify the hostname and click Approve.
  6. Assign skills — Click on the agent, scroll to Assigned Skills, and assign the skills you imported.
  7. Connect Claude — Copy the MCP connector details from Settings → MCP & API into Claude Desktop or Claude Code.
  8. Run your first task — Ask Claude: "Show me the system info on web-01", or use the portal's Run Task button directly.
That's it! Your agent is running, skills are assigned, and you can manage your server with natural language — either through Claude or the portal UI. Read on for detailed instructions on each step.

Create an Account

  1. Navigate to the portal and click Register.
  2. Enter your first name, last name, email, and password.
  3. Check your email for a verification link and click it.
  4. Log in to the portal. You're now the owner of your account.
As the account owner, you have full access to all features. You can invite team members later from the Users & Roles page.

Install an Agent

Agents are installed on any Linux or Windows server you want to manage. The install is a single command.

  1. Log in to the portal and go to My Agents.
  2. Click Add Agent.
  3. Optionally select a server group.
  4. Copy the install command and run it on your server:

Linux

curl -fsSL "https://your-portal/install.sh?token=..." | sh

The Linux installer will:

Linux requirements: Python 3.9+, curl, and root/sudo access. The install script supports both apt-based and dnf-based distributions.

Windows

On Windows, the portal provides a PowerShell install script. Copy it from the Add Agent modal (Windows tab) and run it in an elevated PowerShell session. The Windows installer performs the same enrollment steps and registers the agent as a Windows service.

Windows requirements: Python 3.9+, PowerShell 7+, and Administrator access.

Approve the Agent

After the install script runs, the agent appears in the portal as pending approval.

  1. The portal's Add Agent modal will automatically detect the new enrollment and show an approval prompt.
  2. Verify the hostname and click Approve.
  3. The agent receives its access token and connects via WebSocket.
  4. A green Connected indicator appears in the agent list.

You can also approve agents from the agent list by clicking the Approve button on any pending agent.

Set Up the LLM

Each agent uses an LLM to interpret tasks and generate commands. Configure from Settings → Account.

Option 1: Local LLM (Recommended)

Install Ollama or LM Studio for full data privacy — your commands and data never leave your infrastructure. The LLM can run on the agent server itself or on a dedicated machine accessible by your agents.

# Install Ollama
curl -fsSL https://ollama.com/install.sh | sh

# Pull a recommended model
ollama pull qwen3.5:9b

Ollama listens on http://localhost:11434 by default. If Ollama runs on a separate server, set the LLM API URL to its address (e.g. http://llm-server:11434) in Settings → Account.

Recommended local models

ManageLM agents need an LLM that reliably follows structured output formats (<cmd> tags, <done/> markers). For IT agent workloads — generating shell commands, managing services, parsing logs — models with strong instruction following perform best. We recommend the Qwen 3.5 family for the best balance across all hardware tiers, and the Gemma 4 family (E4B, 26B-A4B MoE, 31B Dense) when you need native multimodal support or the very long 256K-token context window for large log / config analysis.

All VRAM figures below assume 4-bit quantization (Q4), which is the default for Ollama/LM Studio and keeps quality within ~2–5% of full precision while cutting memory by roughly 60%. Add 1–3 GB of overhead for the runtime, KV cache, and typical context — more for long contexts on dense models.

CPU-only servers (8–16 GB RAM)

ModelSizeRAMNotes
gemma-4-e4bE4B (4B effective)~5 GBGemma 4 edge model — native multimodal (text, image, audio, video), 128K context. Runs on modest CPU or 8 GB class GPU.
qwen3.5:9b9B~7 GBBest balance of speed and accuracy for CPU-only servers.
qwen3.5:4b4B~4 GBLightweight option for constrained servers or simple skills.
ministral-3:8b8B~6 GBMistral’s edge model with strong function calling and 128K context. Good alternative to qwen3.5:9b when Mistral’s instruction style fits your skills better.

GPU servers (16–24 GB VRAM)

ModelSizeVRAMNotes
gemma-4-26b-a4b26B MoE (3.8B active)~18 GBMixture-of-Experts — only 3.8B active parameters at inference, so tokens-per-second are close to a 4B model while quality is close to a 26B. 256K context (memory stays modest: ~18 GB at 4K → ~23 GB at 256K). Fits RTX 3090/4090.
qwen3.5:27b27B~17 GBExcellent quality for complex sysadmin tasks. Fits most GPUs (RTX 3090/4090).
qwen3.5:35b35B~24 GBTop quality at this tier. Needs RTX 4090 or A5000.
mistral-small3.224B~16 GBMistral’s small model with strong function calling and instruction following.
ministral-3:14b14B~10 GBMid-tier Mistral model — fast tokens-per-second on consumer GPUs (RTX 3080/4070+) with solid tool-use. Leaves headroom for long contexts or parallel skills.

High-end hardware (48+ GB — Mac Studio, DGX Spark, multi-GPU)

ModelSizeMemoryNotes
gemma-4-31b31B Dense~20–40 GBGoogle’s flagship dense Gemma 4 — top-tier open-weights quality with a 256K context window. ~20 GB at 4K context, scaling up to ~40 GB when filling the full 256K. Best on 48 GB cards (A6000, RTX 6000 Ada) for long-context work; fits on a single RTX 4090 at short contexts.
llama3.3:70b70B~45 GBFull precision. Strong tool-use and structured output. Near cloud-LLM quality.
qwen3.5:35b35B~24 GBExcellent quality with headroom for large context and throughput.
mistral-small3.224B~16 GBStrong instruction following. Efficient for multi-model setups.
Tip: You can assign different models per skill using LLM Model Override in the agent detail page — for example, use a larger model for complex skills like containers or kubernetes, and qwen3.5:9b for simple skills like system or users. This optimizes both quality and throughput.

Option 2: Cloud LLM

Use an external cloud provider instead of running a local LLM. Supported providers:

ProviderExample models
Anthropic (Claude)claude-opus-4-7, claude-sonnet-4-6, claude-haiku-4-5-20251001
OpenAI (ChatGPT)gpt-5, gpt-5-mini, gpt-5-nano, gpt-4.1, o3, o4-mini
Google (Gemini)gemini-2.5-flash, gemini-2.5-flash-lite, gemini-2.5-pro, gemini-3-flash-preview, gemini-3.1-pro-preview, gemini-3.1-flash-lite-preview
xAI (Grok)grok-4, grok-4-fast, grok-3, grok-3-mini
Groqllama-3.3-70b-versatile, llama-4-scout-17b-16e-instruct, deepseek-r1-distill-llama-70b
Mistralmistral-large-latest, codestral-latest, ministral-8b-latest
DeepSeekdeepseek-chat, deepseek-reasoner

Select the provider and model from the dropdown, enter your API key, and click Test to verify the connection before saving.

Privacy note: With cloud LLM, task data (commands, parameters, outputs) is sent to the provider for processing. For production workloads with sensitive data, use a local LLM instead.

LLM Access Mode

In self-hosted mode, you can choose how agents access the LLM:

  • Direct (default) — Each agent calls the LLM directly. The API key is sent to the agent.
  • Proxied — Agents route LLM calls through the portal. The API key stays on the portal server and is never sent to agents. This is useful for centralized key management or when agents should not have direct network access to the LLM provider.

Set the access mode from Settings → Account using the Direct / Proxied toggle. Agents with per-agent LLM overrides always use direct access regardless of this setting.

Proxied badge: When an agent uses proxied access, the dashboard shows an orange PROXIED badge next to the LLM status.
LLM detection: When an agent connects, it probes the LLM endpoint and reports the service type and reachability status in the portal dashboard.

Assign Skills

Skills define what an agent is allowed to do. Without skills, an agent can only perform read-only operations.

  1. Go to Agent Skills in the sidebar.
  2. Click Catalog to browse the built-in skills.
  3. Import the skills you need (e.g. "Systemd Service Management", "Package Management").
  4. Navigate to your agent's detail page.
  5. In the Assigned Skills section, click the skill buttons to assign them.

Built-in Skills (31 total)

Skills are available for both Linux and Windows agents. On Linux, skills use shell commands (bash); on Windows, skills use PowerShell-based equivalents. The skill catalog includes platform-appropriate commands for each OS.

SkillWhat it can do
baseCore read-only utilities (file reading, search, system info, resource usage, network diagnostics). Auto-assigned to all agents.
systemSystem info, performance, hostname, timezone, kernel, reboot.
filesCreate, read, write, move, copy, delete files. Permissions, compression, upload/download.
servicesStart, stop, restart services. Systemd on Linux, Windows Services on Windows. Cron jobs, timers, scheduled tasks, process management.
packagesInstall, update, remove packages. Linux: apt, dnf, yum, pacman, zypper, snap. Windows: Chocolatey, winget, MSI.
usersCreate/manage system users, groups, SSH keys, and sudo access.
networkConfigure interfaces, routes, DNS, diagnose connectivity and ports.
firewallManage firewall rules. Linux: UFW, firewalld, iptables, nftables. Windows: Windows Firewall (netsh/PowerShell).
storageDisks, partitions, filesystems, LVM, RAID, mounts, and swap.
securityAuditing, hardening. Linux: fail2ban, SELinux/AppArmor, SSH config. Windows: Windows Defender, BitLocker, Group Policy, Windows Firewall. Intrusion detection.
certificatesSSL/TLS certificates, Let's Encrypt, CAs, Java keystores, trust stores.
logsView, search, and analyze system and application logs (read-only).
monitoringSystem health, resource usage, disk/network I/O, service checks.
containersDocker, Podman, Buildah, images, volumes, networks, Compose.
webserverNginx, Apache, Caddy, Tomcat — sites, configs, SSL, reverse proxy.
webappsNode.js, Python, PHP, Ruby, Java apps — PM2, Gunicorn, Supervisor.
databaseMySQL, PostgreSQL, SQLite — queries, schemas, users, backups.
nosqlMongoDB, Redis, Elasticsearch — data operations, backups, clusters.
gitGit repositories — clone, pull, push, branches, deployment workflows.
backupBackup and restore with rsync, tar, cron — files, dirs, databases.
dnsBIND, Unbound, dnsmasq — zones, records, resolver configuration.
emailPostfix, Dovecot, queues, aliases, DKIM, SPF, spam filtering.
vpnWireGuard, OpenVPN, IPsec — tunnels, peers, keys.
virtualizationKVM/QEMU, libvirt, LXC/LXD, Proxmox, Vagrant.
kubernetesPods, deployments, services, Helm, scaling, troubleshooting.
proxySquid, Varnish, HAProxy — reverse proxy, caching, load balancing.
messagequeueRabbitMQ, Kafka, NATS, ActiveMQ — queues, consumers, messages.
filesharingNFS, Samba/SMB, FTP/SFTP, WebDAV.
ldapOpenLDAP, FreeIPA, SSSD — directory services, centralized auth.
automationAnsible, Terraform, cloud-init — infrastructure as code.
llmOllama, vLLM, llama.cpp — local LLM server and model management.
Security note: Each skill has a strict command allowlist. For example, the services skill on Linux can only run systemctl and journalctl; on Windows, only Get-Service, Restart-Service, etc. The agent rejects any command not on the list. An agent with no skills can only run read-only commands.

Choose Your Interface

ManageLM is not tied to a single tool. You can manage your servers from Claude, ChatGPT, your terminal, the web portal, VS Code, Slack, or n8n — pick whatever fits your workflow.

ScenarioClaude MCPChatGPTShellPortalVS CodeSlackn8n
Natural language tasksSlash cmdsStructured
Multi-step reasoning✓ BestWorkflows
Scheduled & automated tasksVia portalVia portal✓ Cron✓ Built-inWebhooks✓ Native
Security audits & reports✓ + PDF
Fleet operations✓ Bulk select
CI/CD & scripting✓ Best✓ APIAlerts✓ Best
Team collaborationPer userPer userPer user✓ RBAC + auditPer user✓ Shared channels✓ Shared
Offline / air-gapped✓ Self-hosted✓ Self-hosted
Summary: Use Claude MCP or ChatGPT for complex, conversational tasks. Use the Shell for scripting and cron jobs. Use the Portal for dashboards, RBAC, and PDF reports. Use VS Code to manage servers from your editor. Use Slack for team alerts and approvals. Use n8n for automation pipelines.

Connect Claude

ManageLM integrates with Claude via the Model Context Protocol (MCP). Claude sees your servers as tools it can call.

Claude Pro / Max

  1. Go to Settings → MCP & API in the portal.
  2. Find the Claude MCP Connector section.
  3. In Claude Desktop, go to Settings → Custom Connectors → Add.
  4. Copy the four fields (Name, Remote MCP URL, OAuth Client ID, OAuth Client Secret) from the portal and paste them into Claude.

This uses OAuth 2.0 with PKCE, the standard MCP authentication method. Custom Connectors require a Claude Pro or Max plan.

Claude Team

On Claude Team plans, the organization admin sets up the connector once, then team members connect it individually with their own ManageLM credentials.

Admin setup:

  1. Go to Organization Settings → Connectors → Add.
  2. Select Custom → Web.
  3. Fill in the same four fields (Name, Remote MCP URL, OAuth Client ID, OAuth Client Secret) from the portal's Settings → MCP & API section and save.

Team members:

  1. Go to Settings → Connectors in Claude.
  2. Connect the ManageLM connector — it will authenticate with their own ManageLM credentials.

Claude Free

On Claude's Free plan, Custom Connectors are not available. Instead, you can add the MCP server to your claude_desktop_config.json file using mcp-remote as a bridge with header-based authentication.

  1. Go to Settings → MCP & API in the portal and expand the Claude Desktop Free Plan section to copy the config.
  2. Open your claude_desktop_config.json file:
    • macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
    • Windows: %APPDATA%\Claude\claude_desktop_config.json
    • Linux: ~/.config/Claude/claude_desktop_config.json
  3. Paste the following (replace the URL, credentials, and npx path with your values):
{
  "mcpServers": {
    "ManageLM": {
      "command": "/full/path/to/npx",
      "args": [
        "mcp-remote",
        "https://your-portal/mcp",
        "--header", "X-MCP-Id:your-client-id",
        "--header", "X-MCP-Secret:your-secret"
      ]
    }
  }
}

Replace /full/path/to/npx with the actual path to npx on your system (run which npx to find it). The credentials are available in Settings → MCP & API (click "Rotate Secret" if the secret is not yet generated). Save the file and restart Claude Desktop.

Claude Code

You can also configure the MCP server in Claude Code's JSON config using the same format:

{
  "mcpServers": {
    "ManageLM": {
      "command": "npx",
      "args": [
        "mcp-remote",
        "https://your-portal/mcp",
        "--header", "X-MCP-Id:your-client-id",
        "--header", "X-MCP-Secret:your-secret"
      ]
    }
  }
}

What Claude sees

Once connected, Claude gets one tool per skill slug (e.g. system, services, files). Each tool takes two parameters:

For example, Claude calls the services tool with target: "web-01" and instruction: "restart nginx".

Claude also gets built-in meta-tools:

Note: The tool list is fetched at connection time. If skills are added or removed during a session, Claude won't see the changes until you reconnect (restart Claude Desktop or re-open Claude Code).

Run Tasks

You can run tasks in two ways:

Via Claude (MCP)

Just describe what you want in natural language:

Via the Portal UI

  1. Click on an online agent in the dashboard.
  2. Click the Run Task button.
  3. Select a skill from the dropdown.
  4. Type a natural-language instruction describing what you want.
  5. Click Execute.

Task results appear in the Command History section on the agent detail page and in the Request Log page.

Agent CLI Tools

Three on-host commands ship with the agent and talk to the local daemon over a Unix socket — no network, no portal round-trip. They reuse the same skill gate, command validator, and kernel sandbox as portal tasks, and work offline whenever the local LLM is configured. All three are installed in /opt/managelm/bin/ (Linux) or C:\ProgramData\ManageLM\bin\ (Windows) and also exposed on the PATH.

Security model: These tools run as root / LocalSystem on the managed host. They bypass portal user identity (no per-user RBAC) but still go through the skill's allowed-commands validator and the kernel sandbox, and every task is forwarded to the portal's audit log with source shell.

managelm-shell — Interactive terminal

A natural-language REPL on the managed server. Type what you want, the agent auto-routes it to the best skill, runs it through the sandbox, and streams the answer back.

# Interactive REPL
managelm-shell

# One-shot
managelm-shell -c "install htop and verify"

# Force a specific skill
managelm-shell
> @services restart nginx

Shell tasks show up in the audit log, Reporting, and webhooks just like portal tasks.

managelm-fixit — Diagnose & fix one file

Point it at any misbehaving file. The agent classifies the content, picks the right skill, diagnoses the issue, and proposes a full-file replacement as a colored diff. Apply on y, reject on N.

# Diagnose, show diff, y/N to apply
managelm-fixit /etc/nginx/nginx.conf

# Diagnosis only, no diff
managelm-fixit --explain /etc/postfix/main.cf

# Auto-apply without prompting
managelm-fixit --yes /var/www/app/config.yaml

# Force a skill and add a hint
managelm-fixit @webserver -c "502 after upgrade" /etc/nginx/nginx.conf

managelm-review — Read-only review

Where fixit writes, review only reads. Point it at a file or a directory and get a short summary plus a list of findings grouped by severity. Nothing is written to disk.

# Review a single file
managelm-review /etc/ssh/sshd_config

# Review a directory (walks recursively, skips .git, node_modules, …)
managelm-review ./src/

# Only warning + critical
managelm-review --severity warning ./src/

# JSON for CI
managelm-review --format json ./src/

Quick Reports

Quick reports are one-click diagnostic commands available on the Agent Assets page. They let you run common checks on any online agent without writing instructions.

How it works

  1. Open the Agent Assets page.
  2. Each agent card shows small icon buttons below the OS info line — one per available report.
  3. Click an icon to run the report. A modal opens showing a spinner while the agent executes.
  4. When complete, the modal displays the LLM summary (a readable interpretation) and the raw terminal output.
  5. Click Copy to copy both summary and output to your clipboard.
Requirements: The agent must be online and you need admin/owner role or the agents permission. Reports only appear for skills that are assigned to the agent (directly or via a group).

Available reports

Nine built-in skills include quick reports. Each report runs a pre-built instruction on the agent:

SkillReportWhat it checks
systemSystem SummaryHostname, OS, kernel, uptime, load, memory and disk usage
systemTop ProcessesTop 10 processes by CPU and memory
servicesService InventoryAll services with status and enabled state (systemd on Linux, Windows Services on Windows)
servicesFailed ServicesServices in failed state
packagesAvailable UpdatesPackages with pending updates
usersUser AccountsAll users and groups with UID, GID, home, shell
networkListening PortsAll listening TCP/UDP ports with their process
securitySecurity OverviewListening ports, SSH config, fail2ban status
containersContainer StatusAll containers with name, image, status, ports
storageDisk UsageDisk usage for all mounted filesystems
logsRecent ErrorsErrors and warnings from the system journal (Linux) or Windows Event Log (last 30 min)

Active task indicators

Agent cards on both the My Agents dashboard and the Agent Assets page display a red badge with a spinning icon when the agent has tasks currently running. The badge shows the number of active tasks (pending, sent, or executing).

Portal UI Guide

PagePurpose
My AgentsDashboard showing all agents, their status, LLM info, skills, active task indicators, and a 7-day task activity chart. Add, approve, search, and bulk-manage agents.
Agent DetailConfigure an agent: display name, LLM settings, tags, groups, assigned skills, member access. Run tasks and view command history.
Agent AssetsVisual server map with agents organized by collapsible group zones, click-to-expand agent cards with 24h metrics and cloud provider metadata, quick report buttons, security audit, system inventory, SSH & sudo access, activity audit, service dependencies, scheduled PDF reports, and bulk select operations.
Agent SkillsImport from the built-in catalog, create custom skills, or import/export skill JSON files.
Agent GroupsOrganize agents into groups (e.g. "production", "staging"). Groups are used in MCP tool names.
Users & RolesInvite team members, assign roles (admin/member), and configure granular permissions.
MonitorsService monitors — track availability and response time of 43 service types (HTTP, TCP, DNS, SMTP, databases, message brokers, VPNs, and more). Sparkline charts, status badges, alert toggles, categorized catalog, test-before-create.
CertificatesCertificate management — issue, renew, and revoke TLS certificates via internal CA or Let's Encrypt. Deploy to agents automatically. CRL generation. Daily auto-renewal sweep.
System BackupsSystem Backups — end-to-end encrypted filesystem backups to your own S3 storage (OVH, AWS, R2, B2, Wasabi, Scaleway, MinIO). Streaming downloads, restore to any agent, detach-on-delete, optional service quiesce for consistent database snapshots.
PentestsAutomated penetration testing for public-facing agents using nmap, nuclei, testssl.sh, ffuf, subfinder. Credit-based scans with domain verification. Results feed into the Compliance page. Pro/Business plans.
ComplianceCompliance framework mapping — automatically projects security audit and pentest results onto CIS Level 1, CIS Docker, SOC 2, PCI DSS, ISO 27001, NIS2, NIST CSF, and HIPAA. Drift detection with in-app and email alerts. Per-framework evidence PDFs.
ConnectorsExternal integrations split into two kinds: Cloud Hosting (Azure, AWS, Google Cloud, VMware, Proxmox, OpenStack — credentials, test connections, sync resources, browse discovered cloud inventory with agent matching) and SIEM Integration (Splunk HEC, Elasticsearch _bulk, generic JSON webhook — agents forward task-completion events directly to the destination, per-agent or inherited from an agent group).
Request LogView all MCP task requests across your account with status and output.
Audit LogView a record of all actions taken in your account.
ReportingBrowse and export task execution history with date filters, search, pagination, and PDF export. Requires the perm_reports permission (admin/owner always have access).
SettingsProfile, Security (passkeys, MFA, SSH public keys, verified domains, sessions), MCP & API (credentials, IP whitelist, API keys, webhooks), PKI & CA (internal CA setup, Let's Encrypt account, DNS-01 providers, certificate defaults), S3 Backups (provider, bucket, credentials, orphan cleanup), Account (plan, LLM defaults, danger zone).

Skills

Skills are the core security and capability model. Each skill defines:

Hard security boundary: The allowed_commands list is enforced in code, not just in prompts. Even if the LLM generates a command not on the list, the agent will reject it. Empty allowed_commands = read-only mode (only safe commands like ls, cat, grep).

Management Hints

Each skill assignment (on an agent or a group) supports management hints — free-text contextual instructions injected into the LLM system prompt as an ADMINISTRATOR HINTS block. Use hints to provide server-specific or group-wide context that helps the LLM do its job:

Hints can be set at two levels:

LevelWhere to setScope
Per-agentAgent detail → expand skill → Management HintsThis skill on this specific agent
Per-groupAgent Groups → expand skill → Management HintsThis skill on all agents in the group

Direct per-agent skill assignments take priority over group-inherited ones (including their hints).

Skill Definition Example

Below is an example of a Linux skill definition. Windows skills follow the same structure but with PowerShell cmdlets in allowed_commands.

{
  "description": "Manage systemd services",
  "operations": [
    {
      "name": "restart",
      "description": "Restart a systemd service"
    },
    {
      "name": "status",
      "description": "Get status of a service including recent logs"
    }
  ],
  "allowed_commands": ["systemctl", "journalctl"],
  "system_prompt": "You are a Linux sysadmin..."
}

Operations are instruction-based — each operation has only a name and description. They describe capabilities for documentation and AI context, not structured parameter schemas. The agent LLM interprets the user's natural-language instruction to determine what commands to run.

Skill Combinations

Many real-world management tasks span multiple skills. Each skill controls a specific domain — when an operation touches several domains, you need all the relevant skills assigned to the agent.

How it works: When you ask Claude to perform a multi-step task, it will call the appropriate skill tools one after another. If a required skill is missing, that step will fail with a "skill not assigned" error. Plan ahead by assigning all the skills an agent needs for its role.

Foundation skills

These five skills are used by almost every management workflow. Consider assigning them to all agents as a baseline:

SkillWhy it's foundational
systemSystem info, hostname, timezone — needed to understand what you're working with.
filesRead/write config files, set permissions — almost every change touches a file.
servicesStart/stop/restart daemons, manage cron — most installations end with a service reload.
packagesInstall software — any new capability starts with installing a package.
usersCreate accounts, manage SSH keys, sudo — many services need a dedicated user.

Common multi-skill workflows

Below are typical management tasks and the skills they require. Each example shows what you'd ask Claude and which skills are involved.

Create a new system user with SSH access

"Create user deploy with a home directory, add their SSH key, and set them up with sudo access for systemctl"

StepSkill needed
Create user account and groupusers
Create home directory and set ownershipfiles
Add SSH authorized keyusers
Configure sudoers entryusers

Skills: users + files

Install and configure Nginx with SSL

"Install nginx, create a site for example.com with Let's Encrypt SSL, and open port 443 in the firewall"

StepSkill needed
Install nginx packagepackages
Create site config filewebserver
Obtain SSL certificate via certbotcertificates
Enable the site and reload nginxwebserver
Open ports 80/443 in the firewallfirewall

Skills: packages + webserver + certificates + firewall

Deploy a Node.js application

"Clone the repo from GitHub, install dependencies, set up a PM2 process, and configure nginx as a reverse proxy"

StepSkill needed
Create app user and directoryusers + files
Clone the Git repositorygit
Install Node.js and npm dependencieswebapps
Start the app with PM2webapps
Create nginx reverse proxy configwebserver
Set up SSL certificatecertificates

Skills: users + files + git + webapps + webserver + certificates

Set up a PostgreSQL database server

"Install PostgreSQL 16, create a database and user for my app, configure backups, and open port 5432 only from 10.0.0.0/24"

StepSkill needed
Install PostgreSQL packagespackages
Start and enable the serviceservices
Create database and DB userdatabase
Edit pg_hba.conf for network accessfiles
Set up a pg_dump cron jobbackup
Open port 5432 for the subnetfirewall

Skills: packages + services + database + files + backup + firewall

Docker Compose deployment

"Create a docker-compose.yml for my app stack, start it, and check the container logs"

StepSkill needed
Create project directory and compose filefiles
Start compose stackcontainers
View container logscontainers
Open ports in firewall (if needed)firewall

Skills: files + containers + firewall

Security hardening

"Harden SSH (disable root login, key-only auth), set up fail2ban, and configure the firewall to allow only SSH and HTTPS"

StepSkill needed
Edit sshd_configsecurity
Restart sshdservices
Install and configure fail2bansecurity
Set firewall rules (allow 22, 443 only)firewall
Review auth logslogs

Skills: security + services + firewall + logs

Set up WireGuard VPN

"Install WireGuard, generate keys, configure a tunnel to 10.0.1.0/24, and open UDP port 51820"

StepSkill needed
Install WireGuard packagepackages
Generate keys and create configvpn
Enable IP forwarding (sysctl)network
Open UDP 51820 in firewallfirewall
Start and enable the WireGuard serviceservices

Skills: packages + vpn + network + firewall + services

Skill assignment strategies

Use server groups to assign skill sets by server role, so you don't have to configure each agent individually:

Server roleRecommended skills
Web serversystem, files, services, packages, users, webserver, certificates, firewall, logs, monitoring
App serversystem, files, services, packages, users, webapps, git, logs, monitoring
Database serversystem, files, services, packages, database, backup, firewall, storage, logs, monitoring
Docker hostsystem, files, services, packages, containers, network, firewall, storage, logs, monitoring
Minimal / read-onlysystem, logs, monitoring (no write skills — agent can only read)
Least-privilege: Only assign the skills each server actually needs. A database server doesn't need webserver. A web server doesn't need database. Fewer skills = smaller attack surface.

Policy Rulesets

Rulesets are short markdown policy snippets attached to agents (directly or via groups). Every attached ruleset is concatenated and injected into the LLM system prompt as an unconditional POLICY RULES block — applied to every task, regardless of which skill runs.

Where management hints are advisory context scoped to a single skill ("PostgreSQL data dir is /data/pg16"), rulesets are cross-skill constraints that stay in force for the whole task ("Never restart services between 09:00 and 18:00 UTC", "Never edit files under /etc/pam.d without prior approval"). The prompt includes an explicit refusal rule: if a request would violate a listed policy, the agent refuses instead of executing.

Managing Rulesets

Go to Agent Skills → Rules. Each ruleset has:

Attaching Rulesets

LevelWhereScope
Per-agentAgent detail → RulesetsThis agent only
Per-groupServer Groups → RulesetsAll agents in the group

Rulesets accumulate across attachments — an agent gets the union of everything attached directly plus everything inherited from every group it belongs to (deduplicated by ruleset id). Changes push to affected agents immediately over WebSocket; no restart required.

Permission: Managing rulesets requires the skills permission (same gate as creating custom skills).
Rulesets are guidance, not a hard sandbox. They instruct the LLM to refuse out-of-policy requests, but hard security boundaries (allowed_commands, kernel sandbox, skill operation gates) still live in the agent itself. Use rulesets for organizational policy; use skills and the kernel sandbox for enforcement that cannot be prompted around.

Server Groups

Groups let you organize agents logically (e.g. by environment, role, or location).

Create and manage groups from the Agent Groups page. Assign agents to groups from the agent detail page or the groups page.

Group-level skill configuration

When assigning skills to a group, you can configure per-skill settings that apply to all agents in the group:

Click the chevron next to a skill in edit mode to expand the configuration panel.

Secrets

Each agent has a local secrets.txt file (/opt/managelm/secrets.txt on Linux, C:\ManageLM\secrets.txt on Windows). This file stores sensitive values that commands might need.

# Example secrets.txt
DB_USER=myapp
DB_PASS=s3cret
API_KEY="my-api-key"
How secrets work: Values are injected as environment variables into command subprocesses. The LLM only sees variable names ($DB_PASS), never the actual values. Secrets never leave the server.

LLM Configuration

The LLM is configured from Settings → Account:

For both Local and Cloud LLM, you can choose the access mode:

  • Direct — Agent calls the LLM directly (default).
  • Proxied — Agent routes LLM calls through the portal. The API key stays on the portal and is never exposed to agents.

Configuration hierarchy

LLM settings can be overridden at multiple levels (highest priority first):

LevelWhere to setUse case
Per-skill overrideAgent detail → expand skill → LLM Model OverrideUse a specific model for complex skills
Per-agent overrideAgent detail → Edit → "Override for this agent"Agent needs a different LLM (local or cloud)
Account defaultSettings → AccountDefault for all agents

The per-skill config panel also includes management hints for providing contextual instructions to the LLM.

Per-agent overrides offer Local LLM or Cloud LLM options and always use direct access. Agents inherit from the account default unless explicitly overridden.

Default values if nothing is configured:

Users & Roles

ManageLM supports team collaboration with role-based access control.

Roles

RoleAccess
OwnerFull access. Cannot be removed. One per account.
AdminFull access. Can invite users, manage permissions, edit settings.
MemberLimited access based on permissions. Only sees assigned agents and groups.

Member Permissions

PermissionGrants access to
agentsApprove, delete, configure agents and assign skills
groupsCreate, rename, delete groups and assign agents
skillsCreate, import, edit, and delete skills
logsView task logs and MCP activity
reportsAccess the reporting dashboard and export reports

MCP Visibility

All users (including owners and admins) only see agents via MCP that are:

This ensures MCP access is always explicitly granted, regardless of role.

Skill Restrictions

For delegated admin members, you can optionally block specific skills from being invoked. On the Users & Roles page, expand the Skill restrictions row under a member's permissions and click Edit to add skills to the member's blocklist.

Skill restrictions are a pure portal-side filter — the agent still ships its full effective skill set (agent + group-inherited), and task execution logic is unchanged. Use permissions to gate management actions (creating agents, editing groups) and skill restrictions to gate operational ones (running sensitive skills on agents).

Enforcement matrix

Where the blocklist is checked, and what happens when the caller is restricted:

Entry pointExplicit skillAuto-routingResumed follow-up / answer
POST /api/tasksBlocklist checkRejected for restricted users
POST /api/tasks/:id/follow-upBlocklist checkRejectedBlocklist check
POST /api/tasks/:id/answerBlocklist checkRejectedBlocklist check
MCP skill tool invocationBlocklist check
MCP answer_taskBlocklist checkRejectedBlocklist check
managelm-shell on the managed hostExempt (local root; no portal user identity)Exempt
Portal-initiated scans (security, inventory, SSH & sudo, activity)Exempt (not skills)

Rejected tasks return the same error text as “skill not assigned to agent” by design, so the reason isn't hinted at to the caller.

Inviting Users

  1. Go to Users & Roles.
  2. Click Invite User.
  3. Enter their name and email, select a role, and set permissions.
  4. They'll receive an email with an invitation link.

Passkeys & MFA

ManageLM supports WebAuthn passkeys for secure authentication.

You can register multiple passkeys (e.g. fingerprint + security key) and name them for easy management.

API Keys

API keys allow programmatic access to the portal API for automation and integrations.

  1. Go to Settings → MCP & API.
  2. Enter a name, select permissions (Agents, Logs, Skills, Groups, Reports), and optionally set an expiration (30, 90, 180, or 365 days).
  3. Click Create Key and copy the key (starts with mlm_ak_). It's only shown once.

Use the key in the Authorization header:

Authorization: Bearer mlm_ak_...

Each key's effective permissions are the intersection of the key's permissions and the creating user's permissions. If a user is later downgraded, their keys lose access accordingly. Expired keys are automatically cleaned up.

OAuth App Credentials (OpenAI GPT, etc.)

For integrations that require OAuth 2.0 (like OpenAI GPT Actions), set OAUTH_APP_CLIENT_ID and OAUTH_APP_CLIENT_SECRET in your .env file. These identify the application — each user still authenticates individually with their own ManageLM credentials. See the Self-Hosted Docker guide for details.

Security Model

Defense in depth

  • Command allowlist — Skills define exactly which commands an agent can run. Enforced in code, not prompts.
  • Destructive command guard — Even for allowed commands, the agent blocks catastrophically dangerous argument combinations: rm targeting protected root directories (/, /etc, /usr, etc.), dd writing to block devices, mkfs, --no-preserve-root, and find -delete.
  • Kernel sandbox (opt-in, Linux only)Landlock + seccomp-bpf confine command subprocesses at the kernel level. Even if a command passes all Python-level checks, the kernel blocks writes outside allowed paths and dangerous syscalls.
  • Read-only by default — Agents with no skills (or skills with empty allowlists) can only run safe read-only commands.
  • Outbound-only connections — Agents connect to the portal. No inbound ports needed.
  • Ed25519 task signing — Every task dispatched to an agent is cryptographically signed. Agents verify the signature before execution.
  • Secrets isolation — Secrets stay on the server. The LLM only sees variable names, never values.
  • Hash-only storage — Passwords, tokens, and API keys are stored as hashes.
  • Rate limiting — Login, registration, and password reset endpoints are rate-limited.
  • IP whitelist — Optional CIDR-based IP whitelist for MCP connections.
  • Execution limits — Max 10 LLM turns per task, 120s timeout per command, 8000 char output limit.

Always-allowed commands (read-only)

The base skill is auto-assigned to every agent and provides a broad set of read-only commands:

cat head tail less more ls tree grep egrep fgrep find locate wc sort uniq
awk sed cut tr diff comm column paste tac xargs file stat md5sum sha256sum
sha1sum readlink basename dirname realpath uname hostname whoami id uptime
date timedatectl lsb_release arch nproc getconf dmesg last lastlog w who
df du free lsblk lscpu lsmem vmstat iostat top ps pgrep lsof fuser
ip ss netstat dig nslookup host ping traceroute curl wget nc
echo printf which type test true false yes seq sleep cd pwd

Even if the base skill is somehow missing, agents fall back to a minimal safe set: cat head tail ls grep find wc sort echo printf test true false cd pwd which.

Kernel Sandbox

The kernel sandbox is available on Linux agents only. It adds Linux-native confinement to command subprocesses using Landlock and seccomp-bpf. It is opt-in per skill and disabled by default. Each layer can be enabled independently. Windows agents do not use the kernel sandbox.

How it works

When enabled on a skill, every command subprocess runs inside a kernel-enforced sandbox applied via preexec_fn after fork() but before exec(). The agent process itself stays unrestricted — only the command is confined.

Command from LLM
  |
  +- Layer 1: Injection blocking       (Python — blocks $(), eval, etc.)
  +- Layer 2: Binary allowlist          (Python — must be in allowed_commands)
  +- Layer 3: Destructive argument guard (Python — blocks rm -rf /, dd of=/dev/sda)
  |
  v subprocess.run(preexec_fn=sandbox)
  |
  +- Layer 4a: Landlock                 (Kernel — filesystem path confinement)
  +- Layer 4b: seccomp-bpf             (Kernel — syscall blocklist)

Landlock (filesystem confinement)

Restricts which filesystem paths the subprocess can read, write, and execute from. Uses Linux Landlock LSM (requires kernel 5.13+).

AccessDefault pathsPurpose
Read/ (everything)Commands can read system state
Write/etc, /var, /tmpConfig edits, logs, temp files
Execute/ (everything)allowed_commands is the binary gate

Everything outside the configured write paths is read-only at the kernel level — no userspace bypass possible. File uploads also enforce write paths via Python-level path validation using the same config.

seccomp-bpf (syscall filtering)

Blocks dangerous syscalls that no legitimate agent task should need. Returns EPERM (not kill) for graceful error handling.

CategoryBlocked syscalls
Filesystem rootmount, umount2, pivot_root, chroot, move_mount, fsopen, fsconfig, fsmount, fspick, open_tree
System controlreboot, kexec_load, kexec_file_load
Kernel modulesinit_module, finit_module, delete_module
Swapswapon, swapoff
Exploit primitivesptrace, bpf, userfaultfd, perf_event_open
System identitysettimeofday, clock_settime, sethostname, setdomainname

Enabling the sandbox

  1. Open the Skills page and edit a skill.
  2. Go to the Sandbox tab.
  3. Toggle Landlock and/or seccomp-bpf on.
  4. Customize write paths or blocked syscalls if needed.
  5. Click Save Changes.

The sandbox is pushed to agents automatically on save. Agents on kernels older than 5.13 (Landlock) or 3.17 (seccomp) gracefully degrade — the sandbox is skipped with a log warning, and commands run unrestricted.

Skill configuration (JSON)

{
  "sandbox_landlock": {
    "read_paths": ["/"],
    "write_paths": ["/etc", "/var", "/tmp", "/opt/myapp"],
    "exec_paths": ["/"]
  },
  "sandbox_seccomp": ["mount", "reboot", "ptrace", "init_module", "..."]
}

Each key is independent — use one or both. Absent key means that layer is off. Catalog skills include recommended sandbox templates that you can use as a starting point.

Requirements

Security Audits

ManageLM includes a built-in security audit and compliance engine that scans your agents for misconfigurations, vulnerabilities, and hardening issues. Audits are fully deterministic — no LLM required. Check commands are defined on the portal and executed on the agent in a read-only sandbox.

How it works

  1. Trigger — From the Agent Assets page (per agent), the Compliance dashboard (fleet-wide), or via MCP.
  2. Scan — The agent runs a set of read-only checks on the host.
  3. Report — Each finding includes a severity, an explanation of the risk, a suggested fix, and a mapping to compliance frameworks (CIS, PCI DSS, HIPAA, ISO 27001, NIS2, NIST CSF, SOC 2). A compliance score (0–100) reflects the overall posture. Installed packages are also matched against known vulnerabilities (see below).
  4. Results — Findings appear in the Agent Assets audit view and the Compliance dashboard. You receive an in-app notification when the audit completes.

Server context

Each compliance rule has separate severity ratings for public and private servers:

What is checked

CheckWhat it inspects
SSH & RDP configRoot login, password vs. key authentication, retry limits, X11 forwarding, RDP Network Level Authentication.
Listening portsOpen TCP and UDP sockets on all interfaces.
FirewallHost firewall status and rules (UFW, firewalld, nftables, iptables, or Windows Firewall profiles).
User accountsLogin-enabled users, UID 0 / local administrators, guest account, service accounts.
Password policyMinimum length, complexity, lockout threshold.
Windows hardeningUAC enabled, cleartext credential storage disabled (WDigest), automatic login disabled.
File permissionsWorld-writable files, SUID binaries, shadow file readability.
Password hashingPassword hashes flagged if they use weak algorithms (MD5 or older).
Patch posturePending security updates, automatic-update service enabled, pending reboot after kernel or library updates.
Installed packagesFull package inventory feeding the vulnerability scan.
Authentication eventsFailed login attempts in the last 24 hours.
Audit & event loggingAudit daemon (Linux) or Advanced Audit Policy (Windows); PowerShell script-block logging.
Endpoint protectionMandatory access control (SELinux / AppArmor) or Windows Defender antivirus including signature freshness.
Time synchronizationSystem clock synchronized via NTP.
Kernel hardeningIP forwarding, ICMP redirect handling, reverse-path filtering, ASLR, SUID core dumps.
Brute-force protectionFail2ban status and active jails.
TLS/SSLWeak protocols (SSLv3, TLSv1.0/1.1) and weak ciphers (RC4, DES, NULL, EXPORT, MD5) rejected on all listening services.
CertificatesTLS certificate expiry with days remaining.
SMB hardeningSMB signing required, legacy SMB1 protocol disabled.
Network exposureLLMNR (legacy name resolution) disabled on Windows.
Disk encryptionBitLocker protection on OS and fixed data volumes (Windows).
Scheduled tasksSystem and per-user cron jobs.
SSH authorized keysSSH key-based access across all users.
DockerPrivileged containers, socket exposure, containers running as root.
Vulnerability scanInstalled packages matched against known CVEs (see next section).

Vulnerability scanning

As part of every security audit, ManageLM checks each agent's installed packages against a public vulnerability database and reports any known CVEs that apply to the installed versions. Nothing to install, nothing to configure.

Severity levels

LevelMeaning
CriticalImmediate action required — actively exploitable or dangerous misconfiguration.
HighSignificant risk — should be addressed promptly.
MediumModerate risk — recommended to fix.
LowMinor issue or informational finding.
PassCheck passed — no issue found.

Findings

Each finding includes:

Automated remediation

You can select one or more findings and click Remediate to have the agent automatically fix them. This requires:

Remediation creates a task that uses the security skill and the agent's LLM to intelligently apply the recommended fixes. The agent backs up configuration files before making changes and validates them before restarting services.

Review before remediating. Always review the recommended fixes before clicking Remediate. Security changes (e.g. SSH hardening, firewall rules) can lock you out if applied incorrectly.

PDF export

Click the Security button at the top of the Agent Assets page to download a fleet-wide security audit report. The PDF includes a summary bar with issue counts by severity, detailed findings with explanations and remediation steps, and a list of passed checks.

Use the Schedules popover in the Agent Assets toolbar to enable automatic report emails (Daily / Weekly / Monthly). Scheduled reports are generated and emailed as PDF attachments to all admin users who have report_ready notifications enabled. Changing the report schedule also sets the same scan schedule on all agents so data stays fresh.

Scheduled audits

You can configure automatic recurring audits per agent. Open the Security Audit modal and use the schedule selector in the top-right corner to choose a frequency:

The scheduler checks every 15 minutes and triggers audits for agents that are overdue. Agents that have never been scanned are prioritized. A yellow badge (D, W, or M) appears on the agent card to indicate an active schedule.

Constraints

Service Monitors

Monitor the availability and response time of services running on your agents. Monitors run directly from the agent's network, so they can check internal services (localhost, LAN) as well as public endpoints.

How it works

  1. Create — Open the Monitors page and click Add Monitor. Pick a service type from the catalog (43 types across 9 categories), select an agent, and configure the check parameters.
  2. Check — The agent runs the check locally on the configured schedule (1m, 5m, 15m, 30m, or 1h). Five check types are supported: TCP connect, HTTP request, DNS resolution, ICMP ping, and TLS certificate expiry.
  3. Report — The agent reports results to the portal. Only status transitions (up→down, down→up) and periodic summaries are sent — not every individual check. This keeps DB writes near zero when everything is up.
  4. Alert — When alerts are enabled, an email is sent to all users assigned to the target agent after a configurable number of consecutive failures (default: 3). A recovery email is sent when the service comes back up.

Service catalog

The monitor catalog (one JSON file per service in monitors/) defines 43 service types organized in 9 categories:

CategoryServices
WebHTTP / HTTPS, REST API, HAProxy, Squid Proxy
NetworkTCP Port, Ping (ICMP), DNS, NTP
EmailSMTP, IMAP, POP3
DatabaseMySQL / MariaDB, PostgreSQL, SQL Server, Redis / Valkey, MongoDB, Elasticsearch, Memcached, ClickHouse, InfluxDB, Cassandra, CouchDB
MessagingRabbitMQ, Kafka, NATS, MQTT
File SharingFTP / SFTP, SMB / CIFS, NFS, AFP, MinIO / S3, WebDAV
Remote AccessSSH, RDP, WinRM, OpenVPN, IPsec / IKEv2
InfrastructureLDAP / LDAPS, Kerberos, Docker API, Consul, Vault, etcd
MonitoringPrometheus, Grafana, Zabbix

Each service type maps to one of 5 agent check types (TCP, UDP, HTTP, DNS, Ping). TCP and HTTP checks support an SSL/TLS toggle for TLS handshake validation and optional certificate expiry warnings (works with self-signed certificates). Adding a new service is just a JSON file in the monitors/ directory — no code changes needed.

Alerts

Each monitor has an alert toggle and a configurable consecutive failure threshold (default: 3).

Test before creating

The Test button in the create/edit modal sends an ad-hoc check to the agent and shows the result immediately (up/down, response time, error) without creating or saving the monitor.

Data & charts

Permissions

MCP integration

Two MCP tools are available for AI assistants:

Per-Plan Limits

The number of monitors per account is limited by your plan (Free: 20, Pro: 100, Business: 200, Enterprise: unlimited). The Monitors page shows your usage against the limit. The Add Monitor button is disabled when the limit is reached.

Certificates & PKI

Manage TLS certificates for your agents directly from the portal. Two certificate sources are supported:

Setup

  1. Configure a CA or LE account — Go to Settings → PKI & CA. Create a new internal CA, import an existing sub-CA, or register a Let's Encrypt account. Optionally add DNS-01 providers for DNS-based certificate validation.
  2. Set defaults — Configure default certificate validity (14–365 days), key type (ECDSA P-256, RSA-2048, RSA-4096), and renewal window (7–90 days before expiry).
  3. Issue certificates — Go to Certificates, click New Certificate, pick a target agent, and fill in the common name, file paths, and optional SANs.

Certificate Lifecycle

CRL & Public Endpoints

The portal serves two public endpoints (no authentication required):

Both URLs are embedded in issued certificates as the CRL Distribution Point and Authority Information Access extensions.

Auto-Renewal Sweep

A daily background task (Redis-locked for HA) handles certificate lifecycle:

Permissions

Per-Plan Limits

The number of certificates per account is limited by your plan (Free: 10, Pro: 50, Business: 100, Enterprise: unlimited). The Certificates page shows your usage against the limit. The New Certificate button is disabled when the limit is reached.

MCP Tools

System Backups

End-to-end encrypted filesystem backups from your agents to your own S3 storage. ManageLM never sees your data — the agent encrypts every archive locally before uploading, and only the ciphertext transits via your S3 bucket. Restore to any online agent at any time.

Providers

The S3 bucket is configured once per account in Settings → S3 Backups. Provider-agnostic — one set of credentials, any S3-compatible storage:

Each provider has a one-click preset that pre-fills the endpoint URL. The Test button validates credentials via HeadBucket before saving. Secret keys are stored AES-256-GCM encrypted at rest.

Encryption

Every backup has its own randomly generated 32-byte master key, stored wrapped server-side. Before each run, the portal sends the key to the agent over the existing mTLS WebSocket channel — never over HTTP, never logged.

Pure-Python implementation on the agent via oscrypto — no cryptography package, no native build dependencies.

Schedule & Retention

Each backup has its own cadence and retention:

ScheduleConfigurable Fields
Every hour
Every 6 hours
DailyRun time (HH:MM, agent-local)
WeeklyDay of week + run time
MonthlyDay of month (1–31, clamped) + run time

FIFO retention — specify how many snapshots to keep (1–90). Older snapshots are automatically rotated out by the cleanup cron, which best-effort deletes the S3 object then the DB row.

Quiesce services during backup

For a consistent snapshot of databases and stateful apps, list one or more services to stop during the backup (comma-separated, e.g. postgresql, redis). The agent:

  1. Stops each listed service via systemctl stop (Linux) or net stop (Windows) — 30-second timeout per service.
  2. Runs the tar → encrypt → upload pipeline.
  3. Restarts every service that was successfully stopped — in a try/finally so a backup failure (or the agent being killed mid-run) never leaves services down.

Run flow

  1. Agent requests a presigned PUT URL from the portal; portal pre-inserts a pending snapshot row.
  2. Agent tars the source path (with optional excludes), encrypts the archive, uploads directly to S3 — never through the portal.
  3. Agent reports size, file count, duration, SHA-256 via backup_status.
  4. Portal flips the snapshot to ok / failed; the cleanup cron reaps stuck pending rows after 6 hours.

Download & Restore

Detach on agent delete

When you delete an agent that has backups, the backups are not deleted — their agent_id is cleared instead. The S3 data and snapshot history survive the hardware replacement. A purple Reassign button appears in the backup row; clicking it opens the edit modal with an Agent picker so you can attach the backup to a new agent and continue the schedule. The UI also warns you about the detached count before confirming the agent deletion.

S3 orphan cleanup

The S3 Cleanup button in Settings → S3 Backups scans your bucket under the account prefix and deletes objects that have no matching snapshot row in the portal. Useful when the bucket was deleted externally, credentials were rotated mid-run, or you want to reclaim storage after manually removing backups.

Alerting

Per-backup toggle for alert-on-failure emails. ManageLM also detects stalled backups: if a scheduled backup is missed because its agent is offline, you receive a single consolidated alert per agent rather than one alert per missed run.

Permissions

Per-Plan Limits

The number of backups per account is limited by your plan (Free: 20, Pro: 100, Business: 200, Enterprise: unlimited). Detached backups still occupy a slot — delete them explicitly to free the slot.

Constraints

Pentests

ManageLM includes automated penetration testing for your public-facing agents. Pentests scan your servers from the outside — testing what an attacker would see. Available on Pro and Business plans.

How it works

  1. Select — Open the Pentests page and click New Pentest. Choose one or more public agents, select the tests to run, and optionally add target URLs.
  2. Validate — The portal sends a one-time token to the agent. The agent validates with the pentest service from its public IP, proving it controls the target.
  3. Scan — The pentest service runs tools sequentially: nmap (port discovery), nuclei (vulnerability scanning), testssl.sh (TLS audit), and more depending on selected tests.
  4. Report — An LLM generates a human-readable report with findings, severity ratings, and a security score (0–100). Results appear in the Agent Assets audit modal (Pentest tab) and the Pentests dashboard.

Available tests

TestWhat it scansCredits
Basic ScanPort discovery (nmap), vulnerability scan (nuclei), TLS quick check (testssl)3
Full Port ScanAll 65,535 TCP ports3
Vulnerability ScanExtended nuclei templates (critical/high/medium)3
SSL/TLS AuditFull testssl.sh analysis (per URL)1
Web App ScanNuclei web templates (per URL)3
DNS AuditSPF, DMARC, DKIM, MX records (per URL)1
HTTP HeadersSecurity headers analysis (per URL)1
Directory ScanCommon path discovery with ffuf (per URL)2
Subdomain EnumSubdomain discovery with subfinder (per URL)1

URL-based tests run once per target URL. Credit cost is calculated as: IP-based test credits + (URL-based test credits × number of URLs).

Credits

Pentests consume credits. Credits are deducted after a successful scan — failed scans are not charged.

Domain verification

Before scanning URLs, you must verify domain ownership. The pentest service generates a DNS TXT record that you add to your domain. Once verified, the domain stays valid for 24 hours before requiring re-verification.

Compliance integration

Pentest results automatically feed into the Compliance page. Each tool produces a pass/fail rule that maps to framework controls (CIS, PCI-DSS, SOC 2, ISO 27001, NIS2, NIST CSF, HIPAA). Pentest rules appear alongside security audit rules in framework coverage views.

Constraints

Compliance & Frameworks

The Compliance page maps your security audit results to industry compliance frameworks. ManageLM evaluates your fleet against each framework's controls and shows which pass, fail, or are not covered by the current rule set.

Supported frameworks

FrameworkVersionDescription
CIS Level 1v8.0Center for Internet Security — essential security hygiene for servers
CIS Dockerv1.6CIS Docker Benchmark — container runtime security
SOC 22017Trust Services Criteria — Security principle technical controls
PCI DSSv4.0Payment Card Industry Data Security Standard
ISO 270012022ISO/IEC 27001 Annex A — information security controls
NIS2 Directive2022EU Directive 2022/2555 — network and information security measures
NIST CSFv2.0NIST Cybersecurity Framework — Protect, Detect, Identify functions
HIPAA Security Rule201345 CFR §164.312 — technical safeguards for protected health information

How controls are evaluated

Each framework control is backed by one or more checks from security audits, pentests, and vulnerability scans. A control passes only when every backing check passes on every agent. If any check fails on any agent, the control fails. Controls with no data yet (no agents scanned) show as not covered.

Compliance dashboard

The Compliance page has two tabs:

Agents tab

Frameworks tab

Security drift notifications

When a security audit completes and a rule that previously passed now fails, ManageLM detects this as drift. Drift is shown in the Compliance dashboard as an alert. Optionally, admins can enable the Security Drift email notification in Settings > Email Notifications to receive an email with the new issues.

Drift detection only triggers when there is audit history — the first scan for an agent never generates drift alerts.

Evidence PDF export

Each framework has an Evidence PDF button (enabled when compliance is ≥ 50%). The generated PDF is designed for auditors and includes:

The fleet-wide Export PDF button on the Compliance page generates a summary report covering all frameworks.

Adding custom frameworks

Frameworks are defined as JSON files in the frameworks/ directory. Each file specifies an id, name, version, description, url, and an array of controls. Each control maps to existing rule_slugs. No code changes are needed — drop a new JSON file and restart the portal.

System Inventory

ManageLM discovers all running services, installed packages, and system components on your agents. Checks are read-only and no skill assignment is required.

How it works

  1. Trigger — Open an agent's detail panel on the Agent Assets page. Click the clipboard icon to open the System Inventory modal, then click Run Inventory.
  2. Scan — The agent collects information about the system using a read-only set of checks.
  3. Structure — The agent's configured LLM categorizes the results and extracts product names and versions. This is the only built-in report that uses the LLM. Without an LLM, a minimal inventory is returned.
  4. Results — Inventory items appear in the modal, grouped by category.

What is collected

CheckWhat it inspects
System InfoOS, kernel, uptime, CPU count, memory, disk usage
Running ServicesAll active services (systemd on Linux, Windows Services on Windows)
Enabled ServicesServices enabled at boot
Listening PortsTCP listening sockets with associated processes
Installed PackagesPackage list from rpm or dpkg (Linux), or installed programs list (Windows)
Package VersionsExplicit version extraction for common packages (nginx, PostgreSQL, Redis, Docker, etc.)
ContainersDocker/Podman containers with image, status, and ports
Cron JobsSystem and per-user cron jobs
Network InterfacesAll network interfaces with addresses
Mounted FilesystemsNon-virtual mounted filesystems
Hardware InfoCPU model, memory, disks
Web ServersRunning web servers (nginx, Apache, Caddy, HAProxy)
DatabasesRunning databases (PostgreSQL, MySQL, MongoDB, Redis, Valkey, Memcached, Elasticsearch)
Login UsersNon-system user accounts with shell and group membership

Categories

Each inventory item is classified into one of these categories:

CategoryExamples
systemOS version, kernel, CPU, memory, disk
webNginx, Apache, Caddy, HAProxy
databasePostgreSQL, MySQL, Redis, Valkey, MongoDB, Elasticsearch
mailPostfix, Dovecot, OpenDKIM
containerDocker containers, Podman containers
networkNetwork interfaces, listening ports
storageMounted filesystems, disks
securityFail2ban, SELinux, firewall
monitoringMonitoring agents, metrics collectors
logRsyslog, journald, logrotate
userLogin user accounts
schedulerCron jobs, systemd timers

PDF export

Click the Inventory button at the top of the Agent Assets page to download a fleet-wide inventory report covering all agents with completed inventories. The PDF includes categorized service lists with versions and status for each server.

Like security reports, use the Schedules popover to enable automatic email delivery. Changing the schedule also syncs all agents' inventory scan schedule.

Scheduled inventories

You can configure automatic recurring inventories per agent. Open the System Inventory modal and use the schedule selector in the top-right corner to choose a frequency:

The scheduler checks every 15 minutes and triggers inventories for agents that are overdue. A yellow badge (D, W, or M) appears on the agent card to indicate an active schedule.

Constraints

SSH & Sudo Access

ManageLM includes a built-in access scanner that discovers SSH authorized keys and sudo privileges across your infrastructure. Check commands are defined on the portal (reports/ssh_keys.json) and executed on the agent in a read-only sandbox. Fully deterministic — no LLM involved. Discovered SSH key fingerprints are matched against ManageLM user profiles for identity resolution.

How it works

  1. Trigger — Open an agent's detail panel on the Agent Assets page. Click the SSH & Sudo button to open the access scan modal, then click Scan Access.
  2. Collect — The portal sends a ssh_keys_scan_request to the agent over WebSocket. The agent reads /etc/passwd, parses ~/.ssh/authorized_keys for each user, computes SHA256 fingerprints, and parses /etc/sudoers + /etc/sudoers.d/* with group membership resolution. No LLM calls.
  3. Results — The combined data is returned to the portal and displayed in the modal. SSH key fingerprints are matched against public keys registered in ManageLM user profiles (Settings → Security → SSH Public Keys) — matched keys show the user's name in a green badge, unmatched keys show as "Unknown".

What is collected

DataSourceDetails
SSH authorized keys~/.ssh/authorized_keysKey type, SHA256 fingerprint, comment, full public key, line number
Sudo user rules/etc/sudoersTarget host, runas user, commands, NOPASSWD flag, source file
Sudo group rules/etc/sudoers + /etc/groupGroup rules (e.g. %wheel) expanded to individual users via group membership

Identity mapping

ManageLM users can register their SSH public keys in Settings → Security → SSH Public Keys. When the access scan discovers a key on a server, its SHA256 fingerprint is matched against registered keys to identify the owner. This creates a complete map of who has access to what and what they can do (SSH + sudo).

Register your SSH keys. For identity resolution to work, each team member should add their SSH public key(s) in Settings → Security → SSH Public Keys. Without registered keys, all discovered keys will appear as “Unknown” in scan results.

Sudo rules with NOPASSWD are highlighted in red as a security concern.

Key comments are not used for identity. The user@host comment in authorized_keys is unreliable — identity is resolved exclusively via SHA256 fingerprint matching against registered profiles.

MCP integration

The access scan powers natural-language access management via Claude:

PDF export

Click the SSH & Sudo button at the top of the Agent Assets page to download a fleet-wide access report. The PDF includes SSH keys and sudo rules per user per server, with NOPASSWD rules highlighted.

Scheduled scans

Configure automatic recurring scans per agent via the schedule selector in the modal header, or for all agents via the Schedules popover in the Agent Assets toolbar. Frequencies: Manual / Daily / Weekly / Monthly.

Constraints

Activity Audit

ManageLM includes a built-in activity audit that tracks user activity on your servers. Check commands are defined on the portal (reports/activity.json) with time-window parameters resolved per scan. Executed on the agent in a read-only sandbox. On Linux, no auditd dependency — works on any distribution. On Windows, the audit uses Windows Event Log. Fully deterministic, no LLM needed.

How it works

  1. Trigger — Click the Activity tab on an agent card in the Agent Assets page, then click Run Activity Audit.
  2. Scan — The agent collects activity for the configured time window.
  3. Parse — Events are normalized, deduplicated, and system accounts are filtered out.
  4. Identity — Full names (including LDAP/SSSD users) are matched against ManageLM users — matched users appear as green badges.
  5. Results — Displayed in the Activity Audit modal with dashboard cards and detail tables.

What the report shows

Time windows

Each audit collects data for a rolling time window:

PDF export & scheduled reports

Click the Activity button at the top of the Agent Assets page to download a fleet-wide activity audit report as PDF. Use the Schedules popover to configure automatic report emails.

Constraints

Service Dependencies

The Service Dependencies scan discovers cross-server service dependencies across your infrastructure. It shows what each server provides, what it depends on, and highlights connections between managed agents.

How it works

  1. Trigger — Click the Service Dependencies button at the top of the Agent Assets page.
  2. Scan — The portal sends a dependency_scan_request to all online agents simultaneously. A progress modal shows each agent's scan status in real time.
  3. Collect — Each agent runs a fully deterministic scan (no LLM needed):
    • Provides — discovers all listening TCP services via ss.
    • Depends on — discovers outbound connections (established TCP) plus config-file parsing for intermittent dependencies.
    • All hostnames are resolved to IPs locally on the agent before reporting.
  4. Report — The portal matches dependency IPs against known agent IPs to identify managed vs external connections, and displays a per-agent report.

What is scanned

SourceWhat it finds
Established connectionsAll active outbound TCP connections to non-local IPs
Nginx configsproxy_pass, upstreams, fastcgi_pass, uwsgi_pass, grpc_pass
Apache configsProxyPass, ProxyPassReverse, RewriteRule [P]
HAProxy configBackend server definitions
Caddy configreverse_proxy targets
.env filesDATABASE_URL, REDIS_URL, DB_HOST, SMTP_HOST, and many more
Docker ComposeEnvironment variables with connection strings
WordPressDB_HOST in wp-config.php
Database replicationMySQL master-host, PostgreSQL primary_conninfo, Redis replicaof
Mail configsPostfix relayhost and lookup tables, Dovecot auth backends
LDAP configsldap.conf, sssd.conf, nslcd.conf URI/host directives
NFS/CIFS mountsNetwork mounts in /etc/fstab
Systemd unitsEnvironment variables with connection strings in service files
PrometheusScrape targets in prometheus.yml
DNS resolvers/etc/resolv.conf nameservers
NTP serversntp.conf, chrony.conf, timesyncd.conf
Syslog targetsRemote syslog destinations in rsyslog configs
SNMP trapsTrap sink destinations in snmpd.conf
Backup clientsBacula, Bareos, Borg, Restic server addresses
Zabbix agentServer= directive in zabbix_agentd.conf
Generic /etc sweepURLs with host:port and raw IP:port patterns across all /etc files

Report format

Each agent's section shows:

Constraints

Connectors

Connectors wire ManageLM up to external systems. They come in two kinds, selectable as tabs in the Add Connector modal:

Both kinds share the same permission (perm_connectors), the same encryption-at-rest (AES-256-GCM, requires ENCRYPTION_KEY), the same storage table, and the same CRUD pages. What differs is the data flow: cloud connectors pull on a schedule, SIEM connectors push as tasks complete.

Cloud Hosting

Sync your cloud resources (VMs, volumes, networks, security groups) and auto-match them to ManageLM agents by IP address and hostname.

Supported providers

How it works

  1. Go to Connectors in the sidebar and click Add Connector.
  2. On the Cloud Hosting tab, select a provider, enter a name, and fill in your credentials.
  3. Click Save — the connector syncs automatically on creation.
  4. Use Edit → Test Connection to verify credentials at any time.
  5. Cloud resources appear in the connector's expanded view and on agent cards in the Agent Assets page.

What is synced

Agent matching

After each sync, ManageLM automatically matches cloud VMs to agents by comparing IP addresses and hostnames. Matched agents show a provider badge (e.g. AWS, Azure) on their card in the Agent Assets page. Expanding an agent card shows the full cloud metadata (instance type, zone, IPs, disks, security groups, tags).

MCP integration

Claude can query your cloud inventory using three built-in tools:

These tools are hidden until at least one cloud connector exists — a SIEM-only tenant will not see them in Claude's tool catalog.

Sync schedule

Each connector syncs on a configurable interval: every 1 hour, 6 hours, 12 hours, or 24 hours. Manual sync is available from the connector list (refresh icon). Syncs are distributed across portal instances using Redis locks to prevent duplicates.

Security

SIEM Integration

Forward task-completion events from your agents directly to an external SIEM. Useful for compliance, centralized security monitoring, and audit trails outside ManageLM's own database. Forwarding is additive — the portal's own task log and audit trail are unchanged.

Agent-direct delivery. Events travel from the agent straight to the SIEM over HTTPS. The portal never sees the event stream in flight — it only distributes the SIEM config (endpoint + credentials) to the agents. This is what lets ManageLM SaaS forward into private / on-prem SIEMs behind NAT: the SIEM only has to be reachable from the managed server, not from our cloud.

Supported destinations

What gets forwarded

One event per completed task — the same rows you see in the Command History panel of an Agent's detail page. Nothing else is forwarded: no heartbeats, no config pushes, no LLM traffic.

{
  "ts": "2026-04-17T14:23:11Z",
  "agent": { "hostname": "prod-web-01" },
  "task": {
    "id": "...",
    "skill": "firewall",
    "instruction": "block 1.2.3.4",
    "status": "completed",
    "output": "...",
    "error": null,
    "files_changed": ["/etc/nftables.conf"]
  }
}

Splunk wraps this in {"event": <envelope>, "sourcetype": "...", "index": "...", "host": "..."}. Elasticsearch sends it as an NDJSON _bulk body (action line + doc line). Webhook sends a JSON array of envelopes per batch.

How it works

  1. Go to Connectors in the sidebar and click Add Connector.
  2. Switch to the SIEM Integration tab, pick a type, enter a name, fill in the endpoint and credentials, and save. A Test Connection runs automatically on create.
  3. Open an Agent detail page — or a Server Group — and pick the new SIEM from the SIEM Forwarding dropdown.
  4. From that point on, every task completed by that agent fires a POST to the SIEM, in parallel with the normal task-result report to the portal.

Assignment and inheritance

Each agent has at most one SIEM destination. It resolves as:

  1. If the agent itself has a direct override → that destination wins.
  2. Else if its group(s) point at a single destination → inherit that one.
  3. Else → no forwarding.

If an agent belongs to several groups whose SIEM settings differ, the portal refuses to guess — the agent gets a red SIEM CONFLICT badge on the Agent Assets list until you set an explicit per-agent override to resolve the conflict.

Agent groups show their SIEM destination as a small → <connector name> pill on the group card (read-only view).

Transport and reliability

Security

Permissions

Creating, editing, or deleting a SIEM connector requires the Connectors permission (perm_connectors) — the same gate as cloud connectors. Assigning a SIEM destination to an agent also requires the Agents permission; assigning one to a group requires the Groups permission.

Permissions (shared)

The Connectors permission (perm_connectors) covers both kinds. Owners and admins have full access. Members need the permission toggled on in Users & Roles.

Change Tracking

ManageLM automatically tracks file changes made by every mutating task. Each agent maintains a local git repository that snapshots tracked directories before and after task execution, producing a precise record of what changed, when, and by which task.

How it works

  1. Pre-snapshot — Before a task executes, the agent syncs all tracked files into its local git repo and commits a baseline.
  2. Task execution — The task runs normally (LLM-driven commands).
  3. Post-snapshot — After the task completes, the agent syncs again, commits the delta, and computes the list of changed files.
  4. Report — Changeset metadata (files changed, commit hashes, summary) is sent to the portal and stored in the database. The full diff stays in the agent’s local repo.

What is tracked

AspectDetail
Tracked directories/etc/ — covers SSH, nginx, firewall, cron, sudoers, sysctl, network config, and more
Skipped contentBinary files, files > 512 KB, symlinks, and noisy directories (ssl/certs, pki/ca-trust, firmware, kernel, selinux/targeted/policy)
Git implementationdulwich (pure Python) — no git CLI needed on the host
Repo location/opt/managelm/git/ on each agent
Retention30 days — older commits are automatically pruned daily

Viewing changes

When a task modifies tracked files, a changeset badge appears on the task in the task log (in the Agent Detail page and the MCP Log). The badge shows the number of files changed.

With MCP (Claude), use the built-in get_task_changes tool to inspect what a task modified:

get_task_changes(task_id="...", full_diff=true)

This returns:

Reverting changes

If a task made unwanted changes, you can revert them to restore the previous file state. Use the revert_task MCP tool:

revert_task(task_id="...")

This fetches the diff from the agent’s local git repo and applies a reverse patch, restoring the files to their pre-task state. The revert is tracked as a separate changeset.

Requirements: The agent must be online for full diffs and reverts (the data lives in the agent’s local repo). The changeset must be within the 30-day retention window. Changeset metadata (file list, summary) is always available in the portal database regardless of agent status.

Non-mutating tasks

Tasks classified as read-only (non-mutating) by the LLM skip the snapshot process entirely — no changeset is created. This keeps the git history clean and avoids unnecessary I/O for read-only operations like status checks and log queries.

Audit Log

The Audit Log provides a chronological record of all administrative actions performed in your account. It is accessible from the Audit Log entry in the sidebar.

What is logged

Every significant action is automatically recorded, including:

CategoryActions
AuthenticationLogin, logout
UsersInvite, update role/permissions, delete, transfer ownership
AgentsApprove, delete, update settings, bulk actions
SkillsCreate, import, update, delete, document upload/delete
GroupsCreate, update, delete, member changes
WebhooksCreate, update, delete
API KeysCreate, delete
MCPConfiguration changes (IP whitelist, etc.)
AccountSettings changes, license activation/removal

Log entry details

Each entry records:

Access control

Features

Reporting

The Reporting page provides a historical view of all task executions across your account. Use it to review what was run, by whom, on which agent, and what the outcome was.

Access

Reporting is visible to admin and owner roles by default. Members need the perm_reports permission enabled (configurable in Users & Roles).

Features

Agent summaries

Each task entry may include a summary — a short description auto-generated by the agent’s LLM after completing the task. These summaries make it easy to scan results without reading raw command output.

Webhooks

Get notified when things happen in your account.

Available events

EventFires when
agent.enrolledA new agent requests enrollment
agent.approvedAn agent is approved
agent.onlineAn agent connects
agent.offlineAn agent disconnects
task.completedA task finishes successfully
task.failedA task fails
report.completedA security audit or inventory scan completes
report.failedA security audit or inventory scan fails
monitor.downA service monitor goes down (after consecutive failure threshold)
monitor.upA service monitor recovers from down
cert.issuedA new certificate is issued and deployed to an agent
cert.revokedA certificate is revoked (CRL updated, LE notified for LE certs)
cert.renewedA certificate is automatically renewed by the daily sweep
cert.renewal_failedAutomatic certificate renewal failed
cert.reactivatedA revoked certificate is reactivated (internal CA only)
cert.deletedA certificate is soft-deleted from the portal

Configure webhooks from Settings → MCP & API. Enter a URL, select events, and optionally provide an HMAC secret. Payloads are signed with HMAC-SHA256 via the X-Webhook-Signature header when a secret is configured.

Delivery retries up to 3 times with exponential backoff. After 10 consecutive failures, the webhook is automatically disabled. Re-enabling it resets the counter. Maximum 25 webhooks per account.

In-App Notifications

The portal includes a real-time notification system accessible from the Notifications bell in the sidebar. Notifications are delivered alongside email alerts for key events.

Notification triggers

How it works

Storage: Notifications are stored in Redis with a 24-hour auto-expiry (max 50 per user). They are ephemeral and do not persist across Redis restarts.

Deployment & .env

The portal is configured via environment variables in a .env file. Below is a reference of all available settings.

Core

VariableRequiredDefaultDescription
DATABASE_URLYesPostgreSQL connection string
SERVER_PORTNo3000HTTP listen port
SERVER_URLYesFull public URL (e.g. https://portal.example.com)
ACCESS_TOKEN_TTLNo86400Access token lifetime in seconds (24h). Tokens are opaque random strings stored in Redis — no signing secret.
REFRESH_TOKEN_TTLNo2592000Refresh token lifetime in seconds (30d).
DEFAULT_TIMEZONENoUTCDefault timezone for new users
TASK_TIMEOUT_SECONDSNo300Max duration for synchronous task execution (seconds)
FILE_TRANSFER_MAX_BYTESNo26214400Max file transfer size (default 25 MB)
TOS_URLNoURL to Terms of Service page. When set, signup forms require ToS acceptance.
LOG_LEVELNoinfoLog verbosity: trace, debug, info, warn, error, fatal, silent
CLUSTER_WORKERSNo2Number of Node.js cluster workers. Set to 1 to disable clustering.
SERVER_MODENoselfhostedsaas = hosted SaaS (trial LLM available), selfhosted = Docker/on-prem (proxied LLM available).
NOTIFY_EMAILNoEmail address for platform operator alerts (account created/deleted notifications).
ENCRYPTION_KEYNoAES-256 key for encrypting connector credentials at rest (cloud provider secrets and SIEM tokens). Required to use Connectors. Generate with: openssl rand -hex 32

SMTP & DKIM

VariableRequiredDefaultDescription
SMTP_HOSTNoSMTP server hostname. When empty, emails are sent directly to recipient MX servers (no mail server required).
SMTP_PORTNo25SMTP server port
SMTP_FROMYesFrom address for all emails
SMTP_SECURENononenone = plain (localhost:25), starttls = upgrade via STARTTLS (587), tls = implicit TLS (465)
SMTP_USERNoSMTP auth username (for external relays)
SMTP_PASSNoSMTP auth password
DKIM_DOMAINNoDomain for DKIM signing (e.g. example.com)
DKIM_SELECTORNodefaultDKIM selector (matches DNS TXT record)
DKIM_PRIVATE_KEY_PATHNoPath to PEM private key file
DKIM_PRIVATE_KEYNoInline PEM private key (use \n for newlines)
DKIM setup: When DKIM_DOMAIN and a private key are set, all outgoing emails are signed with DKIM (RSA-SHA256). You also need to publish a DNS TXT record at {selector}._domainkey.{domain} with the matching public key.

Redis (required)

VariableRequiredDefaultDescription
REDIS_URLYesRedis connection URL (e.g. redis://localhost:6379). Supports redis://, rediss://, valkey://, valkeys:// schemes.
REDIS_TLSNoautoauto = TLS if URL uses rediss:// or valkeys://, on = force TLS, off = no TLS
REDIS_DBNo0Logical database number (0–15). Useful when sharing a Redis instance.

Redis is a mandatory component used for:

Database

VariableRequiredDefaultDescription
DB_POOL_MAXNo20Max PostgreSQL connection pool size
DB_SSLNononenone = no SSL, require = SSL (skip cert verify), verify = full CA verification, verify-ca = custom CA cert
DB_SSL_CANoPath to CA certificate file (used with DB_SSL=verify-ca)
TASK_LOG_RETENTION_DAYSNo30Days to keep task log entries
AUDIT_LOG_RETENTION_DAYSNo90Days to keep audit log entries
TASK_LOG_MAX_PER_ACCOUNTNo5000Max task log entries per account
AUDIT_LOG_MAX_PER_ACCOUNTNo10000Max audit log entries per account
SESSION_RETENTION_DAYSNo30Days before inactive login sessions are deleted
PENDING_AGENT_RETENTION_DAYSNo14Days before unapproved agent enrollments are deleted
EMAIL_VERIFY_RETENTION_DAYSNo7Days before stale email verification tokens are cleared
MONITOR_RETENTION_DAYSNo90Days to keep monitor events. Rollups are kept 4× longer for trend charts.

Performance notes

The portal includes several built-in performance optimizations for high-load deployments:

For high-traffic deployments, increase DB_POOL_MAX and configure REDIS_URL for session persistence and horizontal scaling.

Background Maintenance

The portal automatically cleans up stale data using three background tasks. Each runs on a distributed Redis lock, so only one portal instance executes per interval — no external cron is needed.

TaskIntervalWhat it does
OAuth cleanupEvery 30 minDeletes expired MCP OAuth tokens and authorization codes
Log purgeEvery 1 hourAge-based and count-based pruning of task_log and audit_log
MaintenanceEvery 6 hoursCleans all other stale resources (see table below)
Scheduled scansEvery 15 minTriggers security audits and system inventories for agents with a configured schedule (daily/weekly/monthly), and generates scheduled PDF reports for accounts

All tasks also run once on portal startup.

Maintenance targets

ResourceCleanup ruleConfigurable
Login sessionsNo activity in SESSION_RETENTION_DAYS (default 30)Yes
Expired invitationsPast expires_at and not accepted
Expired API keysPast optional expires_at
Password reset tokensPast password_reset_expires_at
Email verification tokensUnverified accounts older than EMAIL_VERIFY_RETENTION_DAYS (default 7)Yes
WebAuthn challengesUser inactive > 6 hours (abandoned registration flow)
Pending agent enrollmentsUnapproved for PENDING_AGENT_RETENTION_DAYS (default 14)Yes
Monitor eventsOlder than MONITOR_RETENTION_DAYS (default 90)Yes
Monitor rollupsOlder than 4× MONITOR_RETENTION_DAYS (default 360 days)Yes
PKI certificatesSoft-deleted certs after natural expiry, expired certs after 7 days, stale failed/pending after 7 daysYes

Configurable retention values can be set via environment variables in .env. See the Deployment & .env section for details.

Reinstalling an Agent

You can reinstall an agent without losing its configuration (skills, groups, members).

  1. Go to the agent's detail page.
  2. Click the Reinstall button.
  3. Copy the install command and run it on the server.
  4. Approve the re-enrollment when prompted.

The agent gets a fresh access token and signing key while keeping all its existing configuration intact.

Custom Skills

You can create your own skills to extend what agents can do.

  1. Go to Agent Skills and click Create Skill.
  2. Define the skill's slug, name, and description.
  3. Add operations (name and description for each capability).
  4. Set the allowed commands.
  5. Write a system prompt that guides the LLM.

Tips for custom skills

Import / Export

Skills can be exported as JSON files and imported into other accounts. Use the export button on any skill, or import from the skills page.

Skill Documents (RAG)

You can upload reference documentation to any skill. When a task is dispatched, relevant sections are automatically retrieved and injected into the LLM prompt — giving the agent knowledge about products, tools, or APIs that the LLM wasn't trained on.

No external dependencies. Document search uses PostgreSQL full-text search (tsvector + GIN index). Works out of the box on both SaaS and self-hosted installations.

How it works

  1. Upload — Drop .txt, .md, .pdf, .html, .doc, or .docx files onto the skill's edit form. Text is extracted automatically and chunked for indexing.
  2. Retrieve — When a task matches, the portal searches document chunks using the task instruction and retrieves the top matching sections.
  3. Inject — Matching chunks are injected into the agent's system prompt as a REFERENCE DOCUMENTATION block, before the task instructions.

Uploading documents

  1. Go to Agent Skills and click the Edit (pencil) icon on a skill.
  2. Below the Detailed Description field, you'll see the Reference Documents section with a drag-and-drop zone.
  3. Drop one or more files (.txt, .md, .pdf, .html, .doc, .docx), or click the zone to browse.
  4. Each uploaded file is shown with its filename, size, chunk count, and upload date.
  5. To remove a document, click the trash icon next to it.

Chunking

Text is first extracted from the uploaded file (PDF → pdf-parse, DOC/DOCX → mammoth, HTML → tag stripping, TXT/MD → as-is), then split into chunks of ~1000–1500 characters for efficient retrieval:

Retrieval at task time

When a task is sent to an agent, the portal searches the skill's document chunks using PostgreSQL's websearch_to_tsquery. The top matching chunks (up to 10 chunks / 30,000 characters by default) are injected into the system prompt. If no chunks match the instruction, nothing is injected.

Limits

LimitDefaultEnvironment Variable
Max file size2 MBSKILL_DOC_MAX_SIZE_BYTES
Max documents per skill10SKILL_DOC_MAX_PER_SKILL
Max total size per skill10 MBSKILL_DOC_MAX_TOTAL_BYTES
Max chunks per task10RAG_MAX_CHUNKS
Max chars per task30,000RAG_MAX_CHARS

Use cases

Cleanup: Documents and their chunks are automatically deleted when the parent skill is deleted (CASCADE). No manual cleanup needed.