Hermes Agent v0.5.0 brings Hugging Face support, supply chain audit, and 50 plus fixes

Hermes Agent v0.5.0 is out. Nous Research calls it the hardening release — and it shows.

hermes agent

Hermes Agent shipped v0.5.0, a release Nous Research describes as focused on security hardening and reliability. The project has also grown from 13,100 to 17,700 GitHub stars since its initial launch in February.

The headline additions:

  • Hugging Face provider: Full integration with the HF Inference API, including a curated model picker and setup wizard. Hermes now supports a second major inference platform alongside OpenRouter.
  • Nous Portal expansion: The Nous Research inference portal now covers 400 plus models accessible through a single endpoint.
  • Telegram Private Chat Topics: Project-based conversations with skill binding per topic, letting you run isolated workflows inside a single Telegram chat.
  • Plugin lifecycle hooks: pre_llm_call, post_llm_call, on_session_start, and on_session_end hooks now fire properly in the agent loop.
  • Native Modal SDK: Replaced the swe-rex dependency with native Modal SDK, removing the need for tunnels.

The supply chain work is the most important part of this release. A compromised litellm dependency was removed entirely. All dependency version ranges are now pinned, the lockfile has been regenerated with hashes, and a new CI workflow scans pull requests for supply chain attack patterns.

A few other fixes worth knowing about:

  • Anthropic output limits: Replaced a hardcoded 16K token cap with per-model native limits — 128K for Opus 4.6, 64K for Sonnet 4.6.
  • GPT tool use: Added guidance to stop GPT models from describing intended actions instead of actually calling tools.
  • SQLite freeze fix: A write-lock contention bug was causing 15 to 20 second TUI freezes during sessions.
  • OpenClaw migration: An updated migration skill covers 17 new modules for users switching from OpenClaw to Hermes.

Source: GitHub / Nous Research

RunPod
RunPod

If you need on-demand GPUs for training, fine-tuning, inference, or running open-source models, give RunPod a try.

  • Available hardware: H100, H200, A100, L40S, RTX 4090, RTX 5090, and 30+ more
  • Cost: significantly cheaper than AWS or GCP, billed per second, no contracts
  • Setup: spins up in under a minute, 30+ regions worldwide
Try RunPod →
Affiliate disclosure: We may earn a commission if you sign up via our link, at no extra cost to you.
Efficienist Newsletter

Get the core business tech news delivered straight to your inbox. We track AI, automation, SaaS, and cybersecurity so you don't have to.

Just read what you want, and be done with it.

Read Next