Skip to main content

Install the Agent

The K8Cost agent is a Kubernetes CronJob that collects resource metrics from your cluster and pushes them to the K8Cost API for analysis. It is the recommended way to connect clusters.

What the Agent Does

On each run, the agent:

  1. Collects resource data via the Kubernetes API -- pods, nodes, namespaces, deployments, StatefulSets, DaemonSets, CronJobs, HPAs, PDBs, PVCs, services, and resource quotas
  2. Gathers metrics from the Metrics Server (if available) for actual CPU and memory usage
  3. Evaluates 65+ optimization rules locally using cloud-provider-specific pricing
  4. Uploads results to the K8Cost API in compressed batches
  5. Sends heartbeats so the dashboard reflects real-time sync status

The entire process typically completes in 1-3 minutes for clusters with up to 500 pods.

What the Agent Collects

ResourceAPI GroupData Points
Podscore/v1Spec, resource requests/limits, status, labels, annotations
Nodescore/v1Capacity, allocatable, conditions, labels
Namespacescore/v1Labels, annotations, resource quotas
Deploymentsapps/v1Replicas, strategy, selectors
StatefulSetsapps/v1Replicas, volume claim templates
DaemonSetsapps/v1Node selectors, tolerations
HPAsautoscaling/v2Min/max replicas, metrics, current status
PVCscore/v1Requested storage, access modes, storage class
PDBspolicy/v1Min available, max unavailable
Pod Metricsmetrics.k8s.ioCurrent CPU and memory usage

The agent only uses get, list, and watch verbs. It never modifies any resources in your cluster.

Security Model

  • Read-only RBAC -- The agent's ClusterRole grants only read access. It cannot create, update, or delete any cluster resources.
  • No kubeconfig export -- Credentials stay inside the cluster. The agent uses the in-cluster ServiceAccount token.
  • Outbound-only traffic -- The agent pushes data to the K8Cost API over HTTPS. No inbound connections to your cluster are required.
  • API key authentication -- The agent authenticates with a revocable API token. Rotate or delete it at any time from the dashboard.
  • No persistent storage required -- The agent stores a small state file for incremental sync, but functions fully without it.

Resource Requirements

The agent runs as a CronJob with sensible defaults:

ResourceRequestLimit
CPU200m1 core
Memory256 Mi1 Gi

For large clusters (1000+ pods), consider increasing the memory limit to 2 Gi. The agent processes pods in configurable batch sizes (default 500) with parallel workers (default 10 concurrent namespace collectors).

Schedule

The default schedule is 0 * * * * (every hour). You can adjust this based on how frequently your cluster changes:

  • Production clusters with frequent deployments: every 30 minutes (*/30 * * * *)
  • Stable production clusters: every 6 hours (0 */6 * * *)
  • Development/staging clusters: daily (0 8 * * *)

The CronJob uses concurrencyPolicy: Forbid to prevent overlapping runs and has an activeDeadlineSeconds of 1800 (30 minutes).

Next Steps