Cost Management Platform
FinOps Integration Into Platform Tooling
FinOps is the practice of bringing financial accountability to cloud spending. The platform team is uniquely positioned to implement this because they control the infrastructure layer where costs originate. The goal is not to cut costs blindly. It's to make cost a first-class engineering metric alongside latency, availability, and developer productivity.
Start with visibility. If teams don't know what they spend, they can't optimize. Enforce a tagging standard: every cloud resource gets tagged with team, service, and environment. Every Kubernetes namespace gets cost allocation labels. This takes months to retrofit but is the foundation everything else depends on.
Cost Visibility Tools
Kubecost runs inside your Kubernetes cluster and allocates costs to namespaces, deployments, and individual pods. It uses actual cloud pricing data combined with resource utilization to calculate real costs, not just requested resources. The open source version handles single-cluster visibility. The enterprise version supports multi-cluster and integrates with Prometheus for historical data.
CAST AI goes further by actively optimizing. It analyzes workload requirements and automatically selects the cheapest instance types, moves workloads to spot instances, and right-sizes node pools. Companies report 50-70% savings, though aggressive automation requires trust in the tool.
Infracost shifts cost visibility left into the development workflow. It comments on pull requests with the cost impact of Terraform changes. A PR that provisions an r5.4xlarge instead of an r5.xlarge gets an automated comment showing the $200/month difference. Engineers make cost-aware decisions before changes reach production.
Per-Team Cost Attribution
The attribution model matters as much as the tooling. Label every resource with the owning team. For Kubernetes, use namespace-level labels that Kubecost reads. For cloud resources, enforce tags through Terraform modules and OPA policies that reject untagged resources.
Build a weekly cost report per team showing: total spend, week-over-week change, top 5 cost drivers, and rightsizing recommendations. Distribute this through Slack or email. The simple act of making costs visible reduces waste by 15-20% without any forced optimization.
For shared resources (databases, message queues, API gateways), attribute costs proportionally based on usage metrics. A team that sends 60% of the messages through Kafka pays for 60% of the Kafka cluster cost. This prevents the "tragedy of the commons" where shared resources grow unchecked because no team owns the cost.
Resource Rightsizing
Most cloud workloads are over-provisioned by 40-60%. Engineers pick instance sizes based on worst-case estimates and never revisit the decision. Automated rightsizing changes this.
The process: collect 2 weeks of CPU and memory utilization data. Flag any resource where peak utilization is below 40% of allocated capacity. Generate a recommendation to downsize. For Kubernetes, this means adjusting resource requests and limits. For cloud instances, this means changing instance types.
Goldilocks (by Fairwinds) runs inside Kubernetes and recommends resource requests based on actual usage observed by the Vertical Pod Autoscaler. It generates per-container recommendations that teams can review and apply.
Show-Back vs Charge-Back
Show-back means reporting costs to teams without actually billing them. Charge-back means deducting infrastructure costs from team budgets. Most organizations should start with show-back. It creates awareness and drives optimization without the organizational overhead of internal billing systems, transfer pricing, and budget negotiations.
The platform team publishes a cost dashboard showing each team's monthly cloud spend, broken down by service and resource type. Teams that spend significantly more than comparable teams are encouraged to optimize. This peer comparison is surprisingly effective. Nobody wants to be the team spending 3x the average per request.
Move to charge-back only if show-back doesn't drive sufficient optimization after 6-12 months, or if your organization's finance team requires it for budget accuracy.
Cost Anomaly Detection
Set up automated alerts for spending anomalies. AWS Cost Anomaly Detection handles AWS resources natively. For multi-cloud or Kubernetes costs, build alerts in Kubecost or your monitoring platform. Alert when daily spend exceeds the trailing 7-day average by more than 30%. Alert when a single service's cost doubles week-over-week.
The most common anomalies: a load test left running over the weekend ($15k in 48 hours), an autoscaler scaling to maximum and not scaling back down, a misconfigured logging pipeline sending 10x normal volume to Datadog, and forgotten development resources running 24/7 that should stop outside business hours.
Key Points
- •Per-team cost attribution through Kubernetes labels and cloud resource tags makes spending visible and creates accountability without blame
- •Kubecost provides real-time cost allocation at the pod and namespace level, showing exactly which team and service drives each dollar of spend
- •Spot instances save 60-90% on compute but require workloads designed for interruption with proper pod disruption budgets and graceful shutdown
- •Show-back models (reporting costs without charging) change behavior almost as effectively as charge-back (actual internal billing) with far less organizational friction
- •Cost anomaly detection catches runaway spending within hours instead of discovering a $50k surprise on the monthly bill
Common Mistakes
- ✗Treating cost optimization as a quarterly project instead of building automated guardrails that continuously right-size resources
- ✗Buying reserved instances based on current usage without accounting for planned migrations, architecture changes, or workload growth
- ✗Optimizing compute costs while ignoring data transfer, storage, and managed service costs which often represent 40-60% of the total bill