Self-Service Database Provisioning

Self-Service Provisioning

Developers should not file tickets to get a database. The platform provides a self-service interface, whether that's a CLI tool, a web portal, or a Kubernetes custom resource. The developer selects a database type (PostgreSQL, MySQL, Redis, MongoDB), picks a size tier, and gets a running instance with credentials delivered to their secrets manager. The whole process takes minutes, not days.

Behind the scenes, the platform uses Terraform, Crossplane, or cloud-native operators to provision the infrastructure. AWS RDS, GCP Cloud SQL, or Azure Database services handle the heavy lifting. The platform team's job is to wrap these with sensible defaults: encryption at rest enabled, automated backups configured, monitoring agents installed, and network policies applied.

Connection Pooling Architecture

PostgreSQL has a process-per-connection model. Each connection consumes about 10MB of memory. A Kubernetes cluster running 200 pods each opening 10 connections means 2,000 connections, which would require 20GB of memory just for connection overhead. Most workloads don't need dedicated connections.

PgBouncer sits between your application and PostgreSQL. In transaction pooling mode, a connection is assigned to a client only for the duration of a transaction. Between transactions, the connection returns to the pool. This lets 2,000 application connections share 50 actual database connections. Deploy PgBouncer as a sidecar or as a shared service per database cluster.

ProxySQL does the same for MySQL with additional features like query routing and caching. Redis doesn't need connection pooling in the same way since its single-threaded model handles connections differently, but connection limits still matter.

Schema Migration Tooling

Database schema changes are code. They should be version controlled, reviewed, and applied through an automated pipeline. Flyway uses numbered SQL migration files. Atlas uses declarative schema definitions and generates migration plans automatically. Both integrate with CI/CD.

The workflow: developer writes a migration, opens a PR, CI runs the migration against a test database and validates it. After merge, the deployment pipeline applies the migration to staging, runs integration tests, then applies to production. No human touches the production database directly.

Dangerous operations like dropping columns or adding indexes on large tables need special handling. Atlas can detect these and flag them for review. For PostgreSQL, use CREATE INDEX CONCURRENTLY to avoid locking tables. For column drops, first stop reading the column, deploy, then drop it in a follow-up migration.

Monitoring and Alerting

Every database instance needs baseline monitoring from day one. The critical metrics: CPU utilization above 80%, disk space below 20% free, replication lag above 5 seconds, active connections above 80% of max, and queries running longer than 30 seconds. These should fire alerts to the owning team's Slack channel and PagerDuty.

The platform team sets up the monitoring stack once and applies it automatically to every provisioned database. Prometheus with postgres_exporter for PostgreSQL, mysqld_exporter for MySQL, and redis_exporter for Redis. Grafana dashboards templated per database type give teams instant visibility.

Cost Management and Rightsizing

Database instances are the second-largest cloud cost after compute for most organizations. The platform should track per-team database costs and surface rightsizing recommendations. An RDS instance running at 5% CPU for three months should be flagged for downsizing. Development databases should be automatically stopped outside business hours, saving 65% on those instances.

Reserved instances for stable production workloads cut costs by 30-60% compared to on-demand. The platform team should handle RI purchasing centrally based on aggregate usage across all teams.

Self-Service Provisioning

Connection Pooling Architecture

Schema Migration Tooling

Monitoring and Alerting

Cost Management and Rightsizing

Reserved instances for stable production workloads cut costs by 30-60% compared to on-demand. The platform team should handle RI purchasing centrally based on aggregate usage across all teams.

Self-Service Provisioning

Connection Pooling Architecture

Schema Migration Tooling

Monitoring and Alerting

Cost Management and Rightsizing

Key Points

Common Mistakes

Related Topics

Self-Service Database Provisioning

Self-Service Provisioning

Connection Pooling Architecture

Schema Migration Tooling

Monitoring and Alerting

Cost Management and Rightsizing

Key Points

Common Mistakes

Related Topics