Score your codebase across eight dimensions.
A weighted scorecard that turns intuition into a defensible baseline. Score each dimension 1 to 5 against the rubric, see your weighted grade, identify the dimension dragging the score down, and export results for tickets or presentations.
Architecture
Clear bounded contexts, some seam coupling between modules
Code Quality
Reasonable structure, code reviews catch most issues
Test Coverage
30 to 60% coverage, critical paths tested, some flakiness
Dependency Health
Most dependencies current, monthly review cadence
Documentation
Architecture and onboarding docs, runbooks for top incidents
Infrastructure & CI/CD
Pipeline based deploy, environments mostly aligned
Security Posture
SAST in CI, secret scanning enabled, auth patterns documented
Observability
Metrics on golden signals, structured logs, some tracing
All scoring runs in your browser. No data leaves the page.
What each dimension covers.
The rubric in the scorecard above gives you the level descriptions. This table shows how the eight dimensions map to your day to day operational categories.
| Dimension | Weight | Focus |
|---|---|---|
| Architecture | 20% | Module boundaries, coupling, evolvability |
| Code Quality | 15% | Duplication, complexity, naming, function length |
| Test Coverage | 15% | Unit, integration, e2e, critical path coverage |
| Dependency Health | 10% | Currency, vulnerability count, upgrade readiness |
| Documentation | 10% | API docs, ADRs, runbooks, onboarding speed |
| Infrastructure & CI/CD | 10% | Deploy automation, environment parity, rollback |
| Security Posture | 10% | Scanning, secrets, auth, threat modelling |
| Observability | 10% | Metrics, tracing, alerting, SLOs |
Reading the grade.
The weighted score maps to a five band grade. Each grade has a recommended action.
Maintain. Allocate minimal debt time.
Boy Scout rule. Quarterly retrospective.
Allocate 20% sprint capacity to debt.
Dedicated debt sprints. Set ceiling.
Pause new features. Triage by risk.
Common questions.
01Why eight dimensions instead of one number?+
A single TDR or SQALE rating tells you the magnitude of the problem but not where it lives. The eight dimensions on this page map to the operational categories that engineering leaders actually budget against. Knowing your codebase scores 2.3 overall is less useful than knowing your test coverage is a 1 and your architecture is a 4. The weighted total gives you the grade. The dimension breakdown tells you what to fix first.
02How are the weights set?+
Weights reflect typical business impact. Architecture is highest (20%) because architectural debt has the largest blast radius and is the hardest to remediate incrementally. Code quality and test coverage are 15% each because they affect every change. The remaining five dimensions sit at 10% each. Override the weights for your context if security or observability matters more than average for your business.
03How does this compare to SQALE or TDR?+
SQALE and TDR measure code-level debt, scoped to what static analysis can see. This assessment is broader: it captures architectural, process, and operational debt that static analysis misses. Best practice is to use both. SQALE or TDR for the precise code-quality figure. This assessment for the full picture you present to leadership.
04Should I score honestly or aspirationally?+
Honestly. The point of the assessment is to identify your weakest dimension so you can prioritise. Inflating scores to look good defeats the purpose. If you are unsure between two scores, pick the lower one. The conservative version gives you a more defensible baseline to track improvement against.
05Can I run this for multiple systems?+
Yes, and at portfolio scale this becomes the input to enterprise prioritisation. Run the assessment per system, capture the dimension scores, and feed them into the portfolio matrix on the enterprise page. The combination produces a system level prioritisation list with action recommendations.