Building and Scaling High Performing teams using DORA Metrics
As software engineering teams scale, fundamental software development and delivery processes become complex, and it's challenging to collect events and track delivery performance on existing systems. Unfortunately, this prevents you from making sounder engineering decisions, identifying performance tests, and effectively measuring developer productivity. CTO.ai's Insights is an easy-to-use dashboard computed using DORA metrics that gives you usable information about your organization's performance and usage. Insights aggregate data across your workloads on CTO.ai and help you visualize and understand how it changes over time. Using Insights, you can -
- Identify your organization's elite performers.
- Comprehend what makes a high-performing team different from a low-performing team
- Understand what low-performing teams should focus on
This tutorial will teach you how to track your software delivery process using DORA Metrics.
Prerequisites
- CTO.ai Account
- CTO.ai CLI Installed
- Sample GitHub repository with source codes with admin permissions required
Guide
DORA Metrics makes it easy for you to measure performance and provide crucial insights into areas of growth. These metrics are:
- Lead time for changes
- Deployment frequency
- Mean time to recovery
- Change failure rate
- Sign up or log in to your CTO.ai Account
2. In your CTO.ai dashboard, click on Insights, you will see the Insights overview page.
- Click on Set up Insights, it will redirect you to the setup page where you have to install the CTO.ai GitHub App.
3. Next, install the GitHub App and select which account you want to configure the GitHub APP on.
- You can instruct the GitHub APP to be installed on all repositories or only a select few. This repo can contain any source code as we will integrate the
yaml
file on the repo collect events. When you are done, click on Install & Authorize
4. In your repository on GitHub add the following GitHub actions configuration to automatically collect events whenever you push changes, open a pull request on any branch, and gather the delivery process wherever you deploy on production or staging.
- Create a new file in the
.github/workflows
folder and push these configurations there.
name: Deployment sample
on: [push, pull_request, release,]
jobs:
build-and-deploy:
name: deployment failed
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
# Add here the Deployment steps
- name: Report Deployment Sucess
if: ${{ success() }}
uses: cto-ai/action@v1.2
id: ctoai-deployment-Succeeded
with:
team_id: ${{ secrets.CTOAI_TEAM }}
token: ${{ secrets.CTOAI_TOKEN }}
event_name: "deployment"
event_action: "succeeded"
- name: Report Deployment Failure
if: ${{ failure() }}
uses: cto-ai/action@v1.2
id: ctoai-deployment-failed
with:
team_id: ${{ secrets.CTOAI_TEAM }}
token: ${{ secrets.CTOAI_TOKEN }}
event_name: "deployment"
event_action: "failed"
5. Once the GitHub app is added to your repository, push new changes or open a pull request with new changes directly on your GitHub repo you’ll be able to use the new changes known as Lead Time for changes directly on your CTO.ai insights dashboard.
Lead Time for Changes measures how long it takes for you to work on any change to production, and it measures the change cycles based on your pull requests. With Lead Time for Changes, you can understand every component and how they interact with your workflow, and it is measured from when you start working on a change to the instant it’s deployed to production. You can break the time into smaller increments, for example;
- The time required for you to work on a change.
- The time your deployment process takes to push a change all the way out to production.
This will help your team know what takes the most time and allow them to work on optimizing it.
The key importance of the Lead Time for Changes on the CTO.ai dashboard is that it measures how quickly your team can respond to changing conditions, events, or needs. For example, how quickly can you deliver these changes if you are working on a new feature or small improvement in your application.
6. Deployment Frequency: Deployment frequency measures how long it takes to get a change to production. To get started, push more changes into production, edit your source code, open a pull request on that change, and eventually merge it using GitHub events in your repository. Events related to your deployments are then sent through the GitHub insights integration. We use the Deployment Succeeded and Deployment Failure status to calculate your Deployment events.
Back in the CTO.ai dashboard, you can see the Deployment Frequency.
In deployment frequency, we measure how many times we push to production. The goal of pushing code faster to production is to ship as many times as possible. Working with Deployment frequency with CTO.ai insights lets you push to production in fixed intervals throughout the day.
7. Deployment Failure Rate: Deployment failure rate is the ratio of the number of deployments to the number of failures. To see the deployment failure rate, we will send a change failure event action, or you can automatically trigger this when your CI/CD build fails or when a defect is detected in your production environment. In our GitHub repo, we deployed a new pipeline using the failure
event_action
, and the Pipeline broke.
Back in the CTO.ai insights dashboard, you can see the deployment failure rate changes directly in the terminal.
CTO.ai insights help you prevent failures by understanding the impact of your change by creating a feedback loop so incident failures don't happen again.
8. Mean Time to Recovery: Next, as you can see in step 7, our pipelines failed and automatically reported the event in our CTO.ai dashboard. Once we recover the Pipeline and fix the issue, the new change will be reported in the Mean Time to Recovery CTO.ai dashboard.
Before measuring the mean time to recovery, you must first discover and know your problems. Once you have detected it then how quickly can you ship or deploy a new change for that issue measure your Mean Time to Recovery
CTO.ai lets you automate and deploy your changes faster, so you don’t have to worry about incidents and troubleshooting issues.
Deployment Status
After the Overview dashboard, you can take a look at your Deployment Status. Using the deployment status, you can track your deployment activity for each project and repo on CTO.ai and measure the successful and failed deployments, see your events success rates, how it changes over time, and identify your lengthiest workflows.
The successful deployments are marked with green, and the failed deployments are marked with red.
Event Timeline
In the event timeline dashboard, you can see a detailed view of your timeline events that shows both your GitHub Events and Workflow Events together. On the Event timeline, you can easily track every event and get broken builds fixed faster.
Event Activity
In the event activity dashboard, you can see a complete list of every event and how that event was deployed. On the Event Activity dashboard, you can track your daily progress, see which jobs are failing and which workflows have failing tests, and let you prioritize efforts for your pipeline improvements. You can easily hover over individual events by clicking on Details for detailed information about each Activity.
The Event Activity dashboard is grouped into six columns:
- TIMESTAMP
- EVENT
- ACTION
- CATEGORY
- SOURCE
- REPOSITORY
There are other methods you can enable insights on the CTO.ai dashboard, like sending events directly from the API and setting up Commands on your favourite languages.
Why do DevOps teams adopt CTO.ai Insights?
Insights enable your team to understand their goals and how they progress towards achieving them.