In late 2020 I took ownership of governing Terraform-based cloud deployment pipelines. After an initial survey of the tooling & architecture landscape, I evaluated several tools, ultimately selecting Hashicorp Sentinel to be the Policy-as-Code (“PaC”) tool of choice. I then architected and implemented the SDLC used to deliver policies and an SDLC for the policies themselves so as to provide robust testing & validation, distributed collaboration, and auditability.
Tools/Products used
Project Details
This project needed to satisfy the requirements of a highly regulated global bank with a large development team, meaning that the challenges of scaling such as exceptions, site reliability engineering, and automating processes would be critical to success.
Cross-Domain Collaboration:
- Working with the pipeline team to ensure mutual success on pipeline functionality & UX
- Continuously monitor developer experience channels for mentions
- Stakeholder/developer outreach to interpret issues/motivations into product improvements
- Influence other teams' backlogs & roadmaps based on required features & integrations
- Participation in Hashicorp Financial Services User Group for cross-company ideas exchange
SDLC Work
Enabling globally dispersed collaboration on policy sets requires easy onboarding, repeatable processes, and lots of testing. Here’s some of what I implemented as I onboarded new engineers and observed their feedback:
Templatized Policy Creation: Enable rapid & repeatable workflows by templatizing policies
- Uses cookiecutter
- Pre-commit hooks validate naming and branching
- Post-commit hooks create & switch to the feature branch, create the policy, test template files, insert policy record into sentinel.hcl, and create a slow-roll deployment
Git Pre-Commit Hooks: Catch often-missed things before running a pipeline, saving time and improving the developer experience
- Using git’s custom git hooks feature
- Runs
sentinel fmt
on all the policies prior to committing - Runs custom internal linting on sentinel.hcl prior to committing
CI Testing: Ensure policy sets are thoroughly validated and tested prior to deployment. Only the Sentinel Test stage out of these is built into the product; the rest were custom development.
- Runs a check to ensure directory & file structure is as intended
- Ensures at least one passing and one failing unit test are declared for each policy
- Parses and lints the sentinel.hcl file to ensure policies will properly load and run
- Runs
sentinel test
on each of the policies to ensure that all unit tests are passing
Continuous Delivery: Allow tailored policy deployment 24x7 and incremental changes. None of these are built into the product; they are all custom development I personally did or designed & managed the delivery of:
- Checks for valid exceptions which are still in their authorized date range
- Creates custom policy sets with exceptions for each Terraform Org/Workspace
- Builds/uploads the artifact to the artifact repository for fast rollback and auditability
- Deploys policy sets to all the deployment targets
- Nightly job runs to re-evaluate validity of each exception and re-deploys as needed
- No exception is accidentally left in place!
- In Progress - Download & store Sentinel mocks from all workspaces for integration testing
- In Progress - Integration testing via Sentinel Mocks to avoid running 1000’s of plans
- Offers representative coverage at a fraction of the cost & time
Continuous Monitoring: In Progress when I left the firm - Ensure policies are consistently applied effectively & correctly, detect outages, and provide data insights for data-driven decision making.
- Poll for policy set hits, individual policy hits, and the results of the evaluations
- Track data for rollup metrics, trend data, and allowing data-driven policy promotions
- Use insights for training, identifying areas to strengthen, and threat modeling