Security as Code
In the last 10 years, perhaps more, infrastructure as code has been rising in popularity, allowing development teams to create solid, repeatable and scalable infrastructures with a few configuration files. Most companies will inevitably adopt these technologies into their life cycles because, otherwise, they will lose a lot on efficiency.
From Chef, to Ansible, and more recently, from Terraform, to AWS CDK, more and more providers are jumping in on the bandwagon and are allowing developers to create these environments with this process.
One clear benefit from this growing technology, is the opportunity to redefine Security Engineering. More and more, automation and self-remediation are becoming part of the Cyber Security reality, and so an approach I like to call Security as Code.
In this post, I want to reflect on this idea and maybe share some concepts worth exploring. It will be very high level and abstract, as implementations differ on adopted technologies.
Infrastructure as Code - What is it
For those unaware, as per Microsoft’s Azure documentation, infrastructure as code (IaC) is the process where we manage infrastructure (networks, virtual machines, load balancers) using code, either with tailored formats (like Hashicorp’s Terraform), or using common formats (Ansible’s YAML playbooks) and even popular programming languages (AWS CDK works with TypeScript, JavaScript, Python, Java, or Go). Without these, teams need to maintain numerous configuration files for all applications and all environments. IaC reduces inconsistencies, risk of post-deployment surprises, and so many more things. It makes products and companies more efficient and cost effective.
Security-wise, IaC sheds light on applying/enforcing security standards early in the development cycle, like automated patching and software updates, ensure data encryption at rest and in transit, ensure the application of the principle of the least privilege for user access, and many, many more things.
The same technologies that are now empowering your infrastructure are just composed of mere text files. Regardless of their format, they can be likely be parsed and used as input for other computational logic to create automated validations and enforcement of policies.
I know some of these technologies have built-in functionality that really facilitate Security as Code, namely AWS’ CDK, but we are not in a stage where this comes easy for most of us. Ansible, Chef, for instance, do not give the users anything remotely similar. But developing these will definitely make you more cost-effective, safer, and it will make you on top your security game.
Introducing Security as Code
AWS CDK lets users perform actions over the resources your code is generating through a concept called Aspects. Their examples include how to automatically apply tags to all resources, or ensure all new S3 buckets are encrypted by default but, from where I see it, we can definitely do a lot more automation on top of that.
Aspects use the visitor pattern and, therefore, we can pinpoint our validations to all resources in AWS and focus on those one at a time. Furthermore, from errors to mere warnings, you can stop the deployment of unsafe resource groups.
If you happen to use any other technology, I would not give up just yet. Despite CDK being a totally different game, I believe it is possible to design similar systems for all other technologies.
Designing a new System
The main purpose of Security as Code is to ensure your infrastructure respects company-wide policies. That said, I believe the first step is actually to create those, even if your company is in a heavy-regulated sector.
After figuring out what you want to protect and prohibit what users can not do by default, we can start implementing software that will parse our infrastructure code and raise errors once it finds anything non-compliant.
I do not want to bother you with implementations because it also means I would not be fair with other technologies. However, these are some security areas I have been working on automation:
- User access (User ACLs): Integrating our AD provider, we can easily map users and root access to machines. The default should be no user has access at all, and through configuration files you can specify which users have access and what are their permission levels. Big emphasis on the principle of the least privilege.
- Network ACLs/SDN: By default, all new spawn machines should be completely blocked. No egress/ingress access, period. Your developers should specify which connections are supposed to be supported by said machine/service. This is particularly relevant for micro-service oriented architectures. By default, you should also ACLs are not too wide, too permissive, or simply insecure (like open for all ports, or gigantic port ranges).
- Data Encryption: At rest or in transit, you should enforce encryption at all times. There is literally no excuse not to do it. For any service/resource that stores data, always validate this data is encrypted. For any service/resource that sends that to the internet, make sure it uses encrypted protocols and updated ones too!
- Backups: For storage services, or services which log data, it is perhaps wise to ensure backups exist and are done in a proper fashion.
- Logging/Monitoring: Are these services saving data? Are you legally allowed? Are you forced to throw away logs older than x days/months? These are just some validations you can throw here as well.
There are other security topics that are not covered in IaC like patching/updates, age policies, and so on, because they do not belong to this domain. Sure, if you are using so bleeding edge technologies, your team should ensure there are policies in place to be compliant in other topics as well, including the two previously mentioned, but other too.
The focus of this system is to apply auto-remediation and notify when new resources are not compliant or insecure.
Blocking and Allowing Exceptions
Naturally, with such restrict rule sets, we are on a good pace to have to face edge cases. Working in Security for a while has made me realize that the applying restrict defaults is just a bad idea. Projects have their own caveats, either because they are legacy, or whatever, it is better if we come prepare to consider exceptions.
For all validations above, or any other you create, it is a must you somehow implement logic to handle exceptions.
This is, in my perspective, a must for whatever your setup is, but particularly relevant for micro-service oriented architectures, where projects vary so much among themselves.
Keep in mind when designing your system that these exceptions should be validated by a team of engineers. Developers can often try to push for too wide exceptions, and they might not needed. Furthermore, Security engineers typically are more aware of the impact certain exceptions can have, and whether or not that needs to be revised. The key takeaway is to create a way to handle exceptions that are governed by your security team.
Secret scanning
This may sound like a shameless self-plug, but I recommend parsing your IaC files for secrets and other sensitive tokens, just like you would for any other type of code. In fact, if you’re using AWS CDK, which makes use of popular programming languages like JavaScript, Python, etc, you should actually submit your infrastructure code to go under SAST validation.
For the purpose of scanning your infrastructure code for secrets and other sensitive tokens, and this is the plug, I naturally recommend shhbt for the job.
Conclusions and Takeaways
In the past few months, I have been fortunate to be involved in many projects that involve bleeding edge principles and concepts, including writing security asserts on top of IaC. For a long time, this is something we have been taking to the heart and, in this post, I wanted to share some ideas for the future as these technologies are becoming more and more common as they grow in popularity.
Hope this article has inspired you take a look into your development life cycle, and try to find the opportunities there to apply automation and create an improved security posture in your project(s).
Thank you for your time reading, I’ll see you next time.
gsilvapt