Shift-left Infrastructure Security

Bridging the gap between security and engineering can bring significant value in compliance and operational protection, and its impact will place broad strokes in knowledge transfer and relationship building. In functional shops, security might oversee rules, time-based audits, and create tickets for teams. As teams cross over the DevSecOps divide, security becomes less of a task and more verification in the release. Specifically focused on cloud workloads, several automated steps can be adopted that plug right in and give that instant feedback engineers desire. Call the movement: continuous security.

Infrastructure Pipelines

There are several moving parts with cloud infrastructure, regardless of which partner, albeit a data store, server, security group, or any other resource configuration. Several compliance institutions have come out with benchmarks to compare against, and these are what security uses in auditing. With shift left, the question is, how can we integrate with the workflows that already exist versus devising new solutions or performing out of band scanning? Several open-source tools can provide this protection.

Disclaimer: This piece is not designed to focus on tools, but rather how tools can simplify the process. No means suggested is a catch-all and should be seen as examples rather than implementation specifications.

Static Analysis of IAC

Depending on your stack, several solutions can provide coverage. Static analysis of infrastructure should be seen as different than static analysis of application code, as IAC is finite and declarative, where applications are variable. The tools in this market vary by the framework in which you are utilizing. TFSec, for example, is focused solely on terraform, whereas Checkov covers multiple frameworks. In determining what works best, the most critical factors to look for are extensibility and coverage. Take this sample Terraform code, which creates an s3 bucket:

Take this sample terraform code creating an s3 bucket for example.

resource "aws_s3_bucket" "b" {
bucket = "tf-test-bucket"
acl = "private"
}
view raw main.tf hosted with ❤ by GitHub

Utilizing a framework like checkov, we can run a series of checks against the resource definition and determine that it is passing and missing some blocks.

~ docker run -t -v $PWD:/tf bridgecrew/checkov -d /tf
___| |__ ___ ___| | _______ __
/ __| '_ \ / _ \/ __| |/ / _ \ \ / /
| (__| | | | __/ (__| < (_) \ V /
\___|_| |_|\___|\___|_|\_\___/ \_/
By bridgecrew.io | version: 1.0.684
terraform scan results:
Passed checks: 4, Failed checks: 4, Skipped checks: 0
Check: CKV_AWS_20: "S3 Bucket has an ACL defined which allows public READ access."
PASSED for resource: aws_s3_bucket.b
File: /bucket.tf:1-4
Guide: https://docs.bridgecrew.io/docs/s3_1-acl-read-permissions-everyone
Check: CKV_AWS_57: "S3 Bucket has an ACL defined which allows public WRITE access."
PASSED for resource: aws_s3_bucket.b
File: /bucket.tf:1-4
Guide: https://docs.bridgecrew.io/docs/s3_2-acl-write-permissions-everyone
Check: CKV_AWS_70: "Ensure S3 bucket does not allow an action with any Principal"
PASSED for resource: aws_s3_bucket.b
File: /bucket.tf:1-4
Guide: https://docs.bridgecrew.io/docs/bc_aws_s3_23
Check: CKV_AWS_93: "Ensure S3 bucket policy does not lockout all but root user. (Prevent lockouts needing root account fixes)"
PASSED for resource: aws_s3_bucket.b
File: /bucket.tf:1-4
Check: CKV_AWS_21: "Ensure all data stored in the S3 bucket have versioning enabled"
FAILED for resource: aws_s3_bucket.b
File: /bucket.tf:1-4
Guide: https://docs.bridgecrew.io/docs/s3_16-enable-versioning
1 | resource "aws_s3_bucket" "b" {
2 | bucket = "tf-test-bucket"
3 | acl = "private"
4 | }
Check: CKV_AWS_18: "Ensure the S3 bucket has access logging enabled"
FAILED for resource: aws_s3_bucket.b
File: /bucket.tf:1-4
Guide: https://docs.bridgecrew.io/docs/s3_13-enable-logging
1 | resource "aws_s3_bucket" "b" {
2 | bucket = "tf-test-bucket"
3 | acl = "private"
4 | }
Check: CKV_AWS_52: "Ensure S3 bucket has MFA delete enabled"
FAILED for resource: aws_s3_bucket.b
File: /bucket.tf:1-4
1 | resource "aws_s3_bucket" "b" {
2 | bucket = "tf-test-bucket"
3 | acl = "private"
4 | }
Check: CKV_AWS_19: "Ensure all data stored in the S3 bucket is securely encrypted at rest"
FAILED for resource: aws_s3_bucket.b
File: /bucket.tf:1-4
Guide: https://docs.bridgecrew.io/docs/s3_14-data-encrypted-at-rest
1 | resource "aws_s3_bucket" "b" {
2 | bucket = "tf-test-bucket"
3 | acl = "private"
4 | }
view raw checkov.log hosted with ❤ by GitHub

With that said, checkov is not the only static code analysis. For a more comprehensive breakdown of static analysis tools, I recommend you read this blog post by Christophe Tafani-Dereeper.

Config Test

If we take a quote from Jim Brikman from Hashiconf 2018, "Infrastructure code without automated tests is broken." Inspec is widely adopted to ensure that the config is appropriately set, but what about using it for security? Several resource checks can validate encryption settings, access policies, etc. Take that same s3 bucket from before, and after deployment, we can validate to ensure the deployment is working as expected.

describe aws_s3_bucket(bucket_name: 'tf-test-bucket') do
it { should exist }
it { should_not be_public }
it { should have_default_encryption_enabled }
it { have_secure_transport_enabled }
end
view raw inspec.rb hosted with ❤ by GitHub

~ inspec exec -t aws:// .

Profile: Bucket Testing profile (bucket-test)
Version: 0.1.0
Target:  aws://

  S3 Bucket tf-test-bucket
     ✔  is expected to exist
     ✔  is expected not to be public
     ×  is expected to have default encryption enabled
     expected #has_default_encryption_enabled? to return true, got false
     ✔  

As you get writing more tests, there can become an issue when you have trickling dependencies. An example would be if you create an access policy, having the means to exercise that policy to validate that it is doing what it should, and not doing what it should not. Integration tests are where Terratest or self-curated scripts (ex. pytest + kubetest) are the means to the solution. An integration test will ensure that the written access policy will continue to work regardless of the underlying changes.

What if we had a role that we wanted to ensure could list objects from our s3 bucket? Can we deploy the role, then assume it and test its permissions against the bucket

data "aws_caller_identity" "current" {}
resource "aws_s3_bucket" "b" {
bucket = "tf-test-bucket"
acl = "private"
}
resource "aws_iam_role" "test_role" {
name = "test_role"
assume_role_policy = <<EOF
{
"Version": "2012-10-17",
"Statement": [
{
"Action": "sts:AssumeRole",
"Principal": {
"AWS": "arn:aws:iam::${data.aws_caller_identity.current.account_id}:root"
},
"Effect": "Allow"
}
]
}
EOF
}
data "aws_iam_policy_document" "test_policy" {
statement {
actions = [
"s3:ListBucket",
]
resources = [
aws_s3_bucket.b.arn,
]
}
}
resource "aws_iam_policy" "test_policy" {
name = "test_policy"
path = "/"
policy = data.aws_iam_policy_document.test_policy.json
}
resource "aws_iam_role_policy_attachment" "test-attach" {
role = aws_iam_role.test_role.name
policy_arn = aws_iam_policy.test_policy.arn
}
output "role_arn" {
value = aws_iam_role.test_role.arn
}
output "bucket" {
value = aws_s3_bucket.b.id
}
view raw terratest.tf hosted with ❤ by GitHub

package test
import (
"testing"
"github.com/gruntwork-io/terratest/modules/terraform"
"github.com/aws/aws-sdk-go/aws"
"github.com/aws/aws-sdk-go/aws/session"
"github.com/aws/aws-sdk-go/aws/credentials/stscreds"
"github.com/aws/aws-sdk-go/service/s3"
"github.com/stretchr/testify/assert"
)
func TestTerraformTestS3(t *testing.T) {
// retryable errors in terraform testing.
terraformOptions := terraform.WithDefaultRetryableErrors(t, &terraform.Options{
TerraformDir: "./tftest",
})
defer terraform.Destroy(t, terraformOptions)
terraform.InitAndApply(t, terraformOptions)
output := terraform.Output(t, terraformOptions, "role_arn")
bucket := terraform.Output(t, terraformOptions, "bucket")
// Assume role
sess := session.Must(session.NewSession(&aws.Config{Region: aws.String("us-east-1")}))
stscreds.NewCredentials(sess, output)
svc := s3.New(sess)
// List Objects
_, err := svc.ListObjectsV2(&s3.ListObjectsV2Input{Bucket: aws.String(bucket)})
assert.Nil(t, err)
}
view raw main_test.go hosted with ❤ by GitHub

~ go test
Running command terraform with args [init -upgrade=false]
...
Running command terraform with args [apply -input=false -auto-approve -lock=false]
...
Apply complete! Resources: 4 added, 0 changed, 0 destroyed.
...
PASS

Runtime protection

In this domain, we typically start to see tight integrations. In the k8s world, an example involves OPA Gatekeeper utilizing the admission controller; in the server space, the remediation might be a service such as AWS Config, which will scan existing assets against known configuration checks.

Recently a CVE was released within k8s, CVE-2020-8554. Using OPA Gatekeepr, you can create a constraint template, that restricts the ability from publishing through an admission controller

---
apiVersion: templates.gatekeeper.sh/v1beta1
kind: ConstraintTemplate
metadata:
name: k8sexternalips
spec:
crd:
spec:
names:
kind: K8sExternalIPs
validation:
openAPIV3Schema:
properties:
allowedIPs:
type: array
items:
type: string
targets:
- target: admission.k8s.gatekeeper.sh
rego: |
package k8sexternalips
violation[{"msg": msg}] {
input.review.kind.kind == "Service"
input.review.kind.group == ""
allowedIPs := {ip | ip := input.parameters.allowedIPs[_]}
externalIPs := {ip | ip := input.review.object.spec.externalIPs[_]}
forbiddenIPs := externalIPs - allowedIPs
count(forbiddenIPs) > 0
msg := sprintf("service has forbidden external IPs: %v", [forbiddenIPs])
}
---
apiVersion: constraints.gatekeeper.sh/v1beta1
kind: K8sExternalIPs
metadata:
name: k8sexternalip
spec:
match:
kinds:
- apiGroups: [""]
kinds: ["Service"]
view raw gatekeeper.yml hosted with ❤ by GitHub

After trying to exercise the vulnerability, gatekeeper will block it.

Error from server ([denied by k8sexternalip] service has forbidden external IPs: {"23.185.0.3"}): error when creating "STDIN": admission webhook "validation.gatekeeper.sh" denied the request: [denied by k8sexternalip] service has forbidden external IPs: {"23.185.0.3"}

Application Pipelines

Many applications depend on system resources or external dependencies, of which the OWASP Top 10 calls out “Using Components with Known Vulnerabilities.” The nice thing about being in the OWASP Top 10, is engineers tend to be more aware of the risks. When talking about shifting left, there are several measures to keep you protected.

Utilize SCA to discover in code

Application dependencies are defined in a static config file, and most languages have a typical pattern for analysis. GitHub and GitLab, two leading Source Code Management solutions, have SCA built into their platform. However, security tools such as Synk or Contrast can integrate directly into your already existing application security stack. In the end, it’s not about which software, but rather ensuring your dependencies are up to date, and having an automated means to do so that runs against your existing CI pipeline makes it easy for developers to adopt.

Image Scanning

New CVEs are discovered daily. Having build-time and proactive scans against your images can help you find out OS-based patches that need to be applied. Scanning can take part both on the host side or the container side. Clair is a popular application integrated by many vendors for containers, but Nessus lead the way on the host side.

Summary

Engineers feed into automation and want to utilize existing pipelines and frameworks to bridge the technical and social gap between security and engineering. Using tickets as a means of patching leads to slow turnarounds, and time-based audits lead to large openings of vulnerabilities. Through integrating changes on the infrastructure and application pipelines, the shift-left approach leads to an overall better security posture.