3 posts tagged with "cloudautomation"

Automating Snowflake Role Based Storage Integration for AWS

December 18, 2021 · 5 min read

Technologist and Cloud Consultant

I have used the instructions here to configure Snowpipe for several projects.

Although it is accurate, it is entirely click-ops oriented. I like to automate (and script) everything, so I have created a fully automated implementation using PowerShell, the aws and snowsql CLIs.

The challenge is that you need to go back and forth between AWS and Snowflake, exchanging information from each platform with the other.

Overview

A Role Based Storage Integration in Snowflake allows a user (an AWS user arn) in your Snowflake account to use a role in your AWS account, which in turns enables access to S3 and KMS resources used by Snowflake for an external stage.

The following diagram explains this (along with the PlantUML code used to create the diagram..):

Overview
PlantUML

@startuml

skinparam rectangle<<boundary>> {
    Shadowing false
    StereotypeFontSize 0
    FontColor #444444
    BorderColor #444444
    BorderStyle dashed
}

skinparam defaultTextAlignment center

!$imgroot = "https://github.com/avensolutions/plantuml-cloud-image-library/raw/main/images"

!unquoted procedure $AwsIam($alias, $label, $techn, $descr="", $stereo="AWS IAM")
    rectangle "==$label\n\n<img:$imgroot/aws/SecurityIdentityCompliance/Iam.png>\n//<size:12>[$techn]</size>//" <<$stereo>> as $alias #white
!endprocedure

!unquoted procedure $AwsS3($alias, $label, $techn, $descr="", $stereo="AWS S3")
    rectangle "==$label\n\n<img:$imgroot/aws/Storage/S3.png>\n//<size:12>[$techn]</size>//" <<$stereo>> as $alias #white
!endprocedure

!unquoted procedure $Snowflake($alias, $label, $techn, $descr="", $stereo="Snowflake")
    rectangle "==$label\n\n<img:$imgroot/snowflake/snowflakeDB.png{scale=0.70}>\n//<size:12>[$techn]</size>//" <<$stereo>> as $alias #white
!endprocedure

rectangle "Snowflake" <<boundary>> {
    $AwsIam(user, Snowflake IAM User, AWS IAM User)
    $Snowflake(int, Storage Integration, Storage Integration)
    $Snowflake(stage, External Stage, Stage)
}

rectangle "AWS" <<boundary>> {
    $AwsS3(bucket, Stage Bucket, AWS S3 Bucket)
    $AwsIam(role, Snowflake Access Role, IAM Role)
    $AwsIam(policy, Snowflake Access Policy, IAM Policy)
}

stage -UP-> int : uses
int -RIGHT-> user : uses
user -RIGHT-> role : uses
policy -UP-> role : attached to
role -RIGHT-> bucket : allows access to

@enduml

Setup

Some prerequisites (removed for brevity):

set the following variables in your script:

$accountid – your AWS account ID
$bucketname – the bucket you are letting Snowflake use as an External Stage
$bucketarn – used in policy statements (you could easily derive this from the bucket name)
$kmskeyarn – assuming you are used customer managed encryption keys, your Snowflake storage integration will need to use these to decrypt data in the stage
$prefix – if you want to set up granular access (on a key/path basis)

Configure Snowflake access credentials using environment variables or using the ~/.snowsql/config file (you should definitely use the SNOWSQL_PWD env var for your password however)
Configure access to AWS using aws configure

note

The actions performed in both AWS and Snowflake required privileged access on both platforms.

The Code

I have broken this into steps, the complete code is included at the end of the article.

Create Policy Documents

You will need to create the policy documents to allow the role you will create to access objects in the target S3 bucket, you will also need an initial “Assume Role” policy document which will be used to create the role and then updated with information you will get from Snowflake later.

Create Snowflake Access Policy

Use the snowflake_policy_doc.json policy document created in the previous step to create a managed policy, you will need the arn returned in a subsequent statement.

Create Snowflake IAM Role

Use the initial assume_role_policy_doc.json created to create a new Snowflake access role, you will need the arn for this resource when you configure the Storage Integration in Snowflake.

Attach S3 Access Policy to the Role

Now you will attach the snowflake-access-policy to the snowflake-access-role using the $policyarn captured from the policy creation statement.

Create Storage Integration in Snowflake

Use the snowsql CLI to create a Storage Integration in Snowflake supplying the $rolearn captured from the role creation statement.

Get `STORAGE_AWS_IAM_USER_ARN` and `STORAGE_AWS_EXTERNAL_ID`

You will need the STORAGE_AWS_IAM_USER_ARN and STORAGE_AWS_EXTERNAL_ID values for the storage integration you created in the previous statement, these will be used to updated the assume role policy in your snowflake-access-role.

Update Snowflake Access Policy

Using the STORAGE_AWS_IAM_USER_ARN and STORAGE_AWS_EXTERNAL_ID values retrieved in the previous statements, you will update the assume-role-policy for the snowflake-access-role.

Test the Storage Integration

To test the connectivity between your Snowflake account and your AWS external stage using the Storage Integartion just created, create a stage as shown here:

Now list objects in the stage (assuming there are any).

list @my_stage;

This should just work! You can use your storage integration to create different stages for different paths in your External Stage bucket and use both of these objects to create Snowpipes for automated ingestion. Enjoy!

Complete Code

The complete code for this example is shown here:

if you have enjoyed this post, please consider buying me a coffee ☕ to help me keep writing!

Simplifying Large CloudFormation Templates using Jsonnet

November 21, 2021 · 3 min read

Jeffrey Aven

Technologist and Cloud Consultant

CloudFormation templates in large environments can grow beyond a manageable point. This article provides one approach to breaking up CloudFormation templates into modules which can be imported and used to create a larger template to deploy a complex AWS stack – using Jsonnet.

Jsonnet is a json pre-processing and templating library which includes features including user defined and built-in functions, objects, and inheritance amongst others. If you are not familiar with Jsonnet, here are some good resources to start with:

Advantages

Using Jsonnet you can use imports to break up large stacks into smaller files scoped for each resource. This approach makes CloudFormation template easier to read and write and allows you to apply the DRY (Do Not Repeat Yourself) coding principle (not possible with native CloudFormation templates.

Additionally, although as the template fragments are in Jsonnet format, you can add annotations or comments to your code similar to YAML (not possible with a JSON template alone), although the rendered template is in legal CloudFormation Json format.

Process Overview

The process is summarised here:

Code

This example will deploy a stack with a VPC and an S3 bucket with logging. The project directory structure would look like this:

templates/
├─ includes/
│  ├─ vpc.libsonnet
│  ├─ s3landingbucket.libsonnet
│  ├─ s3loggingbucket.libsonnet
│  ├─ tags.libsonnet
├─ template.jsonnet

Lets look at all of the constituent files:

`template.jsonnet`

This is the root document which will be processed by Jsonnet to render a legal CloudFormation JSON template. It will import the other files in the includes directory.

`includes/tags.libsonnet`

This code module is used to generate re-usable tags for other resources (DRY).

`includes/vpc.libsonnet`

This code module defines a VPC resource to be created with CloudFormation.

`includes/s3loggingbucket.libsonnet`

This code module defines an S3 bucket resource to be created in the stack which will be used for logging for other buckets.

`includes/s3landingbucket.libsonnet`

This code module defines an S3 landing bucket resource to be created in the stack.

Testing

To test the pre-processing, you will need a Jsonnet binary/executable for your environment. You can find Docker images which include this for you, or you could build it yourself.

Once you have a compiled binary, you can run the following to generate a rendered CloudFormation template.

jsonnet template.jsonnet -o template.json

You can validate this template using the AWS CLI as shown here:

aws cloudformation validate-template --template-body file://template.json

Deployment

In a previous article, Simplified AWS Deployments with CloudFormation and GitLab CI, I demonstrated an end-to-end deployment pipeline using GitLab CI. Jsonnet pre-processing can be added to this pipeline as an initial ‘preprocess’ stage and job. A snippet from the .gitlab-ci.yml file is included here:

Enjoy!

if you have enjoyed this post, please consider buying me a coffee ☕ to help me keep writing!

Simplified AWS Deployments with CloudFormation and GitLab CI

November 11, 2021 · 3 min read

Jeffrey Aven

Technologist and Cloud Consultant

Managing cloud deployments and IaC pipelines can be challenging. I’ve put together a simple pattern for deploying stacks in AWS using CloudFormation templates using GitLab CI.

This deployment framework enables you to target different environments based upon refs (branches or tags) for instance deploy to a dev environment for a push or merge into develop and deploy to prod on a push or merge into main, otherwise just lint/validate (e.g., for a push to a non-protected feature branch). Templates are uploaded to a designated S3 bucket and staged for use in the pipeline and can be retained as an additional audit trail (in addition to the GitLab project history).

Furthermore, you can review changes (by inspecting change set contents) before deploying, saving you from fat finger deployments 😊.

How it works

The logic is described here:

Flow
PlantUML

@startuml

partition prepare {
  (*) --> === S1 ===
  === S1 === --> "Validate Template"
  --> === S2 ===
  === S1 === --> "Check Stack State"
  --> === S2 ===
}

partition publish {
  --> "Publish Template to S3"
}

partition plan {
  --> "Stack Exists?"
  --> === S3 ===
  === S3 === --> [Yes] "Create Change Set"
  === S3 === --> [No] === S4 ===
  "Create Change Set" --> === S4 ===
}

partition deploy {
  --> "MANUAL: Review Changes"
  --> "Deploy Change Set"
}

-->(*)

@enduml

The pipleline looks like this in GitLab:

Prerequisites

You will need to set up GitLab CI variables for AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY and optionally AWS_DEFAULT_REGION. You can do this via Settings -> CI/CD -> Variables in your GitLab project. As AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY are secrets, they should be configured as protected (as they are only required for protected branches) and masked so they are not printed in job logs.

`.gitlab-ci.yml` code

The GitLab CI code is shown here:

Reviewing change sets (plans) and applying

Once a pipeline is triggered for an existing stack it will run hands off until a change set (plan) is created. You can inspect the plan by clicking on the Plan GitLab CI job where you would see output like this:

If you are OK with the changes proposed, you can simply hit the play button on the last stage of the pipeline (Deploy). Voilà, stack deployed, enjoy!

if you have enjoyed this post, please consider buying me a coffee ☕ to help me keep writing!

Overview​

Setup​

The Code​

Create Policy Documents​

Create Snowflake Access Policy​

Create Snowflake IAM Role​

Attach S3 Access Policy to the Role​

Create Storage Integration in Snowflake​

Get STORAGE_AWS_IAM_USER_ARN and STORAGE_AWS_EXTERNAL_ID​

Update Snowflake Access Policy​

Test the Storage Integration​

Complete Code​

Advantages​

Process Overview​

Code​

template.jsonnet​

includes/tags.libsonnet​

includes/vpc.libsonnet​

includes/s3loggingbucket.libsonnet​

includes/s3landingbucket.libsonnet​

Testing​

Deployment​

How it works​

Prerequisites​

.gitlab-ci.yml code​

Reviewing change sets (plans) and applying​

Overview

Setup

The Code

Create Policy Documents

Create Snowflake Access Policy

Create Snowflake IAM Role

Attach S3 Access Policy to the Role

Create Storage Integration in Snowflake

Get `STORAGE_AWS_IAM_USER_ARN` and `STORAGE_AWS_EXTERNAL_ID`

Update Snowflake Access Policy

Test the Storage Integration

Complete Code

Advantages

Process Overview

Code

`template.jsonnet`

`includes/tags.libsonnet`

`includes/vpc.libsonnet`

`includes/s3loggingbucket.libsonnet`

`includes/s3landingbucket.libsonnet`

Testing

Deployment

How it works

Prerequisites

`.gitlab-ci.yml` code

Reviewing change sets (plans) and applying