S3Tk - A Security Toolkit For Amazon S3


A security toolkit for Amazon S3
Another day, another leaky Amazon S3 bucket
— The Register, 12 Jul 2017
Don’t be the... next... big... data... leak

Battle-tested at Instacart

Installation
Run:
pip install s3tk
You can use the AWS CLI to set up your AWS credentials:
pip install awscli
aws configure
See IAM policies needed for each command.

Commands

Scan
Scan your buckets for:
  • ACL open to public
  • policy open to public
  • logging enabled
  • versioning enabled
  • default encryption enabled
s3tk scan
Only run on specific buckets
s3tk scan my-bucket my-bucket-2
Also works with wildcards
s3tk scan "my-bucket*"
Confirm correct log bucket(s) and prefix
s3tk scan --log-bucket my-s3-logs --log-bucket other-region-logs --log-prefix "{bucket}/"
Skip logging, versioning, or default encryption
s3tk scan --skip-logging --skip-versioning --skip-default-encryption
Get email notifications of failures (via SNS)
s3tk scan --sns-topic arn:aws:sns:...

List Policy
List bucket policies
s3tk list-policy
Only run on specific buckets
s3tk list-policy my-bucket my-bucket-2
Show named statements
s3tk list-policy --named

Set Policy
Note: This replaces the previous policy
Only private uploads
s3tk set-policy my-bucket --no-object-acl

Delete Policy
Delete policy
s3tk delete-policy my-bucket

Enable Logging
Enable logging on all buckets
s3tk enable-logging --log-bucket my-s3-logs
Only on specific buckets
s3tk enable-logging my-bucket my-bucket-2 --log-bucket my-s3-logs
Set log prefix ({bucket}/ by default)
s3tk enable-logging --log-bucket my-s3-logs --log-prefix "logs/{bucket}/"
Use the --dry-run flag to test
A few notes about logging:
  • buckets with logging already enabled are not updated at all
  • the log bucket must in the same region as the source bucket - run this command multiple times for different regions
  • it can take over an hour for logs to show up

Enable Versioning
Enable versioning on all buckets
s3tk enable-versioning
Only on specific buckets
s3tk enable-versioning my-bucket my-bucket-2
Use the --dry-run flag to test

Enable Default Encryption
Enable default encryption on all buckets
s3tk enable-default-encryption
Only on specific buckets
s3tk enable-default-encryption my-bucket my-bucket-2
This does not encrypt existing objects - use the encrypt command for this
Use the --dry-run flag to test

Scan Object ACL
Scan ACL on all objects in a bucket
s3tk scan-object-acl my-bucket
Only certain objects
s3tk scan-object-acl my-bucket --only "*.pdf"
Except certain objects
s3tk scan-object-acl my-bucket --except "*.jpg"

Reset Object ACL
Reset ACL on all objects in a bucket
s3tk reset-object-acl my-bucket
This makes all objects private. See bucket policies for how to enforce going forward.
Use the --dry-run flag to test
Specify certain objects the same way as scan-object-acl

Encrypt
Encrypt all objects in a bucket with server-side encryption
s3tk encrypt my-bucket
Use S3-managed keys by default. For KMS-managed keys, use:
s3tk encrypt my-bucket --kms-key-id arn:aws:kms:...
For customer-provided keys, use:
s3tk encrypt my-bucket --customer-key secret-key
Use the --dry-run flag to test
Specify certain objects the same way as scan-object-acl
Note: Objects will lose any custom ACL

Delete Unencrypted Versions
Delete all unencrypted versions of objects in a bucket
s3tk delete-unencrypted-versions my-bucket
For safety, this will not delete any current versions of objects
Use the --dry-run flag to test
Specify certain objects the same way as scan-object-acl

Scan DNS
Scan Route 53 for buckets to make sure you own them
s3tk scan-dns
Otherwise, you may be susceptible to subdomain takeover

Credentials
Credentials can be specified in ~/.aws/credentials or with environment variables. See this guide for an explanation of environment variables.
You can specify a profile to use with:
AWS_PROFILE=your-profile s3tk

IAM Policies
Here are the permissions needed for each command. Only include statements you need.
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "Scan",
"Effect": "Allow",
"Action": [
"s3:ListAllMyBuckets",
"s3:GetBucketAcl",
"s3:GetBucketPolicy",
"s3:GetBucketLogging",
"s3:GetBucketVersioning",
"s3:GetEncryptionConfiguration"
],
"Resource": "*"
},
{
"Sid": "ScanDNS",
"Effect": "Allow",
"Action": [
"s3:ListAllMyBuckets",
"route53:ListHostedZones",
"route53:ListResourceRecordSets"
],
"Resource": "*"
},
{
"Sid": "ListPolicy",
"Effect": "Allow",
"Action": [
"s3:ListAllMyBuckets",
"s3:GetBucketPolicy"
],
"Resource": "*"
},
{
"Sid": "SetPolicy",
"Effect": "Allow",
"Action": [
"s3:PutBucketPolicy"
],
"Resource": "*"
},
{
"Sid": "DeletePolicy",
"Effect": "Allow",
"Action": [
"s3:DeleteBucketPolicy"
],
"Resource": "*"
},
{
"Sid": "EnableLogging",
"Effect": "Allow",
"Action": [
"s3:ListAllMyBuckets",
"s3:PutBucketLogging"
],
"Resource": "*"
},
{
"Sid": "EnableVersioning",
"Effect": "Allow",
"Action": [
"s3:ListAllMyBuckets",
" s3:PutBucketVersioning"
],
"Resource": "*"
},
{
"Sid": "EnableDefaultEncryption",
"Effect": "Allow",
"Action": [
"s3:ListAllMyBuckets",
"s3:PutEncryptionConfiguration"
],
"Resource": "*"
},
{
"Sid": "ResetObjectAcl",
"Effect": "Allow",
"Action": [
"s3:ListBucket",
"s3:GetObjectAcl",
"s3:PutObjectAcl"
],
"Resource": [
"arn:aws:s3:::my-bucket",
"arn:aws:s3:::my-bucket/*"
]
},
{
"Sid": "Encrypt",
"Effect": "Allow",
"Action": [
"s3:ListBucket",
"s3:GetObject",
"s3:PutO bject"
],
"Resource": [
"arn:aws:s3:::my-bucket",
"arn:aws:s3:::my-bucket/*"
]
},
{
"Sid": "DeleteUnencryptedVersions",
"Effect": "Allow",
"Action": [
"s3:ListBucketVersions",
"s3:GetObjectVersion",
"s3:DeleteObjectVersion"
],
"Resource": [
"arn:aws:s3:::my-bucket",
"arn:aws:s3:::my-bucket/*"
]
}
]
}

Access Logs
Amazon Athena is great for querying S3 logs. Create a table (thanks to this post for the table structure) with:
CREATE EXTERNAL TABLE my_bucket (
bucket_owner string,
bucket string,
time string,
remote_ip string,
requester string,
request_id string,
operation string,
key string,
request_verb string,
request_url string,
request_proto string,
status_code string,
error_code string,
bytes_sent string,
object_size string,
total_time string,
turn_around_time string,
referrer string,
user_agent string,
version_id string
)
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.RegexSerDe'
WITH SERDEPROPERTIES (
'serialization.format' = '1',
'input.regex' = '([^ ]*) ([^ ]*) \\[(.*?)\\] ([^ ]*) ([^ ]*) ([^ ]*) ([^ ]*) ([^ ]*) \\\"([^ ]*) ([^ ]*) (- |[^ ]*)\\\" (-|[0-9]*) ([^ ]*) ([^ ]*) ([^ ]*) ([^ ]*) ([^ ]*) ([^ ]*) (\"[^\"]*\\") ([^ ]*)$'
) LOCATION 's3://my-s3-logs/my-bucket/';
Change the last line to point to your log bucket (and prefix) and query away
SELECT
date_parse(time, '%d/%b/%Y:%H:%i:%S +0000') AS time,
request_url,
remote_ip,
user_agent
FROM
my_bucket
WHERE
requester = '-'
AND status_code LIKE '2%'
AND request_url LIKE '/some-keys%'
ORDER BY 1

CloudTrail Logs
Amazon Athena is also great for querying CloudTrail logs. Create a table (thanks to this post for the table structure) with:
CREATE EXTERNAL TABLE cloudtrail_logs (
eventversion STRING,
userIdentity STRUCT<
type:STRING,
principalid:STRING,
arn:STRING,
accountid:STRING,
invokedby:STRING,
accesskeyid:STRING,
userName:String,
sessioncontext:STRUCT<
attributes:STRUCT<
mfaauthenticated:STRING,
creationdate:STRING>,
sessionIssuer:STRUCT<
type:STRING,
principalId:STRING,
arn:STRING,
accountId:STRING,
userName:STRING>>>,
eventTime STRING,
eventSource STRING,
eventName STRING,
awsRegion STRING,
sourceIpAddress STRING,
userAgent STRING,
errorCode STRING,
errorMessage STRING,
requestId STRING,
eventId STRING,
r esources ARRAY<STRUCT<
ARN:STRING,
accountId:STRING,
type:STRING>>,
eventType STRING,
apiVersion STRING,
readOnly BOOLEAN,
recipientAccountId STRING,
sharedEventID STRING,
vpcEndpointId STRING,
requestParameters STRING,
responseElements STRING,
additionalEventData STRING,
serviceEventDetails STRING
)
ROW FORMAT SERDE 'com.amazon.emr.hive.serde.CloudTrailSerde'
STORED AS INPUTFORMAT 'com.amazon.emr.cloudtrail.CloudTrailInputFormat'
OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
LOCATION 's3://my-cloudtrail-logs/'
Change the last line to point to your CloudTrail log bucket and query away
SELECT
eventTime,
eventName,
userIdentity.userName,
requestParameters
FROM
cloudtrail_logs
WHERE
eventName LIKE '%Bucket%'
ORDER BY 1

Best Practices
Keep things simple and follow the principle of least privilege to reduce the chance of mistakes.
  • Strictly limit who can perform bucket-related operations
  • Avoid mixing objects with different permissions in the same bucket (use a bucket policy to enforce this)
  • Don’t specify public read permissions on a bucket level (no GetObject in bucket policy)
  • Monitor configuration frequently for changes

Bucket Policies
Only private uploads
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Deny",
"Principal": "*",
"Action": "s3:PutObjectAcl",
"Resource": "arn:aws:s3:::my-bucket/*"
}
]
}

Notes
The set-policy, enable-logging, enable-versioning, and enable-default-encryption commands are provided for convenience. We recommend Terraform for managing your buckets.
resource "aws_s3_bucket" "my_bucket" {
bucket = "my-bucket"
acl = "private"

logging {
target_bucket = "my-s3-logs"
target_prefix = "my-bucket/"
}

versioning {
enabled = true
}
}

Upgrading
Run:
pip install s3tk --upgrade
To use master, run:
pip install git+https://github.com/ankane/s3tk.git --upgrade

Docker
Run:
docker run -it ankane/s3tk aws configure
Commit your credentials:
docker commit $(docker ps -l -q) my-s3tk
And run:
docker run -it my-s3tk s3tk scan

History
View the changelog


Disqus Comments