Say you're building an application and you need to let your users upload files to S3. How would you go about it?
Particularly, imagine our clients are uploading a lot of files of different sizes, and are sensitive to latency.
Let's walk through a journey of AWS APIs, and explore a little known feature of S3 called POST Policies.
Presigned URLs
Your immediate instinct may be to use S3 presigned URLs. Presigned URLs let you create a URL that you can share and allow a user to download or upload to an S3 bucket. You create the presigned URL server side using IAM credentials that have the valid S3 permissions and then share the URL to allow user actions. Clients simply use HTTP clients to connect to the URL. You can set an expiry time to any date time to ensure the access is short lived and you can also attach an IAM policy to the presigned URL to limit the permissions the client has.
All in all, presigned URLs are pretty powerful and sound like a great choice for allowing credential-less S3 actions. They have one major limitation, however. S3 presigned upload urls require you to know the Content length before hand. Because of the way presigned URLs work, using AWS4 Signatures, the Content-Length is a required component when generating the presigned URL. This means we can't return one presigned URL and allow our clients to upload objects of variable size while it is not expired.
As a workaround, we can have the client request a presigned URL every time they want to perform an upload. This is not necessarily a big deal, and may be perfectly suitable for many scenarios.
However, this results in an additional call for every upload and may not be desirable. For example, say we want our clients to have a short lived session where they can upload a large number of objects, and latency is important.
A new API using temporary credentials
Presigned URLs don't seem to work well for our usecase, so let's try a new approach. AWS IAM offers APIs to request temporary security credentials. There's a few different APIs, but let's see if we can use AssumeRole to return temporary credentials to our client.
As an approach, we can call STS AssumeRole server side and return temporary IAM credentials to our client. We can use IAM session policies to limit the S3 permissions the client has access to. We can also use the DurationSeconds parameter to limit the validity of the credentials, but only up to a minimum of 15 minutes.
Our clients would then use the credentials and upload files using the AWS SDK. If we're offering this as a part of our API, we'd likely want to write a language native client that wraps the AWS SDK and takes care of refereshing the credentials.
Security-wise, you may feel icky having to return credentials using your API, but the approach is generally sound as long as the credentials are short lived and you are giving access to authenticated callers with narrow permissions.
This approach works, but besides having to implement this API, comes with several downsides:
- Dealing with unsigned credentials.
- Our client needs to depend on heavyweight AWS SDKs, rather than a simple HTTP Client.
- We can't control the size of the object our client uploads. This can be a concern with untrusted or semi-trusted clients, who could upload very large files to our S3 bucket.
Enter POST Policy
We're dissatisfied with our previous approach. Let's explore a lesser known Amazon S3 feature: POST Policy. You might be thinking, "POST?, isn't it S3 PUT object?", and you're right, but Amazon actually introduced a POST API for uploading S3 objects to enable browser based S3 uploads.
As a note, you'll generally hear of POST Policy in the context of browser based uploads, but there's nothing inherent preventing us from using it in any environment.
A POST policy is essentially a JSON document that you create, sign, and return to your clients to specify what conditions are required for a successful POST object upload.
What does a POST policy look like?
POST policies are fairly powerful: you can specify the exact date time the policy expires and include conditions on properties like the ACL, Bucket, Key prefix, and Content length range (minimum and maximum). You can also use operators like "starts_with" in addition to exact matches to add dynamic logic to your policy.
Let's take a look at an example POST policy.
{
"expiration": "2022-02-14T13:08:46.864Z",
"conditions": [
{ "acl": "bucket-owner-full-control" },
{ "bucket": "my-bucket" },
["starts-with", "$key", "stuff/clientId"],
["content-length-range", 1048576, 10485760]
]
}
The policy results in the following conditions:
- The policy expires on Mon Feb 14 2022 13:08:46 UTC. After this time, any client using the policy to perform a POST upload will get a 403 error.
- The ACL on the object must be
Bucket owner full control
, which ensures we, as the bucket owner, have full control of uploaded objects. - We specify a specific bucket by name (
my-bucket
) to allow uploads to. - The S3 key of the uploaded object must have a specific prefix. Here, we use it ensure our client only has permission to upload under their prefix by specifying their client ID as the key prefix.
- The uploaded object must be between 1MB and 10MB in size.
Signing the policy
After forming a POST policy, we have to sign the policy using valid IAM credentials (with the requisite permissions), similar to how we sign presigned URLs. You can view the complete procedure on calculating the signature here but it essentially involves Base64 encoding the policy and signing it using AWS SigV4. Unfortunately, unlike presigned URLs, the AWS SDKs don't provide helper methods to create the POST Policy. You can write it yourself or consult the few examples and community libraries out there. If you're using Java/JVM, check out Minio's implementation as a well maintained reference.
Letting our clients perform POST uploads
Once we've created the POST policy, our client's can use the POST policy to perform POST S3 uploads. S3 POST uploads are multipart form data requests to the S3 Bucket URL
(e.g. https://examplebucket.s3-us-west-2.amazonaws.com/
) containing the key's specified in the POST policy. As a code example in Python:
from uuid import uuid
import requests
# Add an endpoint for the client to request a POST Policy
post_policy_form_data = requests.get("/postPolicyFormData")
post_policy_form_data = {
"x-amz-date": "20220213T233352Z",
"x-amz-signature": "efa9bbc<...>",
"acl": "bucket-owner-full-control",
"x-amz-security-token": "<...>",
"x-amz-algorithm": "AWS4-HMAC-SHA256",
"x-amz-credential": "ASIA<..>.",
"policy": "eyJleHBpcmF...<base64 encoded policy>",
"Content-Type": "application/json"
}
filename = str(uuid4()) + ".json"
key = os.path.join("client_id", filename)
content = "file_content"
multipart_form_data = { **post_policy_form_data, "key": key, "file": (filename, content) }
upload_url = "https://examplebucket.s3-us-west-2.amazonaws.com"
res = requests.post(upload_url, files=multipart_form_data)
POST policies satisfy all of our requirements.
- We are using signed policies without raw credentials.
- Our clients can make HTTP requests without the AWS SDK.
- We can granularly control the expiration, permissions, and object properties,
- Our clients can upload objects as long as the POST policy is not expired without needing to make additional requests.
Conclusion
We took a short look at the S3 object upload landscape, and discovered a powerful feature called POST Policies.
Usecases
Lets take a look at some usecases that POST policies unlock:
Browser based uploads
Particularly for large sized files, POST policies provide a convenient way to enable your client's to upload files client side, without needing to go through a server proxy. This can be convenient if you're using API Gateway or Lambda which have content size limits. Additionally, this can give your client better upload speed.
Short lived low-latency upload sessions
As we discussed, we can use POST policies to let our clients maintain a short lived, controlled session with specific permissions where they can upload many objects of variable size, without any additional latency besides S3 latency.
Guidance
As a takeway, if you are looking to incorporate S3 object upload from your clients in your application, follow the general guidance:
- Prefer presigned URLs. They're simpler, are already supported in the AWS SDKs and are well documented.
- Use POST policies otherwise. As we discussed, if latency is important, or you want form based browser uploads.
- Don't use the second approach we discussed (using
AssumeRole
). You can generally achieve the equivalent using a POST policy.
Are you using POST policies, or do you know of an interesting usecase they enable? Feel free to reply.