Storing files in cloud storage is now a standard, and allowing users to upload files is a common feature for a web application. Specifically, files are generally uploaded to a server and then to a cloud storage service, or directly to the cloud storage service. When dealing with small files, this is an easy task to accomplish. On the other hand, when it comes to uploading large files, it can become challenging.
This is why AWS Cloud Storage and the other Amazon S3-like cloud storage services support multipart upload. This technique allows you to split a file into several small chunks and upload them all sequentially or in parallel, empowering you to deal with large files concisely.
First, let’s learn what multipart upload is, how it works, and why the best approach to it involves S3 pre-signed URLs. Then, let’s see how to implement everything required to get started with multipart uploads, both backend and frontend, through a demo application built in Node.js and React.
You need these prerequisites to replicate the article’s examples:
aws-sdk
is the AWS SDK for JavaScript and you install it with the following npm command:
npm install --save aws-sdk
As explained here in the Amazon S3 documentation, multipart upload allows you to upload a single object as a set of parts, and it is typically used with large files. This technique allows you to split a file into many parts and upload them in parallel, pause and resume the operation whenever you need, and even start uploading an object before knowing its global size.
In other words, multipart upload enables faster, more controllable, and more flexible uploads into any Amazon S3-like cloud storage.
After uploading all parts of your object, you have to tell your cloud storage that the operation has been completed. Then, the data uploaded will be presented as a single object, equal to the uploaded file.
So, these are the steps involved in a multipart upload request:
This approach to uploading is generally used in the frontend to give users the possibility to upload large files. In detail, the official documentation recommends using the multipart approach with any file larger than 1GB. But notice that you can use multipart upload with any file, regardless of its size.
The best frontend approach to multipart upload involves pre-signed URLs. An S3 pre-signed URL is a URL signed with an AWS access key that temporarily grants you restricted access to a particular S3 object. With an S3 pre-signed URL, you can perform a GET or PUT operation for a predefined time limit. By default, an S3 pre-signed URL expires in 15 minutes.
S3 pre-signed URLs are particularly useful because they allow you to keep your S3 credentials and buckets private, granting access to a resource for a limited time only. All you have to do is generate them in your backend and then provide them to the frontend side.
Therefore, pre-signed URLs are a secure way to upload any kind of file to your S3 bucket. Plus, they allow you to avoid creating and managing roles, as well as changing the bucket ACL or providing users with special accounts to upload files.
As you can imagine, they are especially useful when it comes to multipart uploads. This is because you can generate a pre-signed URL for each part your original object was split into, and then upload the part with its respective URL.
If you want to follow an approach to multipart upload involving pre-signed URLs, you have to consider a new step. This is the list of all steps required:
Now, let’s delve into the pros and cons of the pre-signed URL approach to multipart upload.
You can clone the GitHub repository supporting this article and immediately try the demo application by launching the following commands:
git clone https://github.com/Tonel/multipart-upload-js-demo cd multipart-upload-js-demo npm i npm backend:start npm frontend:start
To make the application work, you have to address the “TODO” left in the /backend/controllers/upload.js
file asking you to add your S3
credentials in the code.
Otherwise, keep following this step-by-step tutorial to learn how to build it and how it works.
To implement multipart upload signed URLs in S3, you need three APIs. But first, you require an AWS.S3
instance.
You can initialize it using your S3 credentials and the aws-sdk
library as follows:
const AWS = require("aws-sdk") const s3Endpoint = new AWS.Endpoint("<YOUR_ENDPOINT>") const s3Credentials = new AWS.Credentials({ accessKeyId: "<YOUR_ACCESS_KEY>", secretAccessKey: "<YOUR_SECRET_KEY>", }) const s3 = new AWS.S3({ endpoint: s3Endpoint, credentials: s3Credentials, })
Now, let’s now delve deeper into how to build the three APIs to implement multipart upload:
/uploads/initializeMultipartUpload
async function initializeMultipartUpload(req, res) {
const { name } = req.body
const multipartParams = {
Bucket:"",
Key: ${name}
, ACL: "public-read", } const multipartUpload = await s3.createMultipartUpload(multipartParams). promise() res.send({ fileId: multipartUpload.UploadId, fileKey: multipartUpload.Key, }) },
The name
parameter passed in the body of the request represents the name of the file that will be created in the cloud storage at the end of the multipart upload operation. This API takes care of initializing a multipart upload request by calling the createMultipartUpload()
function available from the previously created s3
object.
This API is necessary because to perform a multipart upload request, you need the UploadId
value. This is used by the cloud storage service to associate all the parts involved in the multipart upload at the end of the process, while the Key
parameter represents the full name of the file.
/uploads/getMultipartPreSignedUrls
async function getMultipartPreSignedUrls(req, res) { const { fileKey, fileId, parts } = req.bodyconst multipartParams = { Bucket: BUCKET_NAME, Key: fileKey, UploadId: fileId, }const promises = [] for (let index = 0; index < parts; index++) { promises.push( s3.getSignedUrlPromise("uploadPart", { ...multipartParams, PartNumber: index + 1, }), ) } const signedUrls = await Promise.all(promises) // assign to each URL the index of the part to which it corresponds const partSignedUrlList = signedUrls.map((signedUrl, index) => { return { signedUrl: signedUrl, PartNumber: index + 1, } }) res.send({ parts: partSignedUrlList, }) },
This API is responsible for returning the pre-signed URLs associated with the parts involved in the multipart request. It requires the fileKey
and fileId
parameter retrievable from the previous API, and the number of parts
the original file was split into to upload through multipart upload.
Then, it uses this info to generate the S3 pre-signed URLs by calling the getSignedUrlPromise()
function on the s3
object. As you can see, the PartNumber
parameter telling which part the URL is associated with is an index starting from 1
. Using an index starting from 0
would lead to an error.
/uploads/finalizeMultipartUpload
async function finalizeMultipartUpload(req, res) { const { fileId, fileKey, parts } = req.bodyconst multipartParams = { Bucket: BUCKET_NAME, Key: fileKey, UploadId: fileId, MultipartUpload: { // ordering the parts to make sure they are in the right order Parts: _.orderBy(parts, ["PartNumber"], ["asc"]), }, }const completeMultipartUploadOutput = await s3.completeMultipartUpload(multipartParams).promise() // completeMultipartUploadOutput.Location represents the // URL to the resource just uploaded to the cloud storage res.send() },
This last API finalizes a multipart upload request. Again, it requires the fileId
and fileKey
coming from the first API. Also, it requires the parts
parameter, which is a list of objects having the following type:
{ PartNumber: number ETag: string }
As defined in the official documentation, an ETag
is an ID that identifies a newly created object’s data. As you will see soon, this can be retrieved in the response header of a successful upload request executed with a pre-signed URL.
Then, this data is used to call the completeMultipartUpload()
function, which finalizes the multipart upload request and makes the uploaded object available in the cloud storage. Notice that the Parts
field of MultipartUpload
must have an ordered list, and the Lodash orderBy()
is used to ensure that.
Now, you have everything required to start performing multipart upload requests on your frontend application. Let’s learn how.
Dealing with multipart upload on the frontend is a bit tricky. This is particularly true if you want to upload many parts in parallel and plan to provide users with the ability to abort the operation. So, instead of reinventing the wheel, you should adapt the utility class coming from this repository to your needs.
To be more specific, you can implement a utility class to perform multipart upload:
import axios from "axios" // initializing axios const api = axios.create({ baseURL: "http://localhost:3000", }) // original source: https://github.com/pilovm/multithreaded-uploader/blob/master/frontend/uploader.js export class Uploader { constructor(options) { // this must be bigger than or equal to 5MB, // otherwise AWS will respond with: // "Your proposed upload is smaller than the minimum allowed size" this.chunkSize = options.chunkSize || 1024 * 1024 * 5 // number of parallel uploads this.threadsQuantity = Math.min(options.threadsQuantity || 5, 15) this.file = options.file this.fileName = options.fileName this.aborted = false this.uploadedSize = 0 this.progressCache = {} this.activeConnections = {} this.parts = [] this.uploadedParts = [] this.fileId = null this.fileKey = null this.onProgressFn = () => {} this.onErrorFn = () => {} } // starting the multipart upload request start() { this.initialize() } async initialize() { try { // adding the the file extension (if present) to fileName let fileName = this.fileName const ext = this.file.name.split(".").pop() if (ext) { fileName += `.${ext}` } // initializing the multipart request const videoInitializationUploadInput = { name: fileName, } const initializeReponse = await api.request({ url: "/uploads/initializeMultipartUpload", method: "POST", data: videoInitializationUploadInput, }) const AWSFileDataOutput = initializeReponse.data this.fileId = AWSFileDataOutput.fileId this.fileKey = AWSFileDataOutput.fileKey // retrieving the pre-signed URLs const numberOfparts = Math.ceil(this.file.size / this.chunkSize) const AWSMultipartFileDataInput = { fileId: this.fileId, fileKey: this.fileKey, parts: numberOfparts, } const urlsResponse = await api.request({ url: "/uploads/getMultipartPreSignedUrls", method: "POST", data: AWSMultipartFileDataInput, }) const newParts = urlsResponse.data.parts this.parts.push(...newParts) this.sendNext() } catch (error) { await this.complete(error) } } sendNext() { const activeConnections = Object.keys(this.activeConnections).length if (activeConnections >= this.threadsQuantity) { return } if (!this.parts.length) { if (!activeConnections) { this.complete() } return } const part = this.parts.pop() if (this.file && part) { const sentSize = (part.PartNumber - 1) * this.chunkSize const chunk = this.file.slice(sentSize, sentSize + this.chunkSize) const sendChunkStarted = () => { this.sendNext() } this.sendChunk(chunk, part, sendChunkStarted) .then(() => { this.sendNext() }) .catch((error) => { this.parts.push(part) this.complete(error) }) } } // terminating the multipart upload request on success or failure async complete(error) { if (error && !this.aborted) { this.onErrorFn(error) return } if (error) { this.onErrorFn(error) return } try { await this.sendCompleteRequest() } catch (error) { this.onErrorFn(error) } } // finalizing the multipart upload request on success by calling // the finalization API async sendCompleteRequest() { if (this.fileId && this.fileKey) { const videoFinalizationMultiPartInput = { fileId: this.fileId, fileKey: this.fileKey, parts: this.uploadedParts, } await api.request({ url: "/uploads/finalizeMultipartUpload", method: "POST", data: videoFinalizationMultiPartInput, }) } } sendChunk(chunk, part, sendChunkStarted) { return new Promise((resolve, reject) => { this.upload(chunk, part, sendChunkStarted) .then((status) => { if (status !== 200) { reject(new Error("Failed chunk upload")) return } resolve() }) .catch((error) => { reject(error) }) }) } // calculating the current progress of the multipart upload request handleProgress(part, event) { if (this.file) { if (event.type === "progress" || event.type === "error" || event.type === "abort") { this.progressCache[part] = event.loaded } if (event.type === "uploaded") { this.uploadedSize += this.progressCache[part] || 0 delete this.progressCache[part] } const inProgress = Object.keys(this.progressCache) .map(Number) .reduce((memo, id) => (memo += this.progressCache[id]), 0) const sent = Math.min(this.uploadedSize + inProgress, this.file.size) const total = this.file.size const percentage = Math.round((sent / total) * 100) this.onProgressFn({ sent: sent, total: total, percentage: percentage, }) } } // uploading a part through its pre-signed URL upload(file, part, sendChunkStarted) { // uploading each part with its pre-signed URL return new Promise((resolve, reject) => { if (this.fileId && this.fileKey) { // - 1 because PartNumber is an index starting from 1 and not 0 const xhr = (this.activeConnections[part.PartNumber - 1] = new XMLHttpRequest()) sendChunkStarted() const progressListener = this.handleProgress.bind(this, part.PartNumber - 1) xhr.upload.addEventListener("progress", progressListener) xhr.addEventListener("error", progressListener) xhr.addEventListener("abort", progressListener) xhr.addEventListener("loadend", progressListener) xhr.open("PUT", part.signedUrl) xhr.onreadystatechange = () => { if (xhr.readyState === 4 && xhr.status === 200) { // retrieving the ETag parameter from the HTTP headers const ETag = xhr.getResponseHeader("ETag") if (ETag) { const uploadedPart = { PartNumber: part.PartNumber, // removing the " enclosing carachters from // the raw ETag ETag: ETag.replaceAll('"', ""), } this.uploadedParts.push(uploadedPart) resolve(xhr.status) delete this.activeConnections[part.PartNumber - 1] } } } xhr.onerror = (error) => { reject(error) delete this.activeConnections[part.PartNumber - 1] } xhr.onabort = () => { reject(new Error("Upload canceled by user")) delete this.activeConnections[part.PartNumber - 1] } xhr.send(file) } }) } onProgress(onProgress) { this.onProgressFn = onProgress return this } onError(onError) { this.onErrorFn = onError return this } abort() { Object.keys(this.activeConnections) .map(Number) .forEach((id) => { this.activeConnections[id].abort() }) this.aborted = true } }
As you are about to see, this utility class allows you to perform a multipart request in a bunch of lines of code. The Uploader
utility class uses the axios
API client, but any other Promise-based API request will do.
This utility class takes care of splitting the file
parameter representing the object in order to upload what’s received in the constructor into smaller parts of 5 MB each. Then, it initializes the multipart upload request (uploading at most 15 parts in parallel at a time) and finally calls the finalization API defined in the previous chapter to complete the request.
In case of an error, the sendNext()
function takes care of putting the part whose upload failed back into the queue. In case of fatal errors or deliberate interruption, the upload process is stopped.
The most relevant part of the utility class is represented by the upload()
function. This is where each part is uploaded through a pre-signed URL and its corresponding ETag
value is retrieved.
Now, let’s see how you can employ the Uploader
class:
import "./App.css" import { Uploader } from "./utils/upload" import { useEffect, useState } from "react" export default function App() { const [file, setFile] = useState(undefined) const [uploader, setUploader] = useState(undefined) useEffect(() => { if (file) { let percentage = undefined const videoUploaderOptions = { fileName: "foo", file: file, } const uploader = new Uploader(videoUploaderOptions) setUploader(uploader) uploader .onProgress(({ percentage: newPercentage }) => { // to avoid the same percentage to be logged twice if (newPercentage !== percentage) { percentage = newPercentage console.log(`${percentage}%`) } }) .onError((error) => { setFile(undefined) console.error(error) }) uploader.start() } }, [file]) const onCancel = () => { if (uploader) { uploader.abort() setFile(undefined) } } return ( <div className="App"> <h1>Upload your file</h1> <div> <input type="file" onChange={(e) => { setFile(e.target?.files?.[0]) }} /> </div> <div> <button onClick={onCancel}>Cancel</button> </div> </div> ) }
As soon as a file is uploaded through the <input>
with type="file"
HTML element, the useEffect()
hook is performed. There, the Uploader
utility class is harnessed to automatically manage the multipart upload request accordingly. While the upload process is taking place, you can press the Cancel button to abort the operation.
Et voilá! Your demo application to upload large files through multipart upload is ready!
In this article, we looked at what S3 multipart upload is, how it works, why you should implement it by using S3 pre-signed URLs, and how to do it in Node.js and React.
As explained, multipart upload is an efficient, officially recommended, controllable way to deal with uploads of large files. This is particularly true when using S3 pre-signed URLs, which allow you to perform multipart upload in a secure way without exposing any info about your buckets.
Putting in place a server in Node.js to implement multipart upload with pre-signed URLs involves only three APIs, and it cannot be considered a difficult task. On the other hand, dealing with file splitting and parallel upload on the frontend side is a bit more tricky, but it can definitely be implemented. As we saw, you only have to define a general-purpose and reusable utility class to deal with any multipart uploads, and here we learned how to do that.
Thanks for reading! I hope that you found this article helpful. Feel free to reach out to me with any questions, comments, or suggestions.
Install LogRocket via npm or script tag. LogRocket.init()
must be called client-side, not
server-side
$ npm i --save logrocket // Code: import LogRocket from 'logrocket'; LogRocket.init('app/id');
// Add to your HTML: <script src="https://cdn.lr-ingest.com/LogRocket.min.js"></script> <script>window.LogRocket && window.LogRocket.init('app/id');</script>
Hey there, want to help make our blog better?
Join LogRocket’s Content Advisory Board. You’ll help inform the type of content we create and get access to exclusive meetups, social accreditation, and swag.
Sign up nowWhether you’re part of the typed club or not, one function within TypeScript that can make life a lot easier is object destructuring.
useState
useState
can effectively replace ref
in many scenarios and prevent Nuxt hydration mismatches that can lead to unexpected behavior and errors.
Explore the evolution of list components in React Native, from `ScrollView`, `FlatList`, `SectionList`, to the recent `FlashList`.
Explore the benefits of building your own AI agent from scratch using Langbase, BaseUI, and Open AI, in a demo Next.js project.
22 Replies to "Multipart uploads with S3 in Node.js and React"
When I try to access the etag header on the upload response, I get “Refused to get unsafe header “ETag”. Seems the destination has cors configured and will not let me get the etag. Not sure how you got around this problem?
Hi, I am the author of the article!
If you get the error when trying to upload a portion of the file with a pre-signed URL, it means that it must be due to a wrong configuration on S3.
i cloned your repo to run locally. It doesn’t seem to work for me when I try uploading a file > 5MB; it gets stuck in the middle(~35%, depending on the file size). Am I missing anything?
Hey! It’s the author of the article here.
Could you please expand on what error you get?
Also, make sure your S3-like provider support multiple upload and your Internet connection doesn’t fail.
Thank you for your response. I was able to fix the issue by adding “ExposeHeaders”: [ “ETag”] in the bucket CORS policy. Adding it here for someone who might come across the same problem.
Thanks for pointing that out!
What possible solution could be to make an API call from useEffect after the file is uploaded 100% and the finalizeMultipartUpload API is executed successfully? My API require file to be available in the bucket, and i don’t want to call the API inside the class. Any input would be appreciated.
You could extend Uploader to accept a callback in videoUploaderOptions. This function will be called after
await api.request({
url: “/uploads/finalizeMultipartUpload”,
method: “POST”,
data: videoFinalizationMultiPartInput,
})
Also, wondering how to abort the specific file upload, especially if multiple files are getting uploaded?
There is an abort() function for that.
Nice article! Here presigned URLs were created by the service so we offloaded the server from upload work hence making it scalable solution. What do you think about the download part (multiple files as zip), should that also be done via presigned URLs?
Thanks! I’m the author here!
When it comes to downloading, presigned URLs are particularly useful to share files to users that don’t have access to the bucket. But, you could also use them to download a very large file, one chunk at a time.
Hi, great article thank you. Quick questions, why are the endpoints POST and not GET?
Hey! Thanks!
It’s the auther here.
I used POST and not GET because you are sending data, not retrieving it.
Hey I am getting this error. Proxy error: Could not proxy request /uploads/initializeMultipartUpload from localhost:3000 to http://localhost:3001.
I have below config.
frontend/uti/upload.js
const api = axios.create({
baseURL: “http://localhost:3000”,
})
backend/controller/upload.js
const s3Endpoint = new AWS.Endpoint(“https://”)
package.json
“proxy”: “http://localhost:3001/”
Did you start the server with `npm run backend:start`?
Hi Yes I started both npm run backend:start and npm run frontend:start.
At browser its giving error as “XML Parsing Error: syntax error
Location: http://localhost:3000/uploads/initializeMultipartUpload
Line Number 1, Column 1:”
At backend terminal its giving error: [nodemon] app crashed – waiting for file changes before starting…
errno: -3008,
code: ‘NetworkingError’,
syscall: ‘getaddrinfo’,
At frontend terminal its giving error as: Proxy error: Could not proxy request /uploads/initializeMultipartUpload from localhost:3000 to http://127.0.0.1:3001/.
See https://nodejs.org/api/errors.html#errors_common_system_errors for more information (ECONNRESET).
Hi! I having the same issue. Was you able to resolve it?
will it continue uploading if internet is disconnected for few mins
Right now, it would faild, but you can easily change that by updating the login below:
From
this.sendChunk(chunk, part, sendChunkStarted)
.then(() => {
this.sendNext()
})
.catch((error) => {
this.parts.push(part)
this.complete(error)
})
To
this.sendChunk(chunk, part, sendChunkStarted)
.then(() => {
this.sendNext()
})
.catch((error) => {
this.parts.push(part)
})
cess to XMLHttpRequest at ” from origin ‘http://localhost:3000’ has been blocked by CORS policy: Response to preflight request doesn’t pass access control check: No ‘Access-Control-Allow-Origin’ header is present on the requested resource.
Great article. I tested it and I’m getting a BAD REQUEST on the OPTIONS which is the CORS preflight done by the browser on the S3 bucket.
As I debug it can you let me know if the OPTIONS request is supposed to be sent or maybe is some kind of configuration on my side which triggers it?