Skip to main content
A signed URL is a web link that provides temporary, secure access to a single, specific file in your cloud storage bucket. Think of it as a time-sensitive guest pass to your data. Each URL is “signed” with your own secret cryptographic keys. This signature validates the request and is only valid for a limited time, which you define (e.g., 7 days). After the URL expires, the link no longer works, and access to the file is revoked. This method allows you to grant Labelbox access to your data without configuring a permanent IAM trust relationship. However, it shifts the responsibility for managing access credentials entirely to you.

How signed URLs work

  1. You: Generate URLs — You write and run a script on your own infrastructure that generates a unique signed URL for each data asset you want to label. You must set an expiration date for these URLs.
  2. You: Format and upload — You create a JSON file that maps each data asset to its corresponding signed URL. You then import this JSON file when creating a new dataset in Labelbox.
  3. Labelbox: Renders data — When a user opens a labeling task, Labelbox retrieves the corresponding signed URL from your imported JSON file. The Labelbox application uses that URL to fetch the data directly from your bucket and render it in the editor.
  4. Your cloud provider: Validates — Your cloud provider inspects the signature on the URL. If the signature is valid and the URL has not expired, it serves the file to the Labelbox application. If not, it returns an error.
Signed URLs permit access to resources without requiring updates to policies or other configuration settings. These URLs are usually temporary and only valid for a limited period. Cloud providers generate a hash value for each signed URL to authenticate access requests. If the hash value is missing or expired, access to the resource is denied. Here’s an example signed URL with an access token generated by the server hosting the file as the hash value:
http://example.com/filename?hash=DMF1ucDxtqgxwYQ==

Key advantages

  • Fast to start: This method is the quickest way to begin a project, as it bypasses the need for initial IAM configuration in your cloud environment. It’s excellent for short-term projects, pilots, or proof-of-concepts.
  • Granular, explicit control: You have explicit, file-level control over exactly which assets are accessible and for precisely how long.

Step-by-step instructions

Here is a step-by-step guide to using signed URLs with Labelbox.

Prerequisites

Before you start, make sure you have the following:
  • Your data is stored in a cloud storage bucket (AWS S3, Google Cloud Storage, or Azure Storage).
  • You have the necessary permissions in your cloud environment to generate signed URLs.
  • You have a local environment set up to run scripts (e.g., Python, Node.js) for generating the signed URLs.

Step 1: Generate your signed URLs

For each file you want to import into Labelbox, you need to generate a signed URL. This is typically done by writing a script that uses your cloud provider’s SDK. Refer to your cloud provider’s official documentation for instructions and code samples:

Step 2: Create your JSON file

Once you have generated your signed URLs, you will need to create a JSON file that you will upload to Labelbox. This file maps each of your data assets to its corresponding signed URL. For details on the JSON format for different data types, please refer to the following documentation:

Images

Text

Geospatial

Conversational text

Videos

Audio

Documents

HTML

Step 3: Import your JSON file into Labelbox

With your JSON file created, you can now import it into Labelbox to create your dataset.
  1. Go to Catalog.
  2. Navigate to the Import data section.
  3. Select the option to upload a JSON file.
  4. Upload the JSON file you created in the previous step.
  5. Labelbox will process the file and create a new dataset with your data.

Security best practices

  • Use Short-Lived Expiration Times: Set the expiration time for your signed URLs to be as short as possible. This minimizes the risk of unauthorized access if a URL is accidentally exposed. For most labeling projects, an expiration of 7 days is sufficient.
  • Principle of Least Privilege: Only generate signed URLs for the specific files that need to be labeled. Avoid generating URLs for entire buckets or directories.
  • Audit Regularly: Keep logs of the signed URLs you generate and review them periodically to ensure there is no unusual activity.