Skip to main content

Integrating Browse AI with Amazon S3

Securely export your scraped data to your own AWS S3 bucket (no coding required) with CloudFormation templates.

Kimber avatar
Written by Kimber
Updated this week

Setting up Browse AI to export data to your S3 bucket is straightforward with our CloudFormation template approach. This guide walks you through creating the necessary permissions while maintaining security best practices.

Why connect Browse AI to S3?

This integration allows you to:

  • Automatically export data extracted by your robots to your own storage

  • Maintain control over your extracted datasets

  • Integrate Browse AI data with your existing AWS workflows

  • Keep a secure backup of all your extracted data

What you'll need

  • An Amazon Web Services (AWS) account with permissions to create CloudFormation stacks and IAM roles.

  • The name of your S3 bucket where you want Browse AI to export data.

  • Optional: A specific folder path within your bucket if you want to restrict access. For example, if your bucket is named my-data, and the folder is named browse-ai-exports, you'd use browse-ai-exports.
    ​

How to set up the Amazon S3 integration

Step 1. Access AWS Management Console

  1. Search for CloudFormation in the top search and select it.

What is CloudFormation? CloudFormation helps automate the setup process for AWS resources.​

Step 2. Create a new stack

  1. Click "Create stack".

  2. Select "With new resources (standard)" only if prompted.

  3. Under "Prerequisite - Prepare template," choose "Choose an existing template."

  4. For "Template source," select "Amazon S3 URL."

  5. Paste the url below in the text field and click "Next".

https://browse-ai-integration.s3.us-east-1.amazonaws.com/s3-integration.yaml

Step 3. Configure stack details

  1. Stack name: enter a descriptive name (ex: "BrowseAI-S3-Access").

  2. BucketName: Enter your exact S3 bucket name.

  3. ExternalId: Generate a random 20-character string using letters and numbers. This serves as an additional security verification.

  4. BucketPathName (optional): Enter a folder name if you want to restrict Browse AI's access to a specific folder. Leave blank for full bucket access.

  5. Click "Next."

Step 4: Configure stack options

  1. The default options are typically sufficient, but you can add tags if needed.

  2. Click "Next."

  3. Review your settings carefully.

  4. Check the box acknowledging that CloudFormation might create IAM resources.

  5. Click "Submit" to deploy the stack.

Step 5: Retrieve the IAM Role ARN

  1. After stack creation completes, go to the "Stack info" tab.

  2. Find the "S3AccessRole" in the list and click its Physical ID link.

  3. In the IAM console that opens, find and copy the Role ARN from the Summary section. It will look similar to:

arn:aws:iam::123456789012:role/BrowseAI-S3-Access-S3AccessRole-ABCDEFGH1234

Step 6: Connect to Browse AI

  1. In your Browse AI dashboard, select your robot and go to the "Integrate" tab.

  2. Select "AWS" from the integration options.

  3. Choose "+ Add new S3 bucket" from the dropdown menu.

  4. Complete the form with these details:

    1. AWS Region: Your S3 bucket's region (e.g., us-east-1)

    2. Role ARN: Paste the ARN you copied

    3. External ID: Enter the same 20-character string you created earlier

    4. Bucket Name: Your S3 bucket name

    5. Bucket Path: The folder path you specified (if any)

  5. Click "Add" to complete the connection.

After the steps above are complete, when you attempt to export Tables, you will have an option to send the exported files directly to your AWS S3 bucket.

File naming convention

Once your AWS S3 integration is configured, when you export data from Tables, you can choose to send it directly to your bucket. This section describes where those files will be stored and what they will be named.

Every Table export has a unique ID. For every export, a directory will be created under your bucket and optionally, under the folder path you chose during setup. This directory will be named export_{ISO 8601 timestamp}_{unique export ID}.

Within this folder, there will be at least one file per exported tab. If the file size is not large (for example, smaller than 100mb), the files will be named {tab name}.{format; csv or json}. If the file size is large, it will be chunked into multiple files with the naming convention of {tab name}.part{n}.{format; csv or json}.

  • Example 1 (combined file with small size): export_20250415T175912Z_32ffd473-4476-42bc-96bb-1857e4fa4add/Main.json

  • Example 2 (combined file with large size): export_20250415T175912Z_32ffd473-4476-42bc-96bb-1857e4fa4add/Main.part1.json export_20250415T175912Z_32ffd473-4476-42bc-96bb-1857e4fa4add/Main.part2.json ...

Alternatively, if you choose to export a separate file per record, the files will be named {unique record ID}.json.

  • Example 3 (separate files per record): export_20250415T175912Z_32ffd473-4476-42bc-96bb-1857e4fa4add/0005541b-d61d-42d7-ba24-17600e0068e5.json

Security details

This setup uses AWS best practices to keep your data secure:

  • Browse AI receives only the minimum permissions needed to export data to your bucket.

  • The External ID provides an additional security layer, preventing unauthorized access.

  • You can restrict access to a specific folder within your bucket.

  • All permissions can be revoked at any time by deleting the CloudFormation stack.

Troubleshooting

Stack creation fails:

  • Verify you have sufficient IAM permissions in your AWS account.

  • Check that the bucket name is exactly correct.

  • Ensure the bucket exists in your AWS account.

Can't find the IAM Role ARN:

  • Check the "Resources" tab in your CloudFormation stack.

  • If the stack shows "CREATE_COMPLETE" but no resources appear, try refreshing the page.

Browse AI connection fails:

  • Double-check that the External ID matches exactly.

  • Verify the bucket name and region are correct.

  • Ensure the IAM role hasn't been modified manually after creation.

Next steps

After connecting your S3 bucket, you can:

  • Configure your robot to export its extracted data to your bucket.

  • Set up automated workflows in AWS to process the exported data.

  • Create backups or data transformations using AWS services.

Need more help? Check out AWS documentation or reach out to our support team.

Did this answer your question?