Skip to main content

How to scrape and monitor data behind a login

Your robot can log into websites to extract data using either session cookies or credentials for direct login.

M
Written by Melissa Shires
Updated over 2 weeks ago

You can train a robot to extract data behind a login using two methods:

  • Using your session cookies (recommended for most cases)

  • Using your user credentials (username and password)

Quick decision guide

If your site has...

Use this method

Why

Standard login (no 2FA)

Session cookies

Fewer steps, higher success rate

Two-factor authentication

Session cookies

Bypasses 2FA complexity

IP-restricted access

Username/password

Cookies may be rejected from different IPs

Frequently changing login page

Session cookies

Avoids retraining for UI changes

High-security requirements

Username/password

Some sites block external cookies

Method 1: Login with session cookies (recommended)

Session cookies store your authenticated state. When you're logged into a website, the robot can use these cookies to access protected content without going through the login process.

How to set up cookie-based login

βœ… You must be logged into the website in your browser before starting.

  1. Start training your robot in Robot Studio.

  2. Select This website needs logging in, and then select Login with my session cookies.

Your cookies are then automatically safely encrypted and stored securely. From here continue to train your robot to extract and monitor the data you need.

Benefits of session cookies

  • No login steps during extraction (faster, more reliable)

  • Works with most 2FA/MFA systems

  • No need to record login interactions

  • Higher success rates overall

Limitations of cookies

  • Cookies expire (need periodic updates)

  • Some sites reject cookies from different IPs

  • Not guaranteed with all 2FA implementations

Maintaining cookie-based login

When to update cookies:

  • Robot starts failing with authentication errors

  • After you've logged out and back in

  • Following password changes

  • Periodic refresh (monthly recommended)

How to update:

βœ… You must be logged into the website in your browser first.

  1. Go to your robot β†’ Settings tab

  2. Find Authentication section

  3. Click Update Session Cookies

Method 2: Login with username and password

Train your robot to enter your username and password and submit them, mimicking human login behavior.

How to set up credential-based login

  1. Start training your robot in Robot Studio.

  2. Select This website needs logging in, and then select Login with my password.

  3. Login using your username and password during training and click to login.

πŸ’‘ Your credentials are encrypted and stored on Browse AI's secure AWS infrastructure. We are SOC II Type 2 certified, you can learn more about our security measures here.

Benefits of using credentials

  • Works on IP-restricted sites

  • Doesn't expire like cookies

  • Consistent across different locations

  • No manual updates needed

Limitations of credentials

  • Additional interactions (typing, clicking) introduce more potential failure points

  • May not work well with websites that frequently change their login interface

  • Can be problematic with sites that use 2FA/MFA

  • May trigger security alerts if the login occurs from different IP addresses

Editing or updating login credentials

If your password or login details have changed, or if you're experiencing issues with credential login you'll need to retrain your robot.

  1. Go to your robot β†’ Settings tab

  2. Find Danger Zone section

  3. Click Re-train robot

πŸ’‘ When training your robot with login credentials it's best to keep the login process as simple as possible.

Choosing the right method

Use session cookies when:

  • Site has 2FA/MFA authentication

  • You want maximum reliability

  • Login page changes frequently

  • You're already logged in regularly

  • Speed is important

Use credentials when:

  • Site restricts cookie usage by IP

  • You need fully automated solution

  • Cookies consistently fail

  • Site has simple login (no 2FA)

  • Running from multiple locations

Common authentication types

Auth type

Cookie method

Credential method

Notes

Basic login

βœ… Works well

βœ… Works well

Either method suitable

2FA (SMS/App)

βœ… Best choice

❌ Usually fails

Cookies bypass 2FA

SSO (Google/Microsoft)

βœ… Recommended

⚠️ Complex

Multiple redirects problematic

CAPTCHA on login

βœ… Bypasses

⚠️ May work

Biometric

βœ… After auth

❌ Not possible

Use cookies post-authentication

Security considerations

How Browse AI protects your data

All login information (both cookies and credentials) is:

  • Encrypted during transmission and storage

  • Stored on secure AWS infrastructure

  • Only used for the purpose of running your robots

Website detection risks

While Browse AI employs techniques to mimic human browsing, some websites with strict security may detect automated logins, especially when:

  • Logins occur from multiple IP addresses

  • Multiple requests are made in short succession

  • Interaction patterns differ from typical human behavior

If your automation needs pose this risk, consider local automation, which utilizes your personal IP address, for enhanced discretion. For public data extraction, our robots are a reliable and secure solution.

Troubleshooting login issues

Diagnostic checklist

Issue

Cookie method fix

Credential method fix

"Not logged in" error

Update session cookies

Verify credentials still valid

Partial data extraction

Check cookie permissions

Login may be incomplete

Intermittent failures

Cookies may be expiring

Site may have rate limiting

Sudden stop working

Site may have logged you out

Login page likely changed

2FA challenges appearing

Refresh cookies after 2FA

Not fixable - switch to cookies

Common solutions

  1. Verify manual login works

    • Test credentials manually

    • Check for new security features

    • Note any new steps required

  2. For cookie failures:

    • Clear browser cache

    • Log out and back in

    • Update cookies immediately

    • Try incognito mode

  3. For credential failures:

    • Retrain robot with current flow

    • Simplify login steps

    • Check for CAPTCHA

    • Consider switching to cookies

When authentication won't work

Some sites cannot be accessed:

  • Hardware token requirements

  • Biometric-only authentication

  • IP whitelist restrictions

  • Advanced bot detection

Alternatives:

  • Check for API access

  • Export features on the site

  • Manual extraction

  • Browse AI managed services

Best practices

Do's

βœ… Test login method before scaling
βœ… Monitor success rates regularly
βœ… Update cookies proactively
βœ… Use the simplest method that works
βœ… Document which method you chose


Don'ts

❌ Share login credentials
❌ Use compromised accounts
❌ Violate terms of service
❌ Ignore security warnings
❌ Extract sensitive personal data

Did this answer your question?