Browser Fingerprints 101: Automation Detection

Browser Fingerprinting

Browser fingerprinting is a technology to identify and track users, which generates a unique "fingerprint" by collecting information from the browser and device (such as User-Agent, screen resolution, fonts, etc.). This technology can also be used to determine whether a browser is controlled by a human or an automated tool.

For example, when using Selenium to control a browser, a property named _selenium is added to the window object. Puppeteer, when automating Chrome, sets the window.navigator.webdriver property to "true". By checking these properties, websites can determine whether users are using automation tools.

In addition, the operations of automation tools are often very regular and rapid, which are significantly different from the behavior of normal users. For example, human users usually pause briefly after each character is typed when filling out a form, while automated tools may fill out the entire form in a very short time. By analyzing these behaviors, websites can determine whether users are using automated tools.

In summary, websites use various fingerprinting technologies to identify users. Once it is found that the user is a "robot", the account may be banned, or the IP address may be blacklisted, etc.

How Automation Tools Affect Browser Fingerprints

Automation tools like Selenium and Puppeteer can affect the browser fingerprints.

Selenium is a commonly used automation tool that supports multiple browsers (Chrome, Firefox, Safari) and communicates with the browser through WebDriver. It can control the browser through various programming languages (such as Java, Python, C#, etc.) and can simulate various human user behaviors, such as clicking, scrolling, filling in forms, etc.

Puppeteer communicates with Chrome or Chromium browser via the DevTools protocol. This protocol allows Puppeteer to access many internal features of the browser, including network request interception, PDF generation, screenshots, etc.

When using these technologies to operate the browser, it will affect the browser fingerprints:

User-Agent

With Selenium and Puppeteer, you can set the browser's User-Agent. If your custom User-Agent does not conform to the conventional or is unreasonable, it may be considered as a robot by the website.

JavaScript Environment

When Selenium or Puppeteer controls the browser, it sets the window.navigator.webdriver property to true. This is a standard way to let websites know that the browser is being controlled by automation tools. However, this also means that any website that can execute JavaScript can detect this property.

Feature Code

When using Selenium to control the browser, it leaves many character feature codes with "selenium". As soon as the website recognizes these, the user is undoubtedly controlled by Selenium:

webdriver  
__driver_evaluate  
__webdriver_evaluate  
__selenium_evaluate  
__fxdriver_evaluate  
__driver_unwrapped  
__webdriver_unwrapped  
__selenium_unwrapped  
__fxdriver_unwrapped  
_Selenium_IDE_Recorder  
_selenium  
calledSelenium  
_WEBDRIVER_ELEM_CACHE  
ChromeDriverw  
driver-evaluate  
webdriver-evaluate  
selenium-evaluate  
webdriverCommand  
webdriver-evaluate-response  
__webdriverFunc  
__webdriver_script_fn  
__$webdriverAsyncExecutor  
__lastWatirAlert  
__lastWatirConfirm  
__lastWatirPrompt

Can Automation Detection Be Bypassed?

Indeed, it can, as long as you pay attention to remove the code or strings with robot features when writing scripts, and try to simulate human operations as much as possible when the script controls the browser, such as adding delays, etc.

Before you use automation scripts to access the target website, you can use BrowserScan Webdriver tool to check whether your browser is like a real human user.

We have written a series of articles focused on browser fingerprinting for you. These articles are tailored to provide you with an array of resources, aimed at helping you gain a deeper understanding of browser fingerprinting.

  1. Browser Fingerprinting Guide for Beginners

  2. IP Address

  3. UserAgent

  4. WebRTC Leak

  5. Canvas Fingerprinting

  6. Do Not Track

  7. WebGL Fingerprinting

  8. Geolocation

  9. Language

  10. Media Device & Audio

  11. Client Rects & Font

  12. Port Scanning

  13. Automation Detection