What is Selenium WebDriver & its Architecture

Selenium WebDriver is a powerful open-source automation tool for automating web applications. In this article, we will try to understand what is Selenium Webdriver and its component. We will also see Selenium WebDriver Architecture in detail.

What is Selenium?

Selenium is an Open Source Functional Automation testing tool used to automate only Web-based applications. If you have to automate windows components like the windows authentication dialog box or file upload dialog box along with webpage automation in that situation you can use AutoIT or Sikuli with Selenium.

Selenium supports different programming languages such as Java, C#, Python, Perl, and many more for writing test automation scripts. It can run on multiple browsers across multiple operating systems. Selenium Webdriver Supports multiple locators to identify web elements. If you want to know about locators you can refer to my post on Locators in Selenium Webdriver with Examples

Selenium Suite of Tools

The major components of Selenium are:

  • Selenium Grid
  • Selenium IDE
  • Selenium RC
  • Selenium Webdriver

Selenium Grid

Selenium Grid is a tool that provides a provision to run the Selenium test parallel on different machines against different browsers and operating systems.

Selenium IDE

IDE (Integrated Development Environment) is the tool one can use to develop Selenium test cases by recording user actions on the browser. You can install Selenium IDE as an extension of Chrome and Firefox. Using Selenium IDE you can quickly record the steps and further enhance it to develop test cases in an efficient way. 

Selenium RC

Selenium RC which is also known as Selenium1 is now officially deprecated. It can be used to write dynamic scripts that could work on multiple browsers without any specific browser driver as it is required with the Webdriver.

What is Selenium Webdriver?

Selenium WebDriver is a collection of open-source APIs and one of the most important components of Selenium Tool’s Suite. It supports many browsers such as Chrome, Firefox, Microsoft Edge, IE, and Safari by using their respective browser drivers. Selenium WebDriver is not a newer version of Selenium RC instead it’s altogether a different tool.

Selenium WebDriver uses browser automation APIs provided by browser vendors to control browser and run tests. It supports various modern programming languages such as Java, C#, Python, PHP, Perl, and Ruby.

Operating System Support By Selenium WebDriver

  • Windows
  • Mac OS
  • Linux
  • Solaris

Browser Supported by Selenium WebDriver

  • Internet Explorer
  • Edge
  • Mozilla Firefox
  • Google Chrome
  • Safari
  • Opera 
  • HtmlUnit (a headless browser)

Selenium WebDriver Architecture

Selenium Webdriver API acts as an interpreter and helps in communicating between languages and browsers supported by Selenium. Selenium WebDriver is comprised of four main components as given below.

  • Selenium Client library (Language Binding)
  • JSON wired Protocol
  • Browser drivers 
  • Browser

Selenium Client library

Selenium developers have created Selenium Client libraries that are also termed as Selenium language binding. Selenium has multiple client libraries like Java, Python, CSharp and many more and Selenium developers have developed these language bindings so that Selenium can support multiple languages.

JSON Wired Protocol

JSON stands for JavaScript Object Notation. It is used to transfer the data between client and server on the web.JSON Wire Protocol has REST APIs which work over HTTP. For every command in Selenium, there is a corresponding REST API in JSON Wired Protocol.

You can see the list of APIs by visiting Selenium JsonWireProtocol.You will find that for every Selenium Command possibly there is a Rest API. Using JSON Wired protocol the Rest API for each Selenium Command is sent to the Browser Driver.

Selenium Webdriver JSON Wired Protocol APIs

Browser Drivers

Each browser has a separate Browser Driver. For example, if you want to run your scripts on Chrome browser you will have to use ChromeDriver and so on for other browsers. Now you must know that every browser driver has an HTTP Server. The browser driver communicates with the respective browser without revealing how the internal logic of browser functionality works.

You might be thinking why do we need browser drivers and why can’t we directly interact with real browsers. The rationale behind this is that browsers don’t want to expose their functionality to any third party and for the real browsers Selenium is a third-party application and therefore we need a component that can interact with real browsers without exposing their functionalities that is why we need these browsers drivers.

It might happen in the near future that Selenium could become so much advanced that we will no longer need these browser drivers. But for the time being, we need these browser drivers.

Browser

The browser is the real browser like Chrome, Firefox, Edge, and Safari etc executes the command send by the corresponding browser driver.

The following image shows various components of Selenium WebDriver Architecture.

what is Selenium WebDriver and its Architecture myskillpoint

Selenium WebDriver Architecture Working Mechanism

Consider the following line of Codes written in Java.

When the Selenium script is executed the Chrome browser will be launched and it will navigate to Google. Now let’s try to understand how it happens.

Whenever we write a code for Selenium in the IDEs like Elipse, Pycharm, and Visual Studio, etc, and run the program, every Selenium command will be converted into JSON format.

Using JSON Wire protocol the Rest API for each Selenium command is send to Browser Driver. Every browser driver has an HTTP Server.The browser driver injects the Selenium command in the real browser and corresponding function or will eventually take place in the real browser. This communication is bidirectional. After executing the Selenium command in the real browser the response is sent back to the browser driver.

The browser driver sends back the response to JSON Wire Protocol and from there the response comes back to our automation script. So this is how Selenium Webdriver works.

To start learning Selenium Wevdriver from the beginning, please refer to my post on Java Installation & Environment Setup for Selenium WebDriver

Recommended Posts

Leave a Reply