Appium Architecture

Nitin Bhardwaj
The Startup
Published in
7 min readFeb 18, 2021

--

Mobile Application Testing has become a vital cog in software testing industry in past few years. But with shorter release cycles, complex features, cut throat competition and ensuring the best quality application in market, Manual testing alone cannot perform this task after certain number of releases and needs a saviour. This has lead to the rise of Mobile Automation Testing. A stable Mobile Automation Testing framework ensures the testing time is reduced considerably with high quality of testing. Currently, there are a wide range of mobile automation testing tools and frameworks available and user can select any of them depending on their requirements. Among these tools, Appium has become one of the most favoured tool for mobile automation testing.

In this article, we are going to cover the following topics:

  • What is Appium?
  • Why do we use Appium?
  • Appium architecture | How Appium works?
  • How Appium works on Android?
  • How Appium works on iOS?
  • What is the JSON wire protocol?
  • What is bootstrap.jar?
  • What is WebDriver Agent App?

What is Appium?

Appium is an open source test automation framework used for mobile automation testing across different platforms such as iOS and Android. It also supports mobile web browser automation and has a wide variety of language support for automation. It is backed by Sauce Labs and its first official launch was in May, 2014. In 2016, Sauce Labs donated Appium to JS Foundation to ensure it always stay open source.

Why do we use Appium?

Appium ticks all the desirable capabilities that are required by the users. Some of them are:

  • It is open source and has a vast community support.
  • Covers Android, iOS and Windows platform. Yes, you don’t need to use multiple frameworks to automate your app across different platforms.
  • Provides support to Native application , Hybrid application and Web Browsers on mobile.
  • Support major programming languages such as Java, Python , C#, Ruby, Javascript, PHP etc. Appium provides client libraries which are tailor made for these languages.
  • Integrates seamlessly with Continuous Integration tools such as Jenkins.
  • Your test code can run on emulators, simulators, real devices and cloud based mobile automation services such as AWS Device Farm, BrowserStack etc.
  • Under the hood, it uses Selenium Webdriver commands. So if you have previously worked on Selenium, well, you have headstart in learning Appium.
  • Unlike native mobile automation frameworks like XCUITest and Espresso, it runs on built application file. So in order to run tests you don’t need to recompile application code .
  • Your automation programming language is not dependant on your developer’s programming language. This makes it very flexible for the automation enthusiasts. For example: Your developer has written Android application in Kotlin and you are comfortable with Java. No problem!! Appium has got it covered.

Native App: Apps that are written specifically for a platform such as android or iOS. An android native app cant run on iPhone and vice versa.

Hybrid App: They are web applications written in HTML5 and JavaScript, wrapped inside native containers. Practically they work the same way as browser but only as an application.

Web Browsers: Support for safari on iOS and chrome or Built-in browser on Android.

Appium Architecture | How Appium works?

At the core, Appium is an HTTP server written in nodejs that exposes REST API. The client communicates with Appium server via REST API’s and it is handled by Mobile JSON Wire Protocol.

The first step in this communication flow is the creation of the session. The client initiates the session by sending a request to the server consisting of session related information in key-value pairs, called as Desired Capabilities. On the basis of Desired Capabilities, Appium can differentiate between iOS and android platform and launch a session on target device/simulator/emulator. A session initiation is basically a POST /wd/hub/session request by client. Appium responds to this request in the form of a session id.

Once the session is established, client and Appium server interacts with the session id as reference.

How Appium works on android?

For Android Devices, Appium uses UI Automator API’s to interact with UI components of Application Under Test.

  • Client libraries converts the user written commands to the REST API requests.
  • These requests are sent to the Appium Server using Mobile JSON Wire Protocol.
  • Appium server forward these requests to target android device/emulator.
  • These commands are interpreted by bootstrap.jar which converts them into mobile understandable UIAutomator format.
  • The UIAutomator commands are now performed on the device/emulator.
  • Device/emulator then reverts the outcome of the performed command to the Appium server via bootstrap.jar.
  • Appium server forwards this response to the client.

How Appium works on iOS?

For iOS Devices, Appium uses Apple’s native XCUITest API to interact with UI components of Application Under Test. XCUITest is a UI testing framework built on top of Apple’s unit testing framework, XCTest.

  • Client libraries converts the user written commands to the REST API requests.
  • These requests are sent to the Appium Server using Mobile JSON Wire Protocol.
  • Appium server forward these requests to the target iOS device/simulator.
  • These commands are interpreted by WebDriverAgent.app which converts them into mobile understandable format by calling Apple’s XCUITest API.
  • The commands are now performed on the device/simulator.
  • Device/simulator then reverts the outcome of the performed command to the Appium server via WebDriverAgent.app.
  • Appium server forwards this response to the client.

Note: WebDriverAgent.app is specifically used for XCUITest framework. Earlier, UIAutomation framework was used for iOS in appium which supported bootstrap.js as the middleware between mobile and appium server.

What is the JSON Wire Protocol?

Communication between client and server via REST API takes place in the form of exchange of JSON(JavaScript Object Notations).

JSON is a lightweight, language independent data interchange format.

Example of basic JSON:

{
"Student":{
"FirstName":"Appium",
"LastName":"Selenium",
"IdNumber":"12345",
"City" : "New Delhi",
"EmailID" : "email@gmail.com" }
}

JSON Wire Protocol is a predefined set of specifications that maps actions such as click, type, scroll etc with the HTTP Request/Response. In simple terms, they are a set of rules that define what data should be sent in, in what order, and in what format between client and server.

Appium uses Mobile JSON Wire Protocol, which extends JSON Wire Protocol. It enables Appium Server to manage communication with Mobile Devices.

Flow of communication between Client and Server

A client wants to perform an action on the device. So it converts the action, as object, into JSON object and sends it to the server. The server parses the JSON object and converts it to object. Now server process this object and converts the response object into JSON Object and sends it back to the client. The client then converts the JSON Object to the object.

What is Bootstrap.jar?

Appium server interacts with Android devices through bootstrap.jar. When server starts an Android driver session, it pushes bootstrap.jar file to the device. Device executes this file using the device’s built-in uiautomator command. When bootstrap.jar is executed by the device, it starts a server that listens on the port 4724. This server listens for the requests that are coming from the Appium Server.

On receiving the command, it converts them into UIAutomator commands, understandable by Android API 17 or higher. This UIAutomator then performs the desired action on the device.

What is WebDriverAgent.app?

WebDriverAgent is a WebDriver server implementation for the iOS that can be used to remote control iOS devices. It enables to perform actions such as launch and kill applications, tap, scroll views etc. It works by linking XCUITest.framework and calling Apple’s API to execute commands directly on a device/simulator. WebDriverAgent is developed and used at Facebook for end-to-end testing and has been successfully adopted and integrated with Appium for iOS backed XCUITest Framework.

When Appium first interacts with an iOS device/simulator, it checks for WebDriverAgent.app. If the app is not present on the device then it installs the WebDriverAgent.app as the primary step.

Summary

  • It is an open source, cross platform automation framework that supports Native and Hybrid Applications on iOS and Android OS.
  • It also supports mobile web browser automation.
  • It is backed by Sauce Labs and has a vast community support.
  • It supports wide variety variety of popular programming languages.
  • It works as black box testing framework and does not require application code to be recompiled every time to run tests.

I hope this article has been helpful to the enthusiasts looking to explore Mobile automation testing using Appium.

Take care. Keep Learning. Stay SAFE.

--

--