Skip to main content
blog title image

3 minute read - Test Automation

API Overview - Categorising WebDriver – Navigation, Interrogation, Manipulation

Aug 21, 2012

Hands up who uses generalisation in their modelling strategies? Me too! Here are 5 classifications I use for thinking about the WebDriver API.

I’ve worked through the WebDriver API in some detail and as part of figuring out a way to replay the API back to people learning WebDriver. I have adopted 3 main classifications:

  • Navigation
  • Interrogation
  • Manipulation

My 4th categorisation has resulted in fewer methods:

  • Synchronisation

I initially expected to find more in the Synchronisation category, since the strategies around synchronisation often consume a lot of automation time. But in retrospect it makes sense since WebDriver abstracts the Driver and DOM interaction, not the application level semantics around elegant progression.

Initially I used Navigation, Inspection, Interaction and Synchronisation but I didn’t want so many beginning with the same letter. I find that bad for my mnemonics. So feel free to chop and change the words as most appropriate for you.

In this post I will present an initial categorisation of portions of the API. I will not present the full API.

At a high level we have the 5 main areas. What? First we had 3, then 4, now 5.

Some of the areas of the API relate closely to specific domain elements that I have grouped them in an easy to find categorisation first.

Navigation has the smallest set of API elements for a high level category:

Again I probably should not have experienced surprise at finding a small set of API functionality, since so much of navigation in automation involves manipulating the application under test.


Initially I had this as Inspect, influenced I suspect by Firefox and Chrome. But I moved to interrogation because it sounds more macho.


WebDriver manages to squeeze a lot of bang for the buck out of its API with click and sendKeys doing the vast bulk of the donkey work on HTML elements for us. We can drop down into the Advanced User Interactions subset for more detailed work, and some support classes exist to help with Select elements.

Pretty concise and easy to remember.


Our tests become flaky and intermittent when we do not get our Synchronisation strategies correct. We have many options open to us in terms of the semantics of Synchronisation but we tend to rely on waiting for conditions to implement them.


WebDriver provides help in the support classes for this.

Built in, we have the implicit wait associated with some of the other API calls and with the Page Objects, and while these can help us build tests quickly, we don’t want to rely on those long term.

Much remains for me to write on the topic of Synchronisation.


Hey look, a category that he doesn’t know how to categorise. Yup that happens. I value flexibility in my modelling. when we expand this category you will find that I have subcategorised into the above categories.

But there exist parts of the API that while not all together in the API, I found value in grouping in terms of the domain.


This acts as a cheat sheet reminder of where to find stuff. I haven’t put all the params and return types, because I can get all that with code completion.

The WebDriver API used to have a lot more complexity to it, but over the years the team have distilled it into a pretty tight and fairly small set of methods.

I want it!

You can download the full mind map – should you want it by doing a ‘save as’ on

I used FreePlane as the mind mapping tool.

You can also find the mindmap and images over at github, I even converted it to pdf for you

I’ve uploaded it to MindMeister so I can make it public and share it here.