For the first time, in a long time, I’ve had to bypass WebDriver
In this post I’ll explain why, and look at the pros and cons of doing so.
I have not created a sample web app for the following scenario, and I encountered the situation when automating for a client so I’ll explain the scenario in general terms.
The app I was automating:
- displayed an information dialog that was modal with a cancel button, this remains on screen until cancelled
- at the same time the app displayed a notification status progress dialog at the bottom, which was rendered for about 4 seconds
- the DOM was changing in the background at points
- the modal dialog and notification dialog are animated with CSS
The user can click the notification progress dialog to take the user to a new page with the details being notified about.
With WebDriver I need to click the cancel button on the dialog, before I can click the notification. And I found it very hard to consistently click the cancel button, and then click on the notification dialog to go to the new page.
To get everything happening in a timely fashion I used
.click on the element.
A normal click would be:
Pros & Cons
As soon as the element is available we can find id, then issue the click event.
In the app under test you barely even notice that the dialog is present because it appears in the DOM. The CSS animation slowly starts to reveal it. The click event is issued and the dialog goes away.
Because of its speed, it could tempt many people into using this in preference to the normal
I used this in a single place. For a single dialog.
Click doesn’t always work
‘click’ doesn’t always work for every app. Sometimes we might need to issue other events and might need to trigger it in different ways.
You can read more about ‘click’ and event triggering via the following links:
We bypassed the Driver abstraction
WebDriver provides a set of abstracted implementations. A key benefit of WebDriver. We don’t really know how a ‘.click’ is implemented across Firefox, Chrome, Safari, Edge etc. The driver handles that for us.
If we now issue a ‘click()’ instead, we may have impacted our ability to run on different browsers.
WebDriver also waits for certain conditions to be true before clicking e.g. has the element been rendered, is the z-order correct, will it receive the event, etc.
If we bypass all of this, then our execution is faster but might have unintended consequences or potential synchronisation issues that could be hard to resolve.
It might work now, but later when browsers and drivers change, this approach might not work - if we stick to the abstraction as intended then the development teams who maintain the drivers ensure that the correct semantics are enforced for the various versions of the browser. They concentrate making the physical interaction between Driver and Browser robust for the various operating systems and versions. We concentrate on making robust the synchronisation of our
@Test abstractions with the application state.
Apps might expect more
When we issue a mouse click we do more than ‘click()’. We mouse over, mouse move, hover, mouse down, click mouse up, etc.
If the application has listeners on these other events then we might not trigger all the side-effects that occur with a normal human mouse click.
Why do it then?
With every work around we have to consider the risks.
Because I’m doing this in a single isolated place, and have no intention of doing this in any other parts of the app, I view this as low risk.
I’m also prototyping the automating to see what it is possible to automate and what is hard.
This was hard.
It may lead to design changes in the application to make the application easier to automate e.g. change the css animation to be faster, increase the time the notification is on screen, change the z-order to bring the notification to the top, add a link to the new page in the ‘cancel’ dialog so we don’t need to click the notification, etc. The more we make the application support automating: the fewer workarounds we need, the more robust the execution becomes, the less tempted we are to bypass WebDriver abstractions.
Because it is hard it is not something we want to repeat in many places in the automated coverage and we are aware that there are risks when expanding to multiple browsers etc.
Because it was hard, and risky. I would also look at alternative approaches for validating some of the conditions:
- does the notification allow clicking to go to a new page?
- this could be done by interacting with the application on each release, but could be time consuming.
- does the notification have the correct url linking to the new page?
- scrape the ‘href’ from the DOM and
.getthat page separately and in our own time, rather than trying to click quickly
- scrape the ‘href’ from the DOM and
- amend the app to support automating (as mentioned previously)
- don’t automate this scenario
I prefer not to bypass WebDriver. When prototyping I do create workarounds to demonstrate feasibility, but I also try and convince people not to use the workaround.
Sometimes we do need to do things that, in general, we wouldn’t want to rely on long term, and we need to understand the tools we use to make those options open to us.
The full source for this is in my Webdriver Java FAQs project:
If you want to learn how to use Selenium WebDriver with Java then check out our online courses.
You will need a Github account to comment. Or you can contact me with your comment.
I reserve the right to delete spam comments e.g. if your comment adds no value and its purpose is simply to create a backlink to another site offering training, or courses, etc.