Automating Browsers with Selenium WebDriver [Software Development Engineer in Test Article Series Part 14]
Up to now, I have written 13 articles which are mostly about the structure of software testing. The articles from the 8th one are technical articles. You can read the previous articles from here.
This article is about the well known, and maybe the most popular, browser automation tool; Selenium WebDriver. You will get a brief explanation of Selenium WebDriver, and sample use cases.
I assume that the reader knows coding with Java, and is familiar with IntelliJ IDE.
Technical prerequisites:
- IntelliJ IDE must be installed.
- JDK must be installed.
- Google Chrome Browser must be installed.
Note: You can access the github repo of the project being used from here.
Q: Since we create automated tests, it is not too hard to predicate that we will give commands to the web pages. Isn’t it?
A: Yes. The main structure is that. We have to interact with the web pages like an end user. To perform this interaction we have to add the Selenium WebDriver library to our project.
Q: What is it? Does it drive the web?
A: Just so. It takes our commands (like; read text from page, click on this button …) and performs the actions directly on the web page that we are testing.
Q: It must be a powerful testing tool. Isn’t it?
A: Actually, Selenium is not a testing tool. It only automates browsers. Testing part is another subject. We will only practice automating browsers with Selenium WebDriver in this session.
Q: I am so excited. Let’s create a project and start automating. How can we add Selenium WebDriver dependency to our project?
A: Easy boy, easy!
- First create a Maven project on IntelliJ. You can follow the steps written here. Name your project as you wish. Mine is “ExploringSeleniumWebDriver”.
- Then, add the Selenium WebDriver dependency. You can follow the steps written here. Search for “selenium webdriver” and add your project.
3. Add our second dependency; WebDriverManager. You can follow the steps written here. Search for “bonigarcia” and add your project.
This is my pom.xml file after adding the dependencies:
4. Add a class named “SeleniumPractice” under your “src/main/java” directory.
Add a main method and define a static void method named “practice1_accessWebPage()”:
- WebDriver object definition. This webDriver object will perform all our browser related commands. WebDriver is an interface and it belongs to “selenium library” as indicated with (a).
- WebDriverManager is an abstract class that performs browser setup. It belongs to the “bonigarcia library” as indicated with (b).
- We are initializing the webDriver object with a new instance of ChromeDriver class. ChromeDriver class belongs to “selenium library” as indicated with (c).
If you run the project, you can see a clean (without cookies and any predefined settings) Chrome Browser is opened by only initializing the webDriver object:
Let’s do some practice.
Note: You can access the github repo of the project being used from here.
Practice1 — Access a web page
with the get(“url”) method we can open the specified web page.
Run and see the result:
Practice2 —Get Web Page Details
Run practice2 method by commenting the others in the main method.
- gets the current url as shown by (a) in the console.
- gets the title of the web page as shown by (b) in the console.
- gets the page’s source as shown by (c) in the console.
4. gets the window’s handle value as shown by (d) in the console. Window handle is used to find a specific tab between multiple tabs.
5. closes all opened chrome sessions (browsers) by the driver object. As shown by (e) the execution finished without a problem.
Practice3 — Navigation
Run practice3 method by commenting the others in the main method.
- navigates to the specified web page.
# We are pausing the current thread for a while in order to see the commands’ effect.
2. navigates back as we do with a regular internet browser
3. navigates forward.
4. refreshes the page
Practice4 — Locating elements, reading values
Q: How to read a text from a web page?
A: In order to access the controls/elements of a web page, we have to locate them.
This is the page we are practicing: http://practice.kicchi.net/AutomizationPortal.html
This is the overall page view:
Let’s focus on the “Random” section. There are two buttons. In order to access their html source, just right click “Verify number” button and select the “inspect” option from the context menu:
Look at the html source of the “Verify number” button:
This button has an id =btnVerifyNumber
We will use the id attribute to locate this button.
Run practice4 method by commenting the others in the main method.
- Locating the button with it’s id. findElement() method returns the first corresponding web element. The returned web element is assigned to “btnVerify” WebElement object.
- getAttribute() method returns the specified attribute’s value. In this case, it is the text of the button as shown by (a).
- getTagName() method gets the tag name as shown by (b).
- getAttribute() method returns the specified attribute’s value. In this case, it is the class of the button as shown by (c).
- isDisplayed() method returns the display status of the element as shown by (d).
Practice5 — Different selectors
Q: How to locate elements which do not have id?
A: Selenium supports 8 different selectors. Selectors are used to locate elements.
- with the “id selector” the elements with id attribute can be located,
- “CSS Selector” combines an element selector and a selector value that can identify particular elements on a web page.
- “XPath” stands for XML Path. It’s a query language that helps identify elements from an XML document.
- with the “name selector” the elements with name attribute can be located,
- with “linkText selector”, only the html links are located by their exact texts,
- “tagName selector” locates the elements by their tag names.
- with the “className selector” the elements with the specified class attribute are located,
- with “partialLinkText selector”, only the html links are located by their partial texts.
This is the html source of “Generate random number” button:
Let’s locate this button with different selectors.
We can use the tagName selector if this input is the first input in the page. Because Selenium WebDriver returns the first matching web element:
WebElement buttonByTagName = webDriver.findElement(By.tagName("input"));
We can use css selector:
WebElement buttonByCss = webDriver.findElement(By.cssSelector(".w3-button.w3-blue"));
This css selector finds the button by it’s class names; w3-button and w3-blue. “.w3-button.w3-blue” syntax is specific to css selector. You can find examples about css selector here.
We can use xpath selector:
WebElement buttonByXPath = webDriver.findElement(By.xpath("(//h4)[1]/following-sibling::input[1]"));
This xpath selector finds the first h4 element (as shown in the image above) in the whole page, then finds it’s first input sibling element. You can find examples about xpath selector here.
We can not use the name selector, because this button doesn’t have a name attribute.
We can not use linkText and partialLinkText selectors, because these two selectors are only applicable to an element (html links).
We can use className:
WebElement buttonByClassName = webDriver.findElement(By.className("w3-button"));
Run practice5 method by commenting on the others in the main method.
As you can see we can locate the same button with different selectors.
Practice6 — Interacting with elements
Q: Until now, we can successfully access the web elements, but we didn’t interact with them. For example, how to write in a text field or click a button?
A: Thanks to Selenium WebDriver, interacting with elements is as easy as just calling a method from our WebElement object.
Since we are simulating a human interaction with a web page, let’s create a scenario.
In our practice page there is a Random section:
This section is used to generate a random number, then show it, then enter the exact number in the text field, and finally verify the number. Steps are here:
- Click “Generate random number” button,
- Read the generated number from the web element just below this button,
- Write this exact number in the text web element just above the “Verify number” button,
- Click the “Verify number” button,
- A green check image will be displayed after verifying the number.
A — locating each element that we will use
Generate Number button is located with it’s tag name; input. If there was another element with the same tag name, we wouldn’t get this button. Since we know it is the first input in the html source, we can use tagName selector.
The p element which holds the generated number is located with it’s tag name; p. Again it is the first p element in the html source.
The text field in which we will enter the generated number is located with its name; verifyText.
The verification button is located with its id; btnVerifyNumber.
B — performing the steps
# before each step Thread.sleep() method is used for waiting a little to recognize the changes on the UI.
- “Generate number button” is clicked by click() method.
- The text of “Generated number label” is read by the getText() method.
- The generated number is written to the “text element” by sendKeys() method.
- “Verification number” is clicked by click() method.
At the end, the check icon becomes visible if the entered number matches with the generated number, otherwise a failure icon will be visible.
Q: So, basically we navigate to a web page, access the controls of the web page, read values of them, and finally interact with them as a regular human interaction. But, where is the test part? Did I miss it?
A: We did not mention any word related to the test. The steps we perform in each practice are the user interface automation part. And they will be the core of our UI tests.
We will build our “UI Test with Selenium and TestNG” session upon this session. Wait for the next article buddy. 👍