Sikuli IDE: A Basic Walkthrough

This is a brief Windows-oriented tutorial that walks through a Sikuli IDE test script (there are tutorials provided on the Project Sikuli site, too).  This script demonstrates how to create and manipulate the Sikuli App object and perform some basic UI actions using the Notepad application.

Here's the starting point for your script:

NPPath = 'C:\\Windows\\notepad.exe'
#Open Notepad using the App object
NPApp = App.open(NPPath)
#Wait until the menu bar is visible, then display message
#that the window is open.
wait("Your Notepad Menu Bar",5)
popup("Window is open!")
menuBar = getLastMatch()#Takes advantage of wait function side effect
#Highlight the menu bar region for 3 seconds then click on text
menuBar.highlight(3)
menuBar.click('Help')
#Define a new region using 'Help' text as starting reference
dropDownRegion = Region(menuBar.getLastMatch().getX(),menuBar.getLastMatch().getY(),300,100)
dropDownRegion.click('About')
wait(3)
click("Your OK Button")
type("This is a test message")
wait(3)
NPApp.close()

Open Sikuli IDE and paste this into the IDE window.  You'll be unable to run it as is, though-- you need to supply at least two graphic elements: the Notepad menu bar and the Notepad About dialog OK button.  Highlight the text "Your Notepad Menu Bar" (including the quotes) in line 6 and, with Sikuli IDE still open, launch Notepad.  Press CTRL + SHIFT + 2-- you should see your screen "fade" slightly as Sikuli IDE enters its capture mode.  Click and drag the cross-hair across the menu bar items-- it should look something like this:


Capturing the menu bar

When you release the mouse button, the screenshot of the menu bar should automatically get inserted into the script:


Script with menu bar screenshot pasted

In Notepad, click Help in the menu bar, then "About Notepad" and repeat the steps above, this time replacing the "Your OK Button" text near the bottom of the script with a screenshot of the OK button in the About Notepad dialog.


OK button pasted in

Now try running the script (note that when the "Window is open!" message is displayed, the script is paused until you click the OK button).  Also make sure the path to Notepad is valid for your particular version of Windows (the script was written on a PC running Windows 7).

Now let's step through the lines of the script; if you're experiencing problems I'll try to highlight some "gotchas" as we go that might be the source.

In the first few lines of the script we're opening Notepad by creating a Sikuli App object-- the App.open() function does this using the path to your target application.

Sikuli's wait function (line 6) scans the screen for the existence of the indicated (text or graphic) element; the second parameter indicates how long Sikuli should wait (in seconds) for the element to appear before failing (you can use the constant FOREVER to specify that Sikuli should wait indefinitely)-- here we're giving Notepad five seconds to launch.  If you have a particularly slow machine where it seems to be taking a while for Notepad to launch, try upping this wait time parameter.  There's a corresponding function in Sikuli called waitVanish()-- this function behaves similarly to the wait function, but scans the screen for the absence of a given element (i.e., waits for it to vanish from the screen).  Other similar functions: find(), which (no surprise here) simply finds the given element onscreen without acting on it; click(), which we'll see below and finds and clicks on the specified element; and exists(), which checks for the presence of an element onscreen.

The popup() function (line 7) simply displays a message box with the indicated text-- as I mentioned previously, Sikuli will pause after the popup is displayed and wait for the user to click the OK button before continuing.  This can be useful for troubleshooting scripts or pausing the script to allow for a manual task.

In line 8, the menuBar region is retrieved using the getLastMatch() function-- this illustrates a useful side effect of many of Sikuli's basic functions, including the wait() function-- when the function is successful (e.g., the element that wait() is waiting for is detected) the location of the element is stored as a match that can be retrieved later via the getLastMatch() function.  So in the case of our script, once the wait() function in line 6 is successful, it's not necessary to re-find or otherwise identify the menu bar pattern (which can be a costly operation in terms of processing resources).

In line 10, the highlight() function is called for the menuBar region we just defined, which simply highlights the region for the given amount of time (in seconds).  This can be useful for troubleshooting scripts in instances where you're dynamically defining a region that's supposed to contain a specific element-- you can use the highlight to see what's actually included in the region.  A word of warning about the highlight() function, though:  I've seen cases where it may result in a redraw that "hides" some child windows (like drop-downs, for example).

In the next line, text recognition is used to identify the Help button in the menu bar and the identified area is clicked.  As I mentioned in my previous post, Sikuli IDE's text recognition can be a little finnicky sometimes.  If you seem to be having problems with this particular action in the script, try using a graphic element (an actual screenshot of the File button, captured as described above) instead of text.  Text recognition seems to work best for cases like this: a short string with a simple well-contrasted setting (e.g., black text on white).

One more important thing to point out in line 11 is the use of click() as a method of the menuBar region.  When functions are called as methods of regions, Sikuli only scans those regions, not the entire screen.  In fact, in line 6 of the script when we called the wait() function without specifying a region, it's called as a method of the default SCREEN region-- a region encompassing the entire screen.  Generally it's good practice to call these functions on regions instead of the entire screen since it's less costly to scan a subsection of the screen as opposed to the entire screen.

Line 13 defines a new region dynamically.  A Sikuli Region object's area is defined by four parameters-- x and y coordinates that define its upper left corner, and width and height parameters.  In this line in the script we're using the getLastMatch() function again, this time for the menuBar region.  The click() function in line 11, also called for the menuBar region, recorded the location of the "Help" text as a successful match; getLastMatch() returns it.  Using the match as our starting point, we can define a new region by getting the X and Y parameters of the "Help" text's region-- via getX() and getY()-- then making our new region 300 pixels wide by 100 pixels tall.  This line could cause problems depending on your computer's resolution-- if you find that the next line is failing (even after ruling out a text recognition failure) try making the region larger in this step-- it could be that the 300 by 100 pixel region is too small to encompass the About Notepad button.

The last few lines are self-explanatory or repeats of functions we've already seen: Sikuli finds and clicks on the text "About" in the newly defined dropDownRegion, then waits for 3 seconds as the About dialog is displayed.  After three seconds, the OK button is identified (via a screenshot this time, not text) and clicked. The text "This is a test message" is typed into Notepad's main window and finally, after a 3 second wait the app is closed using the App object created in line 3.

Of course, this is a very simple script and only illustrates the tip of the iceberg of Sikuli's power.  If you're interested in a more detailed look, I recommend checking out Sikuli's documentation (which I think is pretty well-written) here.