Sikuli IDE: An Overview


Switching gears quite a bit from my posts so far, in this post I'm going to give a quick overview of Sikuli IDE, a free UI automation tool.

On the face of it, UI automation testing seems like a pretty straightforward concept: mimic the actions that a user would perform while interacting with an application and confirm the application responds to those actions appropriately.  When you roll up your sleeves and start trying to do it, though, you quickly realize that it's often much easier said then done and there are lots of pitfalls and hurdles to overcome.  If you try to access UI controls programmatically (as objects), you may come up against custom controls that your tool of choice can't recognize , controls that aren't uniquely named, or seemingly infinite nested levels that make identifying and isolating individual controls difficult.  Tools that use UI manipulation (e.g., move cursor to coordinates x,y and click, tab to button 1, press the down arrow twice, etc.) can result in scripts that are obscure and/or diffictult to maintain.

Sikuli IDE (and other tools like it like Citratest and DeviceAnywhere) uses a technique that tries to smooth over some of these issues encountered with other strategies.  Basically, you capture GUI elements (buttons, links, etc.) in small graphic files; during script execution, Sikuli uses these captures to identify those elements onscreen and interact with them.  The strategy maintains some of the modularity of an object-oriented approach without any dependencies on the application's underlying architecture.

The documentation provided on the Project Sikuli site is decent, organized in a logical, straightforward manner by key objects.  As its name implies, Sikuli IDE also comes with its own IDE, where graphical "objects" are represented as images, making dealing with scripts very intuitive (Python/Jython is the language used for scripting in Sikuli).


Sikuli IDE

Of course, as you might expect from a tool that keys off graphics, Sikuli is less than ideal in situations where the UI is still unstable-- although the easy-to-use capture interface (basically, pressing CTRL + SHIFT + 2, then selecting the region you want to capture) makes it relatively easy to adjust to UI changes.  While Sikuli does claim to have some text recognition functionality, this seems to be in the very early stages and very unreliable at this point.


Sikuli's text recognition (sort of)

No comments:

Post a Comment

Please be respectful of others (myself included!) when posting comments. Unfortunately, I may not be able to address (or even read) all comments immediately, and I reserve the right to remove comments periodically to keep clutter to a minimum ("clean" posts that aren't disrespectful or off-topic should stay on the site for at least 30 days to give others a chance to read them). If you're looking for a solution to a particular issue, you're free to post your question here, but you may have better luck posting your question on the main forum belonging to your tool's home site (links to these are available on the navigation bar on the right).