expecting expect
Expect’s a wonderful scripting language. Oh, it has its oddities, but I’ve been using it for over ten years now, and it’s proven useful in a variety of ways.
What is it? It’s a Unix based, fully featured scripting language based on Tcl that can take control of a user’s input and output and make decisions about what to do based on the results it gets. In other words, it’s designed to run programs that are meant to be run interactively by humans and put them under automated control.
Its uses aren’t unlimited: it’s definitely a command based scripting method — you’re not going to be controlling GUI windows with this. In checking around, there’s actually a few windows ports, but none of them really control windowed applications (one of them merely runs off cygwin!)
Where it shines is in automating tasks (especially of the system administration kind), and in testing (provided of course that no GUI is involved). I used dejagnu for years when I used to test network devices (routers, bridges, printservers, webcams, etc.
I can put together a script that checks for a new version of software, ftps it over, unpacks it, compiles it and then notifies me. I can then switch the current symlink to the latest version (or not) as I choose — all the grunt work is already done. I can do the same with any kind of repetitive task on unix.
I currently use it to automate the process of updating our services at work. Basically we offer a searchable database of texts, and we periodically upload new texts. The process involves quite a bit of “preprocess” — we go through the texts beforehand, find the words, and build several databases based on those words to help speed up the inquiries. We have searches based both on straight text search (slow, but can find substrings, etc) or a much faster word based search. The latter is accomplished through preprocessing. So first, we take the texts, and extract all the words. From this we can create indices based on unique instances of the words (678 occurrences of the word DOG in the database), or Btrees containing the complete location (filename and byte location) of every individual word. There are also databases recording the citations, others that help us create outlines for the material. You get the idea: lots of individual texts, many separate little programs to extract useful information, and all of it eventually welded together into a format usable by our website. Some of these tools are written in C. Others in perl, etc. We invoke the database command line to load things up. Files get copied over. Things get checked out, updated, committed from CVS.
Expect comes to the rescue here. It checks that we have the new set of texts, starts up each of the utility scripts in the correct order (checks that each completed without problems before advancing to the next). It also gives the user a number of options on how to do this: where the text files are located, whether we are on production or development machines, we can run parts of it (the prepatory work, versus actual loading of the data once preprocess complete). It handles safety stuff, such as dumping database schemas and cvs’ing them before dropping databases and resconstructing them or copying/tarballing files off before replacing them. All of the tools are under CVS control, and the script knows enough to check for any updates and recompile as needed before using the tools. If you completely forget to set the tools up, it knows how to cvs check out the entire thing from scratch. Depending on which machine we run this on, it can take up to 24 hours to complete.
I can start it up and walk away from it. It’s wonderful. My coworker still gets to do his stuff by hand cos he hasn’t taken the trouble to put together a script for his stuff…
Some useful links for more info:
http://en.wikipedia.org/wiki/Expect
http://expect.nist.gov/
Also, dejagnu: http://www.gnu.org/software/dejagnu/


