Supercharge your AMLs with Embedded Perl Code
Published in ArcUser Magazine, Fall 1999
AML is the workhorse of ArcInfo. It can be used to simplify and automate a wide variety of tasks or to build user interfaces which allow users access to the full range of ArcInfo's functionality without mastering its complete syntax. AML is a flexible language, but it can also be slow, especially when looping repeatedly over blocks of code. Fortunately, AML's flexibility allows programmers ways to work around these limitations. One technique is to combine AML with external programs that are written to perform specific tasks where using AML would be slow. The drawbacks of using external programs are that they are often platform-specific, and can raise file management issues. Instead of a single program file to document, backup, and distribute, you now have two - the AML and the external program. However, by using Perl, it is possible to embed the external program inside the AML itself, thus overcoming these drawbacks.
In this article, I will focus on how Perl can be used to dramatically enhance AML's performance when iterating through a dataset with cursors, where some of the most greatest performance gains can be realized. However, the basic techniques for interfacing Perl with AML can be used to solve a variety of other problems, and I encourage you to experiment.
Perl is a powerful but easy-to-learn text and data manipulation program that was originally developed for UNIX, but is now available for a variety of platforms. Perl's file and text manipulation facilities make it particularly well-suited for manipulating ArcInfo outputs, and, as a compiled language, its speed and performance far outpace that of AML. It is (largely) platform independent, and it is freely available.
Though this article will be easier to follow for those who are already familiar with Perl, it is possible to use these techniques with only a rudimentary knowledge of the language. Perl is particularly easy to learn if you have experience programming with C, but even with other programming experience the following examples should make sense.
If you run ArcInfo on a UNIX system, then you may already have Perl. Type "perl" at your system prompt to find out. If you are running ArcInfo on a Windows NT machine, you will probably need to install Perl yourself. Luckily, Perl is simple to install under both UNIX and NT, and is available for download free of charge (see the Perl Resource page). The code used in this article will run equally well on both platforms.
Once you have Perl, you can try the following example. Either copy and paste the program code into your text editor, or you can download the AML directly. Note that this AML file contains both AML and Perl code.
So how does this work?
The example works by first creating a text file that contains the lengths of all the arcs in the coverage, using the ArcPlot command listoutput to redirect screen output to a file:
... apc listoutput %tmpfile1% list length apc listoutput screen ...
Alternatively, you could create a similar list file using the Tables unload command, or, if you were interested in coordinate information, with Arc's ungenerate command.
Second, a simple Perl program is run which reads the text file, and extracts and sums the length field from the listing. It then writes the result to a new text file. The names that the Perl program uses for the input and output files are generated with the [scratchname] function to guarantee uniqueness, and are passed to the Perl program as arguments:
... &s tmpfile1 [scratchname -prefix xxtmp1 -file] &s tmpfile2 [scratchname -prefix xxtmp2 -file] ... &sys perl -x %aml$fullfile% %tmpfile1% %tmpfile2% ...
Finally, the main AML program runs the one-line AML created by the Perl program, which looks like:
&s .totlength nnnn
This sets the global variable .totlength to the total calculated by the Perl program. We need a global variable because if it were a local, its value would be lost when the main program resumed execution after the termination of our one-liner. (Recall that global variables always start with a "."). The main AML can now carry on and use this new variable in the same way it would use any other.
How can we store two programs in one file?
You have no doubt noticed that the above program contains both Perl and AML code. This is very convenient from a packaging point of view a single file is much easier to work with, document, back-up, and share with colleagues than are two separate ones. However, AML and Perl are not directly compatible that is, if ArcInfo encounters a Perl command, it will return an error, and vice-versa. Luckily, due to the natures of the two languages, it is possible to keep them isolated despite being stored in the same file.
AML is an interpreted language. This means that the AML processor executes a program line-by-line, and does not look at any lines the thread of execution does not encounter. Therefore, if we keep the AML interpreter away from the Perl code by using a &return or &stop statement, then it will never know that there is alien code embedded elsewhere in the file.
Perl, on the other hand, is a compiled language, which ordinarily means that the entire file is checked for syntax errors before being compiled, and is then stored in its compiled binary state before being run by the user. Both of these factors would seem to work against us.
Perl, however, has several interesting features which allow us to overcome these apparent problems. First, Perl is compiled at run time, which means that unlike other compiled languages such as C or Pascal, the user handles only the source code. Perl code, like AML, is simply text. Second, the Perl compiler has ability to extract its source code from another file by using the -x option. This feature was designed to make it easy to run Perl programs sent by email, and directs the Perl compiler to ignore anything before the first instance of #!perl. This allows us to put our AML code at the top of the file, where the AML interpreter will find it, with Perl code at the bottom, where the Perl compiler will extract it.
The magic happens when the AML interpreter encounters the following line:
&sys perl -x %aml$fullfile% %tmpfile1% %tmpfile2%
The &sys perl bit tells ArcInfo to fire up a system shell and run the Perl compiler, passing the remainder of the line as arguments. Perl compiles and runs the program file stored in the special AML variable %aml$fullfile%, in which ArcInfo stores the name of the currently executing AML (thus relieving us of the need to code the name and physical location of the AML into itself). The -x option tells Perl to ignore everything until it encounters an instance of #!perl, and the two temporary files set up earlier in the AML are passed as program parameters. These are picked up and converted to regular Perl variables by the line:
The special Perl array @ARGV contains any arguments passed to the program. In our example there are only two the input file with the lengths of all the arcs, and a yet-to-be-created output file that exists in name only. These two names are stored in the Perl variables $in and $out.
Just how much faster is this?
Compared to using cursors, the above approach is much faster. Performance gains by using Perl were significant even on modest sized coverages, often improving by a factor of 10 or more. On larger coverages, the time savings were even greater. On the largest coverage I tested, the Perl method clocked in at over 160 times faster than the AML cursor method!
The following table summarizes results of tests I conducted on coverages of various sizes on a 233MHz Pentium II machine running NT and ArcInfo 7.1. (You can try this yourself by downloading the cursor-method version of the calclength program.
Real life applications
I have used embedded Perl programs to improve the performance of my AMLs in a number of instances. Most recently, I developed a module of my Alignment Sheet generation program that draws an elevation profile of an arbitrary section of pipeline (or other linear object). Each arc (or part of one) on the plot must be visited in sequential order, its starting and ending points on the page calculated, and a short segment of line drawn between them. I first coded this in pure AML, and the chart, requiring hundreds of selections and thousands of calculations, slowly inched its way across my screen. I then recoded the guts of the program in Perl. The program didn't simply calculate a value and return it to the calling AML as in the example above, but instead produced an entire chart-drawing AML, hardcoded for multiple sections of pipeline. Consisting largely of ArcPlot drawing commands, with all page coordinates pre-calculated, the chart practically blinked onto my screen.
Another way that Perl can expand the utility of AML is to reformat an INFO file listing into a comma separated file (CSV) suitable for import into other programs, such as Excel. I use this application so often that I have created a generalized program called List2CSV, which is part of my collection of GIS Tools.
We have seen how we can use Perl to spice up the performance and extend the capabilities of AML. The Perl program can return values to the parent AML, create stand-alone AML code, or just reformat an output file. By packaging the AML and Perl program in the same file, we retain the convenience and tidiness of a single program, but can access the strengths of two different languages.
After reading this, I hope that you will be able to use the techniques I have described to go beyond the examples provided. If you have any ideas about how to take this technique further, I would enjoy hearing about them. If you have any comments, questions, or feedback, please drop me a line at .
Entire site © 1996-2004 by Christopher Eykamp