---------------------------------------------------- Introduction to AI WIN 95 COMMON LISP STYLE - Description Collected by: David Carmel, carmel@cs.technion.ac.il Last Update: 3.10.1995 ---------------------------------------------------- ---------------------------------------------------------------- How can I improve my Lisp programming style and coding efficiency? Here are some general suggestions/notes about improving Lisp programming style, readability, correctness and efficiency: General Programming Style Rules: - Write short functions, where each function provides a single, well-defined operation. Small functions are easier to read, write, test, debug, and understand. - Use descriptive variable and function names. If it isn't clear from the name of a function or variable what its purpose is, document it with a documentation string and a comment. In fact, even if the purpose is evident from the name, it is still worth documenting your code. - Don't write Pascal (or C) code in Lisp. Use the appropriate predefined functions -- look in the index to CLtL2, or use the APROPOS and DESCRIBE functions. Don't put a close parenthesis on a line by itself -- this can really aggravate programmers who grew up on Lisp. Lisp-oriented text editors include tools for ensuring balanced parentheses and for moving across pairs of balanced parentheses. - Use proper indentation -- you should be able to understand the structure of your definitions without noticing the parentheses. The following functions often abused or misunderstood by novices. Think twice before using any of these functions. - EVAL. Novices almost always misuse EVAL. When experts use EVAL, they often would be better off using APPLY, FUNCALL, or SYMBOL-VALUE. Use of EVAL when defining a macro should set off a warning bell -- macro definitions are already evaluated during expansion. See also the answer to question 3-12. - Destructive operations, such as NCONC, SORT, DELETE, RPLACA, and RPLACD, should be used carefully and sparingly. In general, trust the garbage collector: allocate new data structures when you need them. To improve the readability of your code, - Don't use any C{A,D}R functions with more than two letters between the C and the R. When nested, they become hard to read. If you have complex data structures, you are often better off describing them with a DEFSTRUCT, even if the type is LIST. The data abstraction afforded by DEFSTRUCT makes the code much more readable and its purpose clearer. If you must use C{A,D}R, try to use DESTRUCTURING-BIND instead, or at least SECOND, THIRD, NTH, NTHCDR, etc. - Use COND instead of IF and PROGN. In general, don't use PROGN if there is a way to write the code within an implicit PROGN. For example, (IF (FOO X) (PROGN (PRINT "hi there") 23) 34) should be written using COND instead. - Never use a 2-argument IF or a 3-argument IF with a second argument of NIL unless you want to emphasize the return value; use WHEN and UNLESS instead. You will want to emphasize the return value when the IF clause is embedded within a SETQ, such as (SETQ X (IF (EQ Y Z) 2 NIL)). If the second argument to IF is the same as the first, use OR instead: (OR P Q) rather than (IF P P Q). Use UNLESS instead of (WHEN (NOT ..) ..) but not instead of (WHEN (NULL ..) ..). - Use COND instead of nested IF statements. Be sure to check for unreachable cases, and eliminate those cond-clauses. - Use backquote, rather than explicit calls to LIST, CONS, and APPEND, whenever writing a form which produces a Lisp form, but not as a general substitute for LIST, CONS and APPEND. LIST, CONS and APPEND usually allocate new storage, but lists produced by backquote may involve destructive modification (e.g., ,.). - Make the names of special (global) variables begin and end with an asterisk (*): (DEFVAR *GLOBAL-VARIABLE*) Some programmers will mark the beginning and end of an internal global variable with a percent (%) or a period (.). Make the names of constants begin and end with a plus (+): (DEFCONSTANT +E+ 2.7182818) This helps distinguish them from lexical variables. Some people prefer to use macros to define constants, since this avoids the problem of accidentally trying to bind a symbol declared with defconstant. - If your program is built upon an underlying substrate which is implementation-dependent, consider naming those functions and macros in a way that visually identifies them, either by placing them in their own package, or prepending a character like a %, ., or ! to the function name. Note that many programmers use the $ as a macro character for slot access, so it should be avoided unless you're using it for that purpose. - Don't use property lists. Instead, use an explicit hash table. This helps avoid problems caused by the symbol being in the wrong package, accidental reuse of property keys from other programs, and allows you to customize the structure of the table. - Use the most specific construct that does the job. This lets readers of the code see what you intended when writing the code. For example, don't use SETF if SETQ will do (e.g., for lexical variables). Use the most specific predicate to test your conditions. If you intend for a function to be a predicate, have it return T for true, not just non-NIL. - When NIL is used as an empty list, use () in your code. When NIL is used as a boolean, use NIL. Similarly, use NULL to test for an empty list, NOT to test a logical value. Use ENDP to test for the end of a list, not NULL. - Don't use the &AUX lambda-list keyword. It is always clearer to define local variables using LET or LET*. - When using RETURN and RETURN-FROM to exit from a block, don't use (VALUES ..) when returning only one value, except if you are using it to suppress extra multiple values from the first argument. - Type the name of external symbols, functions, and variables from the COMMON-LISP package in uppercase. This will allow your code to work properly in a case-sensitive version of Common Lisp, since the print-names of symbols in the COMMON-LISP package are uppercase internally. (However, not everybody feels that being nice to case-sensitive Lisps is a requirement, so this isn't an absolute style rule, just a suggestion.) Lisp Idioms: - MAPCAN is used with a function to return a variable number of items to be included in an output list. When the function returns zero or one items, the function serves as a filter. For example, (mapcan #'(lambda (x) (when (and (numberp x) (evenp x)) (list x))) '(1 2 3 4 x 5 y 6 z 7)) Documentation: - Comment your code. Use three semicolons in the left margin before the definition for major explanations. Use two semicolons that float with the code to explain the routine that follows. Two semicolons may also be used to explain the following line when the comment is too long for the single semicolon treatment. Use a single semicolon to the right of the code to explain a particular line with a short comment. The number of semicolons used roughly corresponds with the length of the comment. Put at least one blank line before and after top-level expressions. - Include documentation strings in your code. This lets users get help while running your program without having to resort to the source code or printed documentation. Issues related to macros: - Never use a macro instead of a function for efficiency reasons. Declaim the function as inline -- for example, (DECLAIM (INLINE ..)) This is *not* a magic bullet -- be forewarned that inline expansions can often increase the code size dramatically. INLINE should be used only for short functions where the tradeoff is likely to be worthwhile: inner loops, types that the compiler might do something smart with, and so on. Stylistic preferences: - Use (SETF (CAR ..) ..) and (SETF (CDR ..) ..) in preference to RPLACA and RPLACD. Likewise (SETF (GET ..) ..) instead of PUT. - Use INCF, DECF, PUSH and POP instead instead of the corresponding SETF forms. - Don't use LET* where LET will do. Don't use LABELS where FLET will do. Don't use DO* where DO will do. - Don't use DO where DOTIMES or DOLIST will do. - If you like using MAPCAR instead of DO/DOLIST, use MAPC when no result is needed -- it's more efficient, since it doesn't cons up a list. If a single cumulative value is required, use REDUCE. If you are seeking a particular element, use FIND, POSITION, or MEMBER. - If using REMOVE and DELETE to filter a sequence, don't use the :test-not keyword or the REMOVE-IF-NOT or DELETE-IF-NOT functions. Use COMPLEMENT to complement the predicate and the REMOVE-IF or DELETE-IF functions instead. - Use complex numbers to represent points in a plane. - Don't use lists where vectors are more appropriate. Accessing the nth element of a vector is faster than finding the nth element of a list, since the latter requires pointer chasing while the former requires simple addition. Vectors also take up less space than lists. Use adjustable vectors with fill-pointers to implement a stack, instead of a list -- using a list continually conses and then throws away the conses. - When adding an entry to an association list, use ACONS, not two calls to CONS. This makes it clear that you're using an alist. - If your association list has more than about 10 entries in it, consider using a hash table. Hash tables are often more efficient. - When you don't need the full power of CLOS, consider using structures instead. They are often faster, take up less space, and easier to use. - Use WITH-OPEN-FILE instead of OPEN and CLOSE. - Use local variables in preference to global variables whenever possible. Do not use global variables in lieu of parameter passing. Global variables can be used in the following circumstances: * When one function needs to affect the operation of another, but the second function isn't called by the first. (For example, *load-pathname* and *break-on-warnings*.) * When a called function needs to affect the current or future operation of the caller, but it doesn't make sense to accomplish this by returning multiple values. * To provide hooks into the mechanisms of the program. (For example, *evalhook*, *, /, and +.) * Parameters which, when their value is changed, represent a major change to the program. (For example, *print-level* and *print-readably*.) * For state that persists between invocations of the program. Also, for state which is used by more than one major program. (For example, *package*, *readtable*, *gensym-counter*.) * To provide convenient information to the user. (For example, *version* and *features*.) * To provide customizable defaults. (For example, *default-pathname-defaults*.) * When a value affects major portions of a program, and passing this value around would be extremely awkward. (The example here is output and input streams for a program. Even when the program passes the stream around as an argument, if you want to redirect all output from the program to a different stream, it is much easier to just rebind the global variable.) - NTH must cdr down the list to reach the elements you are interested in. If you don't need the structural flexibility of lists, try using vectors and the ELT function instead. - Don't use quoted constants where you might later destructively modify them. For example, instead of writing '(c d) in (defun foo () (let ((var '(c d))) ..)) write (list 'c 'd) instead. Using a quote here can lead to unexpected results later. If you later destructively modify the value of var, this is self-modifying code! Some Lisp compilers will complain about this, since they like to make constants read-only. Modifying constants has undefined results in ANSI CL. See also the answer to question [3-13]. - Some programmers feel that you shouldn't add declarations to code until it is fully debugged, because incorrect declarations can be an annoying source of errors. They recommend using CHECK-TYPE liberally instead while you are developing the code. On the other hand, if you add declarations to tell the compiler what you think your code is doing, the compiler can then tell you when your assumptions are incorrect. Declarations also make it easier for another programmer to read your code. - Declaring the type of variables to be FIXNUM does not necessarily mean that the results of arithmetic involving the fixnums will be a fixnum; it could be a BIGNUM. For example, (declare (type fixnum x y)) (setq z (+ (* x x) (* y y))) could result in z being a BIGNUM. If you know the limits of your numbers, use a declaration like (declare (type (integer 0 100) x y)) instead, since most compilers can then do the appropriate type inference, leading to much faster code. - Depending on the optimization level of the compiler, type declarations are interpreted either as (1) a guarantee from you that the variable is always bound to values of that type, or (2) a desire that the compiler check that the variable is always bound to values of that type. Use CHECK-TYPE if (2) is your intention. - If you get warnings about unused variables, add IGNORE declarations if appropriate or fix the problem. Letting such warnings stand is a sloppy coding practice. To produce efficient code, - choose the right algorithm. For example, consider seven possible implementations of COPY-LIST: (defun copy-list (list) (let ((result nil)) (dolist (item list result) (setf result (append result (list item)))))) (defun copy-list (list) (let ((result nil)) (dolist (item list (nreverse result)) (push item result)))) (defun copy-list (list) (mapcar #'identity list)) (defun copy-list (list) (let ((result (make-list (length list)))) (do ((original list (cdr original)) (new result (cdr new))) ((null original) result) (setf (car new) (car original))))) (defun copy-list (list) (when list (let* ((result (list (car list))) (tail-ptr result)) (dolist (item (cdr list) result) (setf (cdr tail-ptr) (list item)) (setf tail-ptr (cdr tail-ptr)))))) (defun copy-list (list) (loop for item in list collect item)) (defun copy-list (list) (if (consp list) (cons (car list) (copy-list (cdr list))) list)) The first uses APPEND to tack the elements onto the end of the list. Since APPEND must traverse the entire partial list at each step, this yields a quadratic running time for the algorithm. The second implementation improves on this by iterating down the list twice; once to build up the list in reverse order, and the second time to reverse it. The efficiency of the third depends on the Lisp implementation, but it is usually similar to the second, as is the fourth. The fifth algorithm, however, iterates down the list only once. It avoids the extra work by keeping a pointer (reference) to the last cons of the list and RPLACDing onto the end of that. Use of the fifth algorithm may yield a speedup. Note that this contradicts the earlier dictum to avoid destructive functions. To make more efficient code one might selectively introduce destructive operations in critical sections of code. Nevertheless, the fifth implementation may be less efficient in Lisps with cdr-coding, since it is more expensive to RPLACD cdr-coded lists. Depending on the implementation of nreverse, however, the fifth and second implementations may be doing the same amount of work. The sixth example uses the Loop macro, which usually expands into code similar to the third. The seventh example copies dotted lists, and runs in linear time. It's equivalent to the other linear-time examples in a lisp that is properly tail-recursive. - use type declarations liberally in time-critical code, but only if you are a seasoned Lisp programmer. Appropriate type declarations help the compiler generate more specific and optimized code. It also lets the reader know what assumptions were made. For example, if you only use fixnum arithmetic, adding declarations can lead to a significant speedup. If you are a novice Lisp programmer, you should use type declarations sparingly, as there may be no checking to see if the declarations are correct. Wrong declarations can lead to errors in otherwise correct code, and can limit the reuse of code in other contexts. Depending on the Lisp compiler, it may also be necessary to declare the type of results using THE, since some compilers don't deduce the result type from the inputs. - check the code produced by the compiler by using the disassemble function ---------------------------------------------------------------- Glossary of acronyms: CAR Originally meant "Contents of Address portion of Register", which is what CAR actually did on the IBM 704. CDR Originally meant "Contents of Decrement portion of Register", which is what CDR actually did on the IBM 704. Pronounced "Cudder". LISP Originally from "LISt Processing" GUI Graphical User Interface CLOS Common Lisp Object System. The object oriented programming standard for Common Lisp. Based on Symbolics FLAVORS and Xerox LOOPS, among others. Pronounced either as "See-Loss" or "Closs". See also PCL. PCL Portable Common Loops. A portable CLOS implementation. Available by anonymous ftp from parcftp.xerox.com:pcl/. LOOPS Lisp Object Oriented Programming System. A predecessor to CLOS on Xerox Lisp machines. X3J13 Subcommittee of the ANSI committee X3 which is working on the ANSI Standardization of Common Lisp. ANSI American National Standards Institute CL Common Lisp SC22/WG16 The full name is ISO/IEC JTC 1/SC 22/WG 16. It stands for International Organization for Standardization/International Electronics(?) Congress(?) Joint Technical Committee 1, Subcommittee 22, Working Group 16. This long-winded name is the ISO working group working on an international Lisp standard, (i.e., the ISO analogue to X3J13). CLtL1 First edition of Guy Steele's book, "Common Lisp the Language". CLtL2 Second edition of Guy Steele's book, "Common Lisp the Language". ---------------------------------------------------------------- ;;; *EOF*