In last post, it was shown that in Lisp practice, the term "symbolic expression", shorter "s-expression", "sepxr", "sexp" is used on few different, although similar meanings.
It is not unusual. Other terms, even mathematical ones, like "lines" or "numbers" are not uniquely defined as well. Ambiguity usually motivates the search for the essence of the discussed entities; the result of the search is axiomatic theory. For example, the axioms of natural numbers are developed in late 1880's by R. Dedekind and G. Peano.
Search for axioms of symbolic expressions might be equally justified. I designed following axioms (version in which lists are only shorter way of writing "dotted pairs") to emphasize the similarities to axioms of natural numbers.
Symbolic expressions in all three meanings satisfy the axioms; cons structures satisfy axiom (3) only if cyclic structures are not allowed, like in original, McCarthy's Lisp. I'm not aware of unintended, perverse models that satisfy given axioms; but I am not sure that such models do not exist.
There are only two differences between these axiom systems.
It is not obvious that symbolic expressions require infinitely many atoms; it could be only convenience. Perhaps S-expressions like (A . (B . (C . D))) can be used instead of symbols like ABCD, eliminating need for infinitely many atoms.
It remains unclear why these two systems of axioms are so similar.
Three meanings of the term 'S-expression'
"Symbolic expressions" or "S-expressions" are the basic data type in Lisp. Particularly, Lisp programs are S-expressions as well. The notion has immense importance in Lisp community.
Surprisingly, designers of recent Lisp dialects avoid the term "symbolic expression" or "S-expression". It is used once in Picolisp documentation; only twice, almost accidentally, in CLtL2, and it isn't used in CL Hyperspec or recent Scheme standards. Clojure, according to its web site "extends the code-as-data system beyond parenthesized lists (s-expressions) to vectors and maps." Only Newlisp documentation appear to regularly uses the term.
However, the term S-expression is still extensively used in daily communication and literature. Unfortunately, there is no universal, unique meaning. Inconsistent use is sometimes noted; for instance, in P. Siebel's "Practical Common Lisp." More frequently, it is ignored.
For J. McCarthy, S-expression is finite sequence consisting of dots and parentheses and symbols. The symbols, truly atomic, cannot be analysed on characters. For instance, S-expression (left.right) is sequence of five elements: (, left, ., right and ).
More often1, S-expression is seen as sequence of characters. In that meaning, S-expression (left . right) is sequence consisting of 14 elements.
The most usually2, S-expression is data structure, perhaps tree consisting of cons cells and symbols. In that meaning S-expression (left . right) is cons cell, containing adresses of symbols left and right in memory.
1 For instance, in E. Shaphiro, "Common Lisp - an interactive approach"; F. Turbak and D. Gifford, "Design concept of programming languages."
2 For instance, in J. and G. Sussman and H. Abelson, "SICP"; D. Touretsky, "Common Lisp - A gentle introduction to symbolic computation"; R. Finkel, "Advanced programming languages design".
For instance, matrices 2 x 2 can be implemented with symbols as a11 instead of a[1,1] in some other language, or (a 1 1) in Lisp and names could be created as needed.
The code will be more complicated and less efficient than if matrices are directly supported by language, but neither one is essential problem - complexity of the code can be "abstracted away" with functions, and computational complexity is increased only for factor of constant * log n where n is the number of used symbols (if symbol reaching in Lisp is implemented efficiently.) Newlisp code follows.
(define (matrix-element m i j) (sym (append (string m) (string i) (string j)))) (define (set-matrix m m11 m12 m21 m22) (for(i 1 2) (for(j 1 2) (set (matrix-element m i j) (nth (+ (* 2 (- i 1)) (- j 1)) (list m11 m12 m21 m22)))))) (define (println-matrix m) (for(i 1 2) (for(j 1 2) (println m "[" i "," j "]=" (eval (matrix-element m i j)))))) (define (multiply-matrix a b c) (for(i 1 2) (for(j 1 2) (set (matrix-element c i j) 0) (for(k 1 2) (set (matrix-element c i j) (+ (eval (matrix-element c i j)) (* (eval (matrix-element a i k)) (eval (matrix-element b k j))))))))) (set-matrix (quote X) 1 2 3 4) (set-matrix (quote Y) 2 3 4 5) (multiply-matrix (quote X) (quote Y) (quote Z)) (println-matrix (quote Z)) --------------------------- Program output Z[1,1]=10 Z[1,2]=13 Z[2,1]=22 Z[2,2]=29
More on Another Encoding of S-expressions in SymbolsIn previous post, I have shown another interesting method for encoding of S-expressions into symbols. In this post, I'll show how it can be implemented in Lisp dialects that print symbols on different way from S-expressions, i.e. encapsulated in vertical bars or other symbols (Common Lisp, ISLisp, Picolisp, some implementations of Scheme). This is, normally, good feature. However, repeated encoding of the symbols results in exponential growth. So, we must get rid of those vertical bars inside our symbols.
The code, here in Common Lisp, is sort of trickier than it is in Newlisp or it would be in Clojure.
(defun decode-deep (r) (cond ((symbolp r) (let((result (read-from-string (string r)))) (if (symbolp result) result (decode-deep result)))) ((listp r) (mapcar (quote decode-deep) r)))) (defun sexpr->symbol (r) (make-symbol (write-to-string (list 'q (decode-deep r))))) (defun encode-deep (r) (cond ((symbolp r) r) ((listp r)(if (eq (first r) (quote q)) (make-symbol (write-to-string r)) (mapcar (quote encode-deep) r))))) (defun strip-q (r) (first (rest r))) (defun symbol->sexpr (r) (encode-deep (strip-q (read-from-string (string r))))) ------------- > (setq q0 (sexpr->symbol (list (quote b) (quote c)))) #:|(Q (B C))| > (setq s (sexpr->symbol (list (quote a) q0))) #:|(Q (A (Q (B C))))| > (setq s0 (symbol->sexpr s)) (A #:|(Q (B C))|)
Another Encoding of S-expressions in Symbols.Unlike encoding from the last post, this one maintains form of S-expression. If e is S-expression, then its code is symbol (Q e).
Encoding is relatively simple, because built in functions for conversion of symbols and strings into symbols can be used; especially if symbols converted to strings are not encapsulated within vertical bars or quotation marks. Decoding is slightly more complicated. For instance, symbol
(Q (a (Q b)))
should be decoded into list (a (Q b)) where (Q b) isn't list, but symbol. Hence, built in conversion functions are not enough, but one must define his own "code walker." The code in Newlisp follows.
(define (sexpr->symbol r) (sym (string (list (quote Q) r)))) (define (qlist? r) (and (list? r) (not (empty? r)) (= (first r) (quote Q)))) (define (symbol->sexpr r) (let((codewalker (lambda(r) (cond ((symbol? r) r) ((qlist? r) (sym (string r))) ((not (qlist? r)) (map codewalker r)))))) (codewalker (last (read-expr (string r)))))) -------- > (setq s (sexpr->symbol (quote (a b)))) (Q (a b)) > (symbol? s) true > (setq q0 (sexpr->symbol (quote b))) (Q b) > (setq s (sexpr->symbol (list (quote a) q0))) (Q (a (Q b))) > (symbol? s) true > (setq s0 (symbol->sexpr s)) (a (Q b)) > (symbol? s0) nil > (symbol? (last s0)) true
Behaviour of Q strongly reminds on QUOTE.
More Sophisticated Encoding of S-exprs into SymbolsEncoding of S-expressions into symbols is frequent topic of discussion in this blog. It is motivated by the idea that names should be equally flexible as S-expressions. (Actually, it might be the best if not only every symbol is S-expression, but also every S-expression is the symbol. OK, it is almost mystical statement.)
Trivial encoding was demonstrated in last post: S-expression is encoded in symbol which has exactly same characters as printed representation of S-expression. For instance, S-expression (+ a b) is encoded into symbol (+ a b). This trivial encoding has some limitations. For instance, if we look at symbol
((+ a b) c)
and try to decode it, we'll find that it cannot be determined whether (+ a b) is S-expression, or it is symbol. In some Lisp dialects (Common Lisp, Picolisp, ISLisp) there is no such problem. Symbol (+ a b) is written as |(+ a b)| so difference between S-expression and symbol is obvious. However, in these dialects, repeated encoding results in the symbols that grow exponentially.
In this post more sophisticated encoding, without that problem, is presented. Let us define encoding k, such that for every S-expression e, k(e) is defined as follows.
k(a) = a;
k((a b)) = [k(a) k(b)]; = [a;b;];
If only codes of the S-expressions contain characters [, ] and ;, then encoding k is injection, i.e. for two S-expressions e and f,
k( e ) = k( f ) => e = f.
Furthermore, there is no exponential explosion in case of multiple encoding. For instance,
k((a b)) = [a;b;];
k(k((a b))) = [a;b;];;
k(k(k((a b)))) = [a;b];;;
Here is implementation of encoding and decoding in R5RS Scheme.
(define (sexpr->string L) (string-append (if (symbol? L) (symbol->string L) (string-append "[" (apply string-append (map sexpr->string L)) "]")) ";")) (define (sexpr->symbol L) (string->symbol (sexpr->string L))) (define (string->sexpr S) (let((S1 (substring S 0 (- (string-length S) 1)))) (if (equal? (string-ref S1 (- (string-length S1) 1)) #\]) (let((substring-begin 1) (level 0) (result (list))) (do ((i 1 (+ i 1))) ((= i (string-length S1)) result) (if (and (= level 0) (equal? (string-ref S1 i) #\;) (not (equal? (string-ref S1 (+ i 1)) #\;))) (begin (set! result (append result (list (string->sexpr (substring S1 substring-begin (+ i 1)))))) (set! substring-begin (+ i 1)))) (cond ((equal? (string-ref S1 i) #\[) (set! level (+ level 1))) ((equal? (string-ref S1 i) #\]) (set! level (- level 1)))))) (string->symbol (substring S 0 (- (string-length S) 1)))))) (define (symbol->sexpr s) (string->sexpr (symbol->string s))) ------------------------- > (sexpr->symbol (quote (a b))) |[a;b;];| > (define s2 (sexpr->symbol (quote (|[a;b;];| c)))) > s2 |[[a;b;];;c;];| > (symbol->sexpr s2) (|[a;b;];| c)