Implementing Data Structures with Symbols.


Implementing Data Structures with Symbols

Original Lisp had only symbols as atoms. Symbol names, if used on the same way variables are used in mathematics, are flexible enough to support wide variety (all?) data structures.

For instance, matrices 2 x 2 can be implemented with symbols as a11 instead of a[1,1] in some other language, or (a 1 1) in Lisp and names could be created as needed.

The code will be more complicated and less efficient than if matrices are directly supported by language, but neither one is essential problem - complexity of the code can be "abstracted away" with functions, and computational complexity is increased only for factor of constant * log n where n is the number of used symbols (if symbol reaching in Lisp is implemented efficiently.) Newlisp code follows.

(define (matrix-element m i j)
  (sym (append (string m) (string i) (string j))))

(define (set-matrix m m11 m12 m21 m22)
  (for(i 1 2)
    (for(j 1 2)
      (set (matrix-element m i j)
           (nth (+ (* 2 (- i 1)) (- j 1))
                (list m11 m12 m21 m22))))))
(define (println-matrix m)
  (for(i 1 2)
    (for(j 1 2)
       (println m "[" i "," j "]=" 
          (eval (matrix-element m i j))))))
(define (multiply-matrix a b c)
  (for(i 1 2)
    (for(j 1 2)
      (set (matrix-element c i j) 0)
      (for(k 1 2)
         (set (matrix-element c i j)
              (+ (eval (matrix-element c i j))
                 (* (eval (matrix-element a i k))
                    (eval (matrix-element b k j)))))))))

(set-matrix (quote X) 1 2 3 4)
(set-matrix (quote Y) 2 3 4 5)
(multiply-matrix (quote X) (quote Y) (quote Z))
(println-matrix (quote Z))

Program output


More on Another Encoding of S-expressions in Symbols.


More on Another Encoding of S-expressions in Symbols

In previous post, I have shown another interesting method for encoding of S-expressions into symbols. In this post, I'll show how it can be implemented in Lisp dialects that print symbols on different way from S-expressions, i.e. encapsulated in vertical bars or other symbols (Common Lisp, ISLisp, Picolisp, some implementations of Scheme). This is, normally, good feature. However, repeated encoding of the symbols results in exponential growth. So, we must get rid of those vertical bars inside our symbols.

The code, here in Common Lisp, is sort of trickier than it is in Newlisp or it would be in Clojure.
(defun decode-deep (r)
  (cond ((symbolp r) (let((result (read-from-string (string r))))
                        (if (symbolp result)
                            (decode-deep result))))
        ((listp r) (mapcar (quote decode-deep) r))))

(defun sexpr->symbol (r) 
  (make-symbol (write-to-string (list 'q (decode-deep r)))))

(defun encode-deep (r)
  (cond ((symbolp r) r)
        ((listp r)(if (eq (first r) (quote q))
                      (make-symbol (write-to-string r))
                      (mapcar (quote encode-deep) r)))))

(defun strip-q (r) (first (rest r)))

(defun symbol->sexpr (r) 
  (encode-deep (strip-q (read-from-string (string r)))))


[43]> (setq q0 (sexpr->symbol (list (quote b) (quote c))))
#:|(Q (B C))|
[44]> (setq s (sexpr->symbol (list (quote a) q0)))
#:|(Q (A (Q (B C))))|
[45]> (setq s0 (symbol->sexpr s))
(A #:|(Q (B C))|)

Another Encoding of S-expressions in Symbols.


Another Encoding of S-expressions in Symbols.
Unlike encoding from the last post, this one maintains form of S-expression. If e is S-expression, then its code is symbol (Q e).

Encoding is relatively simple, because built in functions for conversion of symbols and strings into symbols can be used; especially if symbols converted to strings are not encapsulated within vertical bars or quotation marks. Decoding is slightly more complicated. For instance, symbol

(Q (a (Q b)))

should be decoded into list (a (Q b)) where (Q b) isn't list, but symbol. Hence, built in conversion functions are not enough, but one must define his own "code walker." The code in Newlisp follows.

(define (sexpr->symbol r) (sym (string (list (quote Q) r))))

(define (qlist? r)
    (and (list? r)
         (not (empty? r))
         (= (first r) (quote Q))))
(define (symbol->sexpr r)
           (cond ((symbol? r) r)
                 ((qlist? r) (sym (string r)))
                 ((not (qlist? r)) (map codewalker r))))))
     (codewalker (last (read-expr (string r))))))

> (setq s (sexpr->symbol (quote (a b))))
(Q (a b))
> (symbol? s)
> (setq q0 (sexpr->symbol (quote b)))
(Q b)
> (setq s (sexpr->symbol (list (quote a) q0)))
(Q (a (Q b)))
> (symbol? s)
> (setq s0 (symbol->sexpr s))
(a (Q b))
> (symbol? s0)
> (symbol? (last s0))

Behaviour of Q strongly reminds on QUOTE.

More Sophisticated Encoding of S-exprs into Symbols.

More Sophisticated Encoding of S-exprs into Symbols
Encoding of S-expressions into symbols is frequent topic of discussion in this blog. It is motivated by the idea that names should be equally flexible as S-expressions. (Actually, it might be the best if not only every symbol is S-expression, but also every S-expression is the symbol. OK, it is almost mystical statement.)

Trivial encoding was demonstrated in last post: S-expression is encoded in symbol which has exactly same characters as printed representation of S-expression. For instance, S-expression (+ a b) is encoded into symbol (+ a b). This trivial encoding has some limitations. For instance, if we look at symbol

((+ a b) c)

and try to decode it, we'll find that it cannot be determined whether (+ a b) is S-expression, or it is symbol. In some Lisp dialects (Common Lisp, Picolisp, ISLisp) there is no such problem. Symbol (+ a b) is written as |(+ a b)| so difference between S-expression and symbol is obvious. However, in these dialects, repeated encoding results in the symbols that grow exponentially.

In this post more sophisticated encoding, without that problem, is presented. Let us define encoding k, such that for every S-expression e, k(e) is defined as follows.
  1. If e is symbol, then k(e) = e;
  2. if e = (e1 ... en) then k(e) = [k(e1) k(e2) ... k(en)];
Semi-colon is part of the symbol. For instance

k(a) = a;
k((a b)) = [k(a) k(b)]; = [a;b;];

If only codes of the S-expressions contain characters [, ] and ;, then encoding k is injection, i.e. for two S-expressions e and f,

k( e ) = k( f ) => e = f.

Furthermore, there is no exponential explosion in case of multiple encoding. For instance,

k((a b)) = [a;b;];
k(k((a b))) = [a;b;];;
k(k(k((a b)))) = [a;b];;;

Here is implementation of encoding and decoding in R5RS Scheme.
(define (sexpr->string L)
     (if (symbol? L) (symbol->string L)
                        (apply string-append
                               (map sexpr->string L))
(define (sexpr->symbol L)
  (string->symbol (sexpr->string L)))
(define (string->sexpr S)
  (let((S1 (substring S 0 (- (string-length S) 1))))
    (if (equal? (string-ref  S1 (- (string-length S1) 1)) #\])
      (let((substring-begin 1)
           (level 0)
           (result (list)))
        (do ((i 1 (+ i 1)))
            ((= i (string-length S1)) result)
          (if (and (= level 0)
                   (equal? (string-ref S1 i) #\;)
                   (not (equal? (string-ref S1 (+ i 1)) #\;)))
              (set! result
                    (append result
                            (list (string->sexpr
                                    (substring S1
                                               (+ i 1))))))
              (set! substring-begin (+ i 1))))
          (cond ((equal? (string-ref S1 i) #\[)
                 (set! level (+ level 1)))
                ((equal? (string-ref S1 i) #\])
                 (set! level (- level 1))))))
      (string->symbol (substring S 0 (- (string-length S) 1))))))
(define (symbol->sexpr s)
  (string->sexpr (symbol->string s)))


> (sexpr->symbol (quote (a b)))
> (define s2 (sexpr->symbol (quote (|[a;b;];| c))))
> s2
> (symbol->sexpr s2)
(|[a;b;];| c)