The Probability That Random Propositional Formula is Tautology



; In last few posts we have shown that radnom propositional formula
; using classical connectives or, and and ->, and without variables
; is more probably true than false. In this post, we'll show that 
; probability that propositional formula with variables is tautology
; is surprisingly high. 

(load "http://www.instprog.com/Instprog.default-library.lsp")

; The function rnd-pf returns random propositional formula. For
; example, (rnd-pf 10 '(true nil x y z) '(not) '(or and ->)) will
; return radnom formula of the size 10, with leafs true, nil, x, y
; and z; unary connective not and binary connectives or, and and ->.

(set 'element find)

(set 'rnd-pf
  (lambda(len leafs unary binary)
    (let ((connectives (append unary binary)))
      (cond ((= len  1) 
               (apply amb leafs))
            ((= len  2) 
               (list (apply amb unary) 
                 (rnd-pf 1 leafs unary binary)))

            ((> len  2) 
               (let ((connective (apply amb connectives)))
                 (cons connective
                       (if (element connective unary)
                           (list (rnd-pf (- len  1) leafs unary binary))
                           (let ((r (apply amb (sequence 1 (- len  2)))))
                                 (list (rnd-pf r leafs unary binary)
                                       (rnd-pf (- len 1 r) leafs unary binary)))))))))))

; In following part of the program, 1000 random propositional formulas
; of the size 1000 (that means, 1000 nodes in graph) are constructed,
; and probability that random formula is tautology is calculated. It
; is done for various number of variables, from no variables at all
; to 17 variables. 

(set 'vars '(a b c d e f g h i j k l m n o p q r s t u v w x y z))

(dotimes(j 20)
  (set 'tautologies 0)
   (dotimes(i 1000)
     (set 'rf (rnd-pf 1000 (append '(true nil) (slice vars 0 j)) '(not) '(and or ->)))
  (when (tautology? rf)
        (inc tautologies)))
  
(println "Formulas with " j " variables: " (/ tautologies 10) "% tautologies."))

; The results are:
; 
; Formulas with 0 variables: 58% tautologies.
; Formulas with 1 variables: 51% tautologies.
; Formulas with 2 variables: 42% tautologies.
; Formulas with 3 variables: 42% tautologies.
; Formulas with 4 variables: 34% tautologies.
; Formulas with 5 variables: 31% tautologies.
; Formulas with 6 variables: 31% tautologies.
; Formulas with 7 variables: 30% tautologies.
; Formulas with 8 variables: 27% tautologies.
; Formulas with 9 variables: 22% tautologies.
; Formulas with 10 variables: 22% tautologies.
; Formulas with 11 variables: 21% tautologies.
; Formulas with 12 variables: 22% tautologies.
; Formulas with 13 variables: 16% tautologies.
; Formulas with 14 variables: 18% tautologies.
; Formulas with 15 variables: 19% tautologies.
; Formulas with 16 variables: 16% tautologies.
; Formulas with 17 variables: 16% tautologies.
;

On Expected Truth Value of a Random Propositional Formula (2)


; Random propositional formula - is it true or false?

; Another function for definition of the random propositional formula
; is implemented. This time not the probability that graph edge is 
; leaf (true or nil) is given, instead, the size of the formula is
; exactly controlled. For example, if (random-formula 5) is called,
; then, first, logical connective is randomly chosen, and then,
; if that logical connective is not then
;
;   (list 'not (random-formula 4)) 
;
; is returned. If binary logical connective is chosen, the pair 
; (n1, n2) is also randomly chosen so n1+n2=4, and
;
;   (list connective (random-formula n1) (random-formula n2))
;
; is returned. There are also few simple special cases. 
          
(set 'random-formula2
  (lambda(x)
    (cond ((= x 1) (amb true nil))
          ((= x 2) (list 'not (random-formula2 1)))
          
          (true (let ((connective (amb 'not 'or 'and '->)))
                     
                     (cons connective
                     
                           (if (= connective 'not)
                               (list (random-formula2 (- x 1)))
                                       
                               (let ((r (apply amb (sequence 1 (- x 2)))))
                                     (list (random-formula2 r)
                                           (random-formula2 (- x 1 r)))))))))))
                                           
; I'll use debug-wrap from my library to demonstrate how function
; calls itself:

(load "http://www.instprog.com/Instprog.default-library.lsp") 

(debug-wrap random-formula2)
(random-formula2 7)

; 
; (random-formula2 (in 7)
;                  (random-formula2 (in 5)
;                                   (random-formula2 (in 4)
;                                                    (random-formula2 (in 1)
;                                                                     (out true))
;                                                    (random-formula2 (in 2)
;                                                                     (random-formula2 (in 1)
;                                                                                      (out true))
;                                                                     (out (not true))); m=2
;                                                    (out (or true (not true)))); t=31; m=5
;                                   (out (not (or true (not true))))); t=31; m=7
;                  (random-formula2 (in 1)
;                                   (out true))
;                  (out (-> (not (or true (not true))) true))); t=46; m=10
;

(debug-unwrap random-formula2)

; Let us look once again whether propositional formulas are really
; more probably true than false ... 


(for (x 0 10)
  (set 'j (pow 10 x))
  (set 'sum 0)
  (dotimes(i 1000)
    (if (eval (random-formula2 j)) ; power of eval
        (inc sum)))
  (println "Random formula of a length " j " is true in " 
           (div (mul 100 sum) 1000) "% of cases"))

; Random formula of a length 1 is true in 49.2% of cases
; Random formula of a length 10 is true in 55.2% of cases
; Random formula of a length 100 is true in 57.9% of cases
; Random formula of a length 1000 is true in 56.6% of cases
; Random formula of a length 10000 is true in 60% of cases
; Random formula of a length 100000 is true in 59.5% of cases

;




---

On Expected Truth Value of a Random Propositional Formula



Is it more probable that random propositional formula, as defined
in previous blogpost using true, nil and traditional connectives or,
and, not and -> is true or false? One would expect that it is equally
probable. Let us test this hypothesis.

(set 'random-formula
      (lambda(x)
        (if (= x 1)
            (amb true nil)
            (letn ((a (div (add (mul x 3) 4) (mul 2 (sub x 1))))
                  (r (random 0 (add 4 (mul 2 a)))))
                 
                 (cond
                    ((<= 0 r 1) (list 'not (random-formula x)))
                    ((<= 1 r 2) (list 'and (random-formula x) (random-formula x)))
                    ((<= 2 r 3) (list 'or (random-formula x) (random-formula x)))
                    ((<= 3 r 4) (list '-> (random-formula x) (random-formula x)))
                    ((<= 4 r (add 4 a)) true)           ; if a=2 then 
                    ((<= (add 4 a) r (add 4 a a)) nil))))))
                 
(set '-> (lambda(x y)(or (not x) y)))

(for (j 1 50)
  (set 'sum 0)
  (dotimes(i 100000)
    (if (eval (random-formula j)) ; power of eval
        (inc sum)))
  (println "Random formula of a length " j ". is true in " 
           (div (mul 100 sum) 100000) "% of cases"))
  
Results:  
  
Random formula of a length 1. is true in 49.855% of cases
Random formula of a length 2. is true in 52.099% of cases
Random formula of a length 3. is true in 52.636% of cases
Random formula of a length 4. is true in 53.104% of cases
Random formula of a length 5. is true in 53.224% of cases
Random formula of a length 6. is true in 53.577% of cases
Random formula of a length 7. is true in 53.339% of cases
Random formula of a length 8. is true in 53.484% of cases
Random formula of a length 9. is true in 53.513% of cases
Random formula of a length 10. is true in 53.87% of cases
Random formula of a length 11. is true in 53.889% of cases
Random formula of a length 12. is true in 53.571% of cases
Random formula of a length 13. is true in 53.917% of cases
Random formula of a length 14. is true in 53.837% of cases
Random formula of a length 15. is true in 54.14% of cases
Random formula of a length 16. is true in 53.973% of cases
Random formula of a length 17. is true in 53.811% of cases
Random formula of a length 18. is true in 54.059% of cases
Random formula of a length 19. is true in 53.929% of cases
Random formula of a length 20. is true in 53.98% of cases
Random formula of a length 21. is true in 53.818% of cases
Random formula of a length 22. is true in 54.109% of cases
Random formula of a length 23. is true in 53.999% of cases
Random formula of a length 24. is true in 54.001% of cases
Random formula of a length 25. is true in 54.095% of cases
Random formula of a length 26. is true in 54.03% of cases
Random formula of a length 27. is true in 54.215% of cases
Random formula of a length 28. is true in 54.048% of cases
Random formula of a length 29. is true in 54.12% of cases
Random formula of a length 30. is true in 53.972% of cases
Random formula of a length 31. is true in 53.752% of cases
Random formula of a length 32. is true in 54.154% of cases
Random formula of a length 33. is true in 54.229% of cases
Random formula of a length 34. is true in 54.159% of cases
Random formula of a length 35. is true in 53.972% of cases
Random formula of a length 36. is true in 54.063% of cases
Random formula of a length 37. is true in 53.926% of cases
Random formula of a length 38. is true in 53.899% of cases
Random formula of a length 39. is true in 54.078% of cases
Random formula of a length 40. is true in 54.247% of cases
Random formula of a length 41. is true in 54.14% of cases
Random formula of a length 42. is true in 54.036% of cases
Random formula of a length 43. is true in 54.05% of cases
Random formula of a length 44. is true in 54.16% of cases
Random formula of a length 45. is true in 53.964% of cases
Random formula of a length 46. is true in 53.988% of cases
Random formula of a length 47. is true in 54.075% of cases
Random formula of a length 48. is true in 54% of cases
Random formula of a length 49. is true in 53.975% of cases
Random formula of a length 50. is true in 53.954% of cases

Why probability that random formula is true is consistently greater
than 50%? Because of asymmetry of traditionally basic logical connectives
and, or and ->. These are the truth tables (0 = nil, 1 = true):

 x y (or x y)        x y (and x y)       x y (-> x y)
 ------------        -------------       ------------
 0 0   0             0 0   0             0 0  1
 0 1   1             0 1   0             0 1  1
 1 0   1             1 0   0             1 0  0
 1 1   1             1 1   1             1 1  1 

Look at the last column in each of these definitions - there are  
total seven 1's and five 0's.


If formula of a given length, say l has connective or, and or
-> on a top level, the probability that it is true should be greater
than probability that it is false.


On the other side, if formula of the same length l has connective not
on the top level, then it is more probable that formula is false -
because not is applied on a random formula of a length l-1 which
is, supposedly more probable to be true. However, there is less of
formulas of length l-1 than formulas of a length l, so formulas
starting with not cannot compensate for formulas starting with
and, or and ->.


We proved that random formula of the length l is probably true -
if formula of the length l-1 is probably true. To complete the proof,
we must prove that random formula is probably true for some base
case:


1: true, false => 1 true, 1 false
2: (not true), (not false) => 1 true, 1 false
3: (or false false), (or false true), (or true false), (or true true),
   (and false false), (and false true), (and true false), (and true true),
   (-> false false), (-> false true), (-> true false), (-> true true)
   (not (not true)), (not (not false)) => 8 true, 6 false.


If formula of a length 3 is probably true, then formula of the
length 4 is probably true and so on ...


Note that our random formulas have no fixed length - but they have
fixed average length, and it is enough that this asymmetry breaks
through.





---

Random Propositional Formulas of a Given Average Length




; My naive attempt to generate random propositional formula
; recursively, giving equal probability to standard logical
; constants true and nil and connectives not, and, and ->
; failed:

(set 'random-formula 
      (lambda()
         (amb true
              nil
              (list 'not (random-formula))
              (list 'and (random-formula) (random-formula))
              (list 'or (random-formula) (random-formula))
              (list '-> (random-formula) (random-formula)))))

(dotimes(i 30)(println i ". " (random-formula)))







; The problem is in infinite recursion. The expectation is that
; random-formula will call seven times another instance of random-formula
; in 6 calls, so it is unlikely that random formula construction
; will be completed. This time, solution is very simple:

(seed (date-value))

(set 'random-formula
      (lambda()
         (amb true true
              nil nil
              (list 'not (random-formula))
              (list 'and (random-formula) (random-formula))
              (list 'or (random-formula) (random-formula))
              (list '-> (random-formula) (random-formula)))))

(dotimes(i 20)(println i ". " (random-formula)))


; The results:


0. (not (and (not (and true nil)) (-> (-> (or (and nil nil) nil) true) nil)))
1. true
2. true
3. nil
4. nil
5. (not (not (-> (not nil) true)))
6. (and true (-> true (or (or (or (and true nil) nil) nil) (not (or (not (-> true 
(not (or (not (not (or true (and (not true) true)))) (-> nil nil))))) (-> nil 
(-> (and nil true) (and (-> (or (or (not (-> true true)) (and (-> nil nil) nil))
(or nil (not true))) (or nil (or true (or true (not nil))))) true))))))))
7. nil
8. nil
9. (or (and nil (not true)) true)
10. (or true (and true (or (or (or nil (not true)) (-> (-> nil (and (or nil true
) nil)) true)) (and (-> nil (or nil nil)) true))))
11. true
12. true
13. (not nil)
14. nil
15. true
16. (and nil (and (-> (or true (or true (-> true (not (-> (or (and (and nil nil)
(not nil))(not nil)) nil)))))(and (and true (or true (-> nil (or true nil)))) 
(not (and nil nil)))) (or true (and (-> nil (and true (or (not (not (and 
(or true nil) (or (and nil (not (or (not (and nil (not true))) true)))(not (not 
nil)))))) nil))) true))))
17. (or nil (-> (and nil true) true))
18. nil
19. (and (not (not nil)) (or (or nil (and (not (-> (or (-> nil nil) true) (not (
not (and (or (or (and (not nil) (and true true)) true) true) true))))) (not true))) 
nil))


; Now, it is possible, but unlikely that random-formula will
; generate random-formula calls infinitely. In eight function calls,
; new function call will be generated seven times.
;
; Interesting question that occurs immidietely is: but, how big
; is average formula generated on this way? By size of the formula,
; we'll simply count number of connectives and constants in formula,
; i.e. number of nodes in imagined graph. Size of (or true (not nil))
; is, for example, four.

; We'll estimate of average length of formula using the fact that
; Random formula is defined recursively. Let us denote *expected length*
; of average formula as x. Then
;
;          1 + 1 + 1 + 1 + (1+x) + (1+2x) + (1+2x) + (1+2x) 
;     x = --------------------------------------------------
;                               8
;
; where 1, 1, 1, 1 are sizes of the trivial formulas: true, true,
; nil and nil; (1+x) is expected size of the formula (list 'not (random-formula))
; and (1+2x) is expected size of other three formulas.
;
; If we solve that equation, we'll get x=8. Let us test that:


(set 'size-of
     (lambda(f)
       (cond ((or (= f true) (= f nil)) 1)
             ((= (first f) 'not) (+ (size-of (last f)) 1))
             (true (+ (size-of (nth 1 f)) 
                      (size-of (nth 2 f)) 
                      1)))))

(set 'sum 0)
(dotimes(i 100000)
  (inc sum (size-of (random-formula))))
(println (div sum 100000)) ; 8.01414 - OK



; Now, when we are here, we'll try to answer the next question:
; How to define the function that generates random formulas of
; expected length, say, 20. For that purpose, we need slightly
; redefined, more general, random-formula. Instead of repeating
; true and nil two times, we'll repeat it a times, where a is
; real number.


(set 'random-formula
      (lambda(a)
         (let ((r (random 0 (add 4 (mul 2 a)))))
              (cond
                 ((<= 0 r 1) (list 'not (random-formula a)))
                 ((<= 1 r 2) (list 'and (random-formula a) (random-formula a)))
                 ((<= 2 r 3) (list 'or (random-formula a) (random-formula a)))
                 ((<= 3 r 4) (list '-> (random-formula a) (random-formula a)))
                 ((<= 4 r (add 4 a)) true)           ; if a=2 then 
                 ((<= (add 4 a) r (add 4 a a)) nil)))))

(set 'sum 0)
(dotimes(i 100000)
  (inc sum (size-of (random-formula 2))))
(println (div sum 100000)) ; 7.93368 - OK


; Now, my formula for estimation of expected size will get
; this form:
;
;         (1+x) + (1+2x) + (1+2x) + (1+2x) + 2a
;    x = ----------------------------------------
;                         6+2a
;
; If we express a explicitly,
;
;                         3x + 4
;                     a = ------
;                         2(x-1) 
;
; particularly, for x=20, a=1.684210.


(set 'sum 0)
(dotimes(i 100000)
  (inc sum (size-of (random-formula 1.6842105))))
(println (div sum 100000)) ; 19.74605 - not bad either.      


; Finally, for convenience, we'll redefine random-formula so it
; accepts x as variable, not a.


(set 'random-formula
      (lambda(x)
         (letn ((a (div (add (mul x 3) 4) (mul 2 (sub x 1))))
               (r (random 0 (add 4 (mul 2 a)))))
              (cond
                 ((<= 0 r 1) (list 'not (random-formula x)))
                 ((<= 1 r 2) (list 'and (random-formula x) (random-formula x)))
                 ((<= 2 r 3) (list 'or (random-formula x) (random-formula x)))
                 ((<= 3 r 4) (list '-> (random-formula x) (random-formula x)))
                 ((<= 4 r (add 4 a)) true)           ; if a=2 then 
                 ((<= (add 4 a) r (add 4 a a)) nil)))))

(set 'sum 0)
(dotimes(i 100000)
  (inc sum (size-of (random-formula 100))))
(println (div sum 100000)) ; 98.71964

(exit)


---

Lispers on AI, 2009 poll


The results of the poll:

AI is priority. Unrelated features are good, but never on expense of AI.       5  (17%)
AI is one of the priorities, but it should be balanced with other priorities.  8  (28%)
Association with AI harmed Lisp in past, and it should be reduced.             7  (25%)
I have few favorite dialects, and they should develop in different directions. 5  (17%)
Lisp is bad and irreparable.                                                   3  (10%)
I don't know (please answer that rather than nothing.)                         6  (21%)
-------------------------------------------------------------------------------------- 
Votes total:                                                                  28 (100%)





---

EVAL-STRING OR EVAL?


; The problem that could be solved using metaprogramming
; techniques was discussed today on Newlisp forum. Programmer
; newdep proposed integration of unary predicates (the
; functions accepting only one argument and returning true or
; false) and "if" primitive. Instead of, for example, printing
;
;     (if (integer? x) (print "integer!") (print "isn't integer"))
;
; newdep thought about following:
;
;     (if-integer x (print "integer!) (print "isn't integer"))
;
; In further discussion it was noted that such macro would be
; smarter, but less general. Nevertheless, programmer cormullion
; suggested simple macro to accomplish that:

(define-macro (if-integer) 
                (if (integer? (eval (args 0)))
                  (eval (args 1))
                  (eval (args 2))))
                                                       
(if-integer 34.5 (println "yes") (println "no")) ;--> "no"

; As this is routine operation, it can be used as good example
; of the techniqe that allows one to define lot of macros, related
; to all already defined predicates similar to if.
;
; It is already adressed in several posts, but this time, we
; have example of the code that could be processed easier
; in the form of a string, than in the form of s-expression.
; Usually, eval is better than eval-string, and McCarthy actually
; attributed relative success of Lisp to s-expressions, being simpler
; for metaprogramming, compared with normal strings as in POP
; for example.

; We'll first define helper function - almost the metapredicate:
; predicate predicate? Accepting symbol as argument, predicate?
; will check whether symbol ends with "?". Also, the predicate
; must be different than symbol '? itself which serves other
; purposes.

(define (predicate? i)
        (and (ends-with (string i) "?")
             (!= i '?)))

; Now we should loop through all predicates and do what cormullion
; done for integer. First, using strings and eval-string:

(dolist(i (symbols))
  (when (predicate? i)

     (eval-string 
        (replace "integer"

                 (copy "(define-macro (if-integer) 
                                      (if (integer? (eval (args 0)))
                                          (eval (args 1))  
                                          (eval (args 2))))")

                 (append (chop (string i)))))))

(if-zero 0 (println "yes") (println "no")) ;--> "yes"

; Then using s-expressions and eval

(dolist(i (symbols))
  (when (predicate? i) ; for example, i=positive?

    (set (sym (append "if-" (chop (string i)))) ; positive?->if-positive
         (expand (lambda-macro()
                       (if (i (eval (args 0)))
                           (eval (args 1))
                           (eval (args 2))))
                 'i))))

(if-zero 34 (println "yes") (println "no")) ;--> "no"



---

Reader Macros in Newlisp?


Ted WaltherOn Newlisp forum, Ted Walther recently proposed introduction of user defined reader macros in Newlisp. User defined reader macros already exist in Common Lisp and some dialects of Scheme. Built-in, but not user defined reader macros exist in Clojure, and to lesser extent in Scheme and even Pico Lisp. Newlisp has no reader macros at all.

Ted Walther's main reason appears to be ability to use other languages in combination with Newlisp.

Example he posted was:


(add-reader-macro 'c-lang)
(add-reader-macro 'sql-lang)
(add-reader-macro 'html-lang)

(println "hi")

(C 
   printf("foo%d",1);
   i++;
)

(sql
     SELECT * FROM ,(amb "table1" "table2")
)

(html
       <h1>foo</h1>
       ,(generate-some-tables))
) 


My answer on Newlisp forum is republished here.



I think reader macros make writing code (programming) more friendly, but writing code that processes code (metaprogramming) more difficult.

I do not think that "one who does not want to use reader macros shouldn't use them" is enough. What would happen if someone writes useful library that uses reader macros? Programmers will start to use it. Theoretically, we can remember both new syntax and how it translates into normal Lisp, but in reality, we can only remember and understand limited amount of information; especially if we use Newlisp irregularly. So we'd start using that library, and later more and more of Lisp as any other language, without understanding how to process code.

This practice would, in turn, influence future development of the language. Although Lutz Muellermakes decisions, these are made on the base of the perceived problems and discussions by other programmers.

I could admit that finally, it is not known whether metaprogramming is good idea and that I only believe it is good idea. The failure of Lisp dialects to establish itself as a mainstream language is not a good sign, but recently it appears that Ruby users practice metaprogramming a lot. Any case, metaprogramming is differentia specifica of Lisp; if it is bad idea, Lisp itself is failure. And if it is good idea, it is comparative advantage of Lisp that should be promoted and not discouraged.

From my point of view, following syntax looks OK:

(load "c-lang")
(load "sql-lang")
(load "html-lang")

(println "hi")

(C "printf(\"foo%d\",1);
   i++;")

(sql (append "SELECT * FROM" (amb "\"table1\""       "\"table2\"")))

(html (append "<h1>foo<\h1>"
              (generate-some-tables)))

It is understandable why Ted likes his syntax better. Back while I used PLT Scheme, I used °s-expr instead of (println 's-expr "=" s-expr) equivalent. It was good for debugging, it was enough to insert or delete one ° before suspicious expression. Reader macros can be fun. However, Ted can implement his own, private preprocessor - on that way, whole thing is interesting and challenging private experiment, and not a language policy.

(preprocess 
(add-reader-macro 'c-lang)
(add-reader-macro 'sql-lang)
(add-reader-macro 'html-lang)

(println "hi")

(C "printf(\"foo%d\",1);"
   "i++;")) 

Of course, Ted was aware of that, his opinion is that reader macros are generally useful. (I wrote about preprocessing in this blog previously, in posts EVALUATION IN TWO STAGES and MACROCALL.)



---