CMSC 471
Artificial Intelligence -- Fall 2009
HOMEWORK FOUR
out 10/15/09; due 10/29/09
Note: I will be generous about granting extensions
on the Lisp part of this homework if you ask in advance,
since I know you are likely to be studying for the midterm.
To make the grading easier on Denise and me, please submit
the assignment as follows:
- Submit Parts I and II (the written assignments) as
hardcopy in class. No "generous extensions" will be
granted for these sections; the usual rules apply.
- Submit the Lisp code for Part III as "hw4" on the gl machines.
- Submit the commentary for Part III as hardcopy. You do NOT
need to print and submit your Lisp code.
I. Logic (20 points)
(a) Russell & Norvig Exercise 7.11 (a). (5 pts)
(b) Russell & Norvig Exercise 8.6 (a,b,c,i,j). (15 pts)
II. Resolution Theorem Proving (30 points)
(a) (8 points) Represent the following knowledge base in first-order logic.
Use the predicates
- attend(x)
- fail(x,t)
- fair(t)
- pass(x,t)
- prepared(x)
- smart(x)
- study(x)
- umbc-student(x)
where arguments x have the domain of all people, and arguments t
have the domain of all tests.
- Everyone who is smart, studies, and attends class will be prepared.
- Everyone who is prepared will pass any test that is fair.
- A student passes a test if and only if they don't fail it.
- Every UMBC student is smart.
- If a test isn't fair, everyone will fail the test.
- Aidan is a UMBC student.
- Sandy passed the 471 exam.
- Aidan attends class.
(b) (8 points) Convert the KB to conjunctive normal form.
(c) (2 points) We wish to prove that
study(Aidan) -> pass(Aidan, 471-exam)
Express the negation of this goal in conjunctive normal form.
(c) (12 points) Add the negated goal to the KB, and use resolution refutation
to prove that it is true. You may show your proof as a series of sentences
to be added to the KB or as a proof tree. In either case, you must
clearly show which sentences are resolved to produce each new sentence, and
what the unifier is for each resolution step.
III. WalkSAT (50 points)
(a) Lisp Implementation (25 points). Implement the WalkSAT algorithm (shown as pseudocode
in Figure 7.17 in Russell and Norvig) in Lisp. You may
use the partial implementation that I have provided
(see walksat-stubs.lisp
and walksat-utils.lisp);
or you may write your own implementation from scratch;
or you may use part of my code but not all of it.
As explained above, the Lisp code does not need to be
printed out; you just need to submit it on gl.
Also, for this assignment, you are hereby let off the
hook for documentation and good design: I will give
full credit for a working implementation. (And I expect that
most of you will have a working implementation, since otherwise it
will be basically impossible to do the second part.
I implemented everything -- including the utility functions
I'm giving you -- from scratch in less than an hour.)
NOTE: You may, if you wish, work in pairs on
the Lisp implementation. However, the commentary should
be done independently. If you work in pairs, only one
of you needs to submit the code on gl, but you must clearly
indicate on your commentary who you worked with, and which
account the code is submitted under.
(b) Commentary (25 points). This report should be printed out as hardcopy
and submitted in class. It should include two parts:
- Log file showing your code running on the test cases. The file
benchmarks.lisp contains
three simple test cases (*clauses0*, *clauses1*, and
*clauses2*), along with two benchmark problems from
the SATLIB
repository. You should use the "script" command to generate
a log file on the gl machines. Once you've started the script,
you should do the following things:
- Start clisp.
- Load your code (along with the benchmark file and any
other files you need for your code to work).
- Run your walksat algorithm once (or more if needed) on *clauses0*,
*clauses1*, and *clauses2* to show that the code works.
Set the parameters however you wish so that the algorithm
will converge on a solution; if it does not converge the
first time, it's fine to run it more than once.
- Run your walksat algorithm at least three times on each
of the two benchmark problems (*b20* and *b50*) to show that
your implementation works, and to demonstrate the stochastic
nature of the algorithm (i.e., that it generates different
results when rerun).
- Exit clisp.
- Type ^D or "exit" to stop the script.
- Print and turn in the log file (default name: "typescript")
as the first part of your commentary. No remarks are needed
unless you want to clarify anything.
- Analysis of parameters. Pick one (or both)
of the benchmark problems, or any set of one or more benchmark
problems that you'd like to download from the SATLIB
repository. Generate data that will let you answer
experimentally at least one of the following questions:
- How does the value of the probability p affect
the performance (number of steps taken and/or CPU time)
of your implementation, for a fixed-size problem?
(For example, you might set p to 0.0, 0.1, 0.2, ..., 0.9, 1.0;
set max-flips to a large value to ensure convergence,
and collect data on how many steps are needed to solve
*b20* at each probability value.)
- What is an appropriate value of the max-flips parameter
for a particular problem(s) and a particular value of p?
(For example, you might try different max-flips values
(e.g., 50, 100, 150, ...; or a log scale: 16, 32, 64, ...)
with *b50*
to determine the minimum value that gives a 90% probability
of finding a solution. For this question, you would
probably want to do more than 5 runs for each setting so
that you can get a reasonably fine-grained estimate of the probability
of finding a solution. Obviously, if you do only 5 runs,
you won't have a granularity of more than 20%.)
- How does the number of steps vary as the problem size
increases? (For this question, you may want to
look through the benchmarks to get a wider range of problem
sizes than I've included. Another possibility is to take *b50* and "subset"
the clauses to generate a series of successively smaller
problems, with the number of clauses = 50, 100, 150, ....)
Each data point should be the average of at least five
runs of your algorithm. (For example, if you're going
to test your algorithm on *b20* with the parameters
p=0.5 and max-flips=100, you should run this five
times and then compute the average number of steps
across the five runs.)
You should generate at least one graph to show your
results. For example, for question 1, you might have
a scatterplot with p values along the x axis and
number of steps along the y axis. For question
2, you might have a bar plot with max-flips values
on the x axis and probability of solution on the y axis.
You should have at least one paragraph describing the
results and what they tell you about the algorithm and
its parameters.
Please note that this analysis is really just meant to be
a simple, informal study of how the program works. It's
not meant to be a systematic or time-consuming investigation.
Of course, if you get interested in these questions and
want to explore more benchmarks, or multiple questions,
I'm delighted to see that -- but you can get full credit
if you just put together a basic set of results that meets the
guidelines.