Home Assignments Lecture Blog Resources Project Discussion Group

Exam 1 is Mon Oct 20
Exam 2 is Mon Nov 17
Final is Mon Dec 15 at 8a

Echo360 lecture recordings are at

Whiteboard photos are at

Final Exam results
73 points available

Homework results
this includes the UCB programming assignments and all the written assignments,
but not final project work
median was 1012; mean was 895 (max possible 1100)

updated Exam 2 results
25 pts available

updated Exam 1 results
32 pts available

Final exam study notes: fall2014-ai-bok.txt

Meeting 40, Wed Dec 10

  • Project demos in 3rd Floor Elevator Lobby

Meeting 39, Mon Dec 8

  • project demos canceled; informal conversations

Meeting 38, Fri Dec 5

  • please have your Project stubbed in by Sunday night
    • all project web pages are due 8a Monday morning
    • project demonstrations will be held in CS Dept 3rd Floor Elevator Lobby
    • bring your laptop running your code to your presentation (if neither partner has a laptop—arrange to share one with a classmate)
    • bring one printed copy of your web page to your demo
    • prepare to present your project in about 3 minutes
    • attendance at your presentation is mandatory
    • attendance at demonstrations on the day you're not presenting is optional
  • introduction to particle filtering
(:youtube H0G1yslM5rc:)
(:youtube lwg_KI3UewY:)
(:youtube eWsEyJCVXoo:)
  • Final exam review
    • you may bring 3 sheets of double-sided handwritten notes to the final
    • there will be check-boxes to replace Exam 1 and/or Exam 2 scores with score on Final
    • all Exam 1 and Exam 2 material, plus:
    • basic Bayes net probability calculations; e.g. Q1 of UCB Fall 2012 midterm
    • being able to calculate probability distributions given emission and movement probability models (the calculations done in PS4a). See slides 18–32 of UCB CS188 Lecture 14 Notes
  • course evaluation

Meeting 37, Wed Dec 3

  • Exam 2 handed back. Mean was 8.8.
  • final exam policy: you can elect to use final exam scores in lieu of your Exam 1 and/or Exam 2 scores
  • Q-learning update equation (from UCB Lecture 11 slides, as posted):

Meeting 36, Mon Dec 1

Meeting 35, Mon Nov 24

Meeting 34, Fri Nov 21

  • HMMs and robot localization (including robot movement)
(:youtube 8mi8z-EnYq8:)

Meeting 33, Wed Nov 19

Meeting 32, Mon Nov 17

  • Exam 2 administered

Meeting 31, Fri Nov 14

  • For exam 2, make sure you can perform value iteration and Q-learning algorithms on various MDPs.
  • You are allowed one 8.5x11" page of handwritten notes (double-sided OK)
  • Final project presentations

Meeting 30, Wed Nov 12

Meeting 29, Mon Nov 10

Meeting 28, Fri Nov 7

  • Exam 1 handed back and discussed
  • brief introduction to ghostbusters problem set

Meeting 27, Wed Nov 5

Meeting 26, Mon Nov 3

“Bayes' Theorem MMB 01” by mattbuck (category) - Own work by mattbuck..
Licensed under Creative Commons Attribution-Share Alike 3.0 via Wikimedia Commons - http://commons.wikimedia.org/wiki/File:Bayes%27_Theorem_MMB_01.jpg#mediaviewer/File:Bayes%27_Theorem_MMB_01.jpg

Meeting 25, Fri Oct 31

Meeting 24, Wed Oct 29

Meeting 23, Mon Oct 27

  • PS3 Q6: bridge crossing revisited—are there values of ε (exploration rate) and α (learning rate) that will cause it to learn the bridge 99% of the time? Remember that
    • ε = 1 means always explore; ε = 0 is always exploit
    • α = 1 means always learn (and forget the past); α = 0 never learn
  • PS3 Q7: approximate Q-learning I (direct states)
  • PS3 Q8: approximate Q-learning II (learning feature weights)
  • see Lecture 11: Reinforcement Learning II Berkeley slides

Meeting 22, Fri Oct 24

  • go over value iteration solution (Jeremy)
  • Q-learning: you don't know T or R, but must discover by exploration
  • demo of working Q-learning on gridworld
  • demo of working Q-learning on crawler
  • see Lecture 11: Reinforcement Learning II Berkeley slides

Meeting 21, Wed Oct 22

Meeting 20, Mon Oct 20

  • Exam 1 administered

Meeting 19, Fri Oct 17

  • you may bring one page of 8.5" x 11" paper to the exam, which may have handwritten notes double-sided.
  • exam will cover agents and environments, un-informed and informed search, and adversarial search, specifically:
    • Ch 2.1 2.3 2.4
    • Ch 3.1 3.2 3.3 3.4 3.5 3.6
    • Ch 5 - minimax, alpha/beta pruning, expectimax
  • went over problems from Berkeley Fall 2013 midterm 1.

Meeting 18, Wed Oct 15

  • Analyzing an Agent assignment returned. Some feedback:
    • A few people didn't realize that the assignment was to pick a specific agent and its environment and analyze it. I did not want a generic discussion of the material in the chapter.
    • More people did discuss observability, sequential vs. episodic, etc. but did not describe the internal type of their agent (e.g. reflex agent, goal-based agent, etc.)
    • Some people did not make a clear distinction between the agent and its environment.
Other notes:
  • On observability: in general, the real world is only partially observable. If you have sensors trying to ascertain the agent's state in the world, the sensors might have errors or simply might not be sensing something that's actually important. Typically, only things like games or computer simulations can be fully observable (e.g., chess)
  • Full discussion of this

Meeting 17, Fri Oct 10

Meeting 16, Wed Oct 8

  • Questions 1 through 3 of http://ai.berkeley.edu/reinforcement.html will be due next Wednesday.
  • We're learning to be a reflex agent by interacting with an MDP (which is a model of the world). Note: we're not actually interacting with the world itself, we're only interacting with the given model!
  • This will result in a policy—a choice of action for a given state.
  • Then we play in the world.
  • Some real-world applications of MDPs:
    • population harvesting
    • agriculture
    • water resources
    • inspection, maintenance, and repair
    • purchasing, inventory, and production
    • finance and investment; queues
    • sales promotion
    • search
    • motor insurance claims
    • overbooking
    • epidemics
    • credit
    • sports
    • patient admissions
    • location
    • design of experiments
    • others.
(1993) “A survey of applications of markov decision processes,” D. J. White, Journal of the Operational Research Society, 44(11), p. 1073–1096. Attach:MDPApplications3.pdf

Meeting 15, Mon Oct 6

  • Markov Decision Processes (MDPs), aka “sequential decision problems” (Ch 17)
    • a set of states (with initial state s0);
    • a set ACTIONS(s) of actions in each state;
    • a transition model P(s'|s,a);
    • a reward function R(s).
  • task is to find a policy π which recommends the action for each state; i.e. π(s)
  • the optimal policy (one which produces the largest reward) is π*
  • “A policy represents the agent function explicitly and is therefore a description of a simple reflex agent.”

Meeting 14, Fri Oct 3

  • let's carry out the Fig 5.3 minimax algorithm on the Fig 5.2 game tree (it's a mutually-recursive algorithm)
  • O(b^m) time complexity, O(bm) space complexity (not very practical directly in time)
  • Alpha-beta pruning: similar to minimax, but “prunes away branches that cannot possibly influence the final decision.”
    • α — value along best path (highest) choice for MAX
    • β — value along best path for MIN
“Alpha–beta search updates the values of α and β as it goes along and prunes the remaining branches at a node (i.e., terminates the recursive call) as soon as the value of the current node is known to be worse than the current α or β value for MAX or MIN, respectively.”
  • Live alpha-beta simulation (similar to Fig 5.5)
  • Let’s go over odd (but explainable) behavior of Pacman with fixed search depths
    • Why doesn't it eat the food? python pacman.py -p MinimaxAgent -l minimaxClassic -a depth=4
    • Why does it rush the ghost and die quickly if depth=3, but not if depth=1? python pacman.py -p MinimaxAgent -l trappedClassic -a depth=3

Meeting 13, Wed Oct 1

let's look at evaluation functions for reflex agent
  • formal def'n of minimax (Fig. 5.3), with
V(s) = max V(s') where s' is-element-of successor(s)
  • with PS, you have multiple adversaries (each ghost), each picking the thing that's worst for you
  • alpha-beta pruning

Meeting 12, Mon Sep 29

  • this is how we do graph-search for UCB PS:

Meeting 11, Fri Sep 26

Meeting 10, Wed Sep 24

  • h(n) must be admissible — never overestimates. (Never lets you believe that a good path is bad.)
  • admissible heuristics are optimistic — but that's OK.
  • consistent — admissible heuristic that follows triangle-inequality:
h(n) <= c(n, a, n') + h(n')
  • A* uses same algorithm as UCS, except g + h instead of g
  • let's look at heuristics for queens-problem, sliding tile problem
  • a dominating heuristic — strictly better
“It is generally better to use a heuristic function with higher values, provided it is consistent and that the computation time for the heuristic is not too long.”
  • relaxing a problem — a good way to generate a heuristic
“A problem with fewer restrictions on the actions is called a relaxed problem. The state-space graph of the relaxed problem is a supergraph of the original state space because the removal of restrictions creates added edges in the graph.”
  • now, heuristics for PS

Meeting 9, Mon Sep 22

  • https://grader.cs.uml.edu/assignments/384 extended to Friday
  • Manhattan distance (for Pac-Mac) and Straight Line Distance are good ways of knowing if you're closer to the goal for Pac-Man and city travel problems
  • discussion of sliding tile, 8-queen, and Knuth problems: how do you know if you're closer to the goal?
  • these are heuristics
  • informed search: greedy best-first search, f(n) = h(n)
  • showing how GBFS operates on Romanian map — finds an answer, but not the best one
  • A* search, f(n) = g(n) + h(n)
  • problem set: looked at different cost functions and how UCS can accept "stay east" or "stay west" cost functions

Meeting 8, Fri Sep 19

  • recap of BFS and DFS
  • other search types: DLS, IDS, UCS
  • time/space complexity of UCS based on worst-case optional solution C* and action cost ϵ
  • UCS example with Fig 3.15
  • what about if you know something about where is the solution?

Meeting 7, Wed Sep 17

  • DFS for tree search -- space complexity -- answer is that it should check new states against those already on the current path! this will avoid infinite loops in finite spaces
  • DFS -- optimality
  • what do you need in node data structure?
  • other search types
  • Berkeley full search problem set assigned https://grader.cs.uml.edu/assignments/384 due next Wednesday

Meeting 6, Mon Sep 15

  • We talked about BFS, but assignment is DFS
  • recap of properties of BFS
  • let's calculate time and space
  • what's different if we're doing DFS?
  • calculate/talk about time, space, optimality, completeness
  • let's look at how the code works
  • how does DFS deal with graph-search from a memory standpoint — does it need to have the explored-set (which would kill memory efficiency)?

Meeting 5, Fri Sep 12

  • interest in psychological / cognitive AI?
  • 4 properties of search: optimality, completeness, time performance, space performance
    • BTW infinite state spaces exist, e.g., in “recursively defined objects”
  • let's use Romania map (Fig 3.2) to understand tree (and then graph) algorithm (Fig 3.7)
  • what are properties of breadth-first search (BFS)?
  • pacman code

Meeting 4, Wed Sep 10

  • Turn in AI environment analysis HW and search problem formulation HW
  • Exam 1 is Mon Oct 20; Exam 2 is Mon Nov 17
  • Poll & discussion on AI agent
  • Fig 3.1: pseudocode for problem-solving agent
  • Poll & discussion on game / search problem formulation
  • demo of Berkeley Pac-Man search code
  • Fig 3.7: tree search
  • Fig 3.10: node data structure
  • PS1 Depth First Search assigned, due in one week.

Meeting 3, Mon Sep 8

  • download one of these two games: Traffic Hour Free (Android) or Rush Hour Free (iOS)
  • Foundations of AI assignment handed back
  • Poll on your answers to this
  • Let's play Rush Hour
  • What sort of environment is amendable to search?
    • static
    • observable
    • discrete
    • deterministic
  • Search problem formulation and discussion:
    1. initial state
    2. actions
    3. transition model
    4. goal test
    5. path cost
  • Formulating a Board Game into a Search Problem due Wed Sep 8

Meeting 2, Fri Sep 5

  • written homework due in class
  • please register now for the Google group! http://groups.google.com/group/uml-ai-fall14
  • don't email me; email the group with problems (I will answer them if no one else does)
  • if you do email me, tell me which class you are in!
  • questions about class policies?
  • What is rationality?
  • PEAS: performance, environment, actuators, sensors
  • Properties of task environments: fully observable or partially observable; single agent or multi-agent (and the relationship among agents, if multi-agent); deterministic or stochastic; episodic or sequential; static or dynamic; discrete or continuous; known or unknown.
  • Types of agents: simple reflex, model-based, goal-based, utility-based.
  • Analyzing an Agent assignment https://grader.cs.uml.edu/assignments/375 due Wed Sep 8
  • Before next class meeting, read Ch 3.1 “Problem-Solving Agents” and Ch 3.1.1 “Well-defined problems and solutions”

Meeting 1, Wed Sep 3