Home Assignments Lecture Blog Resources Project Discussion Group

Lectures are recorded:

Meeting 36, Fri Dec 2

Meeting 35, Wed Nov 30

Meeting 34, Mon Nov 28

  • Exam 2 grading histogram
  • Final project schedule of deliverables
  • FPPaper
  • team project work time
  • course evaluation

Meeting 33, Wed Nov 23

  • Prof. Heidi Furey and ethics of AI of autonomous cars

Meeting 32, Mon Nov 21

  • Prof. Heidi Furey and ethics of AI of autonomous cars

Meeting 31, Fri Nov 18

Meeting 30, Wed Nov 16

  • Exam 2

Meeting 29, Mon Nov 14

Meeting 28, Wed Nov 9

  • student final project proposal presentations

Meeting 27, Mon Nov 7

  • student final project proposal presentations

Meeting 26, Fri Nov 4

  • if you don't have a team, fill out single-partner form by 3p today (or make a team and fill out team form)
  • review of formalization of search problem and MDP problem
    • including goal test or evaluation function
  • FPP has been updated with new requirements for literature review and references, and explicit detail of what is expected in problem analysis.
  • in-class exercise: review of MDP paper and of ideas for state-space representation for specific domains.
  • group work time: formulate the state space for your project, and report it here.

Meeting 25, Wed Nov 2

Meeting 24, Mon Oct 31

Meeting 23, Fri Oct 28

Meeting 22, Wed Oct 26

  • recap of inference and HMMs
(:youtube OdQoRGRPmj8:)
from 28:20

Meeting 21, Mon Oct 24

  • written Bayes assignment renamed to PS4b
  • PS4c on approximate inference and particle filters assigned
  • in-class exercise: sensor model
  • introduction to particle filtering
(:youtube H0G1yslM5rc:)
(:youtube lwg_KI3UewY:)
(:youtube eWsEyJCVXoo:)

Meeting 20, Fri Oct 21

  • Bayes’ Theorem
“Bayes' Theorem MMB 01” by mattbuck (category) - Own work by mattbuck..
Licensed under Creative Commons Attribution-Share Alike 3.0 via Wikimedia Commons - http://commons.wikimedia.org/wiki/File:Bayes%27_Theorem_MMB_01.jpg#mediaviewer/File:Bayes%27_Theorem_MMB_01.jpg
  • concept of HMM and its applicability to robot localization;
from Thrun, Burgard, and Fox, Probabilistic Robotics (2006):
  • HMMs and robot localization (including robot movement)
(:youtube 8mi8z-EnYq8:)

Meeting 19, Wed Oct 19

  • symbol for conditional independence ⊥ is called “up tack”
  • PS4b Bayes Theorem assigned (due Sunday) – this is a written assignment
  • in-class exercise: bayes theorem

Meeting 18, Mon Oct 17

Meeting 17, Fri Oct 14

  • epsilon-greedy random exploration
    • change ϵ over time?
    • other ideas for exploration?
    • impact of exploring on final computed utilities?
    • python crawler.py
  • Approximate Q-learning (from slide 20 of Berkeley Lecture 11 slides)

Meeting 16, Wed Oct 12

Meeting 15, Tue Oct 11

  • exams and solutions handed back
  • discussion of PS3a problems Q2 and Q3
  • PS3b assigned; due Sun Oct 16
  • what if you don't know the T and R?
answer: directly interact with the world and learn.
  • value iteration used Bellman Equation
  • now let's adapt it for “temporal difference learning”

Exam 1 results

maximum possible 46
high 44; median 33

Meeting 14, Fri Oct 7

Meeting 13, Wed Oct 5

  • Echo360 lecture cap usage poll http://pollev.com/fredm
  • things that aren't MDPs: people; relationships; value of history
  • MDP is analogous to control theory in engineering:
entire state of system is represented as a state vector
may include rate variables (like velocity or acceleration)
  • are MDPs observable?
the ones we are working with now are
in two weeks we will look at MDPs with hidden variables: partially observable MDPs or POMDPs
  • Some real-world applications of MDPs:
    • population harvesting
    • agriculture
    • water resources
    • inspection, maintenance, and repair
    • purchasing, inventory, and production
    • finance and investment; queues
    • sales promotion
    • search
    • motor insurance claims
    • overbooking
    • epidemics
    • credit
    • sports
    • patient admissions
    • location
    • design of experiments
    • others.
(1993) “A survey of applications of markov decision processes,” D. J. White, Journal of the Operational Research Society, 44(11), p. 1073–1096. Attach:MDPApplications3.pdf
  • additive vs. multiplicative discounting
  • in-class exercises discount.pdf
  • demo of MDP game: python gridworld.py -m
  • q-values are state-action pairs
  • value iteration formula

Meeting 12, Mon Oct 3

  • PS3a assigned; due Mon Oct 10 (Mon is a holiday)
  • next week, we meet Tue Wed and Fri
  • Markov Decision Processes (MDPs), aka “sequential decision problems” (Ch 17)
    • a set of states (with initial state s0);
    • a set ACTIONS(s) of actions in each state;
    • a transition model P(s'|s,a);
    • a reward function R(s)—sometimes, reward-for-state-transition: R(s, a, s')
  • task is to find a policy π which recommends the action for each state; i.e. π(s)
  • the optimal policy (one which produces the largest reward) is π*
  • “A policy represents the agent function explicitly and is therefore a description of a simple reflex agent.”
  • in-class exercises: gridworld.pdf

Meeting 11: Wed Sep 28

  • No office hours tomorrow; Friday
  • No posting of solution code policy
  • Material that Exam 1 may cover
  • consistency (and an example of an inconsistent one)
  • continuing min-max.pdf: do you need to expand all nodes?
  • Alpha-beta simulation (similar to Fig 5.5)
  • Alpha-beta pruning: similar to minimax, but “prunes away branches that cannot possibly influence the final decision.”
    • α — value along best path (highest) choice for MAX
    • β — value along best path for MIN
“Alpha–beta search updates the values of α and β as it goes along and prunes the remaining branches at a node (i.e., terminates the recursive call) as soon as the value of the current node is known to be worse than the current α or β value for MAX or MIN, respectively.”
  • Let’s go over odd (but explainable) behavior of Pacman with fixed search depths
    • Why doesn't it eat the food? python pacman.py -p MinimaxAgent -l minimaxClassic -a depth=4
    • Why does it rush the ghost and die quickly if depth=3, but not if depth=1? python pacman.py -p MinimaxAgent -l trappedClassic -a depth=3
  • expectimax

Meeting 10: Mon Sep 26

  • reflections on PS1c
  • Exam 1 this Friday; may include everything through Fri Sep 23
  • PS2 on multi-agent search assigned; due next Sun
  • zero-sum games—do they represent the world?
  • min-max.pdf
  • Fig 5.3 from book (p. 166) on min-max algorithm
  • evaluation functions—in general; for PS2

Meeting 9: Fri Sep 23

  • more about admissibility—NEVER OVERESTIMATE
  • astar-questions.pdf
  • creating heuristics: use a relaxed problem
let's look at sliding-tile puzzle

Meeting 8: Wed Sep 21

  • Welcome our TA, Pengfei Zhang.
    His office hours are on Thursdays from 230p to 430p in Olsen 307
  • PS1c due date is extended to Sun Sep 25
    • let's look at it
  • today: A* (pronounced ”a star“)
    • a breadth-first search
    • like UCS, except, instead of f(n) = g(n)
    • f(n) = g(n) + h(n)
    • heuristic: an estimate, but not a guess. If “admissible”:
      A mathematically provable lower bound for actual path cost.
    • a guarantee that a solution with cost lower than the reported value is not possible
    • An admissible heuristic may underestimate cost (report a value lower than what's achievable),
      but it cannot overestimate (make a point along the frontier seem worse than it is)
    • A* search proceeds like a contour map (Fig. 3.25)
  • In util.py, use PriorityQueueWithFunction
    • make sure you are pushing Node objects into queue
    • provide a function that implements the f(n) = g(n) + h(n) operation
  • in-class exercises: astar-cost-search.pdf astar-questions.pdf

Meeting 7: Mon Sep 19

If you did it your own way, kudos to you.
And you should now do the graph-search method with node.py and util.py, because you will need that for rest of PS1c.

Meeting 6: Fri Sep 16

  • tree search

Meeting 5: Wed Sep 14

  • example of very strong PS0c
  • new pair programming policy for PS1b
  • trees vs. graphs
  • states vs. nodes
    • state is a configuration of the world
    • node is an internal data structure of the search process, which includes:
      • present state
      • prior (parent) node
      • action taken to get from parent to self
      • accumulated path cost
  • graph search algorithm
  • in-class exercises: graph-search.pdf
  • util.py

Meeting 4: Mon Sep 12

  • deterministic vs. stochastic
    • in backgammon, you roll dice first and then move. The moves always succeed. Hence, deterministic.
    • in Risk, you attack, then roll dice. The attack succeeds or fails partly based on the dice roll. Hence, stochastic.
    • after-class conversation: worlds with the backgammon-like randomness can be characterized as deterministic and dynamic. The dice roll is an example of the world changing between your turns.
  • fully vs. partially observable
    • Rush Hour is fully observable—all moves available from any state are disclosed; no information is hidden
    • Battleship is partially observable—you learn/discover where your opponent’s battleships are as you attack them
  • in-class exercises: search.pdf
  • search trees: paths from state to state ending at goal
  • search proceeds by expanding a state, which generates a set of successor states, and then expanding them until goal is reached
  • DFS vs. BFS:
    • how do they work
    • what are their properties (space, time, optimality, completeness)
  • Pac-Man search problem: find shortest path to the food (demo)
  • diving into the code and what you have to do

Meeting 3: Fri Sep 9

  • play Rush Hour online Android iOS
  • worksheet on traffic game rushhour.pdf
  • discussion of uniformed search problem configuration:
    • set of states
    • initial state
    • set of actions
    • transition model
    • step cost
    • goal test
  • what are the properties of an environment that is best solved by a search agent?
    • fully observable
    • deterministic
    • discrete
    • static
    • sequential

Meeting 2: Wed Sep 7

  • Poll: Eight Perspectives on AI
  • Course grading rubric adjusted (please review Home).
  • You are responsible for everything in the syllabus, even if I haven't discussed it in class yet
  • Attendance policy / lecture recordings
  • Final exam is scheduled: Fri Dec 16 from 3p to 6p
  • Email list is set to immediate delivery (you can change later)
  • R&N agent model: environment properties Attach:environments.pdf Δ

Meeting 1: Fri Sep 2

  • What do you want to do with AI? What opportunities will you create / what problems will you solve?
  • Sketch of the semester—problem sets then term project (done in a team)
  • Grading categories
  • 5-agent model of AIMA in-class exercises
  • Demo of Bottlenose