GitHub
People
Publications
Calendar
Projects
Fall 2017
Older Courses
Spring 2017
Fall 2016
Spring 2016
Fall 2015
Spring 2015
Fall 2014
Spring 2014
Fall 2013
Spring 2013
Fall 2012
Spring 2012
Fall 2011
Spring 2011
Fall 2010
Spring 2010
Fall 2009
Spring 2009
Fall 2008
Spring 2008
Fall 2007
HOWTOs
edit SideBar

Home Assignments Lecture Blog Resources Project Discussion Group
Exam 1 is Mon Oct 20 Exam 2 is Mon Nov 17 Final is Mon Dec 15 at 8a
Echo360 lecture recordings are at http://echo360.uml.edu/martin201415/artintellfall.html
Whiteboard photos are at https://plus.google.com/u/0/114049637113235097633/posts/J5yftmyoW6t
Final Exam results 73 points available
Homework results this includes the UCB programming assignments and all the written assignments, but not final project work median was 1012; mean was 895 (max possible 1100)
updated Exam 2 results 25 pts available
updated Exam 1 results 32 pts available
Final exam study notes: fall2014aibok.txt
Meeting 40, Wed Dec 10
 Project demos in 3rd Floor Elevator Lobby
Meeting 39, Mon Dec 8
 project demos canceled; informal conversations
Meeting 38, Fri Dec 5
 please have your Project stubbed in by Sunday night
 all project web pages are due 8a Monday morning
 project demonstrations will be held in CS Dept 3rd Floor Elevator Lobby
 bring your laptop running your code to your presentation (if neither partner has a laptop—arrange to share one with a classmate)
 bring one printed copy of your web page to your demo
 prepare to present your project in about 3 minutes
 attendance at your presentation is mandatory
 attendance at demonstrations on the day you're not presenting is optional
 introduction to particle filtering
 Final exam review
 you may bring 3 sheets of doublesided handwritten notes to the final
 there will be checkboxes to replace Exam 1 and/or Exam 2 scores with score on Final
 all Exam 1 and Exam 2 material, plus:
 basic Bayes net probability calculations; e.g. Q1 of UCB Fall 2012 midterm
 being able to calculate probability distributions given emission and movement probability models (the calculations done in PS4a). See slides 18–32 of UCB CS188 Lecture 14 Notes
 course evaluation
Meeting 37, Wed Dec 3
 Exam 2 handed back. Mean was 8.8.
 final exam policy: you can elect to use final exam scores in lieu of your Exam 1 and/or Exam 2 scores
 Qlearning update equation (from UCB Lecture 11 slides, as posted):
Meeting 36, Mon Dec 1
Meeting 35, Mon Nov 24
Meeting 34, Fri Nov 21
 HMMs and robot localization (including robot movement)
Meeting 33, Wed Nov 19
Meeting 32, Mon Nov 17
Meeting 31, Fri Nov 14
 For exam 2, make sure you can perform value iteration and Qlearning algorithms on various MDPs.
 You are allowed one 8.5x11" page of handwritten notes (doublesided OK)
 Final project presentations
Meeting 30, Wed Nov 12
 FPP assignment now has requirement of slide presentation (due next class)
 possible partner reshuffling after midnight tonight
 Let's review MDPs and Berkeley exam problems:
Meeting 29, Mon Nov 10
Meeting 28, Fri Nov 7
 Exam 1 handed back and discussed
 brief introduction to ghostbusters problem set
Meeting 27, Wed Nov 5
Meeting 26, Mon Nov 3
Meeting 25, Fri Oct 31
Meeting 24, Wed Oct 29
Meeting 23, Mon Oct 27
 PS3 Q6: bridge crossing revisited—are there values of ε (exploration rate) and α (learning rate) that will cause it to learn the bridge 99% of the time? Remember that
 ε = 1 means always explore; ε = 0 is always exploit
 α = 1 means always learn (and forget the past); α = 0 never learn
 PS3 Q7: approximate Qlearning I (direct states)
 PS3 Q8: approximate Qlearning II (learning feature weights)
 see Lecture 11: Reinforcement Learning II Berkeley slides
Meeting 22, Fri Oct 24
 go over value iteration solution (Jeremy)
 Qlearning: you don't know T or R, but must discover by exploration
 demo of working Qlearning on gridworld
 demo of working Qlearning on crawler
 see Lecture 11: Reinforcement Learning II Berkeley slides
Meeting 21, Wed Oct 22
Meeting 20, Mon Oct 20
Meeting 19, Fri Oct 17
 you may bring one page of 8.5" x 11" paper to the exam, which may have handwritten notes doublesided.
 exam will cover agents and environments, uninformed and informed search, and adversarial search, specifically:
 Ch 2.1 2.3 2.4
 Ch 3.1 3.2 3.3 3.4 3.5 3.6
 Ch 5  minimax, alpha/beta pruning, expectimax
 went over problems from Berkeley Fall 2013 midterm 1.
Meeting 18, Wed Oct 15
 Analyzing an Agent assignment returned. Some feedback:
 A few people didn't realize that the assignment was to pick a specific agent and its environment and analyze it. I did not want a generic discussion of the material in the chapter.
 More people did discuss observability, sequential vs. episodic, etc. but did not describe the internal type of their agent (e.g. reflex agent, goalbased agent, etc.)
 Some people did not make a clear distinction between the agent and its environment.
Other notes:
 On observability: in general, the real world is only partially observable. If you have sensors trying to ascertain the agent's state in the world, the sensors might have errors or simply might not be sensing something that's actually important. Typically, only things like games or computer simulations can be fully observable (e.g., chess)
Meeting 17, Fri Oct 10
Meeting 16, Wed Oct 8
 Questions 1 through 3 of http://ai.berkeley.edu/reinforcement.html will be due next Wednesday.
 We're learning to be a reflex agent by interacting with an MDP (which is a model of the world). Note: we're not actually interacting with the world itself, we're only interacting with the given model!
 This will result in a policy—a choice of action for a given state.
 Then we play in the world.
 Some realworld applications of MDPs:
 population harvesting
 agriculture
 water resources
 inspection, maintenance, and repair
 purchasing, inventory, and production
 finance and investment; queues
 sales promotion
 search
 motor insurance claims
 overbooking
 epidemics
 credit
 sports
 patient admissions
 location
 design of experiments
 others.
(1993) “A survey of applications of markov decision processes,” D. J. White, Journal of the Operational Research Society, 44(11), p. 1073–1096. Attach:MDPApplications3.pdf
Meeting 15, Mon Oct 6
 Markov Decision Processes (MDPs), aka “sequential decision problems” (Ch 17)
 a set of states (with initial state s_{0});
 a set ACTIONS(s) of actions in each state;
 a transition model P(s's,a);
 a reward function R(s).
 task is to find a policy π which recommends the action for each state; i.e. π(s)
 the optimal policy (one which produces the largest reward) is π*
 “A policy represents the agent function explicitly and is therefore a description of a simple reflex agent.”
Meeting 14, Fri Oct 3
 let's carry out the Fig 5.3 minimax algorithm on the Fig 5.2 game tree (it's a mutuallyrecursive algorithm)
 O(b^m) time complexity, O(bm) space complexity (not very practical directly in time)
 Alphabeta pruning: similar to minimax, but “prunes away branches that cannot possibly influence the final decision.”
 α — value along best path (highest) choice for MAX
 β — value along best path for MIN
“Alpha–beta search updates the values of α and β as it goes along and prunes the remaining branches at a node (i.e., terminates the recursive call) as soon as the value of the current node is known to be worse than the current α or β value for MAX or MIN, respectively.”
 Live alphabeta simulation (similar to Fig 5.5)
 Let’s go over odd (but explainable) behavior of Pacman with fixed search depths
 Why doesn't it eat the food?
python pacman.py p MinimaxAgent l minimaxClassic a depth=4
 Why does it rush the ghost and die quickly if
depth=3 , but not if depth=1 ? python pacman.py p MinimaxAgent l trappedClassic a depth=3
Meeting 13, Wed Oct 1
let's look at evaluation functions for reflex agent
 formal def'n of minimax (Fig. 5.3), with
V(s) = max V(s') where s' iselementof successor(s)
 with PS, you have multiple adversaries (each ghost), each picking the thing that's worst for you
 alphabeta pruning
Meeting 12, Mon Sep 29
 this is how we do graphsearch for UCB PS:
Meeting 11, Fri Sep 26
Meeting 10, Wed Sep 24
 h(n) must be admissible — never overestimates. (Never lets you believe that a good path is bad.)
 admissible heuristics are optimistic — but that's OK.
 consistent — admissible heuristic that follows triangleinequality:
h(n) <= c(n, a, n') + h(n')
 A* uses same algorithm as UCS, except g + h instead of g
 let's look at heuristics for queensproblem, sliding tile problem
 a dominating heuristic — strictly better
“It is generally better to use a heuristic function with higher values, provided it is consistent and that the computation time for the heuristic is not too long.”
 relaxing a problem — a good way to generate a heuristic
“A problem with fewer restrictions on the actions is called a relaxed problem. The statespace graph of the relaxed problem is a supergraph of the original state space because the removal of restrictions creates added edges in the graph.”
Meeting 9, Mon Sep 22
 https://grader.cs.uml.edu/assignments/384 extended to Friday
 Manhattan distance (for PacMac) and Straight Line Distance are good ways of knowing if you're closer to the goal for PacMan and city travel problems
 discussion of sliding tile, 8queen, and Knuth problems: how do you know if you're closer to the goal?
 these are heuristics
 informed search: greedy bestfirst search, f(n) = h(n)
 showing how GBFS operates on Romanian map — finds an answer, but not the best one
 A* search, f(n) = g(n) + h(n)
 problem set: looked at different cost functions and how UCS can accept "stay east" or "stay west" cost functions
Meeting 8, Fri Sep 19
 recap of BFS and DFS
 other search types: DLS, IDS, UCS
 time/space complexity of UCS based on worstcase optional solution C* and action cost ϵ
 UCS example with Fig 3.15
 what about if you know something about where is the solution?
Meeting 7, Wed Sep 17
 DFS for tree search  space complexity  answer is that it should check new states against those already on the current path! this will avoid infinite loops in finite spaces
 DFS  optimality
 what do you need in node data structure?
 other search types
 Berkeley full search problem set assigned https://grader.cs.uml.edu/assignments/384 due next Wednesday
Meeting 6, Mon Sep 15
 We talked about BFS, but assignment is DFS
 recap of properties of BFS
 let's calculate time and space
 what's different if we're doing DFS?
 calculate/talk about time, space, optimality, completeness
 let's look at how the code works
 how does DFS deal with graphsearch from a memory standpoint — does it need to have the exploredset (which would kill memory efficiency)?
Meeting 5, Fri Sep 12
 interest in psychological / cognitive AI?
 4 properties of search: optimality, completeness, time performance, space performance
 BTW infinite state spaces exist, e.g., in “recursively defined objects”
 let's use Romania map (Fig 3.2) to understand tree (and then graph) algorithm (Fig 3.7)
 what are properties of breadthfirst search (BFS)?
 pacman code
Meeting 4, Wed Sep 10
 Turn in AI environment analysis HW and search problem formulation HW
 Exam 1 is Mon Oct 20; Exam 2 is Mon Nov 17
 Poll & discussion on AI agent
 Fig 3.1: pseudocode for problemsolving agent
 Poll & discussion on game / search problem formulation
 demo of Berkeley PacMan search code
 Fig 3.7: tree search
 Fig 3.10: node data structure
 PS1 Depth First Search assigned, due in one week.
Meeting 3, Mon Sep 8
 download one of these two games: Traffic Hour Free (Android) or Rush Hour Free (iOS)
 Foundations of AI assignment handed back
 Poll on your answers to this
 Let's play Rush Hour
 What sort of environment is amendable to search?
 static
 observable
 discrete
 deterministic
 Search problem formulation and discussion:
 initial state
 actions
 transition model
 goal test
 path cost
 Formulating a Board Game into a Search Problem due Wed Sep 8
Meeting 2, Fri Sep 5
 written homework due in class
 please register now for the Google group! http://groups.google.com/group/umlaifall14
 don't email me; email the group with problems (I will answer them if no one else does)
 if you do email me, tell me which class you are in!
 questions about class policies?
 What is rationality?
 PEAS: performance, environment, actuators, sensors
 Properties of task environments: fully observable or partially observable; single agent or multiagent (and the relationship among agents, if multiagent); deterministic or stochastic; episodic or sequential; static or dynamic; discrete or continuous; known or unknown.
 Types of agents: simple reflex, modelbased, goalbased, utilitybased.
 Analyzing an Agent assignment https://grader.cs.uml.edu/assignments/375 due Wed Sep 8
 Before next class meeting, read Ch 3.1 “ProblemSolving Agents” and Ch 3.1.1 “Welldefined problems and solutions”
Meeting 1, Wed Sep 3
