GitHub
People
Publications
Calendar
Projects
Fall 2017
Older Courses
Spring 2017
Fall 2016
Spring 2016
Fall 2015
Spring 2015
Fall 2014
Spring 2014
Fall 2013
Spring 2013
Fall 2012
Spring 2012
Fall 2011
Spring 2011
Fall 2010
Spring 2010
Fall 2009
Spring 2009
Fall 2008
Spring 2008
Fall 2007
HOWTOs
edit SideBar

Home Assignments Lecture Blog Resources Project Discussion Group
Lectures are recorded: http://echo360.uml.edu/martin201617/artificialintelligence.html
Meeting 36, Fri Dec 2
Meeting 35, Wed Nov 30
Meeting 34, Mon Nov 28
 Final project schedule of deliverables
 FPPaper
 team project work time
 course evaluation
Meeting 33, Wed Nov 23
 Prof. Heidi Furey and ethics of AI of autonomous cars
Meeting 32, Mon Nov 21
 Prof. Heidi Furey and ethics of AI of autonomous cars
Meeting 31, Fri Nov 18
Meeting 30, Wed Nov 16
Meeting 29, Mon Nov 14
Meeting 28, Wed Nov 9
 student final project proposal presentations
Meeting 27, Mon Nov 7
 student final project proposal presentations
Meeting 26, Fri Nov 4
 if you don't have a team, fill out singlepartner form by 3p today (or make a team and fill out team form)
 review of formalization of search problem and MDP problem
 including goal test or evaluation function
 FPP has been updated with new requirements for literature review and references, and explicit detail of what is expected in problem analysis.
 inclass exercise: review of MDP paper and of ideas for statespace representation for specific domains.
 group work time: formulate the state space for your project, and report it here.
Meeting 25, Wed Nov 2
Meeting 24, Mon Oct 31
Meeting 23, Fri Oct 28
Meeting 22, Wed Oct 26
 recap of inference and HMMs
from 28:20
Meeting 21, Mon Oct 24
 written Bayes assignment renamed to PS4b
 PS4c on approximate inference and particle filters assigned
 inclass exercise: sensor model
 introduction to particle filtering
Meeting 20, Fri Oct 21
 concept of HMM and its applicability to robot localization;
from Thrun, Burgard, and Fox, Probabilistic Robotics (2006):
 HMMs and robot localization (including robot movement)
Meeting 19, Wed Oct 19
 symbol for conditional independence ⊥ is called “up tack”
 PS4b Bayes Theorem assigned (due Sunday) – this is a written assignment
 inclass exercise: bayes theorem
Meeting 18, Mon Oct 17
Meeting 17, Fri Oct 14
 epsilongreedy random exploration
 change ϵ over time?
 other ideas for exploration?
 impact of exploring on final computed utilities?
python crawler.py
 Approximate Qlearning (from slide 20 of Berkeley Lecture 11 slides)
Meeting 16, Wed Oct 12
Meeting 15, Tue Oct 11
 exams and solutions handed back
 discussion of PS3a problems Q2 and Q3
 PS3b assigned; due Sun Oct 16
 what if you don't know the T and R?
answer: directly interact with the world and learn.
 value iteration used Bellman Equation
 now let's adapt it for “temporal difference learning”
Exam 1 results
n=67 maximum possible 46 high 44; median 33
Meeting 14, Fri Oct 7
Meeting 13, Wed Oct 5
 Echo360 lecture cap usage poll http://pollev.com/fredm
 things that aren't MDPs: people; relationships; value of history
 MDP is analogous to control theory in engineering:
entire state of system is represented as a state vector
may include rate variables (like velocity or acceleration)
the ones we are working with now are
in two weeks we will look at MDPs with hidden variables: partially observable MDPs or POMDPs
 Some realworld applications of MDPs:
 population harvesting
 agriculture
 water resources
 inspection, maintenance, and repair
 purchasing, inventory, and production
 finance and investment; queues
 sales promotion
 search
 motor insurance claims
 overbooking
 epidemics
 credit
 sports
 patient admissions
 location
 design of experiments
 others.
(1993) “A survey of applications of markov decision processes,” D. J. White, Journal of the Operational Research Society, 44(11), p. 1073–1096. Attach:MDPApplications3.pdf
 additive vs. multiplicative discounting
 inclass exercises discount.pdf
 demo of MDP game:
python gridworld.py m
 qvalues are stateaction pairs
 value iteration formula
Meeting 12, Mon Oct 3
 PS3a assigned; due Mon Oct 10 (Mon is a holiday)
 next week, we meet Tue Wed and Fri
 Markov Decision Processes (MDPs), aka “sequential decision problems” (Ch 17)
 a set of states (with initial state s_{0});
 a set ACTIONS(s) of actions in each state;
 a transition model P(s's,a);
 a reward function R(s)—sometimes, rewardforstatetransition: R(s, a, s')
 task is to find a policy π which recommends the action for each state; i.e. π(s)
 the optimal policy (one which produces the largest reward) is π*
 “A policy represents the agent function explicitly and is therefore a description of a simple reflex agent.”
 inclass exercises: gridworld.pdf
Meeting 11: Wed Sep 28
 No office hours tomorrow; Friday
 No posting of solution code policy
 Material that Exam 1 may cover
 consistency (and an example of an inconsistent one)
 continuing minmax.pdf: do you need to expand all nodes?
 Alphabeta simulation (similar to Fig 5.5)
 Alphabeta pruning: similar to minimax, but “prunes away branches that cannot possibly influence the final decision.”
 α — value along best path (highest) choice for MAX
 β — value along best path for MIN
“Alpha–beta search updates the values of α and β as it goes along and prunes the remaining branches at a node (i.e., terminates the recursive call) as soon as the value of the current node is known to be worse than the current α or β value for MAX or MIN, respectively.”
 Let’s go over odd (but explainable) behavior of Pacman with fixed search depths
 Why doesn't it eat the food?
python pacman.py p MinimaxAgent l minimaxClassic a depth=4
 Why does it rush the ghost and die quickly if
depth=3 , but not if depth=1 ? python pacman.py p MinimaxAgent l trappedClassic a depth=3
 expectimax
Meeting 10: Mon Sep 26
 reflections on PS1c
 Exam 1 this Friday; may include everything through Fri Sep 23
 PS2 on multiagent search assigned; due next Sun
 zerosum games—do they represent the world?
 minmax.pdf
 Fig 5.3 from book (p. 166) on minmax algorithm
 evaluation functions—in general; for PS2
Meeting 9: Fri Sep 23
 more about admissibility—NEVER OVERESTIMATE
 astarquestions.pdf
 creating heuristics: use a relaxed problem
Meeting 8: Wed Sep 21
 Welcome our TA, Pengfei Zhang.
His office hours are on Thursdays from 230p to 430p in Olsen 307
 PS1c due date is extended to Sun Sep 25
 today: A* (pronounced ”a star“)
 a breadthfirst search
 like UCS, except, instead of f(n) = g(n)
 f(n) = g(n) + h(n)
 heuristic: an estimate, but not a guess. If “admissible”:
A mathematically provable lower bound for actual path cost.
 a guarantee that a solution with cost lower than the reported value is not possible
 An admissible heuristic may underestimate cost (report a value lower than what's achievable),
but it cannot overestimate (make a point along the frontier seem worse than it is)
 A* search proceeds like a contour map (Fig. 3.25)
 In
util.py , use PriorityQueueWithFunction
 make sure you are pushing
Node objects into queue
 provide a function that implements the f(n) = g(n) + h(n) operation
 inclass exercises: astarcostsearch.pdf astarquestions.pdf
Meeting 7: Mon Sep 19
If you did it your own way, kudos to you.
And you should now do the graphsearch method with node.py and util.py , because you will need that for rest of PS1c.
 MOSS is working; identifying people who collaborated; few false positives
 search problems are ones which can be stated in the form presented in PS1a
 examples of strong PS1a's:
 PS1c assigned; due Friday
 time and space analysis of BFS (p. 82–83) and DFS (p. 86–87)
 UCS
 intro to heuristic search
 mapbased travel
 15tile puzzle
 some PS1c problems
Meeting 6: Fri Sep 16
Meeting 5: Wed Sep 14
 example of very strong PS0c
 new pair programming policy for PS1b
 trees vs. graphs
 states vs. nodes
 state is a configuration of the world
 node is an internal data structure of the search process, which includes:
 present state
 prior (parent) node
 action taken to get from parent to self
 accumulated path cost
 graph search algorithm
 inclass exercises: graphsearch.pdf
Meeting 4: Mon Sep 12
 deterministic vs. stochastic
 in backgammon, you roll dice first and then move. The moves always succeed. Hence, deterministic.
 in Risk, you attack, then roll dice. The attack succeeds or fails partly based on the dice roll. Hence, stochastic.
 afterclass conversation: worlds with the backgammonlike randomness can be characterized as deterministic and dynamic. The dice roll is an example of the world changing between your turns.
 fully vs. partially observable
 Rush Hour is fully observable—all moves available from any state are disclosed; no information is hidden
 Battleship is partially observable—you learn/discover where your opponent’s battleships are as you attack them
 inclass exercises: search.pdf
 search trees: paths from state to state ending at goal
 search proceeds by expanding a state, which generates a set of successor states, and then expanding them until goal is reached
 DFS vs. BFS:
 how do they work
 what are their properties (space, time, optimality, completeness)
 PacMan search problem: find shortest path to the food (demo)
 diving into the code and what you have to do
Meeting 3: Fri Sep 9
 play Rush Hour online Android iOS
 worksheet on traffic game rushhour.pdf
 discussion of uniformed search problem configuration:
 set of states
 initial state
 set of actions
 transition model
 step cost
 goal test
 what are the properties of an environment that is best solved by a search agent?
 fully observable
 deterministic
 discrete
 static
 sequential
Meeting 2: Wed Sep 7
 Poll: Eight Perspectives on AI
 Course grading rubric adjusted (please review Home).
 You are responsible for everything in the syllabus, even if I haven't discussed it in class yet
 Attendance policy / lecture recordings
 Final exam is scheduled: Fri Dec 16 from 3p to 6p
 Email list is set to immediate delivery (you can change later)
 R&N agent model: environment properties Attach:environments.pdf Δ
Meeting 1: Fri Sep 2
 What do you want to do with AI? What opportunities will you create / what problems will you solve?
 Sketch of the semester—problem sets then term project (done in a team)
 Grading categories
 5agent model of AIMA inclass exercises
