# LectureBlog

Home Assignments Lecture Blog Resources Project Discussion Group

**Exam 1 is Mon Oct 20** **Exam 2 is Mon Nov 17** **Final is Mon Dec 15 at 8a**

Echo360 lecture recordings are at

http://echo360.uml.edu/martin201415/artintellfall.html

Whiteboard photos are at

https://plus.google.com/u/0/114049637113235097633/posts/J5yftmyoW6t

**Final Exam results**

73 points available

**Homework results**

this includes the UCB programming assignments and all the written assignments,

but not final project work

median was 1012; mean was 895 (max possible 1100)

**updated Exam 2 results**

25 pts available

**updated Exam 1 results**

32 pts available

**Final exam study notes:** fall2014-ai-bok.txt

**Meeting 40, Wed Dec 10**

- Project demos in 3rd Floor Elevator Lobby

**Meeting 39, Mon Dec 8**

- project demos canceled; informal conversations

**Meeting 38, Fri Dec 5**

- please have your Project stubbed in by Sunday night
- all project web pages are due 8a Monday morning
- project demonstrations will be held in CS Dept 3rd Floor Elevator Lobby
- bring your laptop running your code to your presentation (if neither partner has a laptop—arrange to share one with a classmate)
- bring one printed copy of your web page to your demo
- prepare to present your project in about 3 minutes
**attendance at your presentation is mandatory**- attendance at demonstrations on the day you're not presenting is optional

- introduction to particle filtering

- Final exam review
- you may bring 3 sheets of double-sided handwritten notes to the final
- there will be check-boxes to replace Exam 1 and/or Exam 2 scores with score on Final
- all Exam 1 and Exam 2 material, plus:
- basic Bayes net probability calculations; e.g. Q1 of UCB Fall 2012 midterm
- being able to calculate probability distributions given emission and movement probability models (the calculations done in PS4a). See slides 18–32 of UCB CS188 Lecture 14 Notes

- course evaluation

**Meeting 37, Wed Dec 3**

- Exam 2 handed back. Mean was 8.8.

- final exam policy: you can elect to use final exam scores in lieu of your Exam 1 and/or Exam 2 scores
- Q-learning update equation (from UCB Lecture 11 slides, as posted):

**Meeting 36, Mon Dec 1**

**Project Web Page assigned; due Mon Dec 8 at 8 am**https://grader.cs.uml.edu/assignments/461**Project Writeup assigned; due Sun Dec 14**https://grader.cs.uml.edu/assignments/462**Final Code Turn-In assigned; due Sun Dec 14**https://grader.cs.uml.edu/assignments/463

**Meeting 35, Mon Nov 24**

**Meeting 34, Fri Nov 21**

- Final project first deliverable is due Monday -- submit at https://grader.cs.uml.edu/assignments/451
- From Thrun, Burgard, and Fox,
*Probabilistic Robotics*(2006).

- HMMs and robot localization (including robot movement)

- Problems 2 and 3 from Tracking problem set

**Meeting 33, Wed Nov 19**

**PS4a Hidden Markov Models (Ghostbusters) assigned; due Wed Nov 26**https://grader.cs.uml.edu/assignments/449- Final project presentations (part 2)

**Meeting 32, Mon Nov 17**

- Exam 2 administered

**Meeting 31, Fri Nov 14**

- For exam 2, make sure you can perform value iteration and Q-learning algorithms on various MDPs.
- You are allowed one 8.5x11" page of handwritten notes (double-sided OK)
- Final project presentations

**Meeting 30, Wed Nov 12**

- FPP assignment now has requirement of slide presentation (due next class)
- possible partner reshuffling after midnight tonight
- Let's review MDPs and Berkeley exam problems:

**Meeting 29, Mon Nov 10**

**Exam 2 is next Mon, Nov 17****Final Project Proposal assigned, due this Fri Nov 14**https://grader.cs.uml.edu/assignments/445- discussion of FPP1
- more discussion of Ghostbusters

**Meeting 28, Fri Nov 7**

- Exam 1 handed back and discussed
- brief introduction to ghostbusters problem set

**Meeting 27, Wed Nov 5**

- Exam 1 is graded; scores are in Bottlenose (please let me know if yours is missing)
- continue with Thrun's YouTube lectures on probability and Bayes, beginning with item 15.

**Meeting 26, Mon Nov 3**

Licensed under Creative Commons Attribution-Share Alike 3.0 via Wikimedia Commons - http://commons.wikimedia.org/wiki/File:Bayes%27_Theorem_MMB_01.jpg#mediaviewer/File:Bayes%27_Theorem_MMB_01.jpg

- two-dimensional way of thinking about Bayes
- review Thrun's YouTube lectures on probability and Bayes, through item 15.

**Meeting 25, Fri Oct 31**

**Final Project Team or Interest Declaration assigned; due Wed Nov 5**https://grader.cs.uml.edu/assignments/427**Final Project Pitch #1 assigned; due Fri Nov 7**https://grader.cs.uml.edu/assignments/428- introduction to Bayes' Law
- continue learning about Bayes Network's, probabilities, and various calculations by reviewing lectures here: https://www.youtube.com/playlist?list=PLEED2415FEB9DC53B

**Meeting 24, Wed Oct 29**

- discussion of RL problem set
**Conference Paper Review assigned; due Wed Nov 5**https://grader.cs.uml.edu/assignments/426- more FP brainstorming and discussion of parameters

**Meeting 23, Mon Oct 27**

- PS3 Q6: bridge crossing revisited—are there values of ε (exploration rate) and α (learning rate) that will cause it to learn the bridge 99% of the time? Remember that
- ε = 1 means always explore; ε = 0 is always exploit
- α = 1 means always learn (and forget the past); α = 0 never learn

- PS3 Q7: approximate Q-learning I (direct states)
- PS3 Q8: approximate Q-learning II (learning feature weights)
- see Lecture 11: Reinforcement Learning II Berkeley slides

**Meeting 22, Fri Oct 24**

- go over value iteration solution (Jeremy)
- Q-learning: you don't know T or R, but must discover by exploration
- demo of working Q-learning on gridworld
- demo of working Q-learning on crawler
- see Lecture 11: Reinforcement Learning II Berkeley slides

**Meeting 21, Wed Oct 22**

**PS3b assigned; due Wed Oct 29**https://grader.cs.uml.edu/assignments/418- final project brainstorming

**Meeting 20, Mon Oct 20**

- Exam 1 administered

**Meeting 19, Fri Oct 17**

- you may bring one page of 8.5" x 11" paper to the exam, which may have handwritten notes double-sided.
- exam will cover agents and environments, un-informed and informed search, and adversarial search, specifically:
- Ch 2.1 2.3 2.4
- Ch 3.1 3.2 3.3 3.4 3.5 3.6
- Ch 5 - minimax, alpha/beta pruning, expectimax

- went over problems from Berkeley Fall 2013 midterm 1.

**Meeting 18, Wed Oct 15**

- Analyzing an Agent assignment returned. Some feedback:
- A few people didn't realize that the assignment was to
*pick a specific agent and its environment*and analyze it. I did*not*want a generic discussion of the material in the chapter. - More people did discuss observability, sequential vs. episodic, etc. but did
*not*describe the internal type of their agent (e.g. reflex agent, goal-based agent, etc.) - Some people did not make a clear distinction between the
*agent*and its*environment.*

- A few people didn't realize that the assignment was to

- On observability: in general, the real world is only partially observable. If you have sensors trying to ascertain the agent's state in the world, the sensors might have errors or simply might not be sensing something that's actually important. Typically, only things like games or computer simulations can be fully observable (e.g., chess)

- Full discussion of this

**Meeting 17, Fri Oct 10**

**PS3a assigned; due Wed Oct 15**https://grader.cs.uml.edu/assignments/408- continuing to go over value iteration
- q1 q2 q3 of the PS

**Meeting 16, Wed Oct 8**

**Questions 1 through 3 of http://ai.berkeley.edu/reinforcement.html will be due next Wednesday.**- We're learning to be a reflex agent by interacting with an MDP (which is a model of the world). Note: we're not actually interacting with the world itself, we're only interacting with the given model!
- This will result in a policy—a choice of action for a given state.
- Then we play in the world.

- Some real-world applications of MDPs:
- population harvesting
- agriculture
- water resources
- inspection, maintenance, and repair
- purchasing, inventory, and production
- finance and investment; queues
- sales promotion
- search
- motor insurance claims
- overbooking
- epidemics
- credit
- sports
- patient admissions
- location
- design of experiments
- others.

*Journal of the Operational Research Society*,

**44**(11), p. 1073–1096. Attach:MDPApplications3.pdf

**Meeting 15, Mon Oct 6**

- Markov Decision Processes (MDPs),
*aka*“sequential decision problems” (Ch 17)- a set of states (with initial state
*s*);_{0} - a set ACTIONS(
*s*) of actions in each state; - a transition model
*P(s'|s,a)*; - a reward function
*R(s)*.

- a set of states (with initial state
- task is to find a
*policy*π which recommends the action for each state; i.e. π*(s)* - the optimal policy (one which produces the largest reward) is π*
- “A policy represents the agent function explicitly and is therefore a description of a simple reflex agent.”

**Meeting 14, Fri Oct 3**

- let's carry out the Fig 5.3 minimax algorithm on the Fig 5.2 game tree (it's a mutually-recursive algorithm)
- O(b^m) time complexity, O(bm) space complexity (not very practical directly in time)
- Alpha-beta pruning: similar to minimax, but “prunes away branches that cannot possibly influence the final decision.”
- α — value along best path (highest) choice for MAX
- β — value along best path for MIN

- Live alpha-beta simulation (similar to Fig 5.5)
- Let’s go over odd (but explainable) behavior of Pacman with fixed search depths
- Why doesn't it eat the food?
`python pacman.py -p MinimaxAgent -l minimaxClassic -a depth=4`

- Why does it rush the ghost and die quickly if
`depth=3`

, but not if`depth=1`

?`python pacman.py -p MinimaxAgent -l trappedClassic -a depth=3`

- Why doesn't it eat the food?

**Meeting 13, Wed Oct 1**

**multiagent search PS is posted—due Wed Oct 8**https://grader.cs.uml.edu/assignments/402- first problem is to design a reflex agent:

- formal def'n of minimax (Fig. 5.3), with

*V(s) = max V(s')*where

*s' is-element-of successor(s)*

- with PS, you have multiple adversaries (each ghost), each picking the thing that's worst for you
- alpha-beta pruning

**Meeting 12, Mon Sep 29**

- this is how we do graph-search for UCB PS:

- please review http://ai.berkeley.edu/slides/Lecture%202%20--%20Uninformed%20Search/SP14%20CS188%20Lecture%202%20--%20Uninformed%20Search.pptx and http://ai.berkeley.edu/slides/Lecture%203%20--%20Informed%20Search/SP14%20CS188%20Lecture%203%20--%20Informed%20Search.pptx
- types of games; deterministic games; zero-sum games
- adversarial search—what would be involved?
- tic-tac-toe board
- search tree with terminal states -- what should player do?
- minimax algorithm
- in general, can't get to leaves. need evaluation function for value of a state of the world.
- Multi-agent (adversarial) Pac-man: http://ai.berkeley.edu/multiagent.html

**Meeting 11, Fri Sep 26**

- Pacman search debrief
- Simulated exams:
- UCB Fall 2011, question 1 (enacting DFS/BFS/UCS/greedy/A*)
- UCB Spring 2011, question 1 (analyzing Pacman mod)

- next material: Multi-agent search

**Meeting 10, Wed Sep 24**

*h(n)*must be admissible — never overestimates. (Never lets you believe that a good path is bad.)- admissible heuristics are optimistic — but that's OK.
- consistent — admissible heuristic that follows triangle-inequality:

*h(n) <= c(n, a, n') + h(n')*

- A* uses same algorithm as UCS, except
*g + h*instead of*g* - let's look at heuristics for queens-problem, sliding tile problem
- a
**dominating**heuristic — strictly better

**relaxing**a problem — a good way to generate a heuristic

**relaxed problem.**The state-space graph of the relaxed problem is a

*supergraph*of the original state space because the removal of restrictions creates added edges in the graph.”

- now, heuristics for PS

**Meeting 9, Mon Sep 22**

- https://grader.cs.uml.edu/assignments/384 extended to Friday
- Manhattan distance (for Pac-Mac) and Straight Line Distance are good ways of knowing if you're closer to the goal for Pac-Man and city travel problems
- discussion of sliding tile, 8-queen, and Knuth problems: how do you know if you're closer to the goal?
- these are heuristics
- informed search: greedy best-first search,
*f(n) = h(n)* - showing how GBFS operates on Romanian map — finds an answer, but not the best one
- A* search,
*f(n) = g(n) + h(n)* - problem set: looked at different cost functions and how UCS can accept "stay east" or "stay west" cost functions

**Meeting 8, Fri Sep 19**

- recap of BFS and DFS
- other search types: DLS, IDS, UCS
- time/space complexity of UCS based on worst-case optional solution
*C**and action cost*ϵ* - UCS example with Fig 3.15
- what about if you know something about where is the solution?

**Meeting 7, Wed Sep 17**

- DFS for tree search -- space complexity -- answer is that it should check new states against those already on the current path! this will avoid infinite loops in finite spaces
- DFS -- optimality
- what do you need in node data structure?
- other search types
**Berkeley full search problem set assigned**https://grader.cs.uml.edu/assignments/384 due next Wednesday

**Meeting 6, Mon Sep 15**

- We talked about BFS, but assignment is DFS
- recap of properties of BFS
- let's calculate time and space
- what's different if we're doing DFS?
- calculate/talk about time, space, optimality, completeness
- let's look at how the code works
- how does DFS deal with graph-search from a memory standpoint — does it need to have the explored-set (which would kill memory efficiency)?

**Meeting 5, Fri Sep 12**

- interest in psychological / cognitive AI?
- 4 properties of search: optimality, completeness, time performance, space performance
- BTW infinite state spaces exist, e.g., in “recursively defined objects”

- let's use Romania map (Fig 3.2) to understand tree (and then graph) algorithm (Fig 3.7)
- what are properties of breadth-first search (BFS)?
- pacman code

**Meeting 4, Wed Sep 10**

- Turn in AI environment analysis HW and search problem formulation HW
**Exam 1 is Mon Oct 20; Exam 2 is Mon Nov 17**- Poll & discussion on AI agent
- Fig 3.1: pseudocode for problem-solving agent
- Poll & discussion on game / search problem formulation
- demo of Berkeley Pac-Man search code
- Fig 3.7: tree search
- Fig 3.10: node data structure
**PS1 Depth First Search assigned**, due in one week.

**Meeting 3, Mon Sep 8**

- download one of these two games: Traffic Hour Free (Android) or Rush Hour Free (iOS)
- Foundations of AI assignment handed back
- Poll on your answers to this
- Let's play Rush Hour
- What sort of environment is amendable to search?
- static
- observable
- discrete
- deterministic

- Search problem formulation and discussion:
- initial state
- actions
- transition model
- goal test
- path cost

**Formulating a Board Game into a Search Problem due Wed Sep 8**

**Meeting 2, Fri Sep 5**

- written homework due in class
**please register now for the Google group!**http://groups.google.com/group/uml-ai-fall14- don't email me; email the group with problems (I will answer them if no one else does)
- if you do email me, tell me which class you are in!
- questions about class policies?
- What is rationality?
- PEAS: performance, environment, actuators, sensors
- Properties of task environments: fully observable or partially observable; single agent or multi-agent (and the relationship among agents, if multi-agent); deterministic or stochastic; episodic or sequential; static or dynamic; discrete or continuous; known or unknown.
- Types of agents: simple reflex, model-based, goal-based, utility-based.
**Analyzing an Agent assignment https://grader.cs.uml.edu/assignments/375 due Wed Sep 8**- Before next class meeting, read Ch 3.1 “Problem-Solving Agents” and Ch 3.1.1 “Well-defined problems and solutions”

**Meeting 1, Wed Sep 3**

- what is AI (discussion)?
- “thinking humanly” (strong AI) vs. “acting rationally” (weak AI)
- the Pac-Man projects http://ai.berkeley.edu/project_overview.html
- introduction to Bottlenose
- Foundations of AI assignment https://grader.cs.uml.edu/assignments/371
- Introduction to Python assignment https://grader.cs.uml.edu/assignments/372