Preview

Final MDP

Satisfactory Essays
Open Document
Open Document
1572 Words
Grammar
Grammar
Plagiarism
Plagiarism
Writing
Writing
Score
Score
Final MDP
MSM 549 Markov Decision Processess
Final Exam, Spring 2013
Instructions You have 8 hours to return the answers to me by email or in my office. You are not allowed to communicate with others about your solutions, approach, ideas and etc. If such an unauthorized sharing is detected you will receive “0” from the final and I will take a disciplinary action according to the Simon Academic Honesty Policy. By returning your solutions to the exam you agree that you will follow the Simon Academic Honesty Policy. Please bear this in mind before you violate the Honesty code. Also, I will not try to decipher your handwriting. If I cannot read your handwriting I will assume your answer is wrong.
The exam consists of 3 questions. Some may be more difficult that others but each has equal weight. Feel free to email me if you need clarification on a question. Good Luck.

Question 1:
Consider the infinite horizon discounted problem with n states. Let Ai denote the available actions in state i. The cost per stage is g(i, u), the discount factor is α and the transition probabilities are pij (u). For each j = 1, . . . , n, let mj = min min pij (u)

(1)

i=1,...,n u∈Ai

For all states i and j and possible actions u ∈ Ai , let pij (u) =
˜

pij (u) − mj
.
1 − n mk k=1 (2)

a. Show that pij (u) are transition probabilities.
˜
b. Consider the discounted problem with cost per stage g(i, u), discount factor α(1 − n mj ) j=1 and transition probabilities pij (u). Show that this problem has the same optimal policies as
˜
the original and that is optimal cost vector satisfies


J =J+

n j=1 mj J(j)

α

1−α

e

(3)

where J ∗ is the optimal cost vector of the original problem and e is the unit vector.
c. What is the advantage of using the transformed MDP?
Solutions:
Fix u and i. We have n n

pij (u) =
˜
j=1

j=1

pij (u) − mj
=
1 − n mk
1−
k=1

1 n k=1 mk

n

(1 −

mj ) = 1.

(4)

j=1

Since pij (u) ≥ 0, for all i,

You May Also Find These Documents Helpful

  • Good Essays

    b. The value of the optimal solution to the revised problem is 8(2.5) + 12(2.5) = 50.…

    • 730 Words
    • 11 Pages
    Good Essays
  • Satisfactory Essays

    Nt1330 Unit 3

    • 472 Words
    • 2 Pages

    For the optimal level of the activity in part c, the total benefit is $_________, the total cost is $_________, and the net benefit is…

    • 472 Words
    • 2 Pages
    Satisfactory Essays
  • Satisfactory Essays

    Brs Mdm3 Tif Ch08

    • 3288 Words
    • 19 Pages

    6) Refer to the payoff table. Using the maximin criterion, what would be the highest expected payoff?…

    • 3288 Words
    • 19 Pages
    Satisfactory Essays
  • Good Essays

    Instructions: This exam consists of twenty-six questions worth one point and eight questions worth three points. Students should type all answers. The link to the entire PDF of this reading is available on Blackboard. Students found to be engaged in collusion or plagiarizing the work of another student will receive a zero. Please spell-check your work and type all answers appropriately, i.e. in complete sentences where possible.…

    • 565 Words
    • 3 Pages
    Good Essays
  • Good Essays

    Plot the Ppc of a Nation

    • 875 Words
    • 4 Pages

    This requires that you find the opportunity cost at a point, and not over an interval. There are various ways to do it, but the most common way is to find the opportunity cost between B and D and use that to approximate the opportunity cost at C. That is given by 50/50 = 1. This is so because in going from B to D you lose 50 units of other goods and gain 50 units of Healthcare.…

    • 875 Words
    • 4 Pages
    Good Essays
  • Good Essays

    Brs Mdm3 Tif Ch04

    • 4332 Words
    • 20 Pages

    2) Assume that the shadow price of a non-binding "≤" constraint is 5. This implies that:…

    • 4332 Words
    • 20 Pages
    Good Essays
  • Better Essays

    Avon Products Inc

    • 653 Words
    • 3 Pages

    To find the fair price of PERCS, we first replicate the payoff of PERCS with a set of simpler…

    • 653 Words
    • 3 Pages
    Better Essays
  • Satisfactory Essays

    I recommend that Allied should not accept the offer of $750 000 from John. It is because, based on the expected value at node 1, it show that $670 000 which is much cheaper than from the offer of $750 000 from John. So, the strategy to counteroffer of $400 000 is better than accepting John’s offer.…

    • 349 Words
    • 2 Pages
    Satisfactory Essays
  • Satisfactory Essays

    Macro Ii Problem Set 3

    • 3310 Words
    • 14 Pages

    1. a. Given this two-period problem of labor supply maxc1 ,n1 ,c2 ,n2 ln[c1 ] + ln[1 − n1 ] + βln[c2 ] + βln[1 − n2 ] subject to the intertemporal budget constraint c1 [1 + r] + c2 = w1 n1 [1 + r] + w2 n2 Dividing each side by [1+r] for convenience gives c1 + c2 w 2 n2 = w 1 n1 + 1+r 1+r…

    • 3310 Words
    • 14 Pages
    Satisfactory Essays
  • Better Essays

    stateline shipping

    • 1099 Words
    • 4 Pages

    This Case Problem, Stateline Shipping and Transport Company, is based on a girl named Rachel Sundusky who is a manager of the South-Atlantic office for Stateline Shipping and Transport (Taylor, 2010). Rachel is negotiating a contract with Polychem an industrial use chemical company (Taylor, 2010). Polychem has six sites that it would like for Stateline to pick up waste from (Taylor, 2010). Polychem would then like for Stateline to transport the waste for disposal to one of three sites (Taylor, 2010). Polychem has agreed to handle all of the waste at all sites therefore Stateline needs only transport the materials and incur costs for the same (Taylor, 2010). Rachel would like to see what the less costly shipping routes are (Taylor, 2010). Rachel will need all of the pertinent information, i.e. cost per unit, cost per trip, and total shipment cost in order to be able to give an accurate proposal to Polychem (Taylor, 2010). For this assignment the instructor determined we should only complete part one. Therefore the model below works out Rachel’s route costs not optimal routes.…

    • 1099 Words
    • 4 Pages
    Better Essays
  • Powerful Essays

    Marketing - Right Choices

    • 4823 Words
    • 20 Pages

    ACADEMIC INTEGRITY DECLARATION Breaches of academic integrity (cheating, plagiarism, falsification of data, collusion) seriously compromise student learning, as well as the University’s assessment of the effectiveness of that learning and the academic quality of the University’s awards. All breaches of academic integrity are taken seriously and could result in penalties including failure in the course and exclusion from the University. Students should be aware that the University uses text-matching software to safeguard the quality of student learning and that your assignment will be checked using this software. I acknowledge and agree that the examiner of this assessment item may, for the purpose of marking this assessment item: 1 reproduce this assessment item and provide a copy to another Griffith staff member; and/ 2 submit this assessment item to a text-matching service. This web-based service will retain a copy of this assessment item for checking the work of other students, but will not reproduce it in any form. Examiners will only award marks for work within this assignment that is your own original work. I, hereby certify that : 1 except where I have indicated, this assignment is my own work, based on my personal study and/or research. 2 I have acknowledged all materials and sources used in the preparation of this assignment whether they be books, articles, reports,…

    • 4823 Words
    • 20 Pages
    Powerful Essays
  • Good Essays

    This can be proved by the following explanation. Suppose that point is the target point with the price of line and is the suboptimal goal state, which means goal state with cost . It can be imagined that the A * algorithm selects from the queue. Since is a goal state, it will end the search with a suboptimal solution. This is not possible because node n is the node in the optimal path to . Then, there must be some more nodes, unless the path has been completely expanded and the algorithm returns the value of . Furthermore, since the h function is received, then . In addition, if n is not selected to expand , then . So,. But since is a goal state, . Thus . Therefore, it can be assumed that . This is contrary to the assumption that is suboptimal, so it can be known that the A * algorithm does not select a suboptimal goal for expansion. Therefore, Algorithm A* is the optimal algorithm because it will only return the value of the solution after selecting the goal state for…

    • 747 Words
    • 3 Pages
    Good Essays
  • Good Essays

    Adopt Algorithm

    • 1063 Words
    • 5 Pages

    Given l Variables {x1, x2, …, xn}, each assigned to an agent l Finite, discrete domains D1, D2, … , Dn, l For each xi, xj, valued constraint fij: Di x Dj → N. Goal l Find complete assignment A that minimizes F(A) where,…

    • 1063 Words
    • 5 Pages
    Good Essays
  • Satisfactory Essays

    Hence, since $50 thousands is the best, choose to do nothing using the maximin strategy…

    • 531 Words
    • 3 Pages
    Satisfactory Essays
  • Good Essays

    Disclaimer: Questions asked in the examination may have wrong/inadequate information and/or ambiguous language. In that case the answers provided by the institute may differ from these ideal answers. Every effort has been made taken to give best answers. Still if you find some errors please bring them to our notice through e-mail. Mail id: gunturmasterminds@yahoo.com…

    • 3564 Words
    • 15 Pages
    Good Essays