Questions tagged [mdp]

Spring provides a JMS abstraction framework that simplifies the use of the JMS API and shilds the user from differences between the JMS 1.0.2 and 1.1 APIs. Spring offers a solution to create Message driven POJOs(MDPs) in a way that does not tie a user to an EJB container

Ref: http://docs.spring.io/spring/docs/2.5.6/reference/jms.html

28 questions
3
votes
2 answers

State value and state action values with policy - Bellman equation with policy

I am just getting start with deep reinforcement learning and i am trying to crasp this concept. I have this deterministic bellman equation When i implement stochastacity from the MDP then i get 2.6a My equation is this assumption correct. I saw…
2
votes
1 answer

What is the difference between model and policy w.r.t reinforcement learning

Both definition seems to state they are mapping from states to actions then what is the difference or am i wrong ?
vaibhav
  • 158
  • 1
  • 12
2
votes
1 answer

PyBrains Q-Learning maze example. State values and the global policy

I am trying out the PyBrains maze example my setup is: envmatrix = [[...]] env = Maze(envmatrix, (1, 8)) task = MDPMazeTask(env) table = ActionValueTable(states_nr, actions_nr) table.initialize(0.) learner = Q() agent = LearningAgent(table,…
Boris Mocialov
  • 3,439
  • 2
  • 28
  • 55
2
votes
3 answers

Spring message listener / MANUAL acknowledge

I know this sounds as heard 1000 times, but I don't think so and I could not really find a solution: With common ejb I can use acknowledge mode to acknowledge() message manually. If I don't do it is redelivered. I did this in the past and it…
user5101998
  • 209
  • 3
  • 9
2
votes
2 answers

Message Scheduling/Consumption in JMS based on Defined Time

We are using IBM WebSphere MQ as JMS provider with Spring MDP (Message Driven POJO). Is there any way in JMS where we can configure time related properties in message so that message can be consumed at particular defined time only? For example, if I…
Narendra Verma
  • 2,372
  • 6
  • 34
  • 61
1
vote
2 answers

Python returning two identical matrices

I am trying to write a small program for Markov Decision Process (inventory problem) using Python. I cannot figure out why the program outputs two identical matrices (for profit and decision matrices). The programming itself has some problems too…
Chris
  • 95
  • 5
1
vote
2 answers
1
vote
0 answers

Is I-POMDP (Interactive POMDP) NEXP-complete?

I know that the Dec-POMDP (Decentralized-POMDP) is NEXP-complete for finite time steps, but I wanted to know whether the I-POMDP is also NEXP-complete or not! If not, then what's the complexity of I-POMDP? I did some research about it, but…
1
vote
1 answer

MDP & Reinforcement Learning - Convergence Comparison of VI, PI and QLearning Algorithms

I have implemented VI (Value Iteration), PI (Policy Iteration), and QLearning algorithms using python. After comparing results, I noticed something. VI and PI algorithms converge to same utilities and policies. With same parameters, QLearning…
1
vote
1 answer

What is the meaning of Values row in POMDP?

I am studying POMDP file format and fallowing this and many other links. I have understood everything but I can't get what does the Value in second row of the file stand for. Its values are Reward or Cost. Can't find the answer elsewhere. Getting…
Oskars
  • 407
  • 4
  • 24
1
vote
0 answers

Java process with Spring Message Driven POJOs required a restart after a while to consume messages from MQ

I have a Java (1.7) process which uses Spring MDPs (Spring 4.2.3 JMS framework) to read and process Messages from Websphere MQ 8.1 queues which worked well in production without issues for several weeks ; but recently stopped consuming messages…
1
vote
1 answer

When to use Policy Iteration instead of Value Iteration

I'm currently studying dynamic programming solutions to Markov Decision Processes. I feel like I've got a decent grip on VI and PI and the motivation for PI is pretty clear to me (converging on the correct state utilities seems like unnecessary work…
kylejmcintyre
  • 1,898
  • 2
  • 17
  • 19
1
vote
1 answer

How do I configure an Spring message listener (MDP) to have one instance across a cluster

I have a spring message listener configured with
Chip
  • 1,439
  • 3
  • 15
  • 29
1
vote
0 answers

Spring MDP not Consuming Message

I am implementing Spring MDP + JMSTemplate to send and receive the message. The message send mechanism is working fine, however the MDP is not getting invoked. I tried testing the via plain receiver, and was able to retrieve the message, but not via…
Amol Aranke
  • 51
  • 1
  • 5
1
vote
2 answers

Reinforcement Learning without Successor State

I'm attempting to pose a problem as a reinforcement learning problem. My difficulty is that the state which an agent is in changes randomly. They must simply choose an action within the state they are in. I want to learn appropriate actions for all…
Michael Anslow
  • 397
  • 3
  • 12
1
2