question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Training specific agentflow subtaskoption

See original GitHub issue

Is their an example showing how to train a specific option policy? For example, from the agent flow tutorial, how can we setup training for the ExamplePolicy. The problem being that the output of step in the main_loop may have a different observation_spec and action_spec than the policy.

# Stubs for pulling observation and sending action to some external system.
observation_cb = ExampleObservationUpdater()
action_cb = ExampleActionSender()

# Create an environment that forwards the observation and action calls.
env = ProxyEnvironment(observation_cb, action_cb)

# Stub policy that runs the desired agent.
policy = ExamplePolicy(action_cb.action_spec(), "agent")

# Wrap policy into an agent that logs to the terminal.
task = ExampleSubTask(env.observation_spec(), action_cb.action_spec(), 10)
logger = print_logger.PrintLogger()
aggregator = subtask_logger.EpisodeReturnAggregator()
logging_observer = subtask_logger.SubTaskLogger(logger, aggregator)
agent = subtask.SubTaskOption(task, policy, [logging_observer])

reset_op = ExampleScriptedOption(action_cb.action_spec(), "reset", 3)
main_loop = loop_ops.Repeat(5, sequence.Sequence([reset_op, agent]))

# Run the episode.
timestep = env.reset()
while True:
  action = main_loop.step(timestep)
  timestep = env.step(action)

  # Terminate if the environment or main_loop requests it.
  if timestep.last() or (main_loop.pterm(timestep) > np.random.rand()):
    if not timestep.last():
      termination_timestep = timestep._replace(step_type=dm_env.StepType.LAST)
      main_loop.step(termination_timestep)
    break

Issue Analytics

  • State:closed
  • Created a year ago
  • Comments:9

github_iconTop GitHub Comments

1reaction
jangirrishabhcommented, Oct 6, 2022

Thank you @araju ! that answers my question for a single learnable agent! For thread reference, I used the af.SubTaskObserver to stack observations and pass this observer object to the main loop, popping observations from it whenever new obs are added to the learnable option.

0reactions
arajucommented, Oct 7, 2022

Glad that worked! Closing the issue.

Read more comments on GitHub >

github_iconTop Results From Across the Web

AgentFlow Core Components - deepmind/dm_robotics - GitHub
SubTask is the AgentFlow mechanism for training an option policy in-situ, i.e. in the context of a higher-level graph. A SubTask and Policy...
Read more >
Developing Reinforcement Learning Agents that ... - YouTube
This knowledge can take the form of a dynamics model, option policies that ... All of these are subtasks that the agent can...
Read more >
Configuring sub-tasks | Administering Jira applications Data ...
Sub-task issues are generally used to split up a parent issue into a number of tasks which can be assigned and tracked separately....
Read more >
JIRA Sub-Task with Example (JIRA Create Sub-task)
Method 1: Creating a Sub-task under a parent issue​​ Go to “More” drop-down in option on the top. Choose the “Create Sub-task” option...
Read more >
Create and work with subtasks and summary tasks in Project ...
Create subtasks and summary tasks to outline the task list in your project. ... All Subtasks to show all the subtasks or click...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found