comments
Respond
Comment on the article
Add a citation
Reply with an article
Start a new topic
Edit
Edit article
Delete article
Share
Invite
Link
Embed
Social media
Avatar
View
Graph
Explorer
Focus
Down
Load 1 level
Load 2 levels
Load 3 levels
Load 4 levels
Load all levels
All
Dagre
Focus
Down
Load 1 level
Load 2 levels
Load 3 levels
Load 4 level
Load all levels
All
Tree
SpaceTree
Focus
Expanding
Load 1 level
Load 2 levels
Load 3 levels
Down
All
Down
Radial
Focus
Expanding
Load 1 level
Load 2 levels
Load 3 levels
Down
All
Down
Box
Focus
Expanding
Down
Up
All
Down
Article ✓
Outline
Document
Down
All
Page
Canvas
Time
Timeline
Calendar
Updates
Subscribe to updates
Get updates
Past 24 hours
Past week
Past month
Past year
Pause updates
Contact us
Skill chaining
Skill chaining is a skill discovery method in continuous reinforcement learning.
Skill chaining, a general method for skill acquisition in continuous reinforcement learning domains. Skill chaining incrementally builds a skill tree that allows an agent to reach a solution state from any of its start states by executing a sequence (or chain) of acquired skills. Skill chaining can adaptively break tasks that are too complex to be solved monolithically into sequences of subtasks that can be solved efficiently. The agent can then learn to sequence the resulting subtasks to solve the original problem.
References
[
edit
]
Konidaris, George
;
Andrew Barto
(2009). "Skill discovery in continuous reinforcement learning domains using skill chaining".
Advances in Neural Information Processing Systems 22
.
RELATED ARTICLES
Explain
⌅
Machine Learning Methods & Algorithms
Machine Learning Methods & Algorithms☜Machine learning, a branch of artificial intelligence, concerns the construction and study of systems that can learn from data.☜F1CEB7
⌃
Reinforcement learning
Reinforcement learning☜Reinforcement learning is an area of machine learning inspired by behaviorist psychology, concerned with how software agents ought to take actions in an environment so as to maximize some notion of cumulative reward. The problem, due to its generality, is studied in many other disciplines, such as game theory, control theory, operations research, information theory, simulation-based optimization, statistics, and genetic algorithms.☜CBCBFF
■
Skill chaining
Skill chaining ☜Skill chaining is a skill discovery method in continuous reinforcement learning.☜CBCBFF
□
Constructing skill trees (CST)
Constructing skill trees (CST) ☜Constructing skill trees (CST) is a hierarchical reinforcement learning algorithm which can build skill trees from a set of sample solution trajectories obtained from demonstration. CST uses an incremental MAP(maximum a posteriori ) change point detection algorithm to segment each demonstration trajectory into skills and integrate the results into a skill tree. ☜9FDEF6
□
Error-driven learning
Error-driven learning☜Error-driven learning is a sub-area of machine learning concerned with how an agent ought to take actions in an environment so as to minimize some error feedback. It is a type of reinforcement learning.☜9FDEF6
□
Evolutionary multimodal optimization
Evolutionary multimodal optimization☜In applied mathematics, multimodal optimization deals with Optimization (mathematics) tasks that involve finding all or most of the multiple solutions (as opposed to a single best solution).☜9FDEF6
□
HEXQ
HEXQ☜An open problem in reinforcement learning is discovering hierarchical structure. HEXQ, an algorithm which automatically attempts to decompose and solve a model-free factored MDP hierarchically is described. By searching for aliased Markov sub-space regions based on the state variables the algorithm uses temporal and state abstraction to construct a hierarchy of interlinked smaller MDPs☜9FDEF6
□
Learning automata
Learning automata☜A branch of the theory of adaptive control is devoted to learning automata surveyed by Narendra and Thathachar (1974) which were originally described explicitly as finite state automata. Learning automata select their current action based on past experiences from the environment.☜9FDEF6
□
Monte Carlo method
Monte Carlo method☜Monte Carlo methods (or Monte Carlo experiments) are a broad class of computational algorithms that rely on repeated random sampling to obtain numerical results; i.e., by running simulations many times over in order to calculate those same probabilities heuristically just like actually playing and recording your results in a real casino situation: hence the name. ☜9FDEF6
□
Q-learning
Q-learning☜Q-learning is a model-free reinforcement learning technique. Specifically, Q-learning can be used to find an optimal action-selection policy for any given (finite) Markov decision process (MDP). It works by learning an action-value function that ultimately gives the expected utility of taking a given action in a given state and following the optimal policy thereafter. ☜9FDEF6
□
SARSA (State-Action-Reward-State-Action)
SARSA (State-Action-Reward-State-Action) ☜SARSA (State-Action-Reward-State-Action) is an algorithm for learning a Markov decision process policy, used in the reinforcement learning area of machine learning. It was introduced in a technical note [1] where the alternative name SARSA was only mentioned as a footnote.☜9FDEF6
□
Temporal difference (TD) learning
Temporal difference (TD) learning☜Temporal difference (TD) learning is a prediction method. It has been mostly used for solving the reinforcement learning problem. TD learning is a combination of Monte Carlo ideas and dynamic programming (DP) ideas.[1] TD resembles a Monte Carlo method because it learns by sampling the environment according to some policy.☜9FDEF6
□
Graph of this discussion
Graph of this discussion☜Click this to see the whole debate, excluding comments, in graphical form☜dcdcdc
Enter the title of your article
Enter a short (max 500 characters) summation of your article
Click the button to enter task scheduling information
Open
Enter the main body of your article
Prefer more work space? Try the
big editor
Enter task details
Message text
Select assignee(s)
Due date (click calendar)
RadDatePicker
RadDatePicker
Open the calendar popup.
Calendar
Title and navigation
Title and navigation
<<
<
November 2024
>
<<
November 2024
S
M
T
W
T
F
S
44
27
28
29
30
31
1
2
45
3
4
5
6
7
8
9
46
10
11
12
13
14
15
16
47
17
18
19
20
21
22
23
48
24
25
26
27
28
29
30
49
1
2
3
4
5
6
7
Reminder
No reminder
1 day before due
2 days before due
3 days before due
1 week before due
Ready to post
Copy to text
Enter
Cancel
Task assignment(s) have been emailed and cannot now be altered
Lock
Cancel
Save
Comment graphing options
Choose comments:
Comment only
Whole thread
All comments
Choose location:
To a new map
To this map
New map options
Select map ontology
Options
Standard (default) ontology
College debate ontology
Hypothesis ontology
Influence diagram ontology
Story ontology
Graph to private map
Cancel
Proceed
+Comments (
0
)
- Comments
Add a comment
Newest first
Oldest first
Show threads
+Citations (
0
)
- Citations
Add new citation
List by:
Citerank
Map
+About
- About
Entered by:-
Roger Yau
NodeID:
#306209
Node type:
Category
Entry date (GMT):
12/11/2013 5:11:00 PM
Last edit date (GMT):
12/11/2013 5:29:00 PM
Show other editors
Incoming cross-relations:
0
Outgoing cross-relations:
0
Average rating:
0
by
0
users
Enter comment
Select article text to quote
Cancel
Enter
welcome text
First name
Last name
Email
Skip
Join
x
Select file to upload