How to write a pseudo-algorithm with algorithm2e package inside a tcolorbox environment?











up vote
3
down vote

favorite
1












I've recently seen in the second edition of Reinforcement Learning: an introduction by Sutton and Barto an appealing way to display pseudo-algorithms. In the following an example image.



algo



I think that the environment is done with tcolorbox package and I think that I should be able to make something similar. However, I like to put my pseudo-algorithm in an algorithm environment through algorithm2e package. There is a way to mix the two things? The ideal case would be to have the background color, the box, and the captions the same as the image and the internal structure of the algorithm environment of algorithm2e.



This is the code of what I've done so far:



documentclass{article}
usepackage[utf8]{inputenc}
usepackage[ruled,longend]{algorithm2e}
usepackage{textgreek}
usepackage{amssymb}

begin{document}
begin{algorithm}
Algorithm parameters: step size $alpha in (0, 1]$, small $epsilon > 0$;
Initialize $Q(s, a)$, for all $s in mathcal{S}^+, a in mathcal{A}(s)$, arbitrarily except that $Q(mathrm{terminal}, cdot) = 0$;

ForEach{episode}{
Initialize S;
ForEach{step of episode}{
Choose $A$ from $S$ using policy derived from $Q$ (e.g., textepsilon-greedy);
Take action $A$, observe $R$, $S'$;
$Q(S, A) leftarrow Q(S, A) + alpha [R + gamma max_a Q(S', a) - Q(S, A)]$;
$S leftarrow S'$;
}
}
caption{Q-learning (off-policy TD control) for estimating $pi approx pi_*$}
end{algorithm}
end{document}


This is the result:



algo2



Basically, it's only the algorithm part. I don't know how to start to change its appearance since in the documentation there are very few commands to change the visual effect and none of them seems helpful.










share|improve this question
























  • Obviously, it is done using tcolorbox.
    – Bernard
    Nov 23 at 19:14










  • @Bernard. I don't have the source, for this reason I only stated that I think that it's done with tcolorbox.
    – gvgramazio
    Nov 23 at 19:17















up vote
3
down vote

favorite
1












I've recently seen in the second edition of Reinforcement Learning: an introduction by Sutton and Barto an appealing way to display pseudo-algorithms. In the following an example image.



algo



I think that the environment is done with tcolorbox package and I think that I should be able to make something similar. However, I like to put my pseudo-algorithm in an algorithm environment through algorithm2e package. There is a way to mix the two things? The ideal case would be to have the background color, the box, and the captions the same as the image and the internal structure of the algorithm environment of algorithm2e.



This is the code of what I've done so far:



documentclass{article}
usepackage[utf8]{inputenc}
usepackage[ruled,longend]{algorithm2e}
usepackage{textgreek}
usepackage{amssymb}

begin{document}
begin{algorithm}
Algorithm parameters: step size $alpha in (0, 1]$, small $epsilon > 0$;
Initialize $Q(s, a)$, for all $s in mathcal{S}^+, a in mathcal{A}(s)$, arbitrarily except that $Q(mathrm{terminal}, cdot) = 0$;

ForEach{episode}{
Initialize S;
ForEach{step of episode}{
Choose $A$ from $S$ using policy derived from $Q$ (e.g., textepsilon-greedy);
Take action $A$, observe $R$, $S'$;
$Q(S, A) leftarrow Q(S, A) + alpha [R + gamma max_a Q(S', a) - Q(S, A)]$;
$S leftarrow S'$;
}
}
caption{Q-learning (off-policy TD control) for estimating $pi approx pi_*$}
end{algorithm}
end{document}


This is the result:



algo2



Basically, it's only the algorithm part. I don't know how to start to change its appearance since in the documentation there are very few commands to change the visual effect and none of them seems helpful.










share|improve this question
























  • Obviously, it is done using tcolorbox.
    – Bernard
    Nov 23 at 19:14










  • @Bernard. I don't have the source, for this reason I only stated that I think that it's done with tcolorbox.
    – gvgramazio
    Nov 23 at 19:17













up vote
3
down vote

favorite
1









up vote
3
down vote

favorite
1






1





I've recently seen in the second edition of Reinforcement Learning: an introduction by Sutton and Barto an appealing way to display pseudo-algorithms. In the following an example image.



algo



I think that the environment is done with tcolorbox package and I think that I should be able to make something similar. However, I like to put my pseudo-algorithm in an algorithm environment through algorithm2e package. There is a way to mix the two things? The ideal case would be to have the background color, the box, and the captions the same as the image and the internal structure of the algorithm environment of algorithm2e.



This is the code of what I've done so far:



documentclass{article}
usepackage[utf8]{inputenc}
usepackage[ruled,longend]{algorithm2e}
usepackage{textgreek}
usepackage{amssymb}

begin{document}
begin{algorithm}
Algorithm parameters: step size $alpha in (0, 1]$, small $epsilon > 0$;
Initialize $Q(s, a)$, for all $s in mathcal{S}^+, a in mathcal{A}(s)$, arbitrarily except that $Q(mathrm{terminal}, cdot) = 0$;

ForEach{episode}{
Initialize S;
ForEach{step of episode}{
Choose $A$ from $S$ using policy derived from $Q$ (e.g., textepsilon-greedy);
Take action $A$, observe $R$, $S'$;
$Q(S, A) leftarrow Q(S, A) + alpha [R + gamma max_a Q(S', a) - Q(S, A)]$;
$S leftarrow S'$;
}
}
caption{Q-learning (off-policy TD control) for estimating $pi approx pi_*$}
end{algorithm}
end{document}


This is the result:



algo2



Basically, it's only the algorithm part. I don't know how to start to change its appearance since in the documentation there are very few commands to change the visual effect and none of them seems helpful.










share|improve this question















I've recently seen in the second edition of Reinforcement Learning: an introduction by Sutton and Barto an appealing way to display pseudo-algorithms. In the following an example image.



algo



I think that the environment is done with tcolorbox package and I think that I should be able to make something similar. However, I like to put my pseudo-algorithm in an algorithm environment through algorithm2e package. There is a way to mix the two things? The ideal case would be to have the background color, the box, and the captions the same as the image and the internal structure of the algorithm environment of algorithm2e.



This is the code of what I've done so far:



documentclass{article}
usepackage[utf8]{inputenc}
usepackage[ruled,longend]{algorithm2e}
usepackage{textgreek}
usepackage{amssymb}

begin{document}
begin{algorithm}
Algorithm parameters: step size $alpha in (0, 1]$, small $epsilon > 0$;
Initialize $Q(s, a)$, for all $s in mathcal{S}^+, a in mathcal{A}(s)$, arbitrarily except that $Q(mathrm{terminal}, cdot) = 0$;

ForEach{episode}{
Initialize S;
ForEach{step of episode}{
Choose $A$ from $S$ using policy derived from $Q$ (e.g., textepsilon-greedy);
Take action $A$, observe $R$, $S'$;
$Q(S, A) leftarrow Q(S, A) + alpha [R + gamma max_a Q(S', a) - Q(S, A)]$;
$S leftarrow S'$;
}
}
caption{Q-learning (off-policy TD control) for estimating $pi approx pi_*$}
end{algorithm}
end{document}


This is the result:



algo2



Basically, it's only the algorithm part. I don't know how to start to change its appearance since in the documentation there are very few commands to change the visual effect and none of them seems helpful.







tcolorbox algorithm2e






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 23 at 19:56

























asked Nov 23 at 19:13









gvgramazio

1,359520




1,359520












  • Obviously, it is done using tcolorbox.
    – Bernard
    Nov 23 at 19:14










  • @Bernard. I don't have the source, for this reason I only stated that I think that it's done with tcolorbox.
    – gvgramazio
    Nov 23 at 19:17


















  • Obviously, it is done using tcolorbox.
    – Bernard
    Nov 23 at 19:14










  • @Bernard. I don't have the source, for this reason I only stated that I think that it's done with tcolorbox.
    – gvgramazio
    Nov 23 at 19:17
















Obviously, it is done using tcolorbox.
– Bernard
Nov 23 at 19:14




Obviously, it is done using tcolorbox.
– Bernard
Nov 23 at 19:14












@Bernard. I don't have the source, for this reason I only stated that I think that it's done with tcolorbox.
– gvgramazio
Nov 23 at 19:17




@Bernard. I don't have the source, for this reason I only stated that I think that it's done with tcolorbox.
– gvgramazio
Nov 23 at 19:17










2 Answers
2






active

oldest

votes

















up vote
6
down vote













I missed the part in the algorithm2e manual when it is stated that the option H makes the environment non-floatable, thus it could be put inside the tcolorbox environment.



This is the updated code:



documentclass{article}
usepackage[utf8]{inputenc}
usepackage[longend]{algorithm2e}
usepackage{textgreek}
usepackage{amssymb}
usepackage{tcolorbox}

begin{document}
begin{tcolorbox}[fonttitle=bfseries, title=Q-learning (off-policy TD control) for estimating $pi approx pi_*$]
begin{algorithm}[H]
Algorithm parameters: step size $alpha in (0, 1]$, small $epsilon > 0$;
Initialize $Q(s, a)$, for all $s in mathcal{S}^+, a in mathcal{A}(s)$, arbitrarily except that $Q(mathrm{terminal}, cdot) = 0$;

ForEach{episode}{
Initialize S;
ForEach{step of episode}{
Choose $A$ from $S$ using policy derived from $Q$ (e.g., textepsilon-greedy);
Take action $A$, observe $R$, $S'$;
$Q(S, A) leftarrow Q(S, A) + alpha [R + gamma max_a Q(S', a) - Q(S, A)]$;
$S leftarrow S'$;
}
}
end{algorithm}
end{tcolorbox}
end{document}


This is the visual result:



algo3





Usually, I don't post self-answered questions. This was a genuine question but I found the answer five minutes after I posted it. Since it could be helpful to others I will leave it here.






share|improve this answer





















  • You may prefer to change the title accordingly to reflect the integration of tcolorbox into algoritm2e.
    – Diaa
    Nov 23 at 19:41






  • 1




    @Diaa. Done. Feel free to change it again if you have of a better title.
    – gvgramazio
    Nov 23 at 19:57


















up vote
2
down vote













This is more a long comment that an answer which was already provided by gvgramazio.



I just propose to integrate tcolorbox+algoritm environments inside a new myalgorithm environment. Using before upper and after upper options, the algorithm environment can be easily integrated inside a tcolorbox. And the new environment includes title as as second parameter. The first parameter is optional and can help to customize the box:



documentclass{article}
usepackage[utf8]{inputenc}
usepackage[longend]{algorithm2e}
usepackage{textgreek}
usepackage{amssymb}
usepackage{tcolorbox}

newtcolorbox{myalgorithm}[2]{%
fonttitle=bfseries,
title=#2,
#1,
before upper={begin{algorithm}[H]},
after upper={end{algorithm}}
}

begin{document}
begin{myalgorithm}{Q-learning (off-policy TD control) for estimating $pi approx pi_*$}
Algorithm parameters: step size $alpha in (0, 1]$, small $epsilon > 0$;
Initialize $Q(s, a)$, for all $s in mathcal{S}^+, a in mathcal{A}(s)$, arbitrarily except that $Q(mathrm{terminal}, cdot) = 0$;

ForEach{episode}{
Initialize S;
ForEach{step of episode}{
Choose $A$ from $S$ using policy derived from $Q$ (e.g., textepsilon-greedy);
Take action $A$, observe $R$, $S'$;
$Q(S, A) leftarrow Q(S, A) + alpha [R + gamma max_a Q(S', a) - Q(S, A)]$;
$S leftarrow S'$;
}
}
end{myalgorithm}

begin{myalgorithm}[sharp corners, colback=cyan!15]{Q-learning (off-policy TD control) for estimating $pi approx pi_*$}
Algorithm parameters: step size $alpha in (0, 1]$, small $epsilon > 0$;
Initialize $Q(s, a)$, for all $s in mathcal{S}^+, a in mathcal{A}(s)$, arbitrarily except that $Q(mathrm{terminal}, cdot) = 0$;

ForEach{episode}{
Initialize S;
ForEach{step of episode}{
Choose $A$ from $S$ using policy derived from $Q$ (e.g., textepsilon-greedy);
Take action $A$, observe $R$, $S'$;
$Q(S, A) leftarrow Q(S, A) + alpha [R + gamma max_a Q(S', a) - Q(S, A)]$;
$S leftarrow S'$;
}
}
end{myalgorithm}
end{document}


enter image description here






share|improve this answer





















    Your Answer








    StackExchange.ready(function() {
    var channelOptions = {
    tags: "".split(" "),
    id: "85"
    };
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function() {
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled) {
    StackExchange.using("snippets", function() {
    createEditor();
    });
    }
    else {
    createEditor();
    }
    });

    function createEditor() {
    StackExchange.prepareEditor({
    heartbeatType: 'answer',
    convertImagesToLinks: false,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: null,
    bindNavPrevention: true,
    postfix: "",
    imageUploader: {
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    },
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    });


    }
    });














    draft saved

    draft discarded


















    StackExchange.ready(
    function () {
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2ftex.stackexchange.com%2fquestions%2f461468%2fhow-to-write-a-pseudo-algorithm-with-algorithm2e-package-inside-a-tcolorbox-envi%23new-answer', 'question_page');
    }
    );

    Post as a guest















    Required, but never shown

























    2 Answers
    2






    active

    oldest

    votes








    2 Answers
    2






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes








    up vote
    6
    down vote













    I missed the part in the algorithm2e manual when it is stated that the option H makes the environment non-floatable, thus it could be put inside the tcolorbox environment.



    This is the updated code:



    documentclass{article}
    usepackage[utf8]{inputenc}
    usepackage[longend]{algorithm2e}
    usepackage{textgreek}
    usepackage{amssymb}
    usepackage{tcolorbox}

    begin{document}
    begin{tcolorbox}[fonttitle=bfseries, title=Q-learning (off-policy TD control) for estimating $pi approx pi_*$]
    begin{algorithm}[H]
    Algorithm parameters: step size $alpha in (0, 1]$, small $epsilon > 0$;
    Initialize $Q(s, a)$, for all $s in mathcal{S}^+, a in mathcal{A}(s)$, arbitrarily except that $Q(mathrm{terminal}, cdot) = 0$;

    ForEach{episode}{
    Initialize S;
    ForEach{step of episode}{
    Choose $A$ from $S$ using policy derived from $Q$ (e.g., textepsilon-greedy);
    Take action $A$, observe $R$, $S'$;
    $Q(S, A) leftarrow Q(S, A) + alpha [R + gamma max_a Q(S', a) - Q(S, A)]$;
    $S leftarrow S'$;
    }
    }
    end{algorithm}
    end{tcolorbox}
    end{document}


    This is the visual result:



    algo3





    Usually, I don't post self-answered questions. This was a genuine question but I found the answer five minutes after I posted it. Since it could be helpful to others I will leave it here.






    share|improve this answer





















    • You may prefer to change the title accordingly to reflect the integration of tcolorbox into algoritm2e.
      – Diaa
      Nov 23 at 19:41






    • 1




      @Diaa. Done. Feel free to change it again if you have of a better title.
      – gvgramazio
      Nov 23 at 19:57















    up vote
    6
    down vote













    I missed the part in the algorithm2e manual when it is stated that the option H makes the environment non-floatable, thus it could be put inside the tcolorbox environment.



    This is the updated code:



    documentclass{article}
    usepackage[utf8]{inputenc}
    usepackage[longend]{algorithm2e}
    usepackage{textgreek}
    usepackage{amssymb}
    usepackage{tcolorbox}

    begin{document}
    begin{tcolorbox}[fonttitle=bfseries, title=Q-learning (off-policy TD control) for estimating $pi approx pi_*$]
    begin{algorithm}[H]
    Algorithm parameters: step size $alpha in (0, 1]$, small $epsilon > 0$;
    Initialize $Q(s, a)$, for all $s in mathcal{S}^+, a in mathcal{A}(s)$, arbitrarily except that $Q(mathrm{terminal}, cdot) = 0$;

    ForEach{episode}{
    Initialize S;
    ForEach{step of episode}{
    Choose $A$ from $S$ using policy derived from $Q$ (e.g., textepsilon-greedy);
    Take action $A$, observe $R$, $S'$;
    $Q(S, A) leftarrow Q(S, A) + alpha [R + gamma max_a Q(S', a) - Q(S, A)]$;
    $S leftarrow S'$;
    }
    }
    end{algorithm}
    end{tcolorbox}
    end{document}


    This is the visual result:



    algo3





    Usually, I don't post self-answered questions. This was a genuine question but I found the answer five minutes after I posted it. Since it could be helpful to others I will leave it here.






    share|improve this answer





















    • You may prefer to change the title accordingly to reflect the integration of tcolorbox into algoritm2e.
      – Diaa
      Nov 23 at 19:41






    • 1




      @Diaa. Done. Feel free to change it again if you have of a better title.
      – gvgramazio
      Nov 23 at 19:57













    up vote
    6
    down vote










    up vote
    6
    down vote









    I missed the part in the algorithm2e manual when it is stated that the option H makes the environment non-floatable, thus it could be put inside the tcolorbox environment.



    This is the updated code:



    documentclass{article}
    usepackage[utf8]{inputenc}
    usepackage[longend]{algorithm2e}
    usepackage{textgreek}
    usepackage{amssymb}
    usepackage{tcolorbox}

    begin{document}
    begin{tcolorbox}[fonttitle=bfseries, title=Q-learning (off-policy TD control) for estimating $pi approx pi_*$]
    begin{algorithm}[H]
    Algorithm parameters: step size $alpha in (0, 1]$, small $epsilon > 0$;
    Initialize $Q(s, a)$, for all $s in mathcal{S}^+, a in mathcal{A}(s)$, arbitrarily except that $Q(mathrm{terminal}, cdot) = 0$;

    ForEach{episode}{
    Initialize S;
    ForEach{step of episode}{
    Choose $A$ from $S$ using policy derived from $Q$ (e.g., textepsilon-greedy);
    Take action $A$, observe $R$, $S'$;
    $Q(S, A) leftarrow Q(S, A) + alpha [R + gamma max_a Q(S', a) - Q(S, A)]$;
    $S leftarrow S'$;
    }
    }
    end{algorithm}
    end{tcolorbox}
    end{document}


    This is the visual result:



    algo3





    Usually, I don't post self-answered questions. This was a genuine question but I found the answer five minutes after I posted it. Since it could be helpful to others I will leave it here.






    share|improve this answer












    I missed the part in the algorithm2e manual when it is stated that the option H makes the environment non-floatable, thus it could be put inside the tcolorbox environment.



    This is the updated code:



    documentclass{article}
    usepackage[utf8]{inputenc}
    usepackage[longend]{algorithm2e}
    usepackage{textgreek}
    usepackage{amssymb}
    usepackage{tcolorbox}

    begin{document}
    begin{tcolorbox}[fonttitle=bfseries, title=Q-learning (off-policy TD control) for estimating $pi approx pi_*$]
    begin{algorithm}[H]
    Algorithm parameters: step size $alpha in (0, 1]$, small $epsilon > 0$;
    Initialize $Q(s, a)$, for all $s in mathcal{S}^+, a in mathcal{A}(s)$, arbitrarily except that $Q(mathrm{terminal}, cdot) = 0$;

    ForEach{episode}{
    Initialize S;
    ForEach{step of episode}{
    Choose $A$ from $S$ using policy derived from $Q$ (e.g., textepsilon-greedy);
    Take action $A$, observe $R$, $S'$;
    $Q(S, A) leftarrow Q(S, A) + alpha [R + gamma max_a Q(S', a) - Q(S, A)]$;
    $S leftarrow S'$;
    }
    }
    end{algorithm}
    end{tcolorbox}
    end{document}


    This is the visual result:



    algo3





    Usually, I don't post self-answered questions. This was a genuine question but I found the answer five minutes after I posted it. Since it could be helpful to others I will leave it here.







    share|improve this answer












    share|improve this answer



    share|improve this answer










    answered Nov 23 at 19:32









    gvgramazio

    1,359520




    1,359520












    • You may prefer to change the title accordingly to reflect the integration of tcolorbox into algoritm2e.
      – Diaa
      Nov 23 at 19:41






    • 1




      @Diaa. Done. Feel free to change it again if you have of a better title.
      – gvgramazio
      Nov 23 at 19:57


















    • You may prefer to change the title accordingly to reflect the integration of tcolorbox into algoritm2e.
      – Diaa
      Nov 23 at 19:41






    • 1




      @Diaa. Done. Feel free to change it again if you have of a better title.
      – gvgramazio
      Nov 23 at 19:57
















    You may prefer to change the title accordingly to reflect the integration of tcolorbox into algoritm2e.
    – Diaa
    Nov 23 at 19:41




    You may prefer to change the title accordingly to reflect the integration of tcolorbox into algoritm2e.
    – Diaa
    Nov 23 at 19:41




    1




    1




    @Diaa. Done. Feel free to change it again if you have of a better title.
    – gvgramazio
    Nov 23 at 19:57




    @Diaa. Done. Feel free to change it again if you have of a better title.
    – gvgramazio
    Nov 23 at 19:57










    up vote
    2
    down vote













    This is more a long comment that an answer which was already provided by gvgramazio.



    I just propose to integrate tcolorbox+algoritm environments inside a new myalgorithm environment. Using before upper and after upper options, the algorithm environment can be easily integrated inside a tcolorbox. And the new environment includes title as as second parameter. The first parameter is optional and can help to customize the box:



    documentclass{article}
    usepackage[utf8]{inputenc}
    usepackage[longend]{algorithm2e}
    usepackage{textgreek}
    usepackage{amssymb}
    usepackage{tcolorbox}

    newtcolorbox{myalgorithm}[2]{%
    fonttitle=bfseries,
    title=#2,
    #1,
    before upper={begin{algorithm}[H]},
    after upper={end{algorithm}}
    }

    begin{document}
    begin{myalgorithm}{Q-learning (off-policy TD control) for estimating $pi approx pi_*$}
    Algorithm parameters: step size $alpha in (0, 1]$, small $epsilon > 0$;
    Initialize $Q(s, a)$, for all $s in mathcal{S}^+, a in mathcal{A}(s)$, arbitrarily except that $Q(mathrm{terminal}, cdot) = 0$;

    ForEach{episode}{
    Initialize S;
    ForEach{step of episode}{
    Choose $A$ from $S$ using policy derived from $Q$ (e.g., textepsilon-greedy);
    Take action $A$, observe $R$, $S'$;
    $Q(S, A) leftarrow Q(S, A) + alpha [R + gamma max_a Q(S', a) - Q(S, A)]$;
    $S leftarrow S'$;
    }
    }
    end{myalgorithm}

    begin{myalgorithm}[sharp corners, colback=cyan!15]{Q-learning (off-policy TD control) for estimating $pi approx pi_*$}
    Algorithm parameters: step size $alpha in (0, 1]$, small $epsilon > 0$;
    Initialize $Q(s, a)$, for all $s in mathcal{S}^+, a in mathcal{A}(s)$, arbitrarily except that $Q(mathrm{terminal}, cdot) = 0$;

    ForEach{episode}{
    Initialize S;
    ForEach{step of episode}{
    Choose $A$ from $S$ using policy derived from $Q$ (e.g., textepsilon-greedy);
    Take action $A$, observe $R$, $S'$;
    $Q(S, A) leftarrow Q(S, A) + alpha [R + gamma max_a Q(S', a) - Q(S, A)]$;
    $S leftarrow S'$;
    }
    }
    end{myalgorithm}
    end{document}


    enter image description here






    share|improve this answer

























      up vote
      2
      down vote













      This is more a long comment that an answer which was already provided by gvgramazio.



      I just propose to integrate tcolorbox+algoritm environments inside a new myalgorithm environment. Using before upper and after upper options, the algorithm environment can be easily integrated inside a tcolorbox. And the new environment includes title as as second parameter. The first parameter is optional and can help to customize the box:



      documentclass{article}
      usepackage[utf8]{inputenc}
      usepackage[longend]{algorithm2e}
      usepackage{textgreek}
      usepackage{amssymb}
      usepackage{tcolorbox}

      newtcolorbox{myalgorithm}[2]{%
      fonttitle=bfseries,
      title=#2,
      #1,
      before upper={begin{algorithm}[H]},
      after upper={end{algorithm}}
      }

      begin{document}
      begin{myalgorithm}{Q-learning (off-policy TD control) for estimating $pi approx pi_*$}
      Algorithm parameters: step size $alpha in (0, 1]$, small $epsilon > 0$;
      Initialize $Q(s, a)$, for all $s in mathcal{S}^+, a in mathcal{A}(s)$, arbitrarily except that $Q(mathrm{terminal}, cdot) = 0$;

      ForEach{episode}{
      Initialize S;
      ForEach{step of episode}{
      Choose $A$ from $S$ using policy derived from $Q$ (e.g., textepsilon-greedy);
      Take action $A$, observe $R$, $S'$;
      $Q(S, A) leftarrow Q(S, A) + alpha [R + gamma max_a Q(S', a) - Q(S, A)]$;
      $S leftarrow S'$;
      }
      }
      end{myalgorithm}

      begin{myalgorithm}[sharp corners, colback=cyan!15]{Q-learning (off-policy TD control) for estimating $pi approx pi_*$}
      Algorithm parameters: step size $alpha in (0, 1]$, small $epsilon > 0$;
      Initialize $Q(s, a)$, for all $s in mathcal{S}^+, a in mathcal{A}(s)$, arbitrarily except that $Q(mathrm{terminal}, cdot) = 0$;

      ForEach{episode}{
      Initialize S;
      ForEach{step of episode}{
      Choose $A$ from $S$ using policy derived from $Q$ (e.g., textepsilon-greedy);
      Take action $A$, observe $R$, $S'$;
      $Q(S, A) leftarrow Q(S, A) + alpha [R + gamma max_a Q(S', a) - Q(S, A)]$;
      $S leftarrow S'$;
      }
      }
      end{myalgorithm}
      end{document}


      enter image description here






      share|improve this answer























        up vote
        2
        down vote










        up vote
        2
        down vote









        This is more a long comment that an answer which was already provided by gvgramazio.



        I just propose to integrate tcolorbox+algoritm environments inside a new myalgorithm environment. Using before upper and after upper options, the algorithm environment can be easily integrated inside a tcolorbox. And the new environment includes title as as second parameter. The first parameter is optional and can help to customize the box:



        documentclass{article}
        usepackage[utf8]{inputenc}
        usepackage[longend]{algorithm2e}
        usepackage{textgreek}
        usepackage{amssymb}
        usepackage{tcolorbox}

        newtcolorbox{myalgorithm}[2]{%
        fonttitle=bfseries,
        title=#2,
        #1,
        before upper={begin{algorithm}[H]},
        after upper={end{algorithm}}
        }

        begin{document}
        begin{myalgorithm}{Q-learning (off-policy TD control) for estimating $pi approx pi_*$}
        Algorithm parameters: step size $alpha in (0, 1]$, small $epsilon > 0$;
        Initialize $Q(s, a)$, for all $s in mathcal{S}^+, a in mathcal{A}(s)$, arbitrarily except that $Q(mathrm{terminal}, cdot) = 0$;

        ForEach{episode}{
        Initialize S;
        ForEach{step of episode}{
        Choose $A$ from $S$ using policy derived from $Q$ (e.g., textepsilon-greedy);
        Take action $A$, observe $R$, $S'$;
        $Q(S, A) leftarrow Q(S, A) + alpha [R + gamma max_a Q(S', a) - Q(S, A)]$;
        $S leftarrow S'$;
        }
        }
        end{myalgorithm}

        begin{myalgorithm}[sharp corners, colback=cyan!15]{Q-learning (off-policy TD control) for estimating $pi approx pi_*$}
        Algorithm parameters: step size $alpha in (0, 1]$, small $epsilon > 0$;
        Initialize $Q(s, a)$, for all $s in mathcal{S}^+, a in mathcal{A}(s)$, arbitrarily except that $Q(mathrm{terminal}, cdot) = 0$;

        ForEach{episode}{
        Initialize S;
        ForEach{step of episode}{
        Choose $A$ from $S$ using policy derived from $Q$ (e.g., textepsilon-greedy);
        Take action $A$, observe $R$, $S'$;
        $Q(S, A) leftarrow Q(S, A) + alpha [R + gamma max_a Q(S', a) - Q(S, A)]$;
        $S leftarrow S'$;
        }
        }
        end{myalgorithm}
        end{document}


        enter image description here






        share|improve this answer












        This is more a long comment that an answer which was already provided by gvgramazio.



        I just propose to integrate tcolorbox+algoritm environments inside a new myalgorithm environment. Using before upper and after upper options, the algorithm environment can be easily integrated inside a tcolorbox. And the new environment includes title as as second parameter. The first parameter is optional and can help to customize the box:



        documentclass{article}
        usepackage[utf8]{inputenc}
        usepackage[longend]{algorithm2e}
        usepackage{textgreek}
        usepackage{amssymb}
        usepackage{tcolorbox}

        newtcolorbox{myalgorithm}[2]{%
        fonttitle=bfseries,
        title=#2,
        #1,
        before upper={begin{algorithm}[H]},
        after upper={end{algorithm}}
        }

        begin{document}
        begin{myalgorithm}{Q-learning (off-policy TD control) for estimating $pi approx pi_*$}
        Algorithm parameters: step size $alpha in (0, 1]$, small $epsilon > 0$;
        Initialize $Q(s, a)$, for all $s in mathcal{S}^+, a in mathcal{A}(s)$, arbitrarily except that $Q(mathrm{terminal}, cdot) = 0$;

        ForEach{episode}{
        Initialize S;
        ForEach{step of episode}{
        Choose $A$ from $S$ using policy derived from $Q$ (e.g., textepsilon-greedy);
        Take action $A$, observe $R$, $S'$;
        $Q(S, A) leftarrow Q(S, A) + alpha [R + gamma max_a Q(S', a) - Q(S, A)]$;
        $S leftarrow S'$;
        }
        }
        end{myalgorithm}

        begin{myalgorithm}[sharp corners, colback=cyan!15]{Q-learning (off-policy TD control) for estimating $pi approx pi_*$}
        Algorithm parameters: step size $alpha in (0, 1]$, small $epsilon > 0$;
        Initialize $Q(s, a)$, for all $s in mathcal{S}^+, a in mathcal{A}(s)$, arbitrarily except that $Q(mathrm{terminal}, cdot) = 0$;

        ForEach{episode}{
        Initialize S;
        ForEach{step of episode}{
        Choose $A$ from $S$ using policy derived from $Q$ (e.g., textepsilon-greedy);
        Take action $A$, observe $R$, $S'$;
        $Q(S, A) leftarrow Q(S, A) + alpha [R + gamma max_a Q(S', a) - Q(S, A)]$;
        $S leftarrow S'$;
        }
        }
        end{myalgorithm}
        end{document}


        enter image description here







        share|improve this answer












        share|improve this answer



        share|improve this answer










        answered Nov 26 at 16:26









        Ignasi

        90.5k4163302




        90.5k4163302






























            draft saved

            draft discarded




















































            Thanks for contributing an answer to TeX - LaTeX Stack Exchange!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            To learn more, see our tips on writing great answers.





            Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


            Please pay close attention to the following guidance:


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2ftex.stackexchange.com%2fquestions%2f461468%2fhow-to-write-a-pseudo-algorithm-with-algorithm2e-package-inside-a-tcolorbox-envi%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            Probability when a professor distributes a quiz and homework assignment to a class of n students.

            Aardman Animations

            Are they similar matrix