Intrinsically Motivated Multi-Goal Reinforcement Learning Using Robotics Environment Integrated with OpenAI Gym

Intrinsically Motivated Multi-Goal Reinforcement Learning Using Robotics Environment Integrated with OpenAI Gym

Authors

  • Sivasubramanian Balasubramanian SASTRA Deemed University; Ex- Product Manager, Revalize, Real World Analytics

DOI:

https://doi.org/10.55662/JST.2023.4502

Downloads

Additional Files

Keywords:

Artificial Intelligence, Product Management, Data Science, Reinforcement Learning, Deep Learning, Multi-GoalReinforcement Learning, Robotics

Abstract

Sparse reward is one of the most challenging problems in reinforcement learning (RL). Hindsight Experience Replay (HER) attempts to address this issue by converting a failed experience to a successful one by relabelling the goals. In open-ended and changing environments, agents face a wide range of potential tasks that might not come with associated reward functions. Such autonomous learning agents must set their own tasks and build their own curriculum through an intrinsically motivated exploration. Because some tasks might prove easy and some impossible, agents must actively select which task to practice at any given moment, to maximize their overall mastery on the set of learnable tasks. The purpose of this technical report is two-fold. First, it introduces a suite of challenging continuous control tasks (integrated with OpenAI Gym) based on currently existing robotics hardware. The tasks include pushing, sliding and pick & place with a Fetch robotic arm as well as in-hand object manipulation with a Shadow Dexterous Hand. All tasks have sparse binary rewards and follow a Multi-Goal Reinforcement Learning (RL) framework in which an agent is told what to do using an additional input.

The second part of the paper presents a set of concrete research ideas for improving RL algorithms, most of which are related to Multi-Goal RL and Hindsight Experience Replay. The Fetch environments are based on the 7-DoF Fetch robotics arm,2 which has a two-fingered parallel gripper. Agents focus on achievable tasks first and focus back on tasks that are being forgotten. Experiments conducted in a new multi-task multi-goal robotic environment show that our algorithm benefits from these two ideas and demonstrate properties of robustness to distracting tasks, forgetting and changes in body properties

Downloads

Download data is not yet available.

References

Andrychowicz, M., Wolski, F., Ray, A., Schneider, J., Fong, R., Welinder, P., McGrew, B., Tobin, J., Abbeel, O. P., and Zaremba, W. (2017). Hindsight experience replay. In Advances in Neural Information Processing Systems, pages 5055-5065.

Bellemare, M. G., Dabney, W., and Munos, R. (2017). A distributional perspective on reinforcement learning. arXiv preprint arXiv:1707.06887.

Brockman,G.,Cheung,V.,Pettersson,L.,Schneider,J.,Schulman,J.,Tang,J.,andZaremba,W.(2016). Openai gym. arXiv preprint arXiv:1606.01540.

Dhariwal, P., Hesse, C., Klimov, O., Nichol, A., Plappert, M., Radford, A., Schulman, J., Sidor, S., and Wu, Y. (2017). OpenAI Baselines.

Florensa, C., Held, D., Wulfmeier, M., and Abbeel, P. (2017). Reverse curriculum generation for reinforcement learning. arXiv preprint arXiv:1707.05300.

Gu, S., Lillicrap, T., Ghahramani, Z., Turner, R. E., Schölkopf, B., and Levine, S. (2017). Interpolated policy gradient: Merging on-policy and off-policy gradient estimation for deep reinforcement learning. arXiv preprint arXiv:1706.00387.

He, F. S., Liu, Y., Schwing, A. G., and Peng, J. (2016). Learning to play in a day: Faster deep reinforcement learning by optimality tightening. arXiv preprint arXiv:1611.01606.

Kingma, D. and Ba, J. (2014). Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980. A Novel Approach for Color Image, Steganography using NUBASI and Randomized, Secret Sharing Algorithm. Indian Journal Science and Technology, 8(S7), 228-235.

https://doi.org/10.17485/ijst/2015/v8iS7/64275

Downloads

Additional Files

Published

17-11-2023
Citation Metrics
DOI: 10.55662/JST.2023.4502
Published: 17-11-2023

How to Cite

Balasubramanian, S. “Intrinsically Motivated Multi-Goal Reinforcement Learning Using Robotics Environment Integrated With OpenAI Gym”. Journal of Science & Technology, vol. 4, no. 5, Nov. 2023, pp. 46-60, doi:10.55662/JST.2023.4502.
PlumX Metrics

Plaudit

License Terms

Ownership and Licensing:

Authors of this research paper submitted to the Journal of Science & Technology retain the copyright of their work while granting the journal certain rights. Authors maintain ownership of the copyright and have granted the journal a right of first publication. Simultaneously, authors agreed to license their research papers under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0) License.

License Permissions:

Under the CC BY-NC-SA 4.0 License, others are permitted to share and adapt the work, as long as proper attribution is given to the authors and acknowledgement is made of the initial publication in the Journal of Science & Technology. This license allows for the broad dissemination and utilization of research papers.

Additional Distribution Arrangements:

Authors are free to enter into separate contractual arrangements for the non-exclusive distribution of the journal's published version of the work. This may include posting the work to institutional repositories, publishing it in journals or books, or other forms of dissemination. In such cases, authors are requested to acknowledge the initial publication of the work in the Journal of Science & Technology.

Online Posting:

Authors are encouraged to share their work online, including in institutional repositories, disciplinary repositories, or on their personal websites. This permission applies both prior to and during the submission process to the Journal of Science & Technology. Online sharing enhances the visibility and accessibility of the research papers.

Responsibility and Liability:

Authors are responsible for ensuring that their research papers do not infringe upon the copyright, privacy, or other rights of any third party. The Journal of Science & Technology and The Science Brigade Publishers disclaim any liability or responsibility for any copyright infringement or violation of third-party rights in the research papers.

Loading...