Faculteit  Science and Engineering 
Jaar  2021/22 
Vakcode  WBAI01505 
Vaknaam  Reinforcement Learning Practical 
Niveau(s)  bachelor 
Voertaal  Engels 
Periode  semester I b 
ECTS  5 
Rooster  rooster.rug.nl 
Uitgebreide vaknaam  Reinforcement Learning Practical  
Leerdoelen  This course aims to introduce the student to most of the concepts underlying the field of Reinforcement Learning, with a particular focus on modelfree Reinforcement Learning. At the end of the course, the student will be familiar with:  The mathematical framework used in Reinforcement Learning (lecture 1)  The explorationexploitation dilemma (lecture 2)  The development of Dynamic Programming algorithms (lecture 3)  The creation of tabular based modelfree algorithms (lecture 4)  The use of function approximators (lecture 5)  The main challenges of Reinforcement Learning (lecture 6) 

Omschrijving  Reinforcement Learning (RL) is the branch of machine learning that aims to teach agents how to interact with an environment through trial and error. Such interaction is usually modeled as a Markov Decision Process where the end goal of the agent, sometimes called the learner, is that of maximizing a certain reward signal. Unlike other machine learning approaches, such as the arguably more popular supervised learning one, RL is largely considered more challenging as an agent is deprived of any external supervision. Therefore, it can only rely on its own personal experience while learning. In this course, we will see how one can train such agents by characterizing RL algorithms both from a theoretical perspective as well as from a more practical one. To this end, six theoretical lectures will be given, the content of which will have to be, in part, put into practice in two different assignments and in a final project. Theoretical lectures will be given every Monday from 9:00 to 11:00, which will then be followed by a computer practicum from 15:00 to 17:00.  
Uren per week  
Onderwijsvorm  Hoorcollege (LC), Practisch werk (PRC)  
Toetsvorm 
Opdracht (AST), Verslag (R)
(The final grade is based on the grades obtained for: i) Assignment 1 (25%): which consists in a coding assignment related to lecture 2; ii) Assignment 2 (25%): where students have to solve some simple mathematical problems related to lecture 3 and lecture 4; iii) Final Project (50%): a RL project of the student's choice. Students are allowed to work alone or in groups of a maximum of two people.) 

Vaksoort  bachelor  
Coördinator  M. Sabatelli, MSc.  
Docent(en)  N. Orzan, MSc. , M. Sabatelli, MSc.  
Verplichte literatuur 


Entreevoorwaarden  Mandatory: Autonomous Systems (WBAI00205), Imperative Programming (WBAI00305). If the mandatory requirements are not met, only the Board of Examiners of the AI BSc may grant an exemption. Exchange students are assumed to have gone through this through their Learning Agreement; premaster's students through the Board of Admissions  other external students are judged casebycase. 

Opmerkingen  This course unit has a capacity limit. More information about capacitylimit courses can be found here. This course has an intended limit of 80 participants. Artificial Intelligence (BSc) is a Fixed Quota (Numerus Fixus) programme. As a consequence, their courses (course code WBAI) are closed for students that are not registered under the AI BSc programme, unless the course is part of the mandatory curriculum of their programme. If you wish to take this course in your minor – or as part of a socalled ‘unofficial’ premaster’s – please use the official procedure through the Board of Examiners form. 

Opgenomen in 
