Reinforcement Learning Preview in Azure ML

November 24, 2021

One of the basic tenets in behavioral psychology is learning through consequences. When a behavior is followed by a reward, the action is positively reinforced so it is repeated. And when a behavior is followed by punishment, the action is negatively reinforced, hence, the action is not going to be done again. Humans, from childhood, have learned this way for all of history.

Imagine if machines or equipment have this capability to decide whether to repeat actions or not based on consequences. Imagine if machines can learn through reinforcement learning. But the thing is, there is no need to imagine because machine leaning allows machines to learn through reinforcement learning.

Reinforcement Learning (RL) is now available in Microsoft Azure, a world leader in hosting and cloud services. Azure brings Cognitive Services and Artificial Intelligence within the reach of every developer and data scientist who can embed the ability to see, hear, speak, search, understand, and accelerate advanced decision-making into any apps.

As stated above, RL employs a system of rewards and penalties to compel the computer to solve a problem by itself. While other machine learning techniques learn by taking labeled or unlabeled data, RL is an approach to machine learning that learns through interaction with the environment, whether real or simulated. It learns optimal strategy from experience. Agents become adaptable to changes through exploration and can be trained to actively make decisions and learn from their outcomes.

As the computer seeks to maximize reward, it is prone to find unexpected ways of doing it. It can learn how to best accomplish its goal, usually without human supervision, which makes it extremely powerful and very flexible as well. RL propels the agent to be creative especially when there is no “proper way” to perform a task, yet the agent also realizes that there are rules to follow to perform its duties correctly.

Reinforcement Learning has indeed made a huge impact in the fields of robotics, self-driving cars, and gaming. However, it has and will revolutionize the world of a lot of enterprise customers who are applying RL to different types of optimization problems in manufacturing processes, logistical optimization, and many other business strategy planning or human-built decision making.

Many industries employ robots for industrial automation. Manipulation through robotics is more efficient and they can also perform tasks that would be dangerous for people to perform. Examples of these are in the field of Aircraft control and robot motion control.

In NLP (Natural Language Processing), RL can be used in text summarization, question answering, and machine translation, which can be used in dialogue generation. The deep RL can be used to model future rewards in a chatbot dialogue. The machine can be rewarded in sequences that contain important conversation attributes such as coherence, informativity, and ease of answering.

In healthcare, patients can receive treatment from policies learned from RL systems. RL can find optimal policies using previous experiences without the need for previous information on the mathematical model of biological systems. It makes this approach more applicable than other control-based systems in healthcare.

The most popular use of RL in businesses is in Recommender Systems. Netflix has publicly announced that it is using RL for recommending shows and films to its users, among other machine learning algorithms. Spotify has also acknowledged using multi-armed bandits, a type of RL algorithm, for managing the trade-off between exploitation and exploration of tracks and artists' recommendations.

Reinforcement learning-based multi-agent system for network traffic signal control. Researchers tried to design a traffic light controller to solve the congestion problem. Tested only on simulated environment though, their methods showed superior results than traditional methods and shed a light on the potential uses of multi-agent RL in designing traffic systems.

RL can also be applied in Chemical Industries optimizing chemical reactions. A study on Optimizing Chemical Reactions with Deep Reinforcement Learning showed that their model outperformed a state-of-the-art algorithm and generalized dissimilar underlying mechanisms. This application is a great one to demonstrate how RL can reduce time-consuming trial-and-error work in a relatively stable environment.

Reinforcement Learning applications in Machine learning can be used for Business strategy planning, finance, and trading. There are many implementations of Stock trading machine learning algorithms, and in particular, RL algorithms. An RL agent can decide on whether to hold, buy, or sell. The RL model is evaluated using market benchmark standards to ensure that it’s performing optimally. In fact, In April 2019, J.P. Morgan announced it started utilizing Deep Neural Network for Algo Execution (DNA) for boosting its FX trading algorithms.

Reinforcement Learning applications in Bidding, Marketing and Advertising. Researchers from Alibaba Group worked on “Real-Time Bidding with Multi-Agent Reinforcement Learning in Display Advertising”. The handling of a large number of advertisers is dealt with using a clustering method and assigning each cluster a strategic bidding agent. To balance the trade-off between the competition and cooperation among advertisers, a Distributed Coordinated Multi-Agent Bidding (DCMAB) is proposed.

Reinforcement learning is a trailblazing technology that has the potential to transform our world. It is one of the most creative, innovative, and creative in machine learning. Thus, reinforcement learning has the potential to be a groundbreaking technology and the next step in AI development. Keiji Kanazawa, Microsoft principal program manager, believes that the value of Azure is most useful for customers who are doing large-scale trial and error.

Reinforcement learning studies how an agent can learn how to achieve goals in a complex, uncertain environment. RL is very general, encompassing many problems that involve making a sequence of decisions. RL can even be applied to supervised learning problems with sequential or structured outputs. RL algorithms have started to achieve good results in many difficult environments and thus can be applied in many various settings.

What processes and structures in your company can benefit from reinforcement learning? What optimization problems in manufacturing processes, logistical optimization, or business strategy planning in your company can be addressed through reinforcement learning?

If you're ready to explore machine learning and AI with reinforcement learning, reach out today. We’re here to help you move your technology forward.
Cameron Vetter
My name is Cameron Vetter and I'm the Principal Architect at the Octavian Technology Group.
Microsoft MVP. Cloud. Enterprise Architecture. Machine Learning. Mixed Reality.
Transformative and Impactful,
Put Us to Work for You
Contact Us