New R&D department at Code Poets - origins & first project
A place to learn, develop, and escape burnout. New challenges granted
Having people ready to code is one of the goals. The second is to ensure that these people are prepared to work on demanding tech projects. It is especially true in blockchain and AI technologies.
Honing the tech skills
It's not easy to find experienced blockchain specialists, but there are already quite a few skilled developers who are seriously interested in it. After several months of coding smart contracts, as part of R&D tasks, such people would be ready to work and take leadership on future blockchain projects for the client. And when the project is over, they could go back to soaking up more technology and doing cool things instead of waiting out the break on a bench.
Future-proof R&D - a place to grow
Our R&D department was built from scratch. First, we established the rules and thought about what projects we would like to run there. We originally wanted to start with just me, and a new person in a 1-1 squad but found it worth practicing more teamwork right away.
We invited new people and those already working in the company. The developers from Code Poets could declare that they want to spend some of their contracted time on R&D.
The ultimate goals for starters were:
- having teams ready to take action and leadership roles in projects,
- creating a place where devs can learn technologies used now and in the future,
- providing interesting challenges for people waiting on the bench for their next project.
As it turned out, the idea of creating R&D department was hitting a jackpot. The first completed project is behind us.
We started with a bang by entering the world of RL (Reinforcement Learning).
#1 R&D project: Paper Soccer - working on RL
The R&D department started by forming a team of three. We started with the RL project - it's a fascinating branch of AI and ML. Our first challenge was to create a simple two-player game - Paper Soccer. Well, complicated enough, so it was impossible to create a good AI based on condition instructions "ifs". With that, the range of possibilities was large enough for the use of RL in conjunction with neural networks to be justified. But let's start with some definitions.
What is RL (Reinforcement Learning)
RL is an area of machine learning that is about maximizing cumulative reward. Learning goes on by interacting with the environment. The agent learns by the consequences of its actions - through trial and error. It judges if the action is worth taking by its past experiences of exploitation or exploration. The chain of actions that bring maximum cumulative reward over time is desired. The mechanics of RL make it a compromise solution between supervised and unsupervised learning, also known as semi-supervised.
RL and… What else did we use in the project?
In DeepMind's publication in Nature, the authors proposed modifications to the DQN algorithm. DQN stands for Deep Q-Learning, a method that uses deep neuronal networks to the classic RL algorithm. Thanks to them, they were able to teach AI to play a series of Atari games with outcomes better than any other algorithms or keen players. We decided to apply this to our project.
The combination of RL with Deep Learning seemed to be the key, most thrilling element of our version of the gaming algorithm.
#1 R&D project: Paper Soccer - challenges and achievements
You can see the effect of our work here: Paper Soccer.
At this stage of project development:
- We have implemented CPU opponents needed for randomized and brute-force simulation and learning.
- We trained our DQN net on a random opponent, stopping at the stage where such an opponent is with over a 90% chance outplayed by the network. It takes about 30 minutes on the NVidia GeForce RTX 2070 Super GPU.
Even more valuable than the end result is the challenges we could define in the course of action. And there were quite a few of them.
- The cleverness of our algorithm
Our algorithm learns to outplay a random player easily, but we can't teach it to play brute-force in a reasonable time, even at a depth of 1.
- Model selection
We based our solution on the example of the model from the DeepMind publication, where convolutional networks were used, but for us, fully connected proved to be better. It's still hard, though, to judge what model will really be the best for a relatively small amount of input data.
- Representation of the state of the game
We have boards 9x13 "nodes" on the game board on which the ball may or may not stand and on which the edges may come out in eight directions, so the third dimension of the board can be planned in various ways. There are many possibilities, scalar or vector.
- Optimizing the game for two
The publication, which we relied on, is about teaching AI to play single-player games while we are dealing with a game for two. Now we are experimenting a bit with making the network play with itself, but so far, with little success.
The fourth challenge - adapting the solution to two players - seems to be the toughest one. Most likely, we will look for other solutions related to MCTS (Monte Carlo Tree Search), which DeepMind also proposed in the next publication from Nature. It is the one where they showed how, with a combination of MCTS, supervised learning, and RL, they created an AI playing Go, which managed to defeat the world champion. And no kidding with Go! Regarded as one of the most complex games globally, it required a sophisticated, well-fostered, and nourished algorithm to beat the champ.
Have we succeeded? And what's next? Conclusions and afterthoughts
The first project showed that RL is a broad field with great room for achievement. At the same time, we still have a lot to learn. We're excited to see what we've already discovered through a simple task. We dream of more and more complicated cases in the future.
The R&D department will develop. The beginnings were blockchain and machine learning, but soon we are also planning to excel in cheminformatics. We have people on board with PhDs in computer or life sciences. These are our valuable resources.
More new technologies mean more opportunities to work on exciting projects. We are convinced that involvement in R&D will result not only in the development of the team but also in acquiring new partnerships and clients.
We try to combine scientific, business, and strictly technological competencies. Thanks to the R&D department, we will be able to develop even faster and fuel our people's imagination. And they have a lot of it.
If you think you may fit in with our super-creative team of coding scientists, contact us for career information.