1. DESCRIPTION OF TECHNOLOGY AREA
Computer infrastructures have undergone major changes in the last 20 years. On one hand, this revolution has been caused by advances in technology and, on the other hand, by the ever-increasing demand for services from users, whether they are individuals, corporations or businesses.
Less than 20 years ago, only large companies had a need for large data centres or storage. These centres were mainly private ones, dedicated exclusively to the execution of the company’s tasks. In addition, the use of these resources was not efficient, as tasks were generally assigned to individual machines that would get blocked up regardless of whether they used all their resources or not. This was a problem even in cases where a data centre was not necessary, but which still required a certain amount of computing power, since it was a complicated but necessary expense if a certain service was to be offered. Then came virtualisation technologies.
Virtualisation lessened the investment and expenses in infrastructure and computing resources. First, it drove an increase in resource-use efficiency, since the virtual machine—not the physical one—became the computer unit. This allowed to execute many more tasks with the same physical resources. Secondly, it obviously reduced the expenses of all those companies requiring infrastructure, both in the acquisition of capital goods, since fewer physical machines were needed, and in operations, since the maintenance and consumption expenses of these machines was lower. And then, the cloud arrived.
Cloud computing was a turning point in everything related to computing infrastructure and, above all, it led to its democratisation. Large infrastructure providers began to appear in the cloud, which allowed individuals, small, medium and large companies to use computing resources without having to purchase, renting and adjusting them to their instant demand. This made it possible for these providers to amortise the acquired infrastructure and avoid the large outlay that other players had to undertake in order to acquire computing resources, which were also often sized to cover maximum consumption peaks and not the usual service.
In addition, cloud computing has led to the emergence of different sub-models, according to what the end customer rents. The main derived models are: Infrastructure as a Service (IaaS), Platform as a Service (PaaS) or Software as a Service (SaaS). This has also led to other approaches, such as BigData as a Service (BDaaS) or Function as a Service (Faas), among others.
In parallel, another revolution was taking place—that of data. The explosion in the computing capacity to which companies suddenly had access made it easier for them to start storing more data and analysing it. This was later complemented by the arrival of sensor networks that increased the amount of data available and fed back into the need for computing capacity to be able to analyse said data.
Successive technological advances, such as the arrival of the Internet of Things (IoT), the boom in Artificial Intelligence and data analysis or, more recently, the Distributed Ledger Technologies (DLTs) and Blockchain, led by the Bitcoin network, have only continued to feed back into this cycle, either indirectly, by providing more data, or directly, by increasing the need for computing. This has contributed to the creation of ever larger and more complex computing infrastructures. Without them, it would be possible to conceive or manage the technological revolution we are living in today.
The technology area is divided in four main branches:
- Cloud Computing.
- Edge Computing.
- High Performance Computing (HPC).
- Distributed Ledger Technologies.
Below, we will explore some of the most relevant aspects and applications of each of them.
Cloud computing has made it easy for any user, whether a company or an individual, to access an unlimited amount of computing resources. However, the democratisation of these resources has been greater than what these words convey. Access to resources might seem sufficient, a priori. However, not every user has the ability to manage, configure or manipulate virtual machines in an online environment. Furthermore, it would be necessary to install and configure those tools that one would like to use in this infrastructure. This is why cloud computing includes multiple paradigms, depending on the degree of control or depth sought. The three most common paradigms are:
- Infrastructure as a Service (IaaS): The user rents the infrastructure directly, virtual machines or empty containers over which the user has full control. This also implies greater complexity and difficulty of use, as it involves selecting and maintaining operating systems, security, scaling and even network configuration.
- Platform as a Service (PaaS): This level already avoids some of the previous complexities, offering a platform on which the user can install certain applications and configure services, but it avoids issues like architecture management, such as scaling or operating system management.
- Software as a Service (SaaS): At this level, you interact directly with the software you are using. There are no added complexities or configurations beyond possible customisations. Examples of day-to-day SaaS could be email services such as Gmail or file management services such as Dropbox.
However, cloud technologies have gone further. In addition to offering different granularities for the user at the service level, different granularities are offered at the level of computing capacity and execution. We are not only talking about having virtual machines of different sizes, but also lighter elements such as containers, such as Docker, which are designed to run individual services or applications; or even lighter elements, geared towards running or servicing individual requests, which has come to be known as Function as a Service (FaaS). In the latter case, the service derives requests as they come in and executes them in containers that contain just enough to serve these requests, usually without status. These containers are expressly created to execute these requests, and their life span is usually only what it takes to serve the request.
Cloud computing applications are very varied. Beyond the most obvious applications exemplified in SaaS services, the ability to deploy additional resources in the cloud allows many companies to meet peak demand for their services flexibly and at a low cost. Examples of this range from small businesses that, at times of high demand such as Christmas, can hire extra resources to tend to their online store or, if they already have it hosted in the cloud, automatically scale it according to demand. An example known to most is Netflix, which decided to move the bulk of its infrastructure to the cloud to obtain additional benefits such as auto-scaling, multi-region redundancy or, of course, cost reduction.
However, the cloud also offers complex services that make life easier for professionals in many sectors. One example of this is BigData as a Service platforms, such as Radiatus, developed at ITI. Radiatus offers a wide variety of services related to data analysis, machine learning and Artificial Intelligence in general. The users of this service are mainly data analysts and scientists who, in this way, avoid the complexities of having to deploy complete stacks of analysis tools or learning their dependencies and other complications derived from the installation process. With platforms like Radiatus, the user only has to choose which tools he or she needs, wait for them to be deployed and start working.
Edge computing has emerged in response to a need in many sectors of the industry, as well as a result of the exponential increase in the amount of data, multimedia or otherwise, available today. Having sensor nodes or information capturers that were limited to forwarding this data to large computer centres for processing or storage or data was a common architecture in many systems. These data were received in the corresponding centre, processed and, if necessary, a response was sent. This was feasible because of the low volume of data to be sent, because generally these data could be processed quickly, or because the latency associated with sending these data to a data centre was not relevant.
However, the explosion of multimedia data that has occurred in the last few years has destroyed at least two of these conditions. In many cases, the volume of data to be sent is no longer trivial, especially in cases where video streams are involved. Moreover, in many cases, the processing of this type of data is not trivial or agile. What is more, with the increasingly frequent outsourcing of services to the cloud and the implementation of complementary and additional services, there is a growing need to obtain quasi-real-time responses, either because of their criticality or because of the presence of users.
In response to this problem, the concept of Edge Computing emerged. Edge computing consists of either adding intermediate nodes close to where the data is generated, or providing computing resources to these devices. The goal is now to pre-process, filter or aggregate, for example, the data in order to reduce the bandwidth and speed up the task to be executed in the remote data centre; or directly execute certain more critical tasks, reducing the latency in the response and guaranteeing complete availability. Two examples of use cases could be autonomous vehicles or clinical analysis devices.
In the first case, the autonomous vehicle, we have a device, the vehicle itself, which will be able to generate huge amounts of data daily. Let us think about the amount and variety of sensors that a vehicle with these characteristics can have: cameras, accelerometers, sensors for each of the car’s circuits, distance sensors, etc. The data generated in a day could be in the order of terabytes, mainly due to video cameras. This amount of data cannot be continuously transmitted to remote data centres for three reasons. First, it would not be possible to provide the amount of bandwidth needed to cover the communication needs of the car pool when the technology becomes established. Secondly, sending this information and processing it would entail an unacceptable increase in latency in decision-making. Let us imagine, for example, a vehicle that needs to decide, in a fraction of a second, what to do to avoid hitting a pedestrian or to avoid a collision. Thirdly, similar to the second case but more extreme, the amount of vehicles sending data could cause the data centre to become saturated, leading to task queuing—which is unacceptable, since they are critical tasks—or even to service outages. For this reason, autonomous vehicles will be mobile computing centres, where any critical operation will be solved in real time in the vehicle itself, and secondary tasks, such as consumption models and history-based calculations, could be sent to the cloud.
The case of medical devices is similar. Many of these devices, such as a CT scan, are machines with a limited amount of resources and functions. The reason for this is that adding more functionality to the device can increase the possibility of generating errors. If this machinery performed critical or life-threatening tasks, these errors would not be bearable, so the systems chosen were austere but reliable in their functionality. However, the possibility of having additional functionality that does not run on the machine itself, but on a node deployed in the same medical centre (if it has some kind of latency limitation), or even in the cloud if there are no requirements or temporary criticality, is now being studied or included. In this way, the machine remains just as safe, but added value and functionality can be increased in an almost unlimited way.
HIGH PERFORMANCE COMPUTING (HPC)
However, HPC systems and interest in them are experiencing a second awakening thanks, in part, to the boom in data analysis and BigData. As mentioned, one of the great challenges today is to efficiently analyse the immense amount of data available and generate models based on this data to solve particular needs or use cases. The complexity of the process of training these models, or the operations needed to solve them based on specific operations, make access to clusters of high-performance processors, such as GPUs (Graphics Processing Units), desirable for data analysts because of the increased efficiency and processing power of these devices.
The main use case for these systems continues to be the development of Artificial Intelligence techniques. In order to accomplish this, the main proof continues to be a machine’s ability to play highly complex games and try to beat the corresponding human champions. There are numerous examples published today, such as the well-known chess game, or the latest achievements by Google’s DeepMind team defeating champions of Go—a traditional Asian board game—, or their foray into computer games, such as StarCraft. In 2019 itself, AlphaStar, the DeepMind application for StarCraft 2, has managed to comfortably beat professional players of this game. To achieve this, the latest evolution of the system, based on convolutional networks, trained intensively, playing against itself for a week, analysing data and results and getting the learning equivalent to 200 years of gameplay. Surprisingly, games against human players were not won by taking advantage of a higher rate of actions executed per minute, which we could think of as its main weapon. In fact, human players achieved higher rates of actions per minute. The victories were simply due to better decisions and strategy. AlphaStar had learned to play better than humans.
Another example of using GPUs for data analysis is in the study of dense tissue segmentation in high-resolution medical imaging in the breast cancer study conducted here at ITI. In this case, this type of study was performed by radiologists studying these medical image sets. The main problem is that this process was slow, arduous, and limited by the number of specialists. For this reason, the introduction of automatic systems that allow for the automation of the analysis of the dense tissue of the woman’s breast through the use of convolutional neural networks has been a great advance, thanks to the simplicity with which these systems can be trained, the time it saves and the possibility of reaching degrees of effectiveness similar to those of a human specialist.
DISTRIBUTED LEDGER TECHNOLOGIES
Distributed Ledger Technologies (DLT) have made a strong breakthrough in the technology landscape in the last decade, led by Bitcoin, the most popular of the crypto-currencies and its Blockchain network. Many things have been said about Blockchain, a particular case of DLT: that it will ruin the world economy, that it will replace banks, or that it is just a vehicle for speculation. However, beyond all speculation, the truth is that every day more and more completely valid cases of use arise, both in the private and corporate world and in the eyes of the general public.
We have the cases of Bitcoin for the general public, but there are also other technologies such as Ethereum, which has allowed the deployment of a wide variety of applications on its network through the progress made in the development of SmartContracts. Examples of these applications can range from banal collection games, such as CryptoKitties, where users can buy their own completely unique virtual kitten, to contracts between individuals or companies with the guarantee that neither party can violate it or cheat. In fact, this is the main virtue of DLTs: transactions are immutable and non-repudiable, it is all secure and allows individuals who do not trust each other to make transactions without the need of a third party, such as a bank. If we also add the Smart Contracts, we add the possibility of executing chains of actions in a safe way and that produce, in a deterministic way, a result given the same input.
From a private point of view, DLT network deployments in different business environments are becoming more and more frequent thanks mainly to these characteristics, immutability and non-repudiation, and the added value that these represent in a commercial relationship between parties that do not necessarily trust each other. Interest in the technology is growing, from relatively open initiatives like Quorum to others like Hyperledger Fabric, also open but driven by companies like IBM.
One of the most important problems that this technology had to overcome was how to ensure consensus and trust between the parties involved. It sounds feasible in private systems where the parties have to know each other and even allow each other access, but what about an open system like Bitcoin or other blockchain networks? These networks have used distributed systems and their computing power to provide their features. Taking Bitcoin as an example, the first security measure is that the account book, ledger or transaction log is not in the possession of one user, but of all the users in the network, so security is provided by the scale. However, information has to be consistent, and that is where the consensus comes in. The Proof of Work algorithm and cryptography provides a system in which once a transaction is stored in the blockchain network, modifying it is practically impossible, since it would be necessary to falsify the whole chain to make a single change and the computational effort would be unmanageable.
DO YOU NEED SOME OF THESE TECHNOLOGIES IN YOUR PROJECT?
Get in touch with us through the form for companies and we will guide you to incorporate these technologies into your project through the partners specialized in your activity.