VIFIB DESCENTRALIZED CLOUD COMPUTING

SlapOS is a decentralized Cloud Computing technology that can automate the deployment and configuration of applications in a heterogeneous environment.

Introduction

${WebPage_insertTableOfReferences}

Goal and scope

The goal of this document is to introduce the architecture of OSTV NMS so that an engineer can understand its purpose, the different  components and the reference documents that explain its installation and operation. OSTV NMS is designed to operate 4G/5G networks based on Amarisoft stack. It derives from SlapOS [RD], a general purpose framework designed as an overlay for POSIX systems that automates the management of any times of services (VRAN, CDN, etc.) on a distributed infrastructure. This document is not a manual.

Executive Summary

The purpose of a Network Management System (NMS) is to automate all aspects of the management of a telecommunication network, from the initial deployment of base stations with eNodeB VRAN software till the management of user subscriptions. A Network Management System (NMS) should thus automate deployment of telecommunication software,  provisioning of telecommunication services, provisioning of user accounts, monitoring of telecommunication services, accounting, billing, etc. A Network Management System should be able to detect and - ideally - predict that a telecommunication service is failing or that a user is complaining of service quality. 

A Network Management System (NMS) is a fairly large and complex application software.

It is the central tool for a telecommunication operator to automate the enforcement of common service level policies to all services of the telecommunication network. It is the central tool to manage the quality of service provided to users by knowing what happens, where and when before even being told by users.

OSTV NMS

The pictures bellow demonstrate OSTV NMS in action.

Base Station Geographic Heat Map

The main screen displays a geographical map which shows in green base stations that are running fine, in red those that are down and in orange those that might become down. It differentiates two cases: the base station hardware ("Computer") and the base station services ("Partitions", ex. Amarisoft enb) running on the base station hardware. Whenever a base station is down, the left side of the rectangle become red. Whenever a service running on the base station encounters an exception, the right side of the rectangle becomes red. In both cases, a ticket is generated to notify administrators of the network of the existence of an issue that they should solve.

List of Base Stations running eNodeB service

The main screen can also be displayed as a list of servers, each of them running various services (emb, CDN, etc) that can be filtered or sorted. Each base stations provides display of its monitoring health using the same color conventions as the heat map and direct access to monitoring tool.

VRAN Service Monitoring

List of Running Process for a VRAN service

In order to fix an issue, administrators of the network can access logs on the base station through a graphical user interface. This interface displays the usage of resources (memory, CPU, etc.) as well as the state of so-called "promises" that are configured to enforce the network service quality policy. A simple promise defines for example that "LTE network should be available". A more complex promise should be that "LTE network should provide at least 5 Mbps to each SIM card more than 95% of the time". The notion of "promise" is essential to manage service quality and must be configured or adapted for each telecommunication company.

By configuring promises step by step in response to the resolution of incidents managed by issue tracking tickets, a telecommunication company can fulfill the goal of providing the level of service that its users are expecting and to be informed of incidents before anyone tells them.

NMS = CCMS

The purpose of a Network Management System (NMS) is very similar to the purpose of Cloud Computing Management System (CCMS). The equivalence table bellow compares the two:

Network vs. Cloud Computing Management
  Network Management System Cloud Management System
Geography Which base station is located where? Which server is located where ?
Ownership Which base station belongs to who? Which server belongs to who?
Hardware profile Which RRH and device driver is available on which base station? Which disk storage and device driver is available on which server?
Software deployment Which version of which software (enb, epc) is available on which base station? Which version of which software (qemu, mariadb, apache) is available on which server?
Service provisioning How many enb services are running on each base station and what is their configuration? How many VMs are running on each server and what is their configuration?
Token provisioning Which SIM cards can access enb service on which base station? Which X509 certificates are authorized by apache service on which servers? 
User management Who can request and configure a new enb service on which base station? Who can request and configure a new VM on which server?
Project management Which project can request and configure a new enb service on which base station? Which project can request and configure a new VM on which server?
Organisation management Which company can request and configure a new enb service on which base station? Which company can request and configure a new VM on which server?
Monitoring Which enb service or base station is down? Which VM or base station is down?
Accounting How much data has been consumed by which user or token on which enb during one month? How much CPU has been consumed by which user or token on which VM during one month?
Billing How much should which user or token owe to which organisation? How much should which user or token owe to which organisation?

This table demonstrates how it is possible to rely on an existing CCMS to build an NMS by replacing typical cloud computing software (qemu, mariadb, apache) with telecommunication software (Amarisoft), and by replacing the notion of server with the notion of base station. This equivalence makes completely sense for the deployment of VRAN on standard PC, as it is the case with Amarisoft stack. It can also make sense with traditional telecommunication hardware as long as a proxy service that can run on a standard PC provides access and remote control of traditional telecommunication equipment.

Moreover, relying on standard PC is also the best approach to deploy the kind of Edge Computing architecture that is increasingly adopted by telecommunication industry. It is becoming more and more frequent to deploy next to each eNodeB various types of network acceleration services: HTTPS CDN, IoT buffering gateway, media conversion service, etc. The reality of a modern telecommunication networks and decentralized cloud computing such as Edge Computing actually requires a solution that can unify both Network Management and Cloud Computing Management.

This is the approach we have chosen with OSTV NMS.

SlapOS

OSTV NMS is based on an existing Open Source / Free Software framework: SlapOS. Although the name "SlapOS" sounds like an operating system, it is not an operating system in the same meaning as Linux, Windows, BSD, etc. are operating systems. SlapOS is a general purpose overlay for distributed infrastructures based on POSIX operating system for (Linux, xBSD, etc.) with a strong focus on service management. 

In other words, SlapOS provides the kind of service management features that are missing in Unix (but that used to exist in old mainframe operating systems or in Plan9). Those service management features can be leveraged to create a Network Management System that meets all requirements.

SlapOS has been deployed successfully since 2009 in many companies (Airbus, SANEF, Mitsubishi, etc.) and to operate two public clouds: VIFIB and Teralab. VIFIB is one of the inventors of Edge Computing [RD]. Teralab has been awarded a "Silver Label" [RD] by Europe's Big Data Value Association (BDVA). It is interesting to notice that Teralab has chosen SlapOS over other solutions (OpenStack, Proxmox, etc.) because it provided at much lower cost more management features, more control on hardware, more resiliency while consuming less resources.

Those characteristics (management, cost, resiliency, control of hardware, low footprint) are even more important in the case of telecommunication networks.

SlapOS Architecture for VRAN

SlapOS architecture is very simple. It is based on only two software components that can be installed on any GNU/Linux operating system: SlapOS Master and SlapOS Node. SlapOS Master is used to register each base station and define their target configuration. SlapOS Node is installed on each base station and ensures that its configuration matches the one defined in SlapOS Master. All other network management features derive from this simple architecture. 

SlapOS Master: Registry

SlapOS Master acts as a registry of the network management system. It keeps track of the:

  • list of base stations with metadata (geographical location, customer, project, network, etc.);
  • list of software and versions installed on each base station;
  • list of services running on each base station with configuration parameters;
  • list of issues open for each service or base station;
  • list of invoices for each customer;
  • list of users of SlapOS Master itself.

In a traditional NMS setup, SlapOS Master users are the system administrators of the network, with very simple access control rules.

In a less traditional setup, SlapOS Master is used both by both system administrators and customers of the network. Access control rules are defined to limit access of users to services that they requested.

SlapOS Node: Base Station Conductor

SlapOS Node acts as the autonomous conductor of the base station which orchestrates the different services running inside the base station based on a target configuration defined in SlapOS Master.

SlapOS Node will thus make sure that:

  • software and versions installed on the base station are the same as those defined on SlapOS Master for the base station;
  • services running in the base station and their configuration parameters are the same as those defined on SlapOS Master for the base station.

SlapOS Node will also inform SlapOS Master of:

  • connection parameters required to connect to services running on the base station;
  • health or status of running services.

Everything is a service

The key motto of SlapOS is that "everything is a service". Everything in SlapOS is thus achieved by defining and instantiating services. In the case of Network Management System (NMS), essential services are based on the Amarisoft LTE stack and implement Virtual Radio Access Network (VRAN). But other services may also be required: SIM card management service, HTTPS acceleration service, etc.

Here are a few examples of services that are usually deployed on a SlapOS Node to implement a complete VRAN architecture:

  • LTE which includes nme and ims;
  • standalone LTE nme;
  • standalone LTE ims,
  • SIM database;
  • HTTPS CDN.

Services in SlapOS are defined using the buildout [RD] build language and the Jinja2 [RD] templating language. By using buildout and Jinja2, services in SlapOS are defined in a way that is independent of any GNU/Linux distribution. It is thus possible to deploy SlapOS Node services on virtually any version of Debian, Ubuntu, CentOS, Fedora, Red Hat, SuSE, Arch, Yocto, Gentoo, etc. without any change to the service definition in buildout. It does not matter if the GNU/Linux distribution is based on initd or systemd. It also does not matter whether the distribution is based on RPM, DEB, portage or pacman packages.

Services in SlapOS consists in reality of two different aspects:

  • a buildout file, also called Software Release, which defines how to build the software required to run a service;
  • a service record in SlapOS Master registry, also known as Software Instance, which defines all parameters needed to run an instance of a service.

An example of Software Release of Amarisoft LTE stack can be found here: https://lab.nexedi.com/nexedi/amarisoft/blob/master/slapos/software/lte/software.cfg [RD].

An example of Software Release for an HTTPS accelerator can be found here: https://lab.nexedi.com/nexedi/slapos/blob/master/software/apache-frontend/software.cfg [RD].

If we compare the two aspects of a service to an object programming language, Software Release plays the role of a class and Software Instance plays the role of an object instance. Just like with object languages a single class of service (Software Release) can be used to create multiple service instances (Software Instance). A single base station can thus run multiple nme service instances with different parameters. It is even possible to run multiple nme service instances of multiple versions of the Amarisoft NME software on the same base station.

A base station can thus be shared between different projects, different customers or different networks with different service goals. Yet, it is managed in a simple and consistent way by SlapOS Master registry which keeps a shared state of the network.

SLAP: Polling from Node to Master

The Simple Language for Accounting and Provisioning (SLAP) communication protocol between SlapOS Node and SlapOS Master is a polling protocol with adaptive polling frequency. It is SlapOS Node which queries SlapOS master to collect its target state. It is SlapOS Node which notifies SlapOS Master to inform about its current state.

All communication is carried over HTTPS with X509 certificates allocated by SlapOS Master. One X509 certificate is allocated for each base station and for each service. Communication between SlapOS Node and SlapOS Master are thus encrypted and authenticated with a high level of detail. Also, by using HTTPS rather than a specific protocol, it is easy to deploy SlapOS Node in corporate networks which restrict most ports and protocols other than HTTP and HTTPS.

An interesting consequence of polling is that SlapOS Node can be deployed behind a NAT or a firewall. This means that an LTE base station could be deployed behind a standard xDSL Internet access with NAT, yet could be completely controlled remotely. It is also possible to deploy an LTE base station on a secure network with virtually no access from outside world, as long as it is allowed for this base station to query SlapOS Master IP address through HTTPS.

SlapOS polling architecture  is thus very convenient for all kinds of applications (edge, defense, industrial, etc.) where base stations are used in a private context.

Base Station Autonomous Operation

SlapOS Node runs in an autonomous way. It contains all software and algorithms to ensure that its state converges autonomously towards the state defined in SlapOS Master. This autonomous approach goes in the opposite direction of centralized configuration systems such as Puppet, Chef or even Ansible that use a central server to run all algorithms that automate infrastructure. SlapOS Master only defines the desired target state. SlapOS Node runs algorithms that make its current state converge towards the desired state.

As a consequence, SlapOS Master / SlapOS Node architecture is much more scalable than centralized configuration systems. It also means that a SlapOS Node can still operate and benefit from Network Management algorithms whenever network access to SlapOS Master has been cut. It can thus provide the kind of resiliency that is expected in applications such as edge computing, defense, industrial IoT, etc.

Zero Knowledge

SlapOS Master only defines the configuration of SlapOS Nodes. It does need to not store:

  • passwords to access SlapOS nodes;
  • passwords to access services running on SlapOS nodes;
  • logs of services running to SlapOS nodes.

Thanks to polling, SlapOS nodes do not need to run ssh or any remote control software.

It is thus possible with SlapOS architecture to create a "Zero Knowledge" system. Even by attacking SlapOS Master, it is impossible to retrieve information that would help an intrusion into SlapOS nodes.

Log analysis

Another principle of SlapOS architecture is that "everything that can be decentralized should be decentralized".

Logs of services running on SlapOS Node are thus by default stored on the node rather than uploaded somewhere. This saves network bandwidth and improves scalability .

A utility called "SlapOS Monitor" is provided to access logs and visualize node state. Network administrators can thus get a nice real-time overview of a collection of SlapOS Nodes: memory, CPU, health, etc. just like in a centralized system. It is even possible to visualize logs offline thanks to modern HTML5 offline technology used by SlapOS Monitor.

IPv6

SlapOS Node to SlapOS Master communication is based on IPv4 or IPv6. SlapOS Node to SlapOS Node communication is based on IPv6 only

This choice was made so that it is possible to allocate a different IPv6 address to each service running on a SlapOS Node and to benefit from various improvements that IPv6 provides to TCP/IP networks. We also made this choice because it simplifies the orchestration of services on Edge networks, by removing all the problems posed by NAT in IPv4.  It is also much easier and cheaper to purchase an IPv6 AS range than an IPv4 AS range nowadays in most regions of the world. Last, we observe that any major network operator (NTT, Google) have already migrated their administration network to IPv6.

In terms of IPv6 implementation we strongly recommend to deploy re6st [RD] resilient network on each SlapOS Node and on SlapOS Master. Although this is not a requirement, it is the only way we are aware of today to implement a low latency resilient network based on babel routing protocol (a protocol that is being standardized by IETF as an improved alternative to OSPF).

For a simple trial or proof of concept, using a private IPv6 address range can be enough.

Requirement Coverage

We introduce in this chapter some common requirements for a Network Management System and explain how they are covered by OSTV NMS. Each requirement is defined as a paragraph title, followed by an explanation.

Hardware

NMS software should run on any x86 PC

NMS Node software can run an any x86 PC. The number of cores that is required depend on how many VRAN services are going to be deployed on a base station. A strict minimum of 2 cores is required. The performance of each core depends mainly . The detailed information at the "NMS SYSTEM REQUIREMENTS" chapter of the learning track.

NMS Node software should not consume more than one CPU core

NMS Node software can be configured not to use more than a single process at a time so that it does not consume more than one CPU core. This are archived by enabling plugins to manage cgroups.

Architecture

Architecture should be simple enough to setup a complete system in less than one day

A complete OSTV NMS system based on SlapOS can be installed in less than one day. This is possible because the architecture is based on only two components: SlapOS Master and SlapOS Node. In addition, thanks to the concept of "everything is service", all actions on the system are done in the same way, which simplifies a lot the documentation and accelerates the learning curve. The NMS Learning Track contains all detailed informations to setup the complete system.

Architecture should cover both base station operating system and server side management system

OSTV NMS consists of SlapOS Node software that is deployed on base stations and SlapOS Master software that is deployed on a central server to manage all base stations. More informations on Lecture 1, and as well Lecture 4 and Lecture 5.

Functions

NMS should provide a geographic view of base stations with their current state (heat map)

The first page of OSTV NMS displays a geaographic map of base stations with various colors that represent their current operation state and health. More informations Lecture 5: Managing a NMS Deployment

NMS should provide a list of open issues with base stations or users

The first page of OSTV NMS displays a list of open issues. More informations Lecture 5: Managing a NMS Deployment

NMS should provide a history of  issues with base stations or users

All present and past issues can be accessed in the "Ticket" module of OSTV NMS. More informations Lecture 5: Managing a NMS Deployment

NMS should provide a view to add a new base station

A base station can be registered by adding a "Server" in the "Server" module of OSTV NMS. This is explained on Lecture 4.

NMS should provide a view to deploy VRAN software on a specified base station

A VRAN service (ex. enb) can be created by adding an "Amarisoft LTE" instance in the "Service" module of OSTV NMS.This is explained on Lecture 4.

NMS should provide a view to change VRAN configuration on specified base station

The "Service" module of OSTV NMS provides a list of all running services. By clicking on one service, it is possible to change its configuration, stop it or destroy it. This is explained on Lecture 4.

NMS should provide a view to add a new SIM card

A SIM card database can created by adding a "SIM Card" instance in the "Service" module of OSTV NMS. 

A SIM card can be created by adding a "SIM Card" so-called "slave" instance in the "Service" module of OSTV NMS. 

This is explained at the end of Lecture 4.

NMS should provide a detailed view of processes and resources used by specified VRAN service on specified base station

OSTV NMS Monitoring tool provides a visual representation of all logs and health information that can be gather from the base station. This is explained on Lecture 5.

NMS should provide a detailed view of load and availability of specified VRAN service on specified base station

OSTV NMS Monitoring tool provides a visual representation of all per process load information that can be gather from the base station.

OSTV NMS Monitoring tool provides a visual representation of all per service availability and history that can be gather from the base station.

This is explained on Lecture 5 on how to manage the NMS.

NMS should provide a list of projects

The "Project" module provides the list of projects to which a base station can be attached. This is explained on Lecture 5, on the classification of Computers.

NMS should provide a list of sites

The "Site" module provides the list of sites to which a base station can be attached.This is explained on Lecture 5, on the classification of Computers.

NMS should provide a list of customers

The "Organisation" module provides the list of customers or organisations to which a base station can be attached. This is explained on Lecture 2 and Lecture 3.

NMS should provide a list of users

The list of users can be seen by the Administrator user on the underlying ERP5 UI, accessing "Person Module". This is explained on Lecture 4.

NMS should provide a view to add a new user

A subscription system is added on Login Form. Registered users credentials can be approved automatically or require administrator approval. This is explained on Lecture 2.

NMS should provide a report of base station availability

Availability can be measured using the list of "Tickets" on a Computer or using a the Monitoring Tool This is explained on Lecture 4, however reports can be extendend to become multiple critereas.

NMS should provide text input of VRAN service configuration parameters

The "Service" module of OSTV NMS provides a list of all running services. By clicking on one service, it is possible to input the configuration in text format (XML or JSON). This is explained on Lecture 4.

NMS should provide output input of VRAN service configuration parameters

The "Service" module of OSTV NMS provides a list of all running services. By clicking on one service, it is possible to copy the configuration in text format (XML or JSON).This is explained on Lecture 4.

NMS should provide a command line interface and API to automate operations

All actions in OSTV NMS user interface can also be achieved using the "slapos" command line. All actions can be triggered using the HATEAOS REST API of SlapOS Master. This is explained on Lecture 1, by using python, REST or "slapos console".

NMS should provide a notification system for all or selected issues

All issues on the home page of OSTV NMS can be accessed from RSS feed which can serve for notification. This is explained on Lecture 5.

NMS should provide a way to define pro-active maintenance actions 

All issues on the monitoring tool of OSTV NMS can be accessed from RSS feed which can serve for notification. This is explained on Lecture 5.

Distribution

All dependencies of NMS software should be specified

All OSTV NMS software is defined using a buildout file which provides explicit dependencies for all libraries until the glibc. This is explained on Lecture 1, 2 and 3.

NMS should run on any GNU/Linux distribution

OSTV NMS software can be installed on any GNU/Linux distribution. This is explained on Lecture 1, 2 and 3.

NMS should run a embedded GNU/Linux, read-only system image with secure boot

A Debian based read-only system image with secure boot is being developed using ELBE technology. SlapOS software and services will be isolated in a dedicated read-write mount point.

System

NMS should provide a procedure for backup

All data of OSTV NMS is stored in a single file than can be copies by rsync or ssh.

The SlapOS Master can be backuped by saving the folder /srv/slapgrid/*/srv/backup, /opt/slapos/proxy.db, /etc/opt/slapos/slapos.cfg

NMS should be resilient to disaster and provide automated recovery 

NMS can be deployed using "resilient WebRunner" which provides automated disaster recovery and daily disaster recovery testing.

On the Lecture 6, you learn how to deploy Webrunner IDE, you can use the webrunner to have a resilient deployment for NMS Master.

NMS should be provide fault tolerance

As an option, NMS can be deployed using MariaDB Galera and NEO database so that it provides complete real time fault-tolerance for all components (except the load-balancer).

System Requirements

A SlapOS system requires at least two computers. One computer will host the SlapOS Master (COMP-ROOT). The other computer will run the first node (COMP-0). Additional nodes (COMP-1,2,3...) will be required to build up the network and create software instances accessible to users.

Requirements COMP-ROOT

The COMP-ROOT computer must have public IPv4 or private IPv4 reachable from COMP-0 (and users). For running a network of 80-160 actual computers, it should at least have:

  • 4-8 CPU cores (8 vCores) - preferrably i7, XEON, AMD
  • 8-16 GB RAM
  • 200GB SSD harddisk (480GB preferred)

The recommended configuation for this computer is Linux Distribution are Debian 8 or Debian 9 as these are distributions tested continuously on a daily basis.

Requirements COMP-0

The COMP-0 computer requires a valid SSL wildcard certificate as well as IPv6. At a minimum it should have:

  • 2-4 Cores
  • 4-8 GB RAM
  • 60-100 GB SSD harddisk (mostly needed for logs)

The recommended configuation for this computer is Linux Distribution are Debian 8 or Debian 9 as these are distributions tested continuously on a daily basis.

Dependencies

SlapOS is self-contained. Dependencies for Debian [RD] and RPM [RD] can be found on Gitlab (SlapOS requires bridge-utils, python, gcc-c++, make, patch, wget and python-xml). When using the single line installer, Ansible is another dependency but only of the installer itself. In general, SlapOS installer/package/buildout handle dependencies automatically.

For a SlapOS node itself, the list of software includes can be found in the slapos buildout config file [RD].

The detailed informations on this topic present on "SlapOS System Requirements - Minimal Requirements to setup a SlapOS network" at Lecture 1.

Further Readings

We provide in this chapter a commented list of documents that will provide further understand of OSTV NMS design and operation.

  • SlapOS Architecture [RD] contains a brief introduction of SlapOS and key concepts;
  • SlapOS Production Deployment Overview [RD] provides a complete overview of deploying a production-grade SlapOS system such as the one that needs to be deployed for OSTV NMS;
  • SlapOS Learning Track [RD] contains a sequence of tutorials and howtos describing the necessary steps for setting up a SlapOS-based system;