Why is Buildout the most disruptive technology in Cloud Computing? This page is an excerpt of the SlapOS Lecture. What is Buildout Quoting the Buildout website, "Buildout is a Python-based build system for creating, assembling and deploying applications from multiple parts, some of which may be non-Python-based. It lets you create a buildout configuration and reproduce the same software later.". Buildout originated from the Zope/Plone community to automate deployment of customized instances of their software. Lead by Jim Fulton, CTO of Zope Corporation, Buildout became a stable and mature product over the years. Buildout is used in SlapOS to define which software must be executed on a Slave Node. It has a key role in SlapOS industrial successes. Without it, SlapOS could not exist. However, buildout is also often misunderstood - sometimes purposely - by observers who criticize its use in SlapOS. Many people still do not realize that there is no possible software standard on the Cloud and that buildout is the solution to this impossibility. Experts know for example that any large scale production system which is operated on the Cloud (ex. a social network system) or privately (ex. a banking software) uses patched software. Relational databases are patched to meet performance requirements of given applications as soon as data grows. If a Cloud operating system does not provide the possibility to patch about any of its software components, it is simply unusable for large scale production applications. SlapOS is usable because its definition of what is a software is based on the possibility of patching any dependent software component. Where is my patch? Still people who name a software such as "kvm" or "MySQL" believe that this is enough (and for them, SlapOS provides aliases for the words "kvm" and "mysql" which link to an explicit buildout definition). However, the reality is not that straightforward. For example some releases of kvm support nbd protocol over IPv6 and some not. Some releases of kvm support sheepdog distributed block storage and some not. Some releases of kvm support CEPH distributed block storage and some not. Most users who run kvm to try a software do not care about IPv6, sheepdog or CEPH. But those users who run kvm on SlapOS need IPv6 support to access NBD and this is for now only available as patch. Those who want resilient storage may want sheepdog support which is only available from version 0.13. And those who want CEPH support also need a patch. Those who want the IPv6 patch may prefer not to use the CEPH patch which is not yet stable officially. And those who want CEPH patch may distrust the IPv6 patch. All-in-all, there is no way to agree on a single version of kvm. All the different releases of kvm may have to be installed on SlapOS Slave nodes in order to meet market requirements. Since the patch possibilities are so wide, the easiest way to know afterall which kvm is being installed on a SlapOS node is simply to list where its original source code was obtained from and which patches were applied. This is exactly what buildout does, in just a few lines of configuration. Buildout also eliminates any complex or time consuming process to distribute binary packages on a wide range of hardware architecture thanks to a trusted, distributed, caching mechanism which does not even centralize signature. The problem we are discussing here with kvm is even more complex with MySQL. There are now multiple sources of MySQL: the official one (MySQL), the one by MySQL original author Michael Widenius (MariaDB), the one by Percona InnoDB experts and the one by Cubrid which is not MySQL but claims to be 90% compatible with it. Among each source of MySQL sources, there are different versions. Default compilation options may also differ. Authors of large scalable applications know very well that the performance of their application can be dramatically impacted by subtle changes to the SQL optimizer. Changing the version of source of MySQL may simply lead to a performance collapse. We always remember an example of application for which we had to change the default parameters in MySQL header file in order to scan 32 rows instead of 8 for query optimization. Therefore, if we did not have the possibility to choose which source of MySQL to use and which patch to apply to it, we just could not have run enterprise applications with SlapOS and show industrial success stories. Arguments and counter-arguments against Buildout The use of buildout by SlapOS is disruptive compared to traditional approaches of software distribution. It has enabled industrial success faster. But it also has lead to slower adoption of SlapOS by certain communities, often for incorrect rationale. We are going to discuss further. What about disk images? Some people consider that buildout is irrelevant since Cloud should be based on disk images and virtual machines. What those people do not realize is that not only SlapOS can run about any disk image format but that buildout can be used to automate the production of disk images, much better probably than many other tools. And it is open source. What about distributions' packaging systems? Some people consider that buildout is irrelevant since it is possible to achieve the same with packaging systems of GNU/Linux distributions. What they do not realize is that not only buildout can rely on existing GNU/Linux distribution packages (at the expense of portability) but that buildout can also be used to automate the production of packages for multiple GNU/Linux distributions in little effort. Also, buildout format is much more concise when it comes to patching or adding dependencies to existing software thanks to the "extends" mechanism. Last, buildout provides a kind of packaging format which can reuse language based packaging formats (eggs, gems, CPAN, etc.) in a way which is neither specific to a given GNU/Linux distribution nor to GNU/Linux itself. In a sense, buildout integrates much better with native language distribution systems than GNU/Linux packaging systems do. And native language distribution systems are currently becoming the de facto standard for developers. What about separation between software and instance? Some people consider that buildout prevents sharing the same executable among multiple instances of the same application. This is a common misconception, which is also wrong. SlapOS is a typical example of how to deploy once single software made of shared libraries and executable binaries and create hundred instances of it without any binary code duplication, without wasting resident RAM. I need something that is language agnostic Some people consider that buildout is designed for python only. What they do not realize is that buildout is already used to build software based on C, C++, Java, Perl, Ruby, etc. And it would not be an issue to extend SlapOS and support any buildout equivalent. But we are not aware of any system builder such as buildout which can support as many different architectures and languages in such a flexible way. Come on, I'm on Windows Some people consider that buildout is not for Windows or that it does not support proprietary software in binary form, without source code. Again, this is a misconception. Buildout is just an automation tool. Whenever source code is not available, buildout can take a binary file as input. This is what is often done for example to build Java applications based on .war distribution archives, or to deploy openoffice binaries which would else take 24 hours to compile. Buildout is also compatible with Windows. Automating the installation or the replication of Windows based software with buildout is possible. Buildout would even be an excellent candidate to automate the conversion of Windows disk images from one host environment to another. Generally speaking, running SlapOS natively on Windows could be very useful both for SlapOS and... for Windows. It destroys the work made by GNU/Linux distributions Overall, what makes buildout so debated by some observers is that it shows a different path for software distribution, especially for open source software distribution. Instead of focusing - as GNU/Linux distributions do - on providing a consistent set of about any possible open source application with perfectly resolved dependencies and maximized sharing of libraries, it focuses on building a single application only and its dependencies in a way which maximizes the portability between different GNU/Linux distributions and POSIX compliant operating systems. Application developers only need to care about their own application and stabilize its distribution. Unlike what happens with most GNU/Linux distributions, they do not need to care about possible consequences of changing one shared library on other applications hosted on the same operating system. Buildout is after all an approach to software distribution in which the most complex software has about 100 dependencies to resolve, compared to 10,000+ interdependent packages in a traditional GNU/Linux distribution. Buildout puts the burden of maintenance on each application packager and removes the burden of managing global dependencies, thus allowing parallel and faster release cycles for every application. All this with a very concise approach. Not convinced yet? If this discussion does not make you convinced yet that buildout is an efficient solution to specify a software executable and deploy it on the Cloud, please consider the following problem to solve: automate the packaging of ERP5 open source ERP and all its dependencies (OpenOffice, patched Zope, patched MariaDB, etc.) on all major GNU/Linux distributions in such a way that it is possible to provide the same behavior on every GNU/Linux distribution and to run 100 instances of ERP5 on the same server, each of which can have its own MariadDB daemon and Zope daemon. Obviously, if you find a better solution, please let us know.