When your IT systems are nearing capacity, you need to make decisions to expand provision, and many of those decisions will revolve around the choices you make to scale up vs scale out. For many the decision is intrinsically linked to their choice of platform and whether they are looking at cloud based, hybrid infrastructure or on-premises led strategies. For example, at the moment, with uncertainty around VMware vSphere / ESXi hypervisor pricing many are evaluating cloud options in AWS or Azure and HCI (Hyper-converged infrastructure) options from modern vendors such as Nutanix. Many are evaluating a move to containerized architectures vs. traditional server virtualization models using VMs. These alternatives present different models of scaling and so to fully evaluate their long-term costs and suitability you need to understand how they scale up and/or out.

Scale Up vs Scale Out – The Simple Explanation

Scaling up involves enhancing the capacity of a single server or system by adding more resources, such as CPU, RAM, or storage. It’s akin to upgrading a computer’s components for better performance – on-premises, scaling up could involve adding a storage array to a rack in your data center. Scaling out, on the other hand, involves adding more servers or systems to distribute the workload across multiple machines. This approach potentially improves performance and redundancy by leveraging a network of systems. While scaling up enhances a single unit, scaling out expands the entire infrastructure. However, scaling out has many challenges too – more of that later.

A diagram showing difference between scale up vs scale out. scale up is shown as three increasingly big, single machines. Scale out shows three clusters each consisting of an increasing number of identical servers.

Scale Up – What is Scaling Up?

Scaling Up – a definition and how it works

Scaling up is often called “Vertical Scaling”. You add some sort of extra resource to an existing system or component to increase its capacity.

Scale up – a bit of history

In the 1960s and 1970s mainframe data centers became common and centralized computing was the de facto for enterprise computing. Companies such as IBM offered families of compatible computers equipped with compatible components such as the IBM System/360, allowing businesses to scale up by upgrading to more powerful models within the same family. Often the capabilities of these mainframes were limited by the licensing on the microcode they were supplied with and scaling up simply involved paying the vendor to unlock extra resources of the ability to add extra components such as memory.

Scale Up – Modern Technologies and Applications Driving Demand

Scale-up methodologies are particularly beneficial in scenarios where latency, consistency, and performance are critical. Scenarios where the latency and overhead of managing distributed systems may not be appropriate or may compromise data integrity.

Certain types of workloads and / or datasets favor very powerful centralized resources. New hardware technologies have also emerged that allow servers to be scaled up with resource capabilities previously unavailable – notably GPUs, TPUs and high-end multi-core processors. Graphically or numerically intensive workloads are now frequently run on upgraded servers equipped with GPUs and / or TPUs rather than CPU alone. Remote working, gaming and streaming have all driven demand for this higher-end hardware.

Scaling up is common in environments with strict consistency and transactional integrity requirements, such as relational databases. For databases, developments in “Vertically Scaled RDBMS” technologies such as Oracle Exadata enable vertical scaling (scaling up) by allowing additional OCPU cores to be added as needed. Newer database technologies have also emerged such as Snowflake, a cloud-based data warehousing platform designed for high performance and scalability, which leverages an architecture to separate storage and compute, allowing for independent scaling.

Scaling up commonly occurs when a system has become unable to meet peak workload demands.

Scale Out – What is Scaling Out?

Scaling Out – a definition and how it works

Scaling up is often called “Horizontal Scaling”. Scaling out usually involves adding more machines or nodes to a system to increase its capacity and / or performance.

Scaling Out – a bit of history

The concept of scaling out gained significant traction in the early 2000s with the rise of large-scale internet services. Google was a pioneer in this field, developing a highly scalable and distributed infrastructure to support its search engine. Google’s use of commodity hardware and its development of technologies like MapReduce and the Google File System (GFS) showcased the effectiveness of horizontal scaling.

Similarly, Rackspace, one of the early cloud service providers, contributed to the popularization of horizontal scaling through its cloud infrastructure services. By offering scalable cloud solutions, Rackspace enabled businesses to easily expand their computing resources by adding more instances, rather than upgrading individual servers.

These innovations have laid the groundwork for modern cloud computing, where horizontal scaling is a fundamental principle for building resilient and flexible systems.

Whilst Google pioneered the technologies, they did not commoditize them for external consumption. It has been HCI vendors such as Nutanix who have brought this type of technology into mainstream use by other vendors for use cases such as EUC DaaS.

Scaling Out – Technologies Well-Suited to Scaling Out

As the example of google shows, scaling-out has been driven by modern architectural strategies and in turn the requirement and wish to scale-out has led to the development of new frameworks and software to facilitate scaling out. Technologies particularly associated with scale-out techniques, which drive innovation to do so, or drive adoption include:

Cloud Computing: Platforms such as Amazon Web Services (AWS), Microsoft Azure, Google Cloud Platform (GCP) offer elastic scaling capabilities, allowing resources to be added or removed dynamically based on demand. These cloud providers typically provide services like load balancing, auto-scaling, and distributed storage. Serverless Computing options available in cloud such as AWS Lambda, Google Cloud Functions, Azure Functions, allow the execution of code without the need to manage servers, automatically scaling in response to demand.

Microservices Architecture: Container technologies such as Docker, or orchestration frameworks such as Docker Swarm, Kubernetes and Red Hat OpenShift facilitate the development and deployment of applications as a collection of loosely coupled services. These enable scaling individual services independently, improving resource utilization and fault isolation. Orchestration frameworks (e.g. K8S) ensure efficient resource allocation and scaling of applications across clusters.

Distributed Databases: Database technologies such as Apache Cassandra, MongoDB and Snowflake are designed to scale horizontally by distributing data across multiple nodes and provide high availability and fault tolerance through replication and partitioning. Many modern databases support a technique known as “Database Sharding” to enable horizontal scaling.

Big Data Technologies: Technologies such as Apache Hadoop and Apache Spark are designed with large data sets in mind and use cases where large-scale data processing occurs by distributing workloads across clusters of commodity hardware.

Content Delivery Networks (CDNs): are particularly suited to distributed resources, nearer to the users vs. a large, centralized resource. Modern applications and media leverage technologies such as Akamai, Cloudflare and Amazon CloudFront. Distributing content geographically reduces latency and improves access speed for users worldwide. These technologies scale out to handle high traffic volumes and provide redundancy.

Load Balancers: Technologies such as NetScaler, NGINX, HAProxy, F5 and AWS Elastic Load Balancing have played an important part in facilitating the use of scaling out methodologies by distributing incoming network traffic across multiple servers to ensure no single server is overwhelmed. Overall enhancing the availability and reliability of applications by managing traffic distribution.

Distributed File Systems: Technologies such as GlusterFS, Hadoop Distributed File System (HDFS), and AWS S3 allow the storage and retrieval of data across multiple servers, and provide scalability and high availability through data replication and distribution.

Messaging and Streaming Platforms: These facilitate the handling of large volumes of messages and real-time data streams. They are designed for horizontal scalability by distributing the message load across multiple nodes. Popular choices include Apache Kafka, RabbitMQ, and Mosquitto MQTT.

Key Differences Between Scale Up vs. Scale Out

Architecture

Scaling up (Vertical Scaling) generally involves enhancing a single system’s hardware to increase capacity. This approach typically relies on a monolithic architecture, where all components are tightly integrated into a single, powerful machine. The architecture is straightforward, with a centralized system managing all processes and data, which simplifies management but creates a single point of failure. The system’s performance is limited by the maximum capacity of the hardware, often making it less flexible and scalable in the long run.

Scaling out (Horizontal Scaling), on the other hand, involves adding more machines or nodes to a system to distribute the load. This approach utilizes a distributed architecture, where components are decoupled and run across multiple systems. Each node operates independently but works together as part of a cohesive system. This architecture enhances flexibility, allowing for nearly unlimited growth by simply adding more nodes. It also improves fault tolerance and resilience, as the failure of one node does not cripple the entire system. However, managing a distributed architecture is more complex, requiring sophisticated orchestration and coordination to ensure seamless operation. Scaling out often requires an investment in software, frameworks and technologies specifically designed to handle distributed architectures and workloads.

Performance

Scaling up focuses on enhancing a single machine’s performance by adding more powerful hardware components like CPUs, RAM, and storage. Some applications and datasets will always perform best when latency is minimized, whether that be network latencies between machines or latencies between components within a single system. Scaling up can be beneficial for CPU-intensive tasks and applications requiring high-speed, low-latency operations and handling larger datasets in-memory. However, it faces limitations due to the physical constraints of a single machine, leading to a ceiling on performance improvements and the size of datasets that can be processed.

Scaling Out (Horizontal Scaling), on the other hand, involves adding more machines or nodes to distribute the load. This approach leverages parallel processing, where tasks are divided among multiple nodes, significantly enhancing performance for distributed applications such as web services, big data processing, and cloud-native applications. It offers superior fault tolerance and flexibility, as the system can continue functioning even if some nodes fail. However, it requires sophisticated management and coordination, and network latency can impact performance.

Cost

Comparing scale up vs scale out on a cost basis is usually very complex and highly dependent on the timescales considered for pay-back and how wide you consider the remit of the project.

On the one hand you can argue that scaling up involves a significant upfront cost for high-end hardware upgrades. Whilst a scale-out approach means costs can be more incremental, as additional (often standard) servers are added as needed.

On the other hand, there’s the argument that server capacity is often limited by a certain resource demand from the workload – you can have loads of spare CPU cycles, memory, and network capacity but the server storage is slow and upgrading the storage to faster SSDs is a relatively cheap way to access the unused capacity of all those other resources (that you’ve already paid for).

See: IT Performance Monitoring Is Not Just About Diagnosis | eG Innovations for a collection of factors including cost amongst others you might want to consider.

Flexibility

Flexibility is a major differentiator between scaling up and scaling out. Scaling out broadly offers superior flexibility due to its ability to adapt to varying workloads dynamically and cost-effectively. This flexibility is critical in modern IT environments where demand can be unpredictable, and the ability to quickly respond to changes is essential for maintaining performance and availability. By leveraging distributed architectures and cloud technologies, scaling out allows organizations to build systems that are more resilient, and adaptable.

Particularly with on-prem datacenter hardware, scaling up is generally a one-way process and leaves little potential to scale back down, so if demand drops there’s no opportunity to save running costs by shutting some machines down.

In some scenarios where continual growth and increased capacity need is predicted and physical hardware needs to be installed it can make complete sense to over-provision and provide sometimes years’ worth of capacity in one-shot. This is a strategy adopted in local internet exchanges and CDN technologies (a topic I wrote about during COVID for TechTarget, see: A deep dive into the internet, bandwidth, and how users in EUC and elsewhere affect capacity | TechTarget).

Maintenance

The scaling up approach generally simplifies management since updates, hardware upgrades, and troubleshooting are confined to one system. However, it often requires significant downtime during upgrades and poses a risk of a single point of failure, which can impact the entire system.

A screenshot of the eG Enterprise admin's console being used to set scheduled maintenance policies to suppress alarms during planned maintenance (e.g. to upgrade a server's disk to a bigger one to scale up)

Multiple maintenance policies can be applied to components, zones, segments or individual tests. Here we have set up a regular maintenance schedule for a server that consists of an hour every Tuesday and Thursday at 3am plus 6 hours on the third Sunday of every month.

Since scaling out distributes maintenance tasks across multiple nodes of systems. This method usually minimizes downtime, as nodes can be added or maintained incrementally without affecting the entire system. However, maintenance complexity increases due to the need to manage and synchronize updates across many nodes, often requiring advanced orchestration tools. However, this approach enhances redundancy and fault tolerance, as the failure of one node has minimal impact on overall system performance.

Scaling smartly can lead to huge cost savings. If it is easy to manage the maintenance operations so that scaling up / out and scaling down / in are done in a timely fashion, costs can be minimized. eG Enterprise is one of the few observability and monitoring platforms designed to make maintenance operations and scaling changes easy, see: Managing Monitoring and Alerting during IT Maintenance | (eginnovations.com).

Hybrid Approaches – It Doesn’t Have to Be Scale Up vs. Scale Out: Combining Scale Up and Scale Out

Most modern application delivery platforms now offer a combination of both horizontal and vertical scaling options, recognizing that this offers the best of both worlds. This is particularly true with platforms like:

  • HCI (Hyper-Converged Infrastructure) platforms such as Nutanix
  • Public cloud platforms such as AWS, Azure and GCP

A very complicated diagram of the Nutanix architecture

eG Enterprise includes domain-aware custom integrations using secure and supported vendor APIs / integration points to provide monitoring for modern platforms such as Nutanix in the same single-pane-of-glass GUI as other on-prem and in-cloud platforms or infrastructure.

Leveraging Scale Up and Scale Out on Modern Cloud and HCI Platforms

Here are some examples of modern cloud and hyper-converged infrastructure (HCI) platforms that leverage both scale-up and scale-out strategies to optimize performance and meet diverse workload requirements:

Amazon Web Services (AWS)

  • Scale Up: AWS allows users to select instances with varying levels of CPU, memory, and storage to suit specific needs. For example, EC2 instance families offer different configurations that can be easily upgraded to more powerful instances.
  • Scale Out: AWS provides auto-scaling groups that automatically add or remove instances based on demand. Services like Amazon RDS and DynamoDB scale horizontally by distributing the load across multiple servers.

Microsoft Azure

  • Scale Up: Azure offers virtual machines (VMs) that can be resized for more CPU, memory, and storage. Users can move to more powerful VM series as their requirements grow.
  • Scale Out: Azure supports horizontal scaling through features like Azure VM Scale Sets and Azure Kubernetes Service (AKS), enabling applications to scale out by adding more VMs or containers.

Google Cloud Platform (GCP)

  • Scale Up: GCP allows users to customize VM instances with the required resources and offers machine types that can be upgraded to higher specs.
  • Scale Out: GCP’s managed services like Google Kubernetes Engine (GKE) and Cloud Spanner automatically scale out to handle increased load, distributing tasks.

Nutanix – A Hyper-Converged Infrastructure (HCI) Platform

  • Scale Up: Nutanix enables organizations to enhance the capacity of individual nodes by adding more powerful hardware components, such as CPUs and RAM.
  • Scale Out: Nutanix clusters can be expanded by adding more nodes, allowing seamless horizontal scaling. The Nutanix Acropolis Operating System (AOS) balances workloads across the expanded infrastructure akin to the Google “web scale” paradigm.

So! Why do You Need to Understand Scale Up vs Scale Out?

Many people are looking at their platform choices for many reasons:

  • The VMware / Broadcom / Omnissa changes and possible licensing cost increases.
  • The themes of cloud repatriation and cloud exit – as some find cloud is proving to be a lot more expensive than expected.
  • Maturity in management tooling, cloud services and software that is making adopting Kubernetes, microservices and containerized architectures far more turnkey and hence feasible for a widnuer range or organizations.

A lot of EUC type workloads, especially where VMs are involved, have always involved a lot of waste and t-shirt sizing (S, M, L, XL). Native-cloud instances are sold in families. See: Choosing Azure Instances for Microsoft AVD (eginnovations.com) as an example.

Moving to a platform such as Nutanix, from on-prem ESXi / vSphere with its SAN dependencies, or from Cloud VM instances, changes the balance of how you can scale up and scale out. Waste and capacity planning strategies will need to be adjusted. This is where eG Enterprise fits in to help our customers make intelligent platform decisions based on data. Leveraging the same tool, they can understand where the new bottlenecks and challenges are and whether scaling up or out is the best option.

eG Enterprise is an Observability solution for Modern IT. Monitor digital workspaces,
web applications, SaaS services, cloud and containers from a single pane of glass.

Related Information

About the Author

Babu is Head of Product Engineering at eG Innovations, having joined the company back in 2001 as one of our first software developers following undergraduate and masters degrees in Computer Science, he knows the product inside and out. Based within our Singapore R&D Management team, Babu has undertaken various roles in engineering and product management becoming a certified PMP along the way.