-
Overview
- Distribution
- Three-Layer “Monolithic” Architecture
- Service Oriented Architecture
- Microservices
- Replication
- Elasticity, Automation and Load Balancing
- Scaling Techniques
- Synchronous and Asynchronous Communication
- Session State
- Caching
-
Distribution
- “An application is distributed when it is provided by multiple nodes each capable of handling a subset of the requests to the application”
- split the application into smaller services
- the smaller services provide different parts of the application
- the data is split into smaller parts
- possible to place each service at a different node
- better to add a distribution function to decide which request should be addressed where
- but may be set at deployment in configuration files
- spreads the load across multiple nodes
- does not directly address availability issues
- as the functionality is split up
- aids service reuse as it may be common across applications
Designing for Distribution
How should we split up the functionality of applications to achieve effective distribution?
- Industry has been through several architecture approach iterations:-
- Three-Layer “Monolithic”
- Service Oriented Architecture (SOA)
- Microservices
Today for cloud applications we are somewhat between SOA approaches and the adoption of microservices.
Three-Layer “Monolithic” Architecture
Web application - receives the HTTP request and returns the HTML response (or some associated technology).
Business logic - handles data validation, business rules and task-specific behaviour.
Data - a relational database executing SQL queries called from an API exposed to the business logic.
Classic way to architect such systems from 20 years ago.
Hardware infrastructure largely static (except web layer) i.e. mostly vertical scalability.
different layers on different servers
Complex software layers with long development cycles.
tightly coupled layers
A change to any application service, even a small one, requires its entire layer to be retested and redeployed with application outage as a result.
Dependence on statically-assigned resources and highly-available hardware makes applications susceptible to variations in load and hardware performance.
Intermediate caches to mitigate performance inefficiencies caused by separating compute and data but this adds cost and complexity.
Not generally scalable and only suitable in the cloud for simple applications.
Service Oriented Architecture
A revised approach (set of principles and patterns) to architect complex distributed systems from 10 years ago.
before the advent of the cloud but because this approach was prevalent when the cloud appeared it has been leveraged.
- Service Oriented Architecture = SOA
- originally for enterprise distributed systems but ideas applicable to the cloud
- SOA in the cloud a key area for NIST
- Break monolithic applications into coarse-grained, loosely coupled, reusable services, each exposing a contract as a well-defined interface.
- i.e. coarse grained but finer than previous monolithic layers
- in SOA using a contract to expose a discrete service located through an endpoint is called the edge component pattern
SOA in the Cloud
- A service is a function that is well-defined, self-contained and does not depend on the context or state of other services.
- a bit like an object but at a much coarser granularity
- think of a service as a process which offers a well-defined interface to consumers
- Services should be wrapped in an infrastructure managing their lifecycle.
- e.g. Azure cloud services
- or a WCF service (see later lectures)
- in SOA this is called the service host pattern
- Separation of interface and implementation
- service interface is a formal contract and should change infrequently
- termed loosely coupled as its consumers are minimally impacted by changes to that service or its environment
- if interface needs to change, version it so customers isolated from changes
- implementation may change – that’s ok
- typically communicate by SOAP web services or queues
- queues are better in the cloud – asynchronous
- in SOA this is called the decoupled invocation pattern
- Policies
- updated at runtime and externalized from business logic
- security
- encryption, authentication, authorization
- auditing, service level agreements (SLAs)
- Platform independence
- technology agnostic
- can change infrastructure without affecting consumers
- Location transparency
- consumers do not know location of service
- can move service as platform reconfigured
- hide service instances behind load balancer
- better if located at run-time
- in SOA this is called the virtual endpoint pattern
Microservices
Microservices are independent components that work in concert to deliver the application’s overall functionality.
Each small enough to truly reflect independent concerns such that each implements a single function.
- Microservices is a current approach at the stage of being adopted by cloud applications architects. This has emerged in concert with:-
- DevOps culture (integrating software development and IT to achieve rapid building, testing and reliable releases)
- container technologies and their associated clustering and orchestration tooling
- Microservices are functionally smaller than SOA services but many of the principles remain.
- i.e. a single application is developed as a suite of small autonomous and independently deployable services, each running in its own process and communicating using lightweight mechanisms
- often use RESTful APIs to communicate and are loosely coupled
- in fact many people consider microservices just to be an evolution of SOA but “done right”!
Deployed in containers so fast and efficient deployment to VMs.
Small, independent services that are versioned, deployed, upgraded and easily scaled separately.
Rolling updates, where only a subset of the instances of a single microservice will update at any given time; can also be easily rolled back should problems occur.
Microservices - Application Platforms
- Some additional infrastructure needs to be added to containers (and Docker) to provide a microservices runtime environment:-
- packaging: definition of applications composed of multiple microservices
- service discovery (naming): a repository allowing microservices to find each other at runtime
- cluster management (orchestration): automatic deployment, update operations and scaling etc.
- health checking and metrics collection
- hot migration: transparently moves microservice instances to healthy VMs or servers when the software or hardware on which they are running fails or must be restarted for upgrades
Microservices in the Cloud
- We can split a modern cloud application into three tiers, the first 2 implemented with a set of microservices.
- Presentation Tier
- Business Logic Tier
- Data Tier
Functionality to process the data is implemented in the Business Logic Tier - it is too CPU and memory intensive to embed this as stored procedures directly in the data tier.
Each microservice instance does not have access to any shared data and must communicate with HTTP or queues.
data that it does have to persist is done so locally on the same server to avoid network I/O.
- It is best that microservices in the cloud are stateless i.e. all calls are self contained and thus devoid of context (see later). However this may not be possible:-
- Presentation Tier: depends on web application design
- Business Logic Tier: agreement may be required to update data from 2 or more microservice instances
- note that this introduces distributed transactions which are bad for cloud scalability (see later lecture).
- for one thing it needs locks which introduces state
- and this is ironically exacerbated by having code fine grained...
-
Replication
- “A service is replicated when it has logically identical instances appearing on different nodes of the system”
- copy the service to multiple places
- requests can be addressed at any replica
- whenever the service state is updated, you must maintain consistency (more later but this can be hard – see CAP)
- in systems where any replica can receive updates, this is even harder
- replication can improve performance
- less client-to-server distance, less load on the service
- it is important to place replicas close to the users
- replication improves availability
- should spread nodes to different fault domains to make sure they do not get all disconnected together
- can have servers close in network terms to clients
Elasticity and Automation
- Two aspects:-
- load monitoring: set up some criteria to determine under what conditions the number of service instances is increased or decreased (or number of queues etc.)
- load balancing: share the load fairly between running service instances thereby maximising throughput
- Load monitoring to set instance numbers:-
- based on daily time or date
- after manually watching system
- automatically according to some performance metric of current instances
- CPU utilisation
- memory utilisation
- latency
- queue message processing rate
Load Balancing
- Various approaches are used based on:-
- performance – directs client to the best load balanced node based on its latency or current connection count
- failover – directs client to the primary node unless the primary node is down and then redirects to a backup node.
- round robin – directs client to the nodes in a cyclic fashion
- hash function – uses a hash of the protocol characteristics (e.g. client TCP connection) to determine the node
The load balancer includes some functionality to probe the nodes to discover the appropriate metric.
TCP/HTTP/DNS latency
liveness
Azure and AWS incorporate load balancers
more sophisticated functionality available from 3rd parties
Kemp, Barracuda, Citrix…
Load Balancing - Azure
A few options but the basic one is a Transport Layer load balancer based on hash based distribution.
Internet facing
- Uses is a 5 field (source IP, source port, destination IP, destination port, protocol type) hash to map traffic to available servers.
- all traffic from same client in a TCP connection goes to same server
- sticky sessions
- but for a new connection with same values goes to another
- so in a sense incorporates some round-robin features
More is at: 🔗 What is Azure Load Balancer?
A discussion on AWS Elastic Load Balancing is at: 🔗 Best Practices in Evaluating Elastic Load Balancing
-
Issues related to Scalability
- Other issues which affect scalability :-
- Synchronous/asynchronous communication
- State Management
- Caching
All three are in fact interrelated.
Synchronous Communication
- If we work synchronously, Client C calls Service S and waits for the response before continuing.
- this is the call semantics we are all familiar with
- these are roles as C may itself be a service
If we scale C (by adding more instances) then we must scale S.
otherwise S cannot cope and performance degrades
If S fails, C will fail as well.
they have a single availability characteristic
Asynchronous Communication
If we can decouple C and S somehow we can work asynchronously and not wait for the result.
- A standard way is to introduce a queue between C and S.
- C and S can scale independently
- web role/worker role pattern
- C can continue if S is down
- have independent availability characteristics
- queue can buffer messages at peak times allowing S to catch up
- may have to scale the queue...
- see lecture on "Data in the Cloud"
State Management
Stateless interaction means that the server does not maintain session state i.e. assume the client needs a sequence of calls to the server to achieve a result; the server does not maintain any interim state for these.
This does not mean that the state of server-side resources, i.e. the resource state (or application state), e.g. table data, doesn’t change when executing a request.
Consider the session state associated with stateful interaction to be data that could vary by client and by individual request.
Resource state is constant across every client which requests it.
State often discussed in terms of the web but it could be any server software.
The HTTP example
- HTTP by itself is stateless.
- it uses client-side session state management
- e.g. cookies which carry the session state in each HTTP call
- HTTP calls are therefore independent and self-contained
So if you have several web servers you can easily send any request to any web server and scale simply behind a simple (round-robin) load balancer.
also much more resilient to faults – send requests simply to another instance
This is the basis of the SOA service instance pattern.
HTTP Session State Management
- However several technologies have introduced server-side state management.
- ASP.NET sessions, Java servlets...
- not so easy to load balance and scale...
Need a distributed cache to share session state between instances.
the worst solution in the cloud, from a performance and scalability standpoint, is to use a relational database backed session state provider
- Stateful interactions are best avoided in cloud applications.
- if possible – e.g. it may be a web app requirement
- and as we saw previously microservices in the Business Logic Tier often use it
The Stateless Cloud
AWS goes stateless...🔗 Three-Tier Cloud Application
Caching
- “The results of a query are cached by saving them at the requesting or intermediate node so that they may be reused instead of repeating the query”.
- this will reduce the number of queries addressed to the actual service/data store
- you must maintain the consistency of cached queries
- usually: do not update them, simply delete them when they become invalid
- how do you decide when to delete?
- expiration – timeout set
- validation – cache management checks whether cache is still valid
- but … studies show over 50% of all HTTP requests are uncacheable
- security, web services, dynamic data…
- however can have a big impact overall on cloud scalability…
Caching in the Cloud
Aim to provide high throughput, low-latency access to commonly accessed application data (and especially static content), by storing the data in memory.
In the cloud the most useful type of cache is a distributed cache, which means the data is not stored in an individual VM server's memory but in another cloud resource.
i.e. in a set of separate specialised VM instances – “cache cluster”
- Advantages:-
- avoids high latency data access to a persistent data store thereby to dramatically improve application responsiveness
- benefits increase the more an application scales, as the throughput limits and latency delays of the persistent data store become more of a limit on overall application performance
- can share session state between web server instances
- the cached data remains accessible through faults and upgrades to the main instances
- acts as a circuit breaker
- can facilitate replicated soft-state in BASE (see later)
- Best used when:-
- do more reads than writes
- a lot of data is shared that does not frequently change
- e.g. for product data rather than for data unique to a user
Cache Population Strategies
On Demand / Cache Aside: The application tries to retrieve data from cache, and when the cache doesn't have the data (a "miss"), the application stores the data in the cache so that it will be available the next time. The next time the application tries to get the same data, it finds what it's looking for in the cache (a "hit"). To prevent fetching cached data that has changed, you invalidate the cache when making changes to the data store.
Background Data Push: Background services push data into the cache on a regular schedule, and the application always pulls from the cache. Works well with high latency data sources that are not required to always return the latest data.
Cloud Caching for real
Distributed in-memory caching been around for a while.
Memcached, NCache...
- Redis now more popular in cloud architectures
- see: 🔗 Introduction to Redis
- a persistent key/value store (see "Data in the Cloud" lecture)
- there is an API
- basically you provide key and value and later the key can be used to retrieve the value
- asynchronous replication between multiple cache servers
- see: 🔗 Replication
- resynchronisation mechanism after recovered network partitions
Redis now both in Azure and AWS ElastiCache.
open source - Linux with a MS port to Windows Server