Abiquo 2.6


Skip to end of metadata
Go to start of metadata

Introduction

Our customers’ Abiquo clouds are growing every month with more concurrent users, VMs deployed, and so on. So the Abiquo team must ensure that the platform will not limit our customers’ business.
 
Scalability is a key requirement and the foundation of scalability is load balancing across multiple APIs.
 
This feature enables customers to add a load balancing component in front of multiple servers running API.war to distribute load among multiple API nodes, thus supporting more concurrent requests and failover capabilities. 

Recommended Setup

To configure multiple API nodes configure a 'datanode'. The datanode will contain all the basic infrastructure needed for Abiquo APIs and remote services to communicate and store data required for normal Abiquo API functionality. On the datanode, you must install:

  • MariaDB or MySQL 5.5 Database 
  • RabbitMQ
  • Redis
  • Apache Zookeeper

These products may be installed with their default configuration because all the necessary setups will be defined in the Abiquo API or Abiquo Remote Services. Note that it is not necessary for all of the datanode services to be installed on the same machine. For example, you could have a separate database server. In addition, you can configure fault tolerance as required, for example, you could configure MySQL with primary-secondary replication.

User-generated API requests will need to be distributed by installing a Load Balancer (i.e. Apache) over this configuration (see our Load balancing client and API using Apache v2.4 and v2.6 or Load balancing client and API with HAProxy v2.4 and v2.6 guide).

When you have the datanode ready, deploy as many API nodes and/or Remote Services nodes as desired to improve performance and fault tolerance. Abiquo will internally distribute all event processing through the API nodes.

The following diagram shows how the load-balanced APIs will require access to the same abiquo.properties configuration and the same shared 3rd-party services.  

 

To provide fault tolerance we have defined a Leader Election recipe. The recipe ensures that at all times one of the API nodes is the Leader, which deals with the events sent from any module in our platform. User requests are balanced and distributed to any other API nodes (or even the leader itself). We use Zookeeper to always keep track of all live API nodes and always select one, and only one, Leader to make sure all the events are processed.

Property configuration
API Nodes

All API nodes must have a common abiquo.properties file to ensure all of them will access the proper information. These are the common properties that all the API nodes must be configured to use:

# RabbitMQ
abiquo.rabbitmq.username 
abiquo.rabbitmq.password 
abiquo.rabbitmq.host 
abiquo.rabbitmq.port 
# Redis
abiquo.redis.port 
abiquo.redis.host
# Zookeeper
abiquo.api.zk.serverConnection
# MySQL
 abiquo.database.user
 abiquo.database.password
 abiquo.database.host

These properties are marked in green in the Abiquo Configuration Properties documentation. 
Note that the datanode redis server is for the API nodes ONLY. 
Remember that you must also set the jdbc configuration in the api.xml file to point to the MySQL database. See documentation on How to set up a remote MySQL database server

 

abiquo.rabbitmq.username=foo
abiquo.rabbitmq.password=bar
abiquo.rabbitmq.host=10.10.1.5 
abiquo.rabbitmq.port=5672
abiquo.redis.port=6379
abiquo.redis.host=10.10.1.5
abiquo.api.zk.serverConnection=10.10.1.5:2181
abiquo.database.user = abiquo
abiquo.database.password = mypa55word
abiquo.database.host = 10.10.1.5
Remote Services (including V2V Services)

All remote services servers in datacenters with API load balancing must be configured to use the same RabbitMQ instance.

# RabbitMQ
abiquo.rabbitmq.username 
abiquo.rabbitmq.password 
abiquo.rabbitmq.host 
abiquo.rabbitmq.port

 
abiquo.rabbitmq.username=foo
abiquo.rabbitmq.password=bar
abiquo.rabbitmq.host=10.10.1.5 
abiquo.rabbitmq.port=5672

The Remote Services Servers only communicate with the datacenter notifications queue in the datanode RabbitMQ instance. DO NOT change the Redis properties for the datacenters.

Difference between multiple datacenters and multiple APIs

A single Abiquo installation can handle multiple datacenters, as shown in the following diagram. Divisions between datacenters are questions of partitioning and geoproximity. For example, the considerations for adding a new datacenter can be in terms of adding a service that is geographically closer to the users. Or to minimize the impact of in terms of the number of users that will be affected by an outage if a datacenter service falls.

To scale up the cloud service you should add another API node.  Adding another API node will distribute the load across one more node. The following simplified diagram shows the same multiple-datacenter environment with multiple API nodes.

 

Note on API Leader concept

The asynchronous tasks between API nodes and remote services instances are coordinated? with RabbitMQ. All APIs are able to process requests from clients and queue asynchronous tasks to remote services and the API itself. The API leader node is the only one that consumes from the scheduler queue (because the requests are to be processed one by one) and the remote services response queues (because all of the messages must be consumed and processed in order).

To guarantee that there will always be one leader we use Curator framework. This is a well known and widely adopted solution that guarantees there will only be one or no leader (if no API is up and running).

In a worst-case scenario, when the leader fails while processing a message, another leader will be elected and continue with the job. Asynchronous jobs in the leader API will take care of the message left behind.

Abiquo Scheduler

Checking the current leader

When a node takes the leadership will print in api.log

INFO c.a.a.w.l.LeadElectionContextListener - Current API is the /api/leader-election leader

 

All the API participants are registered in the zpaht /api/leader-election

$ zktreeutil -z 10.60.1.5:2181 -p /api -D
...

/
|   
|--[api]
|   |   
|   |--[leader-election]
|       |   
|       |--[_c_151aa53a-1e97-4c96-b4f6-ac7e70e36bef-lock-0000000009 => eruiz/10.60.1.228]
|       |   
|       |--[_c_f6af1685-0da8-4a36-88b2-1ceba2ff15f6-lock-0000000008 => apuig/10.60.1.223]
|   
|--[zookeeper]
    |   
    |--[quota]


The current leader is the registered node with the lower lock value.

0000000008 < 0000000009 -->  ''apuig/10.60.1.223''

The znode content is the node hostname, so its important to configure it to avoid useless localhost/127.0.0.1.

Basic API Leader Example

For example, for the asynchronous deployment task of a virtual machine, there are three jobs:

  1. Schedule: Selecting the physical machine and reserving resources.
  2. Configure: Create the virtual machine in the hypervisor.
  3. Power on

In a multiple-API environment, any API can queue the scheduling task. The leader will process the scheduling task. The leader then queues the deploy task (configure and power-on jobs) in the virtual factory. When the vf completes each job, the result is put in the datacenter notification queue. 

Addition of an API node to a running cluster

No specific configuration is required to add a node to a running cluster. Just replicate the properties in the node, and configure the load balancer used in the environment.

Limitations or clarifications

  • The load balancing is focused on a central node. No load balancing is applied to Remote Services, which are not a bottleneck because they have a RabbitMQ instance to manage requests.
  • All API instances must be configured with the same values in abiquo.properties.
  • Multiple Node Installation : since the technologies used in our datanode (MariaDB, Redis, Rabbit, Zookeeper)  are widely known and properly documented in their own project homepages, all issues related with balancing, sharding, replication, etc affecting them are up to system administrators. We currently do not provide any support on how to install or configure these systems replication or balancing features.