High Availability (HA) for node clusters
Siren Alert supports High Availability (HA) for reporting on node clusters. This provides continued service of the alerting system when a cluster’s master node fails, by switching the master’s responsibilities to another node. The reporting functionality is muted on all but the master node, preventing duplicate reports being sent for each alert.
Functional overview
In a cluster, there is only one master node—all other nodes are slaves. If the master is down, the first slave that detects this is elected the new master.
Time is taken into account to define master and slave statuses, and identify dead nodes; this time is represented as the number of seconds since Unix Epoch. The master has priority (priority_for_master
) and all slaves watch the current master with a specific timeout (loop_delay
). If the master does not update its time within the specified period of time (absent_time
), it is considered offline, and an election of a new master takes place. All nodes which are absent for a specified period of time (absent_time_for_delete
) are considered dead and are deleted from memory.
Siren Alert cluster setup
Cluster configuration with High Availability
sentinl: settings: cluster: # configuration for the cluster enabled: boolean # (optional: default: false) enable / disable the cluster configuration debug: boolean # (optional: default: false) debug output in the Investigate console name: string # (optional, default: 'sentinl') name of the cluster configuration priority_for_master: number # (optional, default: 0) master's node priority, see below host.priority absent_time_for_delete: number # (optional, default: 86400) how long before a node is removed from the cluster in seconds absent_time: number # (optional, default: 15) how long the slaves wait for a response from the master before electing a new master in seconds loop_delay: number # (optional, default: 15) how long between polls from slave to master in seconds cert: # configuration for security's certificate selfsigned: boolean # (optional, default: true) if certificate is self-assigned valid: number # (optional, default: 10) validation of certificate key: string | null # (optional, default: undefined) path to key cert: string | null # (optional, default: undefined) path to certificate gun: # configuration for each gun db host peers: string[] # (required) contain urls to all gun db instances including this one host: string # (optional, default: localhost) gun db host port: number # (optional, default 9000) unique for each gun db host # Note: Must be set to unique value when you run more than one Investigate process on the same machine cache: string # (optional, default: 'optimize/gun-server-data.json') path to gun server db cache file # Note: Must be set to unique value when you run more than one Investigate process from the same folder host: # host's configuration id: string # (required) must be a unique ID priority: number # (required) priority 0 = master, priority 1+ = slave name: string # (optional, default: 'investigate-gun-host') name of the node node: string # (optional, default: 'investigate-gun-hosts') name of node within gun DB # Note: all gun db instance configurations in HA cluster must share same node name cache: string # (optional, default 'optimize/gun-host-data.json' ) path to gun host db cache file # Note: Must be set to unique value when you run more than one Investigate process from the same folder
Example of configuration
The following cluster topology is an example of HA configuration:
In the following configuration examples, the ellipsis (…
) indicates that the options here are identical to the options specified in the example above.
elasticsearch.yml
Host Trex
cluster.name: siren-distribution network.host: [_local_, _enp2s0_] discovery.zen.minimum_master_nodes: 2 node.name: trex discovery.zen.ping.unicast.hosts: ["172.126.0.5", "192.168.0.12"]
investigate.yml
Host Trex
sentinl: settings: cluster: enabled: true name: 'sentinl' priority_for_master: 0 absent_time_for_delete: 86400 absent_time: 15 loop_delay: 5 cert: selfsigned: true valid: 10 gun: port: 9000 host: '0.0.0.0' cache: 'data.json' peers: ['https://localhost:9000/gun', 'https://172.16.0.5:9000/gun', 'https://192.168.0.12:9000/gun'] host: id: '123' name: 'trex' priority: 0 node: 'hosts'