]> git.8kb.co.uk Git - slony-i/slony_failover/blob - README.md
481ded8a29b34d3f077a0cb7a98a031dc304a9ad
[slony-i/slony_failover] / README.md
1 #Slony-I Failover / Switchover 
2
3 Perl script to assist with performing switchover and failover of replication sets 
4 in PostgreSQL databases replicated using Slony-I.
5
6 The script can be run in interactive mode to suggest switchover or failover and
7 will create a slonik script to perform the suggested action.
8
9 It's hard to put together a script for all situations as different Slony 
10 configurations can have different complexities (hence the existence of slonik), 
11 but this script is intended to be used for building and running slonik scripts 
12 to move all sets from one node to another.
13
14 There is also an autofailover mode which will sit and poll each node and perform
15 a failover of failed nodes.  This mode should be assumed as experimental, as 
16 there can be quite a few decisions to be made when failing over different setups.
17
18 ##Example usage
19
20 Launch interactive mode with command line parameters:
21
22 ```bash
23 $ ./slony_failover.pl -h localhost -db TEST -cl test_replication
24 ```
25
26 Launch with configuration file, a minimal configuration file would be:
27
28     slony_database_host = localhost
29     slony_database_name = TEST
30     slony_cluster_name = test_replication
31
32 ```bash
33 $ ./slony_failover.pl -f slony_failover.conf
34 ```
35 Run as a daemon in debian:
36
37 ```bash
38 $ sudo cp init.debian /etc/init.d/slony_failover
39 $ cp slony_failover.conf /var/slony/slony_failover/slony_failover.conf
40 $ sudo chmod +x /etc/init.d/slony_failover
41 $ sudo update-rc.d slony_failover start 99 2 3 4 5 . stop 24 0 1 6 
42 $ sudo invoke-rc.d slony_failover start
43 ```
44
45 ##Command line parameters
46
47 ```bash
48 $ ./failover.pl [options]
49 ```
50
51 |Switch    | Description
52 |----------|------------------------------------------
53 |-f        |Read all configuration from config file
54 |-h        |Host running PostgreSQL instance to read state of Slony-I cluster from
55 |-p        |Port of above PostgreSQL database instance
56 |-db       |Name of above PostgreSQL database instance 
57 |-cl       |Name of Slony-I cluster
58 |-u        |User to connect to above PostgreSQL database instance
59 |-P        |Password for above user (Use .pgpass instead where possible)
60 |-i        |Print information about slony cluster and exit
61
62 ##Configuration file parameters
63
64 | Section     | Parameter                                    | Type                          | Default                         | Comment
65 |:------------|:-------------------------------------------- |:------------------------------|:--------------------------------|:-----------------------------------
66 | General     |**lang**                                      | en/fr                         | *'en'*                          | The language to print messages in, currently only english and french
67 | General     |**prefix_directory**                          | /full/path/to/directory       | *'/tmp/slony_failovers'*        | Working directory for script to generate slonik scripts and log files
68 | General     |**separate_working_directory**                | boolean                       | *'true'*                        | Append a separate working directory to the prefix_directory for each run
69 | General     |**slonik_path**                               | /full/path/to/bin/directory   | *null*                          | Slonik binary if not in current path
70 | General     |**pid_filename**                              | /path/to/pidfile              | *'/var/run/slony_failover.pid'* | Pid file to use when running in autofailover mode
71 | General     |**enable_try_blocks**                         | boolean                       | *false*                         |    Write slonik script with try blocks where possible to aid error handling
72 | General     |**lockset_method**                            | single/multiple               | *'multiple'*                    | Write slonik script that locks all sets
73 | General     |**pull_aliases_from_comments**                | boolean                       | *false*                         | If true, script will pull text from comment fields and use to generate
74 |             |                                              |                               |                                 | possibly meaningful aliases for nodes and sets.
75 |             |                                              |                               |                                 | For sl_set this uses the entire comment, and sl_node text in parentheses.
76 | General     |**log_line_prefix**                           | text                          | *null*                          | Prefix to add to log lines, special values:
77 |             |                                              |                               |                                 |     %p = process ID
78 |             |                                              |                               |                                 |     %t = timestamp without milliseconds
79 |             |                                              |                               |                                 |     %m = timestamp with milliseconds
80 | General     |**failover_offline_subscriber_only**          | boolean                       | *false*                         | If set to true any subscriber only nodes that are unavailable at the time 
81 |             |                                              |                               |                                 | of failover will also be failed over.  If false any such nodes will be 
82 |             |                                              |                               |                                 | excluded from the preamble and not failed over, however this may be problematic
83 |             |                                              |                               |                                 | especially in the case where the most up to date node is the unavailable one.
84 | General     |**drop_failed_nodes**                         | boolean                       | *false*                         | After failover automatically drop the failed nodes.
85 | Slon Config |**slony_database_host**                       | IP Address/Hostname           | *null*                          | PostgreSQL Hostname of database to read Slony configuration from
86 | Slon Config |**slony_database_port**                       | integer                       | *5432*                          | PostgreSQL Port of database to read Slony configuration from
87 | Slon Config |**slony_database_name**                       | name                          | *null*                          | PostgreSQL database name to read Slony configuration from 
88 | Slon Config |**slony_database_user**                       | username                      | *'slony'*                       | Username to use to connect when reading Slony configuration
89 | Slon Config |**slony_database_password**                   | password                      | *''*                            | Recommended to leave blank and use .pgpass file
90 | Slon Config |**slony_cluster_name**                        | name                          | *null*                          | Name of Slony-I cluster to read configuration for 
91 | Logging     |**enable_debugging**                          | boolean                       | *false*                         | Enable printing of debug messages to stdout
92 | Logging     |**log_filename**                              | base file name                | *'failover.log'*                | File name to use for script process logging, special values as per strftime spec
93 | Logging     |**log_to_postgresql**                         | boolean                       | *false*                         | Store details of failover script runs in a postgresql database
94 | Logging     |**log_database_host**                         | IP Address/Hostname           | *null*                          | PostgreSQL Hostname of logging database
95 | Logging     |**log_database_port**                         | integer                       | *null*                          | PostgreSQL Port of logging database
96 | Logging     |**log_database_name**                         | name                          | *null*                          | PostgreSQL database name of logging database
97 | Logging     |**log_database_user**                         | username                      | *null*                          | Username to use to connect when logging to database
98 | Logging     |**log_database_password**                     | password                      | *''*                            | Recommended to leave blank and use .pgpass file
99 | Autofailover|**enable_autofailover**                       | boolean                       | *'false'*                       | Rather than interactive mode sit and watch the cluster state for failed
100 |             |                                              |                               |                                 | origin/forwarding nodes; upon detection trigger automated failover.
101 |             |                                              |                               |                                 |
102 | Autofailover|**autofailover_forwarding_providers**         | boolean                       | *'false'*                       | If true a failure of a pure forwarding provider will also trigger failover
103 | Autofailover|**autofailover_config_any_node**              | boolean                       | *'true'*                        | After reading the initial cluster configuration, subsequent reads of the configuration 
104 |             |                                              |                               |                                 | will use conninfo read from sl_subscribe to read from any node.
105 | Autofailover|**autofailover_poll_interval**                | integer                       | 500                             | How often to check for failure of nodes (milliseconds)
106 | Autofailover|**autofailover_node_retry**                   | integer                       | 2                               | When failure is detected, retry this many times before initiating failover
107 | Autofailover|**autofailover_sleep_time**                   | integer                       | 1000                            | Interval between retries (milliseconds)
108
109 Changes
110 -------
111
112 * 08/04/2012 - Hash together some ideas for interactive failover perl script
113 * 04/11/2012 - Experiment with different use of try blocks (currently can't use multiple lock sets indide try)
114 * 13/04/2014 - Update to work differently for Slony 2.2+
115 * 05/05/2014 - Experiment with autofailover ideas
116
117 Licence
118 -------
119 See the LICENCE file.