Symlinks are easy in Corosync/HA. However, when coupled with Cron it gets to be a bit of a pain. A list of problems needing to be addressed:
1. Normal server startup has Cron starting before Corosync which means that if /var/spool/cron is missing then Cron will create it. This is not a bad thing, just not optimal for us if we are wanting to link to a NFS share with all the cron jobs that need to be made available on all the HA nodes. The OCF Symlink resource will error out until you delete the directory and once the symlink is made you have to restart Cron to get all working. Kind of defeats the purpose of headache free HA. ;)
2. Cron really does need to be running on all nodes in the cluster if only for log rotation.
3. One could do the standard symlink RA with a crond RA and set the order of start and grouping. However, you would then need a second clone group to ensure cron is running. Unfortunately this does not work due to race conditions.
By modifying the original Symlink RA script I was able to get a very nice Cron Symlink RA that works perfectly for me. A standard symlink would look like this:
primitive cronlinked ocf:heartbeat:symlink \ params link="/var/spool/cron" target="/mnt/imports/nvwh2.bluedotmedia.de/var/spool/cron" \ op monitor interval="15" timeout="15" on-fail="ignore" \ meta target-role="Started"
As you can see you have to set on-fail to “ignore” otherwise it will just failover due to the directory that Cron created when it started. However, here is the new Cron Symlink with the new cronlink RA:
primitive cronlinked ocf:itadmins:cronlink \ params link="/etc/cron.d/cronscript /var/spool/cron" target="/mnt/nfsshare/etc/cron.d/cronscript /mnt/nfsshare/var/spool/cron" croninit="/usr/sbin/service cron restart" \ meta target-role="Started" \ op monitor interval="15" timeout="15" on-fail="restart"
Exactly as before except now you can tell corosync how to restart Cron once the link has been created and do not have to set the resource to be “ignored” on failure. Also you can set multiple link pairs (space seperated) and works perfectly if wanting certain jobs running on certain nodes in the cluster unless, of course, a fail-over ensues.
In the newest version I have had to move to Bash due to the use of arrays and it will now fail on the first error it encounters during monitoring. I have had no issues with this but if you have any let me know.
#!/bin/bash # # # An OCF RA that manages symlinks for Cron # # Copyright (c) 2011 Dominik Klein # Modified by Charles Williams 2012 - 2015 # # This program is free software; you can redistribute it and/or modify # it under the terms of version 2 of the GNU General Public License as # published by the Free Software Foundation. # # This program is distributed in the hope that it would be useful, but # WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. # # Further, this software is distributed without any warranty that it is # free of the rightful claim of any third person regarding infringement # or the like. Any license provided herein, whether implied or # otherwise, applies only to this software file. Patent licenses, if # any, provided herein do not apply to combinations of this program with # other software, or any other product whatsoever. # # You should have received a copy of the GNU General Public License # along with this program; if not, write the Free Software Foundation, # Inc., 59 Temple Place - Suite 330, Boston MA 02111-1307, USA. # ####################################################################### # Initialization: : ${OCF_FUNCTIONS_DIR=${OCF_ROOT}/lib/heartbeat} . ${OCF_FUNCTIONS_DIR}/ocf-shellfuncs ####################################################################### meta_data() { cat <<END <?xml version="1.0"?> <!DOCTYPE resource-agent SYSTEM "ra-api-1.dtd"> <resource-agent name="cronlink"> <version>1.8</version> <longdesc lang="en"> This resource agent that manages a symbolic link (symlink) for Cron. It is primarily intended to manage /var/spool/cron which is automatically created by Cron when it starts. This resource removes that directory (or another one) before creating the symlink and restarting Cron. It will also create symlinks to cronjobs in the /etc/cron* directories as well. This means no longer needing to "clusterafy" your cronjobs. Link to target pairs can also be used for multiple symlinks as follows: primitive cronlinked ocf:heartbeat:cronlink \ params link="/etc/cron.d/cronscript /var/spool/cron" target="/mnt/nfsshare/etc/cron.d/cronscript /mnt/nfsshare/var/spool/cron" croninit="/usr/sbin/service cron restart" \ meta target-role="Started" \ op monitor interval="15" timeout="15" on-fail="restart" </longdesc> <shortdesc lang="en">Manages a symbolic link for Cron</shortdesc> <parameters> <parameter name="link" required="1"> <longdesc lang="en"> Full path of the symbolic link to be managed. This must obviously be in a filesystem that supports symbolic links. </longdesc> <shortdesc lang="en">Full path of the symlink</shortdesc> <content type="string"/> </parameter> <parameter name="target" required="1"> <longdesc lang="en"> Full path to the link target (the file or directory which the symlink points to). </longdesc> <shortdesc lang="en">Full path to the link target</shortdesc> <content type="string" /> </parameter> <parameter name="croninit" required="1"> <longdesc lang="en"> Full command to restart Cron. </longdesc> <shortdesc lang="en">Cron restart command</shortdesc> <content type="string"/> </parameter> <parameter name="backup_suffix"> <longdesc lang="en"> A suffix to append to any files that the resource agent moves out of the way because they clash with "link". If this is unset (the default), then the resource agent will simply refuse to create a symlink if it clashes with an existing file. </longdesc> <shortdesc lang="en">Suffix to append to backup files</shortdesc> <content type="string" /> </parameter> </parameters> <actions> <action name="start" timeout="15" /> <action name="stop" timeout="15" /> <action name="monitor" depth="0" timeout="15" interval="60"/> <action name="meta-data" timeout="5" /> <action name="validate-all" timeout="10" /> </actions> </resource-agent> END } symlink_monitor() { # This applies the following logic: # # * If $OCF_RESKEY_link does not exist, then the resource is # definitely stopped. # # * If $OCF_RESKEY_link exists and is a symlink that points to # ${OCF_RESKEY_target}, then the resource is definitely started. # # * If $OCF_RESKEY_link exists, but is anything other than a # symlink to ${OCF_RESKEY_target}, then the status depends on whether # ${OCF_RESKEY_backup_suffix} is set: # # - if ${OCF_RESKEY_backup_suffix} is set, then the resource is # simply not running. The existing file will be moved out of # the way, to ${OCF_RESKEY_link}${OCF_RESKEY_backup_suffix}, # when the resource starts. # # - if ${OCF_RESKEY_backup_suffix} is not set, then an existing # file ${OCF_RESKEY_link} is an error condition, and the # resource can't start here. rc=$OCF_ERR_GENERIC # Using ls here instead of "test -e", as "test -e" returns false # if the file does exist, but not if it's a symlink to a file that doesn't ocf_log info "Checking if $1 is symlinked to $2" if ! ls "$1" >/dev/null 2>&1; then ocf_log debug "$1 does not exist" rc=$OCF_NOT_RUNNING elif [ ! -L "$1" ]; then if [ -d "$1" ]; then ocf_run rm -rf "$1" rc=$OCF_NOT_RUNNING elif [ -z "$OCF_RESKEY_backup_suffix" ]; then ocf_log err "$1 exists but is not a symbolic link!" exit $OCF_ERR_INSTALLED else ocf_log debug "$1 exists but is not a symbolic link, will be moved to ${1}${OCF_RESKEY_backup_suffix} on start" rc=$OCF_NOT_RUNNING fi elif readlink -f "$1" | egrep -q "^${2}$"; then ocf_log debug "$1 exists and is a symbolic link to ${2}." rc=$OCF_SUCCESS else if [ -z "$OCF_RESKEY_backup_suffix" ]; then ocf_log err "$1 does not point to ${2}!" exit $OCF_ERR_INSTALLED else ocf_log debug "$1 does not point to ${2}, will be moved to ${1}${OCF_RESKEY_backup_suffix} on start" rc=$OCF_NOT_RUNNING fi fi return $rc } symlink_monitor_links() { links=($OCF_RESKEY_link) targets=($OCF_RESKEY_target) success=0 if [ "${#links[@]}" -eq "${#targets[@]}" ]; then i=0 while [ $i -lt ${#links[*]} ]; do symlink_monitor ${links[$i]} ${targets[$i]} rc=$? if [ $rc -ne $OCF_SUCCESS ]; then return $rc fi i=$(( $i + 1)); done return $rc fi } symlink_start() { links=($OCF_RESKEY_link) targets=($OCF_RESKEY_target) success=0 if [ "${#links[@]}" -eq "${#targets[@]}" ]; then i=0 while [ $i -lt ${#links[*]} ]; do if ! symlink_monitor ${links[$i]} ${targets[$i]}; then if [ -e "${links[$i]}" ]; then if [ -z "$OCF_RESKEY_backup_suffix" ]; then # Shouldn't happen, because symlink_monitor should # have errored out. But there is a chance that # something else put that file there after # symlink_monitor ran. ocf_log err "${links[$i]} exists and no backup_suffix is set, won't overwrite." #exit $OCF_ERR_GENERIC success=1 else ocf_log debug "Found ${links[$i]}, moving to ${links[$i]}${OCF_RESKEY_backup_suffix}" #ocf_run mv -v ${links[$i]} ${links[$i]}${OCF_RESKEY_backup_suffix} || exit $OCF_ERR_GENERIC ocf_run mv -v ${links[$i]} ${links[$i]}${OCF_RESKEY_backup_suffix} || success=1 fi fi ocf_log info "Linking $links to $targets" ocf_run ln -sv ${targets[$i]} ${links[$i]} symlink_monitor ${links[$i]} ${targets[$i]} fi i=$(( $i + 1)); done ocf_run $OCF_RESKEY_croninit return $? fi if [ $success -eq 0 ]; then return $OCF_SUCCESS else return $OCF_ERR_GENERIC fi } symlink_stop() { links=($OCF_RESKEY_link) targets=($OCF_RESKEY_target) success=0 if [ "${#links[@]}" -eq "${#targets[@]}" ]; then i=0 while [ $i -lt ${#links[*]} ]; do if symlink_monitor ${links[$i]} ${targets[$i]}; then ocf_run rm -vf ${links[$i]} || exit $OCF_ERR_GENERIC if ! symlink_monitor ${links[$i]} ${targets[$i]}; then if [ -e "${links[$i]}${OCF_RESKEY_backup_suffix}" ]; then ocf_log debug "Found backup ${links[$i]}${OCF_RESKEY_backup_suffix}, moving to ${links[$i]}" # if restoring the backup fails then still return with # $OCF_SUCCESS, but log a warning ocf_run -warn mv "${links[$i]}${OCF_RESKEY_backup_suffix}" "${links[$i]}" fi ocf_run $OCF_RESKEY_croninit else ocf_log err "Removing ${links[$i]} failed." #return $OCF_ERR_GENERIC success=1 fi else ocf_run $OCF_RESKEY_croninit fi i=$(( $i + 1)); done fi if [ $success -eq 0 ]; then return $OCF_SUCCESS else return $OCF_ERR_GENERIC fi } symlink_validate_all() { if [ "x${OCF_RESKEY_link}" = "x" ]; then ocf_log err "Mandatory parameter link is unset" exit $OCF_ERR_CONFIGURED fi if [ "x${OCF_RESKEY_target}" = "x" ]; then ocf_log err "Mandatory parameter target is unset" exit $OCF_ERR_CONFIGURED fi if [ "x${OCF_RESKEY_croninit}" = "x" ]; then ocf_log err "Mandatory parameter croninit is unset" exit $OCF_ERR_CONFIGURED fi # Having a non-existant target is technically not an error, as # symlinks are allowed to point to non-existant paths. But it # still doesn't hurt to warn people if the target does not exist # (but only during non-probes). links=(${OCF_RESKEY_link// / }) targets=(${OCF_RESKEY_target// / }) success=0 if [ "${#links[@]}" -eq "${#targets[@]}" ]; then i=0 while [ $i -lt ${#links[*]} ]; do if [ ! -e "${targets[$i]}" ]; then ocf_log warn "${targets[$i]} does not exist!" fi i=$(( $i + 1)); done fi } symlink_usage() { cat <<EOF usage: $0 {start|stop|monitor|validate-all|meta-data} Expects to have a fully populated OCF RA-compliant environment set. EOF } if [ $# -ne 1 ]; then symlink_usage exit $OCF_ERR_ARGS fi case $__OCF_ACTION in meta-data) meta_data exit $OCF_SUCCESS ;; usage) symlink_usage exit $OCF_SUCCESS esac # Everything except usage and meta-data must pass the validate test symlink_validate_all || exit case $__OCF_ACTION in start) echo "Starting ..." symlink_start ;; stop) symlink_stop ;; status|monitor) symlink_monitor_links ;; validate-all) ;; *) symlink_usage exit $OCF_ERR_UNIMPLEMENTED esac # exit code is the exit code (return code) of the last command (shell function)