Problem
This tip is probably only useful to very few Foswiki administrators; it was developed from an environment where we wanted to track Foswiki's code, data, pub, htpasswd &
LocalSite.cfg to more easily duplicate the production environment onto the staging & development servers.
It was also a moderately large wiki with ~210,000 topics with almost 100GB of attachments; this will mostly appeal to those who want to manage their Foswiki installation with git. Reasons for doing this might include:
- You have a testing server which mirrors your production Foswiki instance. You would rather use git to keep the test/dev servers synchronised with prod (including testing bulk topic editing/deployment before pushing to prod).
- You are already using git elsewhere, and would like to incorporate git for disaster recovery, development, tracking config & data changes.
Context
The script below makes some assumptions:
- Directories are 'sticky', and are only owner/group read+writable, i.e. 2770 permission
- Files are owner/group read+writeable, ie. 0660
- Directories & files are owned
www-data:fwadmins
- that is, owned by the webserver user, with a group dedicated to users who are expected to be able manipulate Foswiki files directly
- Foswiki is configured to use data & pub directories located outside of the Foswiki installation
- Foswiki is configured to use a
.htpasswd
file located outside of the Foswiki installation, in a dedicated git repository
- Foswiki's configuration file,
LocalSite.cfg
file is located outside of the Foswiki installation, in a dedicated git repository (a symbolic link into the Foswiki installation's lib/
directory is used)
Solution
Notes:
- Due to the number of topics managed by our installation, we split
data/
and pub/
up so that each root web has its own git repository; data/
and pub/
are also set up as parent repositories, which don't track any files but are merely used to track the child repos for convenience (allows us to do git foreach submodule <do stuff>...
.
- The script below automatically detects any newly created root webs, initialises git there and adds this as a new repo to the parent supermodule.
-
pub/
git repositories are configured not to do any compression or deltas. On our virtual machines, this just added an impossible amount CPU overhead when repos exceeded the size of server RAM, and with most attachments already being compressed .png, .jpg, .pdf files etc. the compression was probably not saving a whole lot of disk space anyway.
#!/bin/bash
# Fix perms
nice chown -R www-data:fwadmins /path/to/foswiki/storage
nice chmod -R ug+rwX /path/to/foswiki/storage
nice find /path/to/foswiki/storage -type d -exec chmod g+s {} \;
# Essentials
sudo -u www-data nice perl -I /path/to/foswiki/core/bin /path/to/foswiki/core/tools/mailnotify -user AdminUser
sudo -u www-data nice perl -I /path/to/foswiki/core/bin /path/to/foswiki/core/tools/tick_foswiki.pl
# Deal with any newly created root webs in data/ (set them up as git submodules)
sudo -u www-data nice bash -c 'find /path/to/foswiki/storage/data -maxdepth 1 -mindepth 1 -type d -not -name .git |
while read dir; do
if [ ! -d $dir/.git ]; then
#echo "Doing $dir as $USER"
cd $dir
git init
git config core.filemode false
cd ..
git submodule add -q ./`basename $dir`
fi
done'
# Deal with any newly created root webs in pub/ (set them up as git submodules)
sudo -u www-data nice bash -c 'find /path/to/foswiki/storage/pub -maxdepth 1 -mindepth 1 -type d -not -name .git -not -name images |
while read dir; do
if [ ! -d $dir/.git ]; then
#echo "Doing $dir as $USER"
cd $dir
git init
git config core.filemode false
# These settings assume a pub/RootWeb directory containing mostly binary
# files. Without these settings, cloning, pulling, gc & repacking repos
# approaching the size of the server"s free RAM is fantastically slow.
#
# These settings prevent delta compression, so if you have many unique
# revs, this can take up a lot of disk. If you tweak these settings, run
# "git gc" and/or "git repack" on the affected repo(s)
git config pack.depth 1
# Compressing already compressed attachments (.pdf, .png, .jpg, etc)
# can bog down the server"s CPU unnecessarily; especially if the repo
# doesn"t fit in available system RAM, it really thrashes the disk.
# TODO: demonstrate a nice way to schedule compression/re-packing
# (with deltas) in a separate cron job, Eg. over weekend.
git config core.compression 0
git config core.loosecompression 0
git config pack.compression 0
# Ignore DirectedGraphPlugin & ImagePlugin temporary files
echo "igp_*
DirectedGraphPlugin_*" >> .gitignore
cd ..
git submodule add -q ./`basename $dir`
fi
done'
# Add *.txt and *.txt,v in separate commits. Don't want lease/.changes or other
# cruft in here, otherwise it's difficult for Foswikis running a clone of data/
# to stay merged without conflicts in .changes/.lease, etc.
sudo -u www-data nice bash -c 'cd /path/to/foswiki/storage/data && git submodule foreach "find . -name .git -prune -o -type d -exec bash -c \"cd {} && git add *.txt && git commit -q -m cron-update:txt || : && git add *.txt,v && git commit -q -m cron-update:txt,v || : \" \; && git commit -q -a -m cron-update:other || :"'
# Update the supermodule to point at the latest commits in the submodules.
sudo -u www-data nice bash -c 'cd /path/to/foswiki/storage/data && git commit -q -a -m "cron-update"'
# Commit everything in pub/.
sudo -u www-data nice bash -c 'cd /path/to/foswiki/storage/pub && git submodule foreach "git add . && git commit -q -a -m cron-update" || :'
# Update the supermodule to point at the latest commits in the submodules.
sudo -u www-data nice bash -c 'cd /path/to/foswiki/storage/pub && git commit -q -a -m "cron-update"'
# commit htpasswd file
sudo -u www-data nice bash -c 'cd /path/to/foswiki/storage/htpasswd && git commit -q -a -m "cron-update"'
# Commit LocalSite config
sudo -u www-data nice bash -c 'cd /path/to/foswiki/storage/LocalSite && git commit -q -a -m "cron-update"'
# Update supermodule in case we've upgraded an extension/core
sudo -u www-data nice bash -c 'cd /path/to/foswiki && git commit -q -a -m "cron-update"'
We can now clone this by doing something like
git clone server.org:/path/to/foswiki/storage/data
cd data
git submodule update --init
cd ..
git clone server.org:/path/to/foswiki/storage/pub
cd pub
git submodule update --init
cd ..
git clone server.org:/path/to/foswiki/storage/LocalSite
git clone server.org:/path/to/foswiki/storage/htpasswd
We could actually create a single super-parent repo to hold all these, then we'd only have one clone command and we'd just need to do
git submodule update --init --recursive
. Anyway, we use something like this to keep clones up-to-date:
#!/bin/sh
bash -c 'cd /path/to/foswiki/storage/data && git pull origin && git submodule update --init'
bash -c 'cd /path/to/foswiki/storage/pub && git pull origin && git submodule update --init'
bash -c 'cd /path/to/foswiki/storage/htpasswd && git pull origin'
Known Uses
Known Limitations
See Also