Cleaning Up Statistics At Year End
The
WebStatistics topics rcs files get out of hand, with well over 4000 revisions by end of year. Assumptions
- The revision history doesn't really serve much purpose, and slows everything down.
- We will archive each year into a per-year statistics file. WebStatistics2012 for example
- The WebStatistics topic header has been modified to search for prior year statistics topics in that web.
Manual cleanup
Here are the steps I've gone through to fix this up. First time around it was done at midnight on the Dec. 31st. but that's not all that practical. For the 2012 -> 2013 transition, the following steps were run, redirecting the output to a file that was then run as a shell script.
- Backup all the existing statistics files
-
find /home/foswiki.org/public_html/data -iname WebStatistics\.* -exec tar -rf webstats3.tar {} +
- Copy all stats files into their 2012 versions
-
for i in `find /home/foswiki.org/public_html/data/ -name 'WebStatistics\.*'` ; do echo sudo -u www cp $i `echo $i | sed 's/\.txt/2012.txt/'` ; done
- Delete the 2013 entry from the 2012 files
-
for i in `find /home/foswiki.org/public_html/data/ -name 'WebStatistics2012.txt'`; do echo sudo -u www sed -i_bak -e "/Jan\ 2013/d" $i ; done
- Delete any 2012 statistics from the 2013 files
-
for i in `find /home/foswiki.org/public_html/data/ -name 'WebStatistics.txt'`; do echo sudo -u www sed -i_bak -e "/[a-z]\ 2012/d" $i; done
- Remove any rcs files.
-
find /home/foswiki.org/public_html/data -iname WebStatistics\.txt,v -exec echo sudo -u www mv {} {}.bak \;
After all is completed, the backup files need to be removed.: