Item14202: PageCache tweaks to control dependency growth.
Priority: Urgent
Current State: Closed
Released In: 2.1.3
Target Release: patch
The dependencies table on foswiki.org experiences explosive growth. We really need to control growth of the dependencies table especially for bots.
One simple discovery (repeated often by others as a huge performance hit) ... The
WebLeftBarExample topic includes
WebLeftBarWebsList, which causes every system page to become dependent upon every
WebHome and
WebPreferences for every public web. Simple example. The dependencies for
UserRegistration went from 77 to 31 topics when after creating
WebLeftBar, eliminating the weblist.
The 2nd issue is that every link to any internal topic results in a dependency, even though many topics will probably never be deleted (or added if missing). This also results in an exponential growth in the deps table. I think it would be good to be able to trade "accuracy" of the cache for topic additions/removals/renames with controlling growth of the dependencies table, making the cache more useful. It may be that it's more useful to track link "presence" only for logged in users.
Possible implementation to be discussed:
- add new routine
Foswiki::PageCache::addTopicRef
, wrapper for Foswiki::PageCache::addDependency()
- Call from
Foswiki::Render::_renderWikiWord()
- Call from
Foswiki::Search::formatResults()
- Add new configure key:
{Cache}{TrackInternalLinks}
-
on
: Behave as currently implemented. This is the default.
-
users
: Track only for logged in users, not for WikiGuest
-
off
: Don't track links.
- Add a new configure key:
{Cache}{IgnoreQueryParams}
- Comma separated list of parameters that should be ignored by genVariationKey
- Default
cache_expire,cache_ignore,_.*,refresh,foswiki_redirect_cache,logout,topic
(currently hardcoded)
- Add
redirectedfrom
and validationkey
The downside is that delete / rename or new topics can result in stale cache with incorrectly rendered links. However rendered "Content" from
INCLUDE macros, etc. will still be correctly cached and recorded as dependencies.
--
GeorgeClark - 22 Oct 2016
Another observation. The RSS feed generates all links with a
t=2016-10-22T03:02:18Z
timestamp query param. It doesn't impact rendering .. just keeps changed topics at the top of the RSS feed. But we treat it as a new page variation. Maybe "t=" should be an ignored query param.
No, this is Foswiki.org specific. I've changed the Tasks, Development and Support WebRss topics to use _t, to keep it out of the cache.
--
GeorgeClark - 22 Oct 2016
Another useless variation. "redirectedfrom" - For anyone hitting the
Download which redirects to the current release. (Extended above implementation to make this configurable.) Or another option would be to change the
RedirectPlugin to use _redirectedfrom, and change the RSS feeds to use _t for the timestamp.
--
GeorgeClark - 22 Oct 2016
validationkey also gets picked up as a variation after login redirect.
--
GeorgeClark - 23 Oct 2016
I think there might be a different bug with the
validation_key
being preserved as a query param. There is code to delete it from the passthru parameters, but the login redirect seems to preserve it anyway.
--
GeorgeClark - 23 Oct 2016
Changed name of function from
addTopicRef
to
addDependencyForLink
after discussion in release meeting. Original suggestion was to use
addDependencyForWikiWord
but "Link" is more accurate as it applies to any internal links, not just WikiWord type links.
--
GeorgeClark - 31 Oct 2016