Item12477: Spurious entries being added to .changes
Priority: Urgent
Current State: Closed
Released In: 2.0.0
Target Release: major
mailnotify occasionally jumps way back in history, aggregating years of revisions into a notification.
For ex:
This is an automated e-mail from Foswiki.
New or changed topics in Foswiki.Extensions, since 21 Mar 2013 - 16:52:
Topics in Extensions web:
|
Changed: (now 16:15)
|
Changed by:
|
NatSkin
|
28 Mar 2013 - 15:48 - r3->r43
|
MichaelDaum
|
The Natural Skin
Description NatSkin improves your Foswiki experience by bringing together some of the most useful
|
|
However in the topic history,
43 28 Mar 2013 - 16:12 MichaelDaum
...
38 08 Jan 2011 - 00:57 PaulHarvey
...
3 04 Nov 2008 - 16:46 MichaelDaum
See also
Support.Question1281
--
GeorgeClark - 22 Apr 2013
Crawford, do you have any idea on why this might happen?
--
GeorgeClark - 20 Dec 2013
I did a low level trace and evetually determined that something is adding records to .changes that have spurious revision numbers. This is throwing off
MailerContrib. The problem is not with the contrib, but the core, so I changed the assignation of this task and the headline from "MailerContrib occasionally jumps way back in history for notifications". Here's an example, taken from the Community web:
AssociationArticles MichaelDaum 1395836777 1
AssociationArticles MichaelDaum 1395836777 49
AssociationArticles MichaelDaum 1395836973 1
AssociationArticles MichaelDaum 1395836974 49 minor, reprev
AssociationArticles MichaelDaum 1395837011 49 minor, reprev
AssociationArticles MichaelDaum 1395837035 49 minor, reprev
AssociationArticles OliverKrueger 1395837098 50
AssociationArticles OliverKrueger 1395837186 50 minor, reprev
AssociationArticles OliverKrueger 1395837558 50 minor, reprev
AssociationArticles OliverKrueger 1395837579 50 minor, reprev
AssociationArticles OliverKrueger 1395837596 50 minor, reprev
AssociationArticles MichaelDaum 1395837620 1
AssociationArticles MichaelDaum 1395837620 51
AssociationArticles OliverKrueger 1395837735 1
AssociationArticles OliverKrueger 1395837736 52
AssociationArticles OliverKrueger 1395837756 0
AssociationArticles OliverKrueger 1395837757 52 minor, reprev
Item8460 added recoredChange calls to many operations on the store. However these operations are recorded either with a rev of 0 (which
MailerContrib can always skip) but also with the rev of an attachment. It's impossible to tell from the change record which applies.
All the recordChanges calls must be reviewed, and sufficient (and correct) information passed to disambiguate them.
--
CrawfordCurrie - 27 Mar 2014
Marked the for Release 1.2, to get it into the release blocker list
--
GeorgeClark - 28 May 2014
What about recording attachments as something like
AssociationArticles{TheAttachmentName}
to record attachment names. And add a "more" field of "attachment" for the attachment changes.
It would then look like:
AssociationArticles OliverKrueger 1395837596 50 minor, reprev
AssociationArticles{FoswikiArticles.tex} MichaelDaum 1395837620 1 attachment
AssociationArticles MichaelDaum 1395837620 51
AssociationArticles{AnotherAttachment.txt} OliverKrueger 1395837735 1 attachment
AssociationArticles OliverKrueger 1395837736 52
--
GeorgeClark - 05 Jun 2014
Something like that; I had thought of using a / separator. But that really is the tip of the iceberg; ever recordChange call has to be examined on it's own merits, and the version information selected appropriately.
--
CrawfordCurrie - 08 Jun 2014
Yeah I was thinking about delimiters, Since attachments might allow spaces, something that clearly delimits the filename is needed. And / is for subwebs. Even though .changes is specific for a web, it might be a bit confusing when someone reviews this later.
Ug.
NameFilter allows { and } so we probably need to encode filename. File is tab delimited, so I guess the / is simplest even potentially confusing. We have to be a bit careful though. IIRC Sven did some work dealing with migration of ancient TWiki sites where "/" is legal in topic names. Migration assisted by disabling subweb support and removing / from
NameFilter. I don't recall the exact solution, but there might be a trap waiting there.
--
GeorgeClark - 08 Jun 2014
Details on calls to
recordChange()
(RCS =
Store/VC/Store.pm
Plain =
Store/Plainfile.pm
)
Event |
Plain |
RCS |
cuid |
revision |
verb |
Details recorded |
Auto-attach a file |
|
x |
x |
revisionn=0 |
autoattach |
newmeta, newattachment (Note the typo in the revision key passed) |
Move attachment |
x |
x |
x |
0 |
update |
oldmeta, oldattachment, newmeta, newattachment |
Copy attachment |
x |
x |
x |
0 |
insert |
newmeta, newattachment |
Move topic |
x |
x |
x |
y |
update |
handler(oldTopic),oldmeta, newmeta |
|
x |
x |
x |
y |
update |
handler(newtopic) oldmeta, newmeta |
Move web |
x |
x |
x |
0 |
update |
handler(newWeb) oldmeta, newmeta, more="moved from oldweb" |
Save attachment |
x |
x |
x |
y |
(update -or- insert) |
newmeta, newattachment |
more=minor - plainfile only |
Save topic |
x |
x |
x |
y |
(update -or- insert) |
newmeta, extra=minor? |
repRev topic |
x |
x |
x |
y |
update |
newmeta, more="minor, reprev" |
(Plainfile includes oldmeta, more=minor,reprev,options or save ) |
delRev topic |
x |
x |
x |
y |
update |
newmeta |
? delRev action isn't noted in the record. |
remove topic or attachment |
x |
x |
x |
0 |
remove |
oldmeta, oldattachment, more="Deleted Web.(Topic)?:(Attachment)?" |
Plainfile doesn't call for web. RCS skips web remove in the recordChange() routine. |
Calls generally appear consistent between RCS and Plainfile, with a couple of minor differences noted,. Note found a bug in RCS store, the Auto-attach function mis-spelled revision, so it's undef instead of 0.
However the call to recordChange doesn't use all the fields.
-
RCS
- The topic is picked up from $this->{topic} and is not taken from either new or old meta.
-
PlainFile
- The topic is picked up from the _meta parameter, not taken from either new or oldmeta.
The only field recorded are: (Topic | . ), cuid, time, revision, more. Tab delimited.
Possible changes needed / inconsistencies:
- Add Attachment name as a separate tabbed field.
- Move Topic records 2 changes, Move Attachment, only records a single change.
- delRev should indicate that the action happened.
- Need to address PlainFile vs. RCS more= difference
Modify file format, tab delimited. Add the verb, attachment name, and minor flag / reprev/delrev commands as separate fields. Use a consistent format for any move operation, topic or attachment. Record two changes, a remove on the old topic/attachment and an insert on the new name.
Topic |
Attachment |
verb |
cUID |
time |
revision |
minor |
cmd |
more |
Some topic |
|
update |
someuser |
1395837736 |
24 |
|
|
|
Some topic |
Somefile |
insert |
someuser |
1395837737 |
0 |
|
|
|
Some topic |
|
update |
admin |
1395837738 |
23 |
|
delRev |
|
Some topic |
Somefile |
remove |
someuser |
1395837738 |
3 |
|
|
|
. |
|
update |
admin |
1395837739 |
0 |
|
|
Moved from OldWeb |
--
GeorgeClark - 12 Jun 2014
Other considerations:
- File locking. .changes is not locked.
- File is read parsed and rewritten for every transaction. Consider move to
tick_foswiki
script
--
GeorgeClark - 12 Jun 2014
Running some quick tests, the whole recordChange is somewhat bogus. Both a combination of the recordChange() routine not recording information, and Store failing to pass complete information to recordChange.
Just from some quick testing, Delete attachment, doesn't pass the attachment name or any indication that it was a delete. Deleting an attachment should probably record 4 changes: remove attachment, update topic, insert attachment (trash), update topic.
What we probably need is some more thorough recordChange tests in
StoreImplementation tests that makes every possible change, and verifies that useful information is recorded.
for the eachChange processing there are a couple of alternatives:
- Add an API extension, request "version 2" to get all topic & attachment changes. Otherwise only return topic changes -or-
- update the users of eachChange to filter out undesired changes like attachments.
--
GeorgeClark - 13 Jun 2014
In pondering this, we have an issue for any site sharing the store between two releases, (for example Release 1.1 and Trunk on foswiki.org.) If we fix this by changing the .changes file format in 1.2, we'll break processing of the shared .changes files.
Current format (tab delimited)
topic |
cUID |
timestamp |
rev |
minor, reprev |
so a compatible format might be:
Topic |
cUID |
timestamp |
rev |
(more) minor, reprev, delrev |
verb |
Attachment |
--
GeorgeClark - 18 Jul 2014
I've started checking this into
Item12477 branch. My plans are:
- tie "rev" to only ever be a topic rev, never an attachment.
- Add Attachment / attchment rev
- 3 "verbs" Update, Insert, Remove Rename represented by two changes remove & insert
- add "Other web, topic, attachment & rev representing the from or to topic info for move, rename
- consistently create 2 records for any move / rename, within or between webs. Currently some operations are represented by a single change record. So a rename would result in:
- remove Topic rev Attachment rev (other points to new location)
- insert Topci rev Attachment rev (other points to from location)
- a create or remove without the "other" present would represent a new toipic/attachment or a delete.
--
GeorgeClark - 05 Dec 2014
OK, bottom line is that these recordChange calls are in the wrong place. It should not be up to the store implementation whether to record a change or not; it should just manage the recording.
The recordChange calls need to move to Meta, so they are done consistently for all stores. Not sure yet how to handle compatibility, but like as not I'll simply disable in-store calls to recordChange for 1.2
--
CrawfordCurrie - 17 Dec 2014
Done. Note that the format of the .changes file has changed - it is now encoded using JSON (selected JSON rather than XML to avoid adding dependencies). (
RCSStoreContrib and
PlainFileContrib). Old format files should still be read by the code. Third-party processors of the .changes may have problems, though (it's no longer emenable to grep, for example). George, you may want to review this, make sure it's compatible with your anti-spam efforts.
--
CrawfordCurrie - 19 Dec 2014
I don't think it will impact the
AntiSpam changes. Anything web based uses our official APIs. And the plugin does't proces the .changes files. However I frequently grep the .changes manually looking for activity from shell when cleaning up spammers. This is going to be rather inconvenient I suspect. My current spam cleanup runs:
find /path/to/foswiki.org/data -name .changes -exec grep -H $1 {} \;
I'm not sure how I'll be able to quickly find all changes made by a user across all webs. Suggestions welcomed.
--
GeorgeClark - 20 Dec 2014
cat .changes | perl -e 'use JSON; print join("\n", map {JSON::to_json($_) } @{JSON::from_json(<>)})' | grep
might work
--
CrawfordCurrie - 20 Dec 2014