Item13135: .gitignore management is unclear
Priority: Normal
Current State: Being Worked On
Released In: n/a
Target Release: n/a
Applies To: Extension
Component: SCM
Branches: master
Item13135 Release01x01
When setting up my dev environment I found many untracked files lying around and I found this confusing. There was also the issue in my mind about what .gitignore changes should or should not be committed back to the repo.
After a while I recognized that most of the files to .gitignore were symlinks, i.e. those generated by pseudo-install to create a working Foswiki etc. These symlinks are always a build/developer output and never need tracking. The original non-symlinked file is already tracked.
I have fixed this
SCM bug in my set-up by deleting
all the existing .gitignore files and then creating a single maintainable .gitignore at the distro level. (It's maintainable because it will very rarely change — indeed any changes here will be an integral part of our version control).
The symlinks can be ignored by simply doing 'find * -type l > .git/info/exclude' called from the distro directory. This can be re-run at any time without any concerns picking up all the actual symlinks in service at that time. Yes this is volatile, but locally and its easily regenerated on demand and never committed back to the repo!
This is my proposed distro/.gitignore which includes docs about it's use.
# This is kept in the distro part of the developer structure. That way we can potentially ignore
# files/directories in any Extension that is pseudo-installed and not just core.
#
# This .gitignore file is NOT ignored, it is meant to be maintained as part of foswiki/distro
# and contain the standard .gitignore for FW development
#
# The standard .git/info/exclude file also lists excludes (effectively another .gitignore) and
# is NOT tracked ever. This is used for FW development to keep a generated list of ALL symlinks
# as they are in fact ALWAYS build outputs (via pseudo-install.pl and build.pl).
#
# See: https://help.github.com/articles/ignoring-files/
#
# This also leaves a developer free to create a core/.gitignore for his own needs without
# being concerned about impact on distro. This .gitignore should ignore itself so it's never
# maintained in git.
#
# A .gitignore could also be added to any extension but in general its recommended that
# this is avoided (expecially for default extensions). Of course a developer of
# a particular extension will know best about the .gitignore needs of their extension.
#
# By design we end up with only one .gitignore to maintain in distro with
# .git/info/exclude used to hold a generated list of all symlinks which we ignore.
# This list can be generated any time with 'find * -type l > .git/info/exclude'.
# Eventually, pseudo-install can be modfied to maintain this list including handling
# platforms that do not have symlinks and have to copy the file.
# -----------------------------------------------------------------------------
# ----- Standard Foswiki gitignores -------------------------------------------
# Any file with gz extension is a build output which we do not want in git.
# What we do want in git are the source files the gz file is derived from.
*.gz
# Ignore any .changes in Webs and Sub-Webs (hence /**/)
core/data/**/.changes
# Similarly none of the following should be in git
core/data/.htpasswd
core/lib/LocalSite.cfg
core/test/unit/testlogs/
core/working/
# Suggestion to use this directory for all junk output (diffs, debugs, logs etc) that
# may be generated by a developer. Stops core being full of cruft and is implicitly
# ignored
core/z*
This change also means deleting the following:
deleted: CommentPlugin/.gitignore
deleted: ConfigurePlugin/.gitignore
deleted: EditRowPlugin/.gitignore
deleted: JEditableContrib/.gitignore
deleted: JQueryPlugin/.gitignore
deleted: JSCalendarContrib/.gitignore
deleted: JsonRpcContrib/.gitignore
deleted: NatEditPlugin/.gitignore
deleted: PatternSkin/.gitignore
deleted: SlideShowPlugin/.gitignore
deleted: SmiliesPlugin/.gitignore
deleted: SubscribePlugin/.gitignore
deleted: TinyMCEPlugin/.gitignore
deleted: TwistyPlugin/.gitignore
deleted: UpdatesPlugin/.gitignore
deleted: core/test/unit/.gitignore
--
JulianLevens - 05 Dec 2014
No please don't do that. Let's delegate gitignore to plugins as they are better manageable in a narrowed scope. Also, I'd like to tinker with
distro/.gitignore
myself and not let the distro repo cover it.
--
MichaelDaum - 05 Dec 2014
We have a lot of options for ignoring files:
I agree with
MichaelDaum regarding extensions.
.gitignore
in extensions should probably be manually maintained, checked in, and not updated by pseudo-install. They should generally reflect the delta between the checked in files, and the file list in MANIFEST. That is, all the generated files. And this should be a consistent practice across both default extensions in distro, and per-extension repositiories. It also should include the store history related files: *,v and *,pfv so that they don't get checked in.
How about the following:
- global
core.excludesfile
Good for developer workflow. .bak, ~editor backup files, etc.
-
[Extension]/.gitignore
files will be owned by the project. They will be statically maintained and we'll remove the feature from pseudo-install that updates them. That should avoid a lot of extraneous checkins caused by the pseudo-install updates. It should include any generated files, and the output of the build process.
-
core/.gitignore
will also be project owned, a manually maintained list for standard Foswiki installations. That includes .htpasswd
, the various config files, parts of working directory, etc. (Not all working should be ignored).
-
distro/.gitignore
will be not checked in, and can be tweaked by individual developers. This is a good place for pseudo-install to create a single exclude for each "checked-out" extension so that they don't get added to distro.
-
.git/info/exclude
... probably a good place to pipe your symlink info, like you suggest.
--
GeorgeClark - 06 Dec 2014
Note that GeorgeClark and JulianLevens were editing at the same time. As part of resolving the conflict JulianLevens moved this block to follow George's block that he was replying to.
I'm about 95% with you George.
For now let's forget about any tools - I'll reintroduce them later; let's just consider what should be in the repo and what should not. I'll start with distro (and cover non-distro extensions later). We have a .gitignore and .git/info/exclude available in distro/core and for each distro/defaultExtension.
Your suggestion is to use .gitignore as part of the repo and changes committed as required and I agree. You suggestion is then to use .git/info/exclude for each developers own requirements or whims — again I agree.
The most important thing is that one is managed within the repo and one is not. As such its arguably arbitrary which is which as long as we agree. However, by design .git/info/exclude cannot be part of the repo. Therefore, as this article recommends
https://help.github.com/articles/ignoring-files/ .gitignore for the repo and .git/info/exclude for the dev.
When I consider distro/.gitignore and distro/.git/info/exclude; I would argue to follow the same rules and not have .distro/.gitignore as an exception being used the other way around. It still leaves any developer the room (as in Michael's case) to handle things locally (via .git/info/exclude) and gives the project an extra .gitignore to use usefully within the project repo.
- .gitignore is always for the repo &mdash nice and consistent
- .git/info/exclude is always for the dev &mdash nice and consistent
The same principle would apply for the non-distro extensions; why be different?
As for any tool — well that only makes sense for each and every .git/info/exclude after all these are never saved in the repo.
The problem faced now is that the tool could conflict with the devs manually and lovingly created .git/info/exclude as we do not have a third exclude file for core and each and every extension.
However a tool is a program. Therefore, we can take advantage of the comments in a .git/info/exclude to create marked sections for various tools and important include params so the dev can even say 'no thanks'. This offers maximum flexibility and control.
I need to return to distro/.gitignore as I provided above, then assuming that symlinks have been ignored by some means, then right now there is no need for another .gitignore elsewhere within distro. That's not to say that it cannot be changed, but they would form a set of .gitignore that are part of distro to be pushed to the repo whenever required. But right now why not maintain one .gitignore rather than 17 as we have now?
--
JulianLevens - 07 Dec 2014
Note that GeorgeClark and JulianLevens were editing at the same time. As part of resolving the conflict JulianLevens moved this block to follow his own although George would have been replying to his own earlier block in the first instance
Still tinkering with this a bit, trying to work out how to separate the manually maintained files from generated files, and reduce the size of them so they are more easily managed. Since these files are merged, we might be able to use that aspect as well. Consistent with my previous comment, but extending it a bit further
-
distro/.gitignore
- Never checked in.
-
distro/core/.gitignore
- Maintained automatically by pseudo-install, but exclude extensions installed by pseudo-install..pl developer
-
distro/core/bin/.gitignore
Project owned, manually maintained, "default install"
-
distro/core/data/.gitignore
Project owned, manually maintained, "default install"
-
distro/core/pub/.gitignore
Project owned, manually maintained, "default install"
--
GeorgeClark - 07 Dec 2014
Maybe this should all be in a brainstorming topic. And I've fixed up an edit / merge conflict. Using
CommentPlugin is a bit safer than concurrent editing. So I don't think you saw my 2nd comment, as it was interleaved with your last response.
We have need for 3 categories of ignore, and git provides 2. (ignoring the global file) I'd really like to separate tool maintained from developer maintained.
- Maintained by a developer but private
- This is easy - the
.git/info/exclude
- Maintained by a tool
- Maintained by a developer and checked in.
- I think if these two can be in
.gitignore
and differentiated by position within the hierarchy.
Rather than repeating, I'll update my example above.
--
GeorgeClark - 07 Dec 2014
An additional problem to bear in mind is that a non-default extension by definition needs it own .gitignore and .git/info/exclude. Git will not honour these gitignores anywhere else in distro or other extensions and vice versa.
Therefore, the tool can cannot write to one place to cover all eventualities. Therefore, the only option is to split .git/info/exclude into marked sections thus giving us effectively multiple sections to work with.
I also wondered about sub-directories containing .gitignore files. I would myself say follow the rules above and
allow .gitignore there but always as part of the repo.
I say
allow because I see no need right now. However, by saying that "all .gitignore files are part of an ignore database within any repo" clearly allows that to change at any time — it's just another checkin with changes to this database.
--
JulianLevens - 07 Dec 2014
I think that we converged on a design, but need feedback from other git users. See
This IRC log.
-
.gitignore
files will be manually maintained. Checking in primarily at the level below the distro root. So root of each extension directory, and in core.
-
pseudo_install.pl
will be updated to maintain the copied / symlinked files as commented "blocks" within .git/info/exclude
That includes:
- An entry for the root directory of any cloned extension
- Entries for any files copied or symlinked into core
Rather than try to manage a file/by/file merge into the core .gitignore as is currently done, a commented block of files within
.git/info/exclude
will be easier and simpler to maintain.
In the meantime I've added a "gitignore" target to
BuildContrib. It works similar to the "manifest" target, but generates a candidate .gitignore by comparing the existing MANIFEST file to the list of files known to git.
--
GeorgeClark - 07 Dec 2014
A variant idea to create sections of .git/info/exclude for tools is to create a directory called 'x' (literally or not) that contains many named sections as separate files (e.g. x/developer & x/symlinks). Of course this would require a final step equivalent to 'cat x/* > .git/info/exclude' to create the actual ignore file recognised by git.
--
JulianLevens - 08 Dec 2014
When a non-distro extension is git cloned then the distro git will see that extension as non-tracked. Therefore, we need to tell the distro git to ignore any git cloned extensions — ideally automatically within pseudo-install as extra extensions are cloned into the developers area.
Any directory under distro that contains its own .git directory needs to be ignored by distro. One option is another command-line one-liner:
find * -name .git | grep -o -P '(.*?)(?=/.git$)' > x/extensions
Which together with
find * -type l > x/symlinks
Gives us two distinct generated ignore sections which can then be combined with:
cat x/* > .git/info/exclude
Which would also pick up x/developer for the developers own special hand-crafted ignore rules.
Or similar logic built into the tools (and hence more portable).
I note that create_new_extension does not 'git init' the new extension and update the ignore rules. For completeness an option of '-repo none' could suppress this. In the future we could conceivably allow a different git repo to be given than the default (
https://github.com/foswiki/NewExtension.git) but that needs more thinking through.
--
JulianLevens - 14 Dec 2014
I was arguing earlier that distro/.gitignore should contain all the ignore rules for the whole of distro. An argument raised against this is that a separate .gitignore in sub-directories allows for more refined tuning, but this argument is bogus as you can be as refined as you like in distro/.gitignore.
The advantage of a single distro/.gitignore is to see the whole distro ignore picture in one.
A different argument against one .gitignore to rule over all of distro is to allow easier movement of an extensions in and out of distro, but that's a very low frequency action.
OTOH an argument to keep a separate .gitignore file in distro sub directories is that the tools will need to manage the .git/info/exclude for each sub-directory and it would be a more consistent approach.
Hmm, when I reflect on all the arguments regarding this issue I have to concede that it's not a big deal either way. The important thing is that a .gitignore file anywhere in distro (or any extension) is seen as maintained within the repo.
Alas, I cannot make the next release meeting to discuss this, so I hope I have covered my thoughts adequately.
--
JulianLevens - 14 Dec 2014
Having finished my work I'll document it here for now:
The script is tools/git_excludes.pl:
- It is called without parameters
- It can be called from anywhere
- It assumes it lives in tools and works out where distro is from there
- It always scans from the distro directory down
- It rebuilds all the required
.git/info/exclude
files required
- for distro
- for core
- and for each and every extension
- will be made empty if not required
- Necessary to blank when things change (∴ we do not leave old exclude cruft behind)
- It is designed to be re-run as often as is required - it just makes the 'excludes' clean
- It gives developers two 'ignore' files to work with
-
.gitignore
file part of the Fosiwki repo if created — this is of course just part of standard git and nothing new
-
.gitexclude
never part of the Foswiki repo it's always for developer's private excludes — this is a new 'ignore' file.
- Only for distro and non-distro extensions
- However
distro/.gitexclude
can be used to handle
- distro excludes
- distro Extension excludes
- pseudo-install.pl and create_new_extension call
git_excludes.pl
before returning to ensure any changes are immediately reflected
- create_new_extension really needs a
git init
option
Apart from scanning for symlinks it also notes any non-distro extensions and
.gitexclude
files. These are combined into a new
.git/info/exclude
which is recreated each time.
For pedagogical purposes, a distro directory looks something like this:
Any
files at this level are one of three types
- Part of the project and therefore the distro repo, e.g.
- .gitignore If created this will become part of the project
- README.md
- Developer created files not part of the project
- Build outputs (this could include symlinks)
There is nothing remarkable about the
README.md
file, it is quite correctly part of the project and already part of the repo and as it's not remarkable I'll stop remarking about it.
To neatly and correctly ignore the above you will need both a
.gitignore
and a
.gitexclude
file.
For example, the
.gitignore
/*.build
This is because
*.build
files in the above pedagogical scenario are a valid output of the Foswiki build tools. Every developer will build these outputs and will need to .gitignore them. Therefore, a
.gitignore
file is created which is itself committed to the repo for everyone's benefit.
Conversely a
.gitexclude
file would appear as:
/diff*
As this developer in in the habit of creating
diff
outputs in the distro directory. The
.gitexclude
file is never part of the repo and is never committed. Similarly the
diff*
files are never tracked and hence never committed to the repo.
In practice the above two ignore file are unlikely in distro:
- No Foswiki build tools create output in the distro directory
- ∴ there is currently no
.gitignore
file in the repo
- If that ever changes then a
.gitignore
like the above would need to be created and committed. However, it does seem unlikely the we will ever place build outputs here
- Most developers probably place any ignorable non Foswiki files in one of the sub-directories
- Nonetheless this is quite possible and reasonable, we all have our own unique working habits
- It will always remain local to that developer and never seen by anyone else
The sub-directories within distro also fall into two main categories:
- Those with a
.git
sub-directory
- The list of these are captured and added as a set of directories to ignore as distro level
- These are added to
distro/.git/info/exclude
- This is valid even if the sub-directory is actually a git repo for something non-Foswiki
- It still needs to be ignored at distro level
- They are also scanned internally for symlinks and added to
extension/.git/info/exclude
- In practice I think it is unlikely to have non-Foswiki repos here
- Nonetheless this is a potential break point as symlinks may need to be managed quite differently in that repo
- Those without a
.git
sub-directory
- Assumed to be core or a distro extension
- All their symlinks are harvested
- These are added to
distro/.git/info/exclude
- If it's not core or distro then the developer will need to add this directory to a
distro/.gitexclude
file
- Their symlinks are still harvested and added to
distro/.git/info/exclude
- However this is benign as they are just redundant
Also checked-in are a number of simplified
.gitignore
files. Therefore you
MUST run
git_excludes.pl
to make
git status
sane again.
After running you can check out your
git status
and it should be clean with any
.git/info/exclude
files handling local excludes and the dynamic changing excludes.
Example
.gitignores
now required:
*,v
*,pfv
*.gz
/BuildContrib.md5
/BuildContrib.sha1
/BuildContrib.tgz
/BuildContrib.txt
/BuildContrib.zip
/BuildContrib_installer
/BuildContrib_installer.pl
*,v
*,pfv
*.gz
*.jslint
/ConfigurePlugin.md5
/ConfigurePlugin.sha1
/ConfigurePlugin.tgz
/ConfigurePlugin.txt
/ConfigurePlugin.zip
/ConfigurePlugin_installer
/ConfigurePlugin_installer.pl
None are more complicated than this, and some are even simpler
*,v
*,pfv
*.gz
The above suggests a pretty standard
.gitignore
as a build target.
The
core/.gitignore
is only
*,v
*,pfv
*.gz
/working/**/events.*
/working/**/error.*
/working/**/cgisess_*
/lib/LocalSite.cfg
#
# Open question whether to add or not, something to do with 'git clean' and the -x flag, ask GeorgeClark
# /data/**/*.lease
# /data/**/*.changes
I did have just
/working/
as an ignore line, but that directory has various README files inside which describe the purpose of certain directories and sub-directories. The initial thought was to use
!/working/README
lines following the
/working/
line to restore the READMEs. However testing failed so I checked the docs and you cannot reinstate individual files within an ignored directory. Therefore, I used the above rules to achieve something similar.
--
JulianLevens
There is an issue for anyone building the "compiled" versions of extensions. In this case the minimized / compressed files are created in the Extension directory. Extensions that use javascript files need all of the output files generated by build also added to their .gitignore files.
MichaelDaum already had created .gitignore files for the JQuery and JavaScript based extensions that he maintains. Those need to be reverted back to their manual form.
Any new extensions once they have been built with buildcontrib can have a .gitignore file generated by the build target gitignore
cd JQueryPlugin/lib/Foswiki/Plugins/JQueryPlugin
perl build.pl release
perl build.pl gitignore >> ../../../.gitignore
I believe that Michael prefers to maintain a manually tuned .gitignore for his extensions, so best to just revert those files back to his last version.
--
GeorgeClark - 10 Mar 2015