Item12763: Solr plugin installation did not work as documented
Priority: Normal
Current State: Closed
Released In: n/a
Target Release: n/a
Applies To: Extension
Component: SolrPlugin
Branches:
With any luck, much of this is fixed already in upcoming Plugin versions. I was using the July 2013 released version of the plugin.
Here's a list of the issues that I ran into and what I did about them:
When I tried to start solr after installing the bin file, the first thing I ran into is that the solrstart script required me to pass the path to my foswiki installation as an argument to work. Otherwise, it was defaulting to a trunk development directory. There seem to be similar issues for a bunch of the scripts in the tools directory. They expect environment variables to set - which I guess are set if they are called by foswiki itself - but aren't if called from the command line.
After getting past this problem, Solr still didn't start properly. Manually starting it using java -jar start.jar, I was able to see the solr logs and to see that it was looking for webapps in the "contexts" directory - but there was no contexts directory under foswiki/
Also, it seemed that there should have been a foswiki directory under the multicore directory (and also an _template directory), but it didn't exist there. The direcotry contained conf/ lib/ and solr.xml only.
When I tried changing the Jetty configuration to look in the webapps dir instead of contexts, I got a bunch of Java errors.
At this point, since it seems that the solr and jetty configuration files weren't matching the actual file structure, I decided to download my own copy of solr to see if I could get it working.
- I downloaded my own copy of solr 4.6 and also solr 4.3 (the version that the plugin seemed to be based on). Solr 4.6 was easy to get working. I customized it as follows:
- renamed the example directory to solr-jetty (example just didn't make sense for a production deployment)
- removed some example sub-directories that didn't apply: example-DIH, exampledocs, example-schemaless, multicore
- created a core for foswiki based on the "collection1" example:
cp -r collection1/ foswiki
- edited core.properties to remove the name "collection1" so it would use the core directory name - foswiki - as the core name.
- ran java -jar start.jar in solr-jetty to check that both cores were up and running, which they were
- made changes to 3 files: solrconfig.xml, schema.xml, and stopwords.txt
- diffed the solrconfig.xml files from the default 4.3 installation and the one that came with the plugin, then manually applied all the changes that looked like they made sense to the solrconfig.xml file in my new foswiki core.
- also fixed some other things that were resulting in warnings (use of deprecated code)
- replaced the schema.xml file with the one from the plugin multicores/conf directory
- replaced all instances of stopwords- with stopwords_
- replaces stopwords_se with stopwords_sv for Swedish (just seemed to be incorrect for some reason)
- fixed a few other things that were resulting in warnings (use of deprecated code)
- copied foswiki/conf/lang/stopwords_en.txt to foswiki/conf/stopwords.txt
- copied over the mapping-japanese.txt file from the plugin multicores/conf directory
- disabled collection1 by renaming the core.properties file in the collection1 directory to core.properties.orig (alternatively the whole directory could be deleted)
- restarted the server, and checked the logs for errors and warnings.
When trying to follow the instructions for indexing the data, I ran into the problem that the instructions assume you are using virtual hosts, which I'm not - so I had to use the non-virtual host version of the scripts.
When running queries, I saw warnings in the Solr logs that Solr would use Highlighter instead of
FastVectorHighlighter so I fixed this - and also a typo in the query where it says "Contignuous".
I also found that Solr was detecting some english pages as German - and figured out that I should have set CONTENT_LANGUAGE to be "en" in my site preferences so it didn't need to use language detection (since my site is all English).
Finally, I generally set up my foswiki server with at least 2 instances of foswiki - a production version and a test version. This isn't the foswiki virtual hosts but apache named virtual hosts. From a solr perspective, I still need two cores though so I can have two separate indexes and two separate solr configurations to play with. It didn't make sense, to leave solr and jetty installed in foswiki/solr - so I decided to install it under /opt/solr instead where it would make more sense to share it.
--
LeilaPearson - 01 Mar 2014
I've attached to this ticket all of my modified files, along with the files they are based on so they can be diffed to see what specific changes were made.
- .new files are my new versions of the files
- .plugin files are the original plugin files - from the July 2013 release (which is the current release as of this writing)
- .solr43 are the corresponding files from the default solr 4.3.0 release - without the foswiki customizations.
- .solr461 are the corresponding files from the default solr 4.6.1 release - without the foswiki customizations.
My .new files are basically the diffs between the .plugin and .solr43 files applied on top of the .solr461 files, with a few additional fixes and changes related to the upgrade to solr 4.6.1.
--
LeilaPearson - 01 Mar 2014
I also attached a modified version of
SolrPlugin.txt and a tar file to go with it. If you follow these instructions, you should be able to get things up and running the way I did - however, this may not be the best way to do things - especially if you already have jetty or tomcat running on your server. The solr distibution is set up as an example instead of a production package, which means my method results in lots of unecessary files, plus a bit of an odd file organization.
--
LeilaPearson - 01 Mar 2014
Thanks a lot for providing a documentation of what you did to install
SolrPlugin. This is highly appreciated and will be of help upgrading from solr-4.3 to new er ones.
I'll be cherry-picking some of your changes and integrate them into a next release while leaving aside some other changes. Here are my inspecting your changes:
Mapping-japanese.txt didn't change.
The plugin's stopwords.txt has got a more extensive list of english stopwords ... I'll keep that.
I won't rename the stopwords files nor move them into a separate
lang/
directory due to backwards compatibility.
The rest of the changes in
schema.xml
(replacing
SortableXXX
with
TrieXXX
) are good ones.
I'd always prefer to run solr using the server's own jetty (or tomcat) servlet engine controlled by its init process ... and not the one shipped inside the example jetty bundled in the solr tar ball.
I can't copy over your
SolrPlugin.txt
changes as they heavily rely on morphing the upstream example directory into a manually installed solr service ... which I'd rather not promote to do.
Core of the problem is that the upstream solr distribution is packaged in a rather odd way, i.e. the extra libraries are scattered all over the place in different directories. I tried to mitigate this pain by rebundling these binaries as
SolrPlugin-bin
package in a way that makes more sense and needs a simple extraction process instead of wading thru the upstream distributions while separating their jetty jars from the real net binaries that make up solr and its other goodies.
At the end of the day only
three four things need to be taken out of the upstream tarball:
- solr.war
- non-jetty lib jars
- diff of changes in stopword files
- diff of changes in solrconfig.xml and schema.xml
Finally, there are some extensive changes to
solrconfig.xml
that heavily alter the request handlers in there ... I am completely unsure what's going on there atm ... needs digging deeper.
--
MichaelDaum - 09 Jan 2015
This is going to change drastically with Solr 5. Some of the proposed changes have been added as part of
Item13280.
--
MichaelDaum - 25 Feb 2015