Id | Summary | Priority | Current State | Creation Date | Last Edit |
---|---|---|---|---|---|
Item56 | kinosearch - enable admin to index just a specified list of webs' attachments | Enhancement | Confirmed | 02 Nov 2008 - 05:27 | 27 Feb 2010 - 01:26 |
Item8369 | LDAP Plugin seems to be interfering with the Index initialization script | Normal | Waiting for Feedback | 19 Dec 2009 - 06:16 | 11 Mar 2010 - 23:54 |
Item5647 | Make the indexing process more robust | Urgent | Being Worked On | 21 May 2008 - 16:26 | 18 Mar 2010 - 01:49 |
Item2308 | Kinosearch should set 'search' context for the kinosearch script | Low | New | 30 Oct 2009 - 09:40 | 16 Jul 2010 - 15:47 |
Item5581 | Full text search over form fields | Enhancement | Confirmed | 28 Apr 2008 - 19:38 | 26 Mar 2011 - 23:50 |
Item10769 | Sort out logging | Enhancement | New | 18 May 2011 - 17:02 | 18 May 2011 - 17:02 |
Item10775 | When updating a topic, we reindex all attachments | Enhancement | New | 19 May 2011 - 09:30 | 19 May 2011 - 09:30 |
Item10774 | Should not store .kinoupdate in webs data directory | Enhancement | New | 19 May 2011 - 09:32 | 19 May 2011 - 09:32 |
Item11223 | duplicate results after kinoupdate via bug in KinoSearchContrib/Index.pm sub changedTopics | Urgent | New | 31 Oct 2011 - 14:43 | 31 Oct 2011 - 14:43 |
Item11388 | kinoupdate seems broken with 1.1.4 | Urgent | New | 23 Dec 2011 - 00:09 | 23 Dec 2011 - 00:09 |
Item8083 | Using 'TWiki::Store::SearchAlgorithms::Kino' with RcsLite doesn't work | Urgent | Waiting for Feedback | 24 Mar 2009 - 13:54 | 21 Apr 2012 - 20:29 |
Item10543 | KinoSearchContrib incompatible with 1.1 when used as the stores search algorithm | Urgent | Confirmed | 25 Mar 2011 - 16:13 | 02 May 2012 - 14:27 |
antiword
(in debian package antiword
)
abiword
(in debian package abiword
)
wvhtml
(in debian package wv
)
pdf
& ppt
: xpdf
(in debian package xpdf-utils
)
ppthtml
(in debian package ppthtml
)
root ~# aptitude install foswiki-kinosearchplugin
One might thing that’s it. I just have to create the index and it will work.
Unfortunately this is not the case. In fact the dependencies are by far not all resolved.
1) In the porting of the KinoSearchPlugin from Twiki to Foswiki the KinoSearchContrib has been split into: KinoSearchPlugin + StringifierContrib. This means that if you look for the dependencies of the KinoSearchPlugin you have to look at the sum of the dependencies of all 3 extensions!
2) Not every coded dependency in the foswiki extensions is automatically coded into the .deb packaging! root ~# perl -e 'use IO::File; print $IO::File::VERSION."\n"'
wv
in the example):
root ~# dpkg -l | grep wv
apt-file
root ~# apt-file find CharsetDetector|grep perl
/var/lib/foswiki/ <extension_name>_installer
root ~# tail /var/lib/foswiki/KinoSearchContrib_installer
<<<< DEPENDENCIES >>>>
Foswiki::Contrib::StringifierContrib,>0,1,perl,Required for indexing attachments
KinoSearch,>0,1,cpan,Required
Error,>0,1,cpan,Required
Time::Local,>0,1,cpan,Required
IO::File,>0,1,cpan,Required
Perl Module/Program |
Installation Status |
Deb Package y/n |
Deb Package / CPAN |
|
installed |
yes |
|
|
installed |
yes |
|
|
not installed |
yes |
|
|
installed |
yes |
|
|
not installed |
NO |
CPAN |
|
not installed |
NO |
CPAN |
|
installed |
yes |
|
|
installed |
yes |
|
|
not installed |
yes |
|
|
not installed |
yes |
|
|
not installed |
yes |
|
|
not installed |
yes |
|
|
not installed |
yes |
|
|
installed |
yes |
|
|
installed |
yes |
|
pptx 2txt |
installed |
yes |
|
|
not installed |
NO |
CPAN |
|
installed |
yes |
|
|
installed |
yes |
|
|
installed |
yes |
|
|
installed |
yes |
|
|
installed |
yes |
|
aptitude
( apt-get
) all the components which are available from the debian distro:
root ~# aptitude install ppthtml xpdf-utils antiword abiword wv
root ~# aptitude install libhtml-tree-perl
Note: you do not really need to install all antiword, abiword, and wv. antiword is the default, the suggested for a linux installation and works pretty well.
Use CPAN to install the perl modules which are not available from the debian distro accepting the resolution of all the suggested dependencies. Note to use cpan you need to install make
and gcc
):
root ~# cpan
install KinoSearch
install CharsetDetector
install Spreadsheet::XLSX
quit
Logfile cannot be opend in path-20100728.log.
You can see in the configure that the needed paths for log and index do not exist!
{KinoSearchContrib}{LogDirectory} /var/lib/foswiki/pub/../kinosearch/logs
{KinoSearchContrib}{IndexDirectory} /var/lib/foswiki/pub/../kinosearch/index
You should create them by hand as the correct web user:
root ~# su - www-data
www-data ~$ mkdir -p /var/lib/foswiki/pub/../kinosearch/logs
www-data ~$ mkdir -p /var/lib/foswiki/pub/../kinosearch/index
/var/lib/foswiki/kinosearch/index
directory.
{KinoSearchContrib}{WordIndexer} = antiword
{StringifierContrib}{WordIndexer} = antiword
docx2txt.pl
file works.
root ~# /var/lib/foswiki/tools/docx2txt.pl ./Test5.docx
Failed to extract required information from <./Test5.docx>!
This is a symptom that unzip is not installed.
root ~# dpkg -l|grep unzip
root ~# aptitude install unzip
root ~# /var/lib/foswiki/tools/docx2txt.pl Test5.docx
root ~# ls
Test5.docx Test5.txt
root ~# cat Test5.txt
ciao ciao
OK from the console works, and if you re-index kinosearch now and make a search the docx files are included!
root ~# cp /etc/mime.types /etc/mime.types.orig
edit /etc/mime.types
and add at the bottom the lines:
application/vnd.ms-word.document.macroEnabled.12 .docm
Restart apache:
root ~# /etc/init.d/apache2 restart
Note: foswiki itself has it’s own mime.types defined.
/var/lib/foswiki/data/mime.types
This contains a “resolution” for at least the docx pptx xlsx formats.
application/vnd.openxmlformats docx pptx xlsx
This does not look like preventing the weird behavior with the IE8 client.
I did not change anything in
/var/lib/foswiki/data/mime.types
because it did not seem to have any effect, so I kept everything “as default as possible”.