This question about Topic Markup Language and applications: Answered
I have problems with regular expressions.
I need a search for all topics starting with "AbteilungsPortal" and I want to show the first line (the header) of the topic, not like the summary a fixed amount of characters.
The header starts with + and ends with a CR/LF
Thanks
Andreas
%SEARCH{
"^[^a-zA-Z0-9]*AbteilungsPortal"
type="regex"
format="$pattern(^([^a-zA-Z0-9]*?AbteilungsPortal.*?).*)"
}%
In the search string:
- First ^ anchors the search at the beginning of the topic (or if we had multiple="on", it would anchor to beginning of the line)
- [^a-zA-Z0-9] matches non-alphanumeric characters
- The following * means match the non-alphanumeric characters zero or more times
- Then AbteilungsPortal must follow that pattern.
In the pattern string:
- ^ anchors the match to the beginning
- ( begins the pattern to be extracted
- [^a-zA-Z0-9] matches non-alphanumerics
- Following * means match the non-alphanumerics zero or more times
- ? makes the match "non-greedy" (in combination with * - match zero or more times until the first occurance of AbteilungsPortal)
- .*? : . means "any character", * means "zero or more times", ? means "non-greedy"
- ) finishes the pattern to be extracted
- .* finishes the regex. In Foswiki, we must always finish
$pattern()
in this way
Result shown on this topic:
Number of topics: 1
--
PaulHarvey - 04 Apr 2010</verbatim>
Dear Paul,
I think I don't explain my problem very good.
In the company I work for we have a lot of departments (geman: Abteilung).
Each department will have a portal topic. The topic names are
AbteilungsPortalA,
AbteilungsPortalB,
AbteilungsPortalC........ The portal topic starts with "---+!! KA-1 Machinery construction" for example
My own try:
%SEARCH{"AbteilungsPortal" scope="topic" nonoise="on" format="[[$topic]] $pattern(.*?---\+!!*([\n\r]+).*)"}%
I search for a topic its name contains "AbteilungsPortal". And I want show the first line of the founded topic.
Andreas
UPDATE:
I got it...mostly. (It helps to read the manual carefully!)
%SEARCH{
"AbteilungsPortal"
scope="topic"
nonoise="on"
format="[[$topic]] $pattern(.*?([:blank:].*?([\n\r]+)).*)"
}%
New problems:
The topic I'm looking for ("AbteilungsPortalElt") starts with a heading1:
---+!! ELT-Abteilung
---++ Internal documents
The search-result is
AbteilungsPortalElt LT-Abteilung
The 'E' is surpressed! If the Text starts with 'A' or 'K' it will be shown correct.
And I tried to use the founded string as a link.
%SEARCH{
"AbteilungsPortal"
scope="topic"
nonoise="on"
format="[[$topic][$pattern(.*?([:blank:].*?([\n\r]+)).*)]]"
}%
But this did not work?!?
Andreas
I think it is a current bug that
[:classes:]
are not recognised by
$pattern()
, and I don't know if it is an easy to fix (we don't want to prevent future non-grep search algorithms -
Development.NormaliseRegexSyntax and
Development.AddMatchOperatorToQueryLanguage has some background).
Anyway, the
$pattern()
is treating the
[:blank:]
literally: matching :, b, l, a, n, k characters. I would suggest using
\s
instead but I seem to recall that here again
$pattern()
doesn't handle that notation either, I could be wrong though. Which is why I wrote a pattern to match non-alphanum characters:
[^a-zA-Z0-9]
--
PaulHarvey - 06 Apr 2010
Try writing the class as
[[:blank:]]
- classes have to be within a double square-brackets. I'm not sure about the rest of the regex.
--
GeorgeClark - 07 Apr 2010
Dear Paul,
your idea with [^a-zA-Z0-9] is good. Now I get the results I want!
- Note you may want to also try George's note that the character classes look like
[[:blank:]]
instead of [:blank:]
. This would be better, especially because [^a-zA-Z0-9] does not contain accented characters, etc.
In the Sandbox I made some tries. If you (or somebody else) have time please have a look!
SandboxAndreas
--
AndreasEllguth - 07 Apr 2010
Thank you for the very clear questions you wrote.
I have moved them into the Support web, because they are a nice series of questions that could be useful to other users. I hope you don't mind.
NewlinesAndFormattedSearch
--
PaulHarvey - 07 Apr 2010