What is meaning of stop word in Endeca ?
Stop words are words that are ignored if an application user includes them as part of a search.Typically,
common words like the, and, a and so on are included in the stop word list.
How to add stop words in CAS based application ?
Step 1 : Open application specific stop word configuration file.
This file is located at <Application Directory>/config/mdex/<Application_Name>.stop_words.xml
For example : For store application installed at /opt/app/endeca/apps/ location file will be /opt/app/endeca/apps/Store/config/mdex/Store.stop_words.xml
Step 2 : Add stop words.
By Default there is no stop word configured.
============================================================
<?xml version="1.0" encoding="UTF-8" standalone="no" ?>
<!DOCTYPE STOP_WORDS SYSTEM "stop_words.dtd">
<STOP_WORDS>
<STOP_WORD>of</STOP_WORD>
<STOP_WORD>the</STOP_WORD>
<STOP_WORD>how</STOP_WORD>
<STOP_WORD>when</STOP_WORD>
<!DOCTYPE STOP_WORDS SYSTEM "stop_words.dtd">
<STOP_WORDS>
<STOP_WORD>of</STOP_WORD>
<STOP_WORD>the</STOP_WORD>
<STOP_WORD>how</STOP_WORD>
<STOP_WORD>when</STOP_WORD>
</STOP_WORDS>
============================================================
Step 3 : Run baseline.
Important Points :
1. Words added to the stop word list are not expanded by other Endeca features like stemming and
thesaurus. That means that if you set the word item as a stop word, its plural form items will not be
marked automatically as a stop word. If you want both forms to be on the stop word list, you must add
them individually.
2. Stop words must be single words only, and cannot contain any non-searchable characters. If more
than one word is entered as a stop word, neither the individual words nor the combined phrase will
act as a stop word. Non-searchable characters within a stop word will also cause this behavior. Entering
“full-book” as a stop word acts just as if you had entered “full book”, and does not have any effect
on searches.
No comments:
Post a Comment