Perceptive Search Entity detection uses a combination of heuristical and dictionary based techniques. The dictionary is called a "lexicon" and Perceptive Search comes with a pre-defined lexicon, which has been developed at the Perceptive Search research labs to perform well over a wide variety of scenarios, without the need for user configuration.
However, you might have some specific requirements and wish to augment the standard lexicon.
The standard lexicon is in a compressed and encrypted file, called ISYS.ELX. The file resides in your Perceptive Search executable directory, usually C:\Program Files\ISYS 9. You may create your supplementary lexicon in a plain text file, called ISYS.ELX2, also placed in the Perceptive Search executable directory. This supplementary lexicon will apply globally to all your indexes. Alternatively, place your ISYS.ELX2 file in the same directory as your index (alongside your ISYS.CFG file), for index-level supplementary lexicons. You can also combine the two approaches.
Any change to your lexicon file will require a full reindex, because it necessitates all the documents being re-read.
The ISYS.ELX2 file can be created using Windows Notepad or your favorite text editor.
It contains a series of entries in the following format:
[<EntityType>, <TriggerType>]
<Trigger>
<Trigger>
<EntityType> indicates which type of entity the lexicon entry is for, and may only be one of the following values:
<TriggerType> specifies the nature of the triggers to be listed in this section, and may only be one of the following values:
<Trigger> listed one per line, are the actual triggers. Triggers may be single words or multi-word phrases. Diacriticals are treated as though they do not exist, so for example, a lexicon entry for the name René may be given either as Rene or René, and will trigger against documents containing either form.
The various sections of the lexicon may be given in any order. The file may be encoded in ANSI or UTF8 Unicode. If desired, comment entries can be made by using an asterisk as the first character of a line. For example:
[Organization, Complete]
IED
GBRMPA
The Mighty Ducks
[Organization, Post]
Club
Inc
[Person, Component]
Twiggy
Hermione
Snape
Gilderoy
[Person, Complete]
Abu al Inc
[Person, Pre]
Mr
Frauline
It's important to remember the standard Perceptive Search is pre-populated, so you will only need to supplement it if you have specific interest in any entities that are quite unusually named, or which fail to be detected by both the standard lexicon and the heuristics.