Help > Maintaining your indexes > Adjusting the common word list

Adjusting the common word list

Common words are words that appear frequently in documents but convey little or no information. Typical examples are AND, THE, AT, IT, etc. Perceptive Search has a default list of 317 common words that it excludes from indexes in order to save disk space and to reduce unnecessary hits. The common words are stored in the ISYS.CWD file, which can be modified to suit your purposes.

When you perform a query, Perceptive Search skips over the common words when building the result.

Note: as common words are not included in the index, they are also sometimes referred to as "stop words".

Maintaining your Common Word List

To tailor your Perceptive Search searches to get the most accurate results, you may wish to modify the Common Word List. You may find that:

For example: in an IT context, C is a programming language. To find occurrences of this in a document, you may enter a query such as C // program. If the letter C is in the Common Word List, Perceptive Search will only retrieve the word program. In this instance, removing the letter C from the common words file would allow the query to execute as intended.

A good way to select which words to add to the common word list is to use the Word Frequency report to list most frequent occurring words, generally words that account for 1% or more of the vocabulary are good candidates for the common word list. It is unlikely that you will need to make all these words common. However, the list is a good indication of those words that are candidates.

You must perform a Reindex for any changes made the Common Word List to take effect.


Adding a Common Word

You can add any words you like to the Common Word list, so that Perceptive Search will ignore them when they are included in queries. To add a common word:

  1. Select Tools > Common Word List from the menu
  2. Click the 'Add' button, the Add Common Word window is shown
  3. Enter the word you wish to be common
  4. Click the OK button to save the list

Deleting a common word

You can remove any word from the Common Word list, so that Perceptive Search will recognize them when they are included in a query. To delete a common word from the list:

  1. Select Tools > Common Word List from the menu
  2. Select the word you wish to remove
  3. Click the Delete button
  4. Click the OK button to save the list

Editing a common word

You can edit any word that appears in the Common Word List. To change a common word:

  1. Select Tools > Common Word List from the menu
  2. Select the word you wish to edit
  3. Click the Edit button
  4. Make your changes
  5. Click the OK button to save the list

Making common words index-specific

By default, common words apply to all indexes. If you want to apply a special list of common words to a particular index, you need to do the following:

Open the Perceptive Search program directory (the default location is C:\Program Files\Perceptive Enterprise Search) and locate the ISYS.CWD file.

Copy the ISYS.CWD file from the Perceptive Search program directory into the desired index directory. When creating your indexes, Perceptive Search uses this file in the index directory, if exists, otherwise it uses the one in the program directory.