Documents Options
These options determine how indexing of documents occur. The settings apply for
all email
attachments, SQL BLOBS, Lotus Notes
attachments and other non-file system objects.
These options can be overridden for File
and FTP
indexing rules.
View thumbnailView full size image
Presentation
- Standard - Document text and properties only, maximum speed
Select this option to achieve the fastest indexing and browsing speeds possible
within your environment. When browsing a document Perceptive Search will make use of its own
internal viewers which will allow for multiple hit colors.
- HiDef - Original document layout and formatting, maximum quality
Select this option if you want Perceptive Search to attempt to render your legacy documents
such as Microsoft Word, Excel, WordPerfect, etc, as HTML. This presentation is richly formatted,
but will not be identical to the original document.
Formatting
- Plain
Selecting this option from the drop-down list will use the default document options
for the indexing rule.
- Contains Ventura desk-top publishing paste-up markers
If you use the Ventura Publisher desktop publishing package and the documents covered
by this rule may contain Ventura paste-up markers, select this option from the drop-down
list. Perceptive Search will then correctly interpret the special markers that Ventura inserts
into documents, wherever they may occur
- Every line has hard return. Two hard returns indicate paragraph ends.
Use this option if your documents are in a word-processor format in which every
line ends with a hard return and every paragraph ends with two consecutive hard
returns.
- Entire document is double-spaced. Three hard returns indicate paragraph ends.
Use this option if your documents are in a word-processor format, are completely
double-spaced and every paragraph ends with three consecutive hard returns.
The preceding two options only apply to documents in a word processor format where
there is a concept of a hard-return. ASCII, by comparison, doesn't have the
concept of paragraphs, and so it is normal for every line to end with a hard return.
- Document contains lines formatted wider than 78 characters
By default, the Perceptive Search browser automatically wraps wide documents to a right margin
of 76 characters for maximum browsing performance. Select this option if you need
to browse highly formatted, wide ASCII material, such as 132-column mainframe report
files. You can also compensate for wide documents by using the Quality presentation
option, or by amending the ISYS.CFG file.
- Documents should be interpreted as OEM (DOS ASCII) character set
In most situations with English-style languages, there is no real difference between
ASCII files (as usually generated by DOS programs), and ANSI files (as usually generated
by Windows programs). By default, Perceptive Search assumes ASCII files have been created using
the ASCII character set. If you have ASCII files that are actually coded in ANSI,
and if this distinction matters in your language, use the "ANSI" option
when creating your indexing rules.
Extended Options
- This section allows you to set any desired extended options to be applied
during indexing.
- For example you can set the number of index worker processes that get created during indexing
by adding the option indexworkers with a value of how many workers to use on top of the indexer.
- By default (or if not set) the number of workers will be equal to the number of cores minus 2
(with a minimum of 1), but can be overridden by setting the value here.
- A value of 0 means that there are no workers created and that the indexing will be done
within a single process.