Resources

The resources use a data model that is supported by a set of client-side libraries that are made available on the files and libraries page.

There is a WADL document available that describes the resources API.

name path methods description
BatchStreamSetService
  • /streamset
  • /streamset/{id}
  • /streamset/{id}/document
  • /streamset/{id}/fields
  • /streamset/{id}/utf8
  • /streamset/{id}/file/cbm
  • /streamset/{id}/file/ptb
  • /streamset/{id}/file/zip
  • PUT
  • DELETE HEAD
  • POST
  • GET POST
  • GET PUT
  • GET HEAD POST
  • GET HEAD POST
  • GET HEAD
Creation and management of batch stream sets.
FieldExtractorService
  • /extractor
  • /extractor/{id}
  • /extractor/{id}/extract
  • /extractor/{id}/fields
  • /extractor/{id}/fieldtargets
  • /extractor/{id}/learn
  • /extractor/{id}/relearn
  • /extractor/{id}/usePositionalInformation
  • /extractor/{id}/utf8
  • /extractor/{id}/file/extractor
  • /extractor/{id}/streamset/{streamSetId}/learn
  • /extractor/{id}/streamset/{streamSetId}/extract/{docNum}
  • PUT
  • DELETE HEAD
  • POST
  • GET POST
  • POST
  • POST
  • POST
  • GET PUT
  • GET PUT
  • GET HEAD POST
  • GET
  • GET
Training of field extractors and extraction of fields.
LearnSetManagerService
  • /learnset/project
  • /learnset/ocr/available
  • /learnset/project/{projectId}
  • /learnset/project/{projectId}/check
  • /learnset/project/{projectId}/class
  • /learnset/project/{projectId}/classfields
  • /learnset/project/{projectId}/docs
  • /learnset/project/{projectId}/fields
  • /learnset/project/{projectId}/isLearnable
  • /learnset/project/{projectId}/learn
  • /learnset/project/{projectId}/numDocs
  • /learnset/project/{projectId}/testset
  • /learnset/project/{projectId}/updateStreamSet
  • /learnset/project/{projectId}/upload
  • /learnset/project/{projectId}/check/extraction
  • /learnset/project/{projectId}/check/fieldtargets
  • /learnset/project/{projectId}/class/{classId}
  • /learnset/project/{projectId}/fields/statistics
  • /learnset/project/{projectId}/testset/{testSetId}
  • /learnset/project/{projectId}/upload/{uploadId}
  • /learnset/project/{projectId}/class/{classId}/doc
  • /learnset/project/{projectId}/class/{classId}/fields
  • /learnset/project/{projectId}/doc/oldest/{noOfDocToBeDeleted}
  • /learnset/project/{projectId}/class/{classId}/doc/{docId}
  • /learnset/project/{projectId}/class/{classId}/doc/oldest/{noOfDocToBeDeleted}
  • /learnset/project/{projectId}/class/{classId}/doc/{docId}/check
  • /learnset/project/{projectId}/class/{classId}/doc/{docId}/class
  • /learnset/project/{projectId}/class/{classId}/doc/{docId}/fields
  • /learnset/project/{projectId}/class/{classId}/doc/{docId}/image
  • /learnset/project/{projectId}/class/{classId}/doc/{docId}/meta
  • /learnset/project/{projectId}/class/{classId}/doc/{docId}/pageCnt
  • /learnset/project/{projectId}/class/{classId}/doc/{docId}/pos
  • /learnset/project/{projectId}/testset/{testSetId}/doc/{docId}/class
  • /learnset/project/{projectId}/testset/{testSetId}/doc/{docId}/extract
  • /learnset/project/{projectId}/testset/{testSetId}/doc/{docId}/image
  • /learnset/project/{projectId}/testset/{testSetId}/doc/{docId}/meta
  • /learnset/project/{projectId}/testset/{testSetId}/doc/{docId}/pos
  • /learnset/project/{projectId}/class/{classId}/doc/{docId}/check/extraction
  • /learnset/project/{projectId}/class/{classId}/doc/{docId}/check/fieldtargets
  • /learnset/project/{projectId}/class/{classId}/doc/{docId}/locations/fields
  • /learnset/project/{projectId}/class/{classId}/doc/{docId}/locations/value
  • /learnset/project/{projectId}/testset/{testSetId}/doc/{docId}/locations/value
  • /learnset/project/{projectId}/testset/{testSetId}/doc/{docId}/extract/forceLearning/{learn}
  • GET PUT
  • GET
  • DELETE GET PUT
  • GET
  • GET PUT
  • GET
  • GET POST
  • GET POST
  • GET
  • GET
  • GET
  • POST PUT
  • HEAD
  • POST
  • GET
  • GET
  • DELETE GET PUT
  • GET
  • DELETE GET HEAD POST
  • DELETE GET
  • GET POST
  • GET POST
  • DELETE
  • DELETE GET
  • DELETE
  • GET
  • PUT
  • GET POST
  • GET POST
  • GET
  • GET
  • GET POST
  • POST
  • GET
  • GET
  • GET
  • GET
  • GET
  • GET
  • GET
  • GET
  • GET
  • GET
LearnsetNoTouchModeService
  • /no-touch-mode/learnset/add-staging-data
  • /no-touch-mode/learnset/project/doc
  • POST
  • POST
LearnsetSchedulerService
  • /learnset/projects/learn-scheduler
  • /learnset/projects/learn-scheduler/start
  • /learnset/projects/learn-scheduler/{schedulerId}
  • /learnset/projects/learn-scheduler/{schedulerName}
  • GET
  • POST
  • DELETE
  • GET

Data Types

JSON

type description
BoundingBox Container to carry the positional information of word.
Candidate Container to carry the candidate related properties for inserting into the staging table.
ClassFieldDeclaration Container to carry the information of fields of the Document Class.
DataCell Simple container to carry extracted data string for a single cell. Used by ExtractedData.
DocumentAdapter Carrier for document information.

When training an extractor, make sure to fill the word list, the page list and the field list. When extracting fields from a document you only need to fill the word list and page list.

DocumentClass Information about a document class
DocumentHeaderField Container to carry the corrected header fieldName and header fieldValue , as a part of request.
DocumentImportStatus Container to carry the document import status information.
DocumentUploadStatus Container to carry the uploading status information of the documents. It takes account of number of total and imported documents and their status
ExtendedClassFieldStatistics Field statistics for a single class, covering how many values have been found at all for a field within that class and for how many of those values a target can be located.
ExtractedData Contains the extraction result for a single field. There are usually multiple candidates which are provided as a list of DataCell.
FieldDeclaration Declaration of a field that can be extracted.
FieldInfo Container to carry field information.
FieldLocations Container to carry the information of word locations respective to the field.
FieldStatistics Basic field statistics, covering how many values exist for a given field in a project or class
LearnsetDocumentAddRequest Container to carry the all the properties required to facilitate the request payload for adding documents from the staging tables to the learnset tables.
LearnsetSchedulerProperties Container to carry the properties of a global scheduler.
PageInfo Container to carry the page orientation and positional information.
Project Container to carry the information of a Project created in the ALM application.
StagingDataAddRequest Container to carry the all the properties required to facilitate the request payload for adding documents data to staging tables.
TmpALMDocument Container to carry all the required properties for populating TMPALMDOCUMENT table's data.
TmpALMField Container to carry all the required properties for populating TMPALMFIELDS table's data
TrainingDocumentIncident Description of a failed plausibility check on a traing document
TrainingDocumentMetaData Meta data about a stored training document
TrainingSetCheckResult Result of a training set plausibility check, including found incidents and field statistics.
WordInfo Container to carry word information.