The Transform Service handles the essential transforms, such as Microsoft Office documents, images, and PDFs. These include PNG for thumbnails, PDF and JPEG for downloads and previews.
The main components of the Transform Service are:
- Content Repository (ACS): This is the repository where documents and other content resides. The repository produces and consumes events destined for the message broker (such as ActiveMQ or Amazon MQ). It also reads and writes documents to the shared file store.
- ActiveMQ: This is the message broker (either a self-managed ActiveMQ instance or Amazon MQ), where the repository and the Transform Router send image transform requests and responses. These JSON-based messages are then passed to the Transform Router.
- Transform Router: The Transform Router allows simple (single-step) and pipeline (multi-step) transforms that are passed to the Transform Engines. The Transform Router (and the Transform Engines) run as independently scalable Docker containers.
- Transform Engines: The Transform Engines transform files referenced by the repository and retrieved from the shared file store. Here are some example transformations for each Transform Engine (this is not an exhaustive list):
- LibreOffice (e.g. docx to pdf)
- ImageMagick (e.g. resize)
- Alfresco PDF Renderer (e.g. pdf to png)
- Tika (e.g. docx to plain text)
- Misc. (not included in diagram)
- Shared File Store: This is used as temporary storage for the original source file (stored by the repository), intermediate files for multi-step transforms, and the final transformed target file. The target file is retrieved by the repository after it's been processed by one or more of the Transform Engines.
The following diagram shows a simple representation of the Transform Service components:
Note that from Transform Service version 1.3.2 the metadata extraction that usually takes part in the core repository legacy transform engines has now been lifted out into the separate transform engine processes. This enables scaling of the metadata extraction.
This shows an example implementation of how you can deploy into AWS, using a number of managed services:
- Amazon EKS - Elastic Container Service for Kubernetes
- Amazon MQ - Managed message broker service for Apache ActiveMQ
- Amazon EFS - Amazon Elastic File System
You can replace the AWS services (EKS, MQ, and EFS) with a self-managed Kubernetes cluster, ActiveMQ (configured with failover), and a shared file store, such as NFS.
Note
For more detailed representations of the Alfresco Content Services deployment (including the Transform Service), see the GitHub Docker Compose and Helm documentation.
The advantage of using Docker containers is that they provide a consistent environment for development and production. They allow applications to run using microservice architecture. This means you can upgrade an individual service with limited impact on other services.