Alfresco Content Services Version 7: The Features We're Excited About!
By Steve Stott – Principal Architect, Content Management Solutions
In April 2021 Hyland/Alfresco released the latest version of Alfresco Content Services (ACS), 7.0, along with a number of new or improved modules and tools. If you’re a Technology Executive or Architect, you’ll want to know how significant this release is to you and what its impact might be on your organization.
This article focuses on what ClearCadence considers to be the top features announced by Hyland at their launch webinar on April 13, 2021. I’ll be providing overviews of each and describing the business benefits to aid you in answering the question “Does this release deliver enough added value for me to choose ACS, or to upgrade my current ACS implementation?”
As a company, Alfresco has a track record of innovation in their feature-rich, open-source, digital business platform releases. Many would agree that versions 5.2 and 6.0 saw the biggest leaps in platform functionality – providing more and more rich core content management features, greatly expanded APIs, UI development options, and improvements in system operations. Happily, with Alfresco now being part of Hyland, this trend continues. With ACS 7.0 we see, of course, updates to core content management capabilities, point enhancements to APIs, an expansion of the Application Development Framework (ADF), and more of the natural improvements you’d expect in a maturing platform.
However, I feel the most exciting announcements in this release are for features that impact SysOps in their managing of deployments, system performance, and operating at scale.
That’s not to say that developers have been neglected – there’s now an important alternative to the traditional methods for platform customization and extension, and it’s one that Architects and Technology Executives can get excited about, too.
So, what are we covering with this article?
ALFRESCO SDK 5
Those of us old enough to remember the bygone days of ACS 5.x and earlier (it all seems so long ago!) often look back with affection to the ACS installer wizard. We stroke our gray beards and sigh for simpler times.
The release of ACS 6 came at a time when containerized deployment technologies matured, with increases in adoption rates across IT organizations. This was perfectly timed to coincide with the shift from ACS as an all-in-one application to a collection of services and modules comprising a platform. Deployment decisions could be based on preferred infrastructure choices.
Alfresco focused installation and deployment innovations on popular technologies such as Docker and Kubernetes. There was no longer an installer available for download with the platform. For anyone not implementing a containerized deployment approach, the alternative was a semi-manual implementation from zip files, with increased product set up and the need to separately download, install and configure supporting version of products such as Java, ActiveMQ, LibreOffice etc.
What, then, does the latest ACS 7 release bring?
Hyland/Alfresco has recognized the significant appetite many organizations have for simpler, non-containerized deployment options. Specifically, support has been introduced for Ansible playbooks – both for ACS 7.0 going forward and for 6.2.x. This sees a set of sample Ansible playbooks for Linux-based implementations made available, which go a long way to simplifying the installation process in non-containerized distributions. Ansible based deployment is also now included in the ACS support documentation.
In addition to fully embracing Ansible as a deployment option, Hyland/Alfresco has also provided enhanced support for Docker Compose and Helm charts.
What this means for your IT organization is that you really have the choice and flexibility to choose a deployment architecture that best meets your infrastructure choices, skill set, and scale – ultimately resulting an accelerated time to value for your implementation and streamlined maintenance processes going forward.
ACS has been proven to operate very effectively at scale. It’s been a number of years since the platform broke through the billion-document benchmark, and there are implementations in production today operating very successfully at and above this volume.
So, if it already can be demonstrated to work at scale, what would this latest release need to address? Well, there’s a big difference between ‘working’ at scale and being inherently suited to perform at scale.
Experience has shown that supporting such high volumes of content records in terms of storage, ingestion, retrieval, and consumption requires appropriate application configuration and customization, careful selection of infrastructure elements, optimization of database and storage performance, and specialization in maintenance processes.
The improvements announced in ACS 7 go a long way in providing the features that allow the platform to scale up naturally. In addition, they enhance performance and stability across all levels of implementation, irrespective of scale. So what are these features?
A New Query Accelerator
This enterprise-only feature optimizes selected queries in very large content repositories by allowing the definition of ‘Query Sets’. These are sets of properties and aspects for which faster queries can be executed. This is achieved with updates to database tables and indexes which allows for improved performance versus traditional ACS queries against Solr and transactional metadata.
I can see this this feature proving significant in providing enhanced response times, especially in turnkey processes relying on predictable, repetitive queries to return result sets, irrespective of scale.
Finer Rendition Control
It’s long been recognized that the generation of content renditions in ACS is a powerful feature. But as Peter Parker came to understand as Spider-Man, ‘With great power comes great responsibility’. There’s a great processing overhead that generating renditions can mean when handling significant volumes of content that shouldn’t be underestimated.
Consider the processing needed to generate thumbnail renditions in use-cases where several thousand documents are ingested hourly. Consider the extra storage needed for the image files. Consider the database entries created to reference them.
Previous version of ACS have thumbnail generation enabled on upload for all content by default. But are they always necessary or relevant? Thumbnails of images are very useful in a gallery view, however, why generate a thumbnail for every Word or textual piece of content? What if no one ever navigates through folders or lists of content records – why generate thumbnails that no-one will ever use?
Control over things such as thumbnail generation has been limited in previous ACS versions. Configuration allowed for broad suppression of renditions. For control at a more granular level customizations might be required.
Among the capabilities provided by ACS 7 are:
Selective disabling of renditions based on type of content
Ability to disable automatic thumbnail creation for CMIS uploads and uploads from certain ADF applications such as the Alfresco Digital Workspace
Selective disabling of renditions when ingesting large quantities of content
Changes to the treatment of renditions as discrete nodes in the repository
The benefits provided in application performance, resource management and, ultimately operating costs should not be underestimated.
Improved Performance of Clean-Up Jobs
File this one under ‘Greater Platform Stability.’ Cross reference it with ‘Less stress for the Sys Admin’.
There are certain background processes and jobs which run, by default, on a schedule in ACS. On the whole, these tick along unnoticed, doing their important work day by day. They deal with handling the cleanup of deleted content. They manage the buildup of temp content. And, they “housekeep” property records in the database. And they work pretty well. That is until they are faced with situations where the scale of the clean-up needed reaches unexpected levels.
Imagine the scenario where vast amounts of transactions are happening constantly. Content is ingested in the hundreds of thousands, and content is deleted or updated at the same rate. One or more aspects is added or removed from swathes of content in a day, all part of a valid and effective business process. That part works like a charm.
However, when the time comes for the clean-up jobs to get to work, they could find themselves dealing with extreme query result sets, or processing vast numbers of records that they haven’t been optimized to handle. Processing times therefore can be excessive, resources stretched, potentially resulting in system performance hits, or heaven-forbid – production outages.
Hyland/Alfresco have recognized this as a significant problem and have introduced updates to the back-end clean-up jobs that take into account potential peaks in activity.
Temp file cleanup now allows configuration of the maximum number of files per process run, plus, a new properties clean-up job has been created that maintains its performance when processing upwards of 100M files.
This benefit can be filed under "Fewer surprises" and cross referenced with "A better night's sleep, IT Management!"
I could probably keep this section very brief and just say ‘Elasticsearch is here!’
The core search engine for ACS has been, and remains, Solr, and very efficient it is, too. However, indexing and search technologies continue to advance apace and the new kid on the block is maturing fast.
I won’t cover the architectural differences between Solr and Elasticsearch. There is however one way they do differ, as implemented with ACS, which is of particular significance. Eventual Consistency vs Near Real-Time Indexing.
Simply put, Solr indexes are maintained based on what amounts to frequent polling of the ACS repository for updates. Typically, the polling frequency and efficiency of indexing means that as far as users and background processes are concerned, searching of new content can take place pretty rapidly. At times of great upload or update activity, there can be a backlog of transactions to process. Even then, this is seldom an issue. However, it has its inherent inefficiencies which need to be taken into account when designing applications or performing large-scale sets of transactions such as mass migrations or metadata updates etc.
The Elasticsearch implementation with ACS, on the other hand, uses ActiveMQ messaging from the repository to indexing on commit of a transaction. This, along with great Elasticsearch indexing performance, means no waiting time for indexing to catch up.
ACS 7 allows for Alfresco Search Services 3.0 to use Elasticsearch as an alternative to Solr. This an Early Access program from Alfresco and entirely optional. Solr will continue to be supported as a Search engine for ACS 7.x, and most likely beyond, though the overall direction of search going forward appears to be towards Elasticsearch.
The main benefit of the Elasticsearch solution is flexibility. It can be run uncoupled from the ACS implementation, as part of the install set, or even as a managed service. Since there are no ACS-specific customizations to Elasticsearch, it can be run effectively out of the box, which could potentially allow for the leveraging of existing deployments or shared enterprise instances.
Alfresco SDK 5
Development kits and solutions are generally of less concern to Architects and Technical Executives than more architectural or platform related features. However, the benefits of effective, flexible development toolkits should not be underestimated.
I’m certainly stating the obvious when citing the connection between better tools and faster, higher quality application development, deployments, and maintenance. Where things take on even more significance is when the tools are specifically tailored for a platform.
Now Hyland/Alfresco has released an alternative SDK, not replacing the traditional one, but offering a different way to develop what they are referring to as ‘Out-Of-Process‘ extensions.
Without delving into all the gory details, this SDK provides extension points based on event triggers – such as Content Upload, Folder Creation, and Content change etc. These are, in principle, at a somewhat more straightforward level for interacting with the main events of content management. The intent is to simplify and streamline this type of development.
While this will be welcomed by developers, there may be a period of uncertainty as to which SDK to use - the new SDK 5 or traditional SDK 4. The decision will be based on the functionality exposed in each.
Fundamental platform extensions and customizations along with detailed usage of core APIs will likely remain via SDK 4 for the time being. Development of functionality related to upload, change and manipulation of content (especially where invoked by external integrations) may well be best suited to SDK 5.
In this article I’ve provided my opinions on the top features I believe are most relevant to Architects and Technical Executives in the latest Hyland/Alfresco product release.
Support for Ansible Playbooks adds even more flexible deployment options based on the latest toolsets.
Limitations related to working with large repositories are addressed so you can confidently scale your implementation.
The option to use Elasticsearch for your search service greatly opens up your architecture choices.
ALFRESCO SDK 5
Developers have a more in their toolbox to choose from, with a new development kit designed with Out-of-Process extensions in mind.
There have long been compelling reasons to choose Alfresco Content Services and the wider set of Alfresco Digital Business Platform products. This latest product release from Hyland/Alfresco, with ACS 7.0 and the features described in this article (and many more not covered here), only go to reinforce why it should be your choice for flexible, open-source and scalable management and governance of content and business process. For those with an existing ACS implementation, the benefits of upgrading should be clear.
Steve Stott is a Principal Architect at ClearCadence, with a primary focus on designing and delivering effective content-based solutions for Fortune 1000 clients.
ClearCadence has a long track record of assisting customers with analyzing, planning, designing, and implementing solutions. Contact us at the link above to discuss how we can help you.