Apache Solr + TYPO3: Enterprise Search with Facets, Autocomplete and Vector Search

How EXT:solr turns TYPO3 into an enterprise search platform – from the Index Queue and faceted navigation to semantic vector search with the Solr module "llm" (Text-to-Vector / knn_text_to_vector) from Solr 9.8+ (from Solr 10: "language-models").

Overview

  • EXT:solr makes TYPO3 an enterprise search platform with full-text search, facets, autocomplete and highlighting – without an external SaaS service.
  • The Index Queue automatically synchronises pages, news, products and any records with the Solr server – including PSR-14 events for custom extensions.
  • Since Apache Solr 9.8, the Solr module llm (renamed to "language-models" from Solr 10; Text-to-Vector) delivers native vector and hybrid search – directly on the Solr server, without a separate vector database.
  • Hosting options range from DDEV locally and Docker production to hosted-solr.com as a managed service.

The built-in TYPO3 search queries page titles and content – sufficient for small projects, but a bottleneck for enterprise requirements. When you need to make thousands of pages, structured records, and files searchable, you need a dedicated search infrastructure. Apache Solr delivers exactly that – and the EXT:solr extension integrates this capability seamlessly into TYPO3.

This article demonstrates how EXT:solr works, what is possible since Solr 9.8 with the llm module (from Solr 10: language-models; embedding/vector pipeline, e.g., knn_text_to_vector), and how you can implement enterprise search in your project.


Table of Contents  


What EXT:solr Delivers  

EXT:solr connects TYPO3 to an Apache Solr server and provides a complete enterprise search infrastructure:

Full-Text Search

Searchable pages, news, products and arbitrary records – with language-specific stemming and tokenisation.

Faceted Navigation

Filter by category, date, type or any custom field – just like in any modern online shop.

Autocomplete

Real-time search suggestions, even before pressing the Enter key. With top-result previews.

Architecture Overview  

The system consists of four cooperating components:

EXT:solr architecture: From the editor to the search result

How the document lifecycle works:

  1. Editors create or edit content in TYPO3.
  2. The Monitoring detects changes via PSR-14 events and writes an entry into the Index Queue.
  3. A Scheduler Task processes the queue and sends documents as JSON to the Solr server.
  4. The Search Plugin in the frontend sends queries to Solr and renders results via Fluid templates.

Component Overview  

ComponentPackagePurposeRequired?
EXT:solrapache-solr-for-typo3/solrCore integration: Indexing, search, backend moduleYes
EXT:tikaapache-solr-for-typo3/tikaText extraction from PDFs, DOCX, XLSX (~1,200 formats)Only file search
EXT:solrfalFunding ExtensionIndex FAL files in SolrOnly file search
EXT:solrconsoleFunding ExtensionBackend console for Solr managementOptional
EXT:solrdebugtoolsFunding ExtensionQuery debugging, score analysis in the frontendRecommended (dev)

Version Compatibility  

EXT:solrTYPO3Apache SolrConfigsetPHP
13.1.x (13.1.1)13.4 LTS9.10.1ext_solr_13_1_08.2 – 8.4
12.112.4 LTS9.10.1ext_solr_12_1_08.1 – 8.3
Note on the TYPO3 12.4 row

The 12.4 LTS row serves as a comparison for migrations from older projects. For new development on TYPO3 v14, the matrix targets 14.3 + EXT:solr 14.0; on v13 LTS, 13.4 + EXT:solr 13.1 remains the appropriate stack.

Security: EXT:solr 13.1.x and EXT:tika

The 13.1.x line (currently including 13.1.1) maintains Solr 9.10.1 and patches, among others, CVE-2025-66516, CVE-2026-22444 and CVE-2026-22022 – always check the Release Notes and the Version Matrix before going live. EXT:tika 13.1 requires Apache Tika Server 3.2.3+ and addresses, among others, CVE-2025-66516 and CVE-2025-66516.

TYPO3 v14 and EXT:solr 14.0

The Version Matrix lists TYPO3 14.3 with EXT:solr 14.0, Apache Solr 9.10.1 and the Configset ext_solr_14_0_0. As long as no stable 14.0.x tags are published on Packagist, teams frequently use Composer branches like dev-main / 14.0.x-dev (often with minimum-stability: dev and prefer-stable: true):

Before every upgrade, check the current status on Packagist and at GitHub Releases, and switch to ^14.0 as soon as stable releases exist. Older integration branches from discussions are secondary to the published matrix and releases .


Setup in Under 10 Minutes  

Installation via Composer  

Local Development with DDEV  

The fastest route to a local Solr instance is the official DDEV addon :

Solr Image Version (DDEV)

The addon documents a default SOLR_BASE_IMAGE (e.g., solr:9.8). The Version Matrix recommends Solr 9.10.1 for TYPO3 13.4/14.x. For parity with production: e.g., ddev dotenv set .ddev/.env.solr --solr-base-image="solr:9.10.1" and rebuild the Solr service (see the ddev-typo3-solr README).

Configure cores in .ddev/typo3-solr/config.yaml:

And create the cores after startup in .ddev/config.yaml:

CommandDescription
ddev solrctl applyCreate cores from configuration
ddev solrctl wipeWipe all cores
ddev launch :8984Open Solr Admin UI in the browser
ddev logs -s typo3-solrView Solr log files

Docker Production  

For production, the official Docker image provides pre-configured cores for all languages:

Managed Hosting  

Two options for those who prefer not to run their own Solr server:

hosted-solr.com

Managed Solr by dkd – the maintainers of EXT:solr. Pre-configured cores from approx. 10 EUR/month ("Small" plan, based on pricing page; excluding VAT as noted by the provider).

Mittwald

Managed container platform with Solr service. Access via mw container port-forward --port 8983.

TYPO3 Site Configuration  

Add the connection in config/sites/<identifier>/config.yaml:

Environment Configuration

Never hardcode Solr credentials. Use helhum/dotenv-connector for environment-specific .env files with SOLR_HOST, SOLR_PORT and SOLR_CORE_*. Commit .env.example and keep .env gitignored.


Index Queue: The Heart of the System  

The Index Queue is the central mechanism through which TYPO3 content gets into Solr.

Document lifecycle: From the editor to the search result

Indexing Pages  

Pages are indexed out of the box – no additional configuration is necessary. The page indexer renders the page and sends the content to Solr.

Indexing Arbitrary Records  

Every TYPO3 table can be added to the index via TypoScript. Here is an example for EXT:news:

An overview of the three content objects:

Content ObjectPurpose
SOLR_CONTENTRemoves HTML/RTE tags from field contents
SOLR_RELATIONResolves relations (categories, tags), supports multiValue = 1
SOLR_MULTIVALUESplits comma-separated fields into multiple values

Dynamic Fields  

EXT:solr utilises dynamic field suffixes to add custom fields without modifying the schema :

SuffixTypeMulti?Example
_stringSString (not analysed)Nocategory_stringS
_stringMString (not analysed)Yestags_stringM
_textSText (analysed)Nodescription_textS
_intSIntegerNoyear_intS
_dateSDateNopublished_dateS
_floatSFloatNoprice_floatS
_boolSBooleanNoactive_boolS

Monitoring  

EXT:solr detects record changes via PSR-14 events. Two modes are available:

  • Immediate (Default): Changes are processed directly during the DataHandler operation.
  • Delayed: Changes are queued in the event queue (tx_solr_eventqueue_item) and processed by a separate Scheduler task.

Configuration is done via Extension Settings → monitoringType.

Site Hash Strategy  

Under Admin Tools → Settings → Extension Configuration → solr, the siteHashStrategy controls how the site hash is generated in Solr documents. For new TYPO3 13.4+ projects, siteHashStrategy = 1 (Site-Identifier) is recommended; older defaults might still use the domain-based variant (0, deprecated). A change of strategy requires a complete re-indexing.


Basic Configuration  

Faceted Navigation  

Facets transform the search into an interactive filtering experience :

Available facet types: options (default), queryGroup, hierarchy, dateRange, numericRange.

Autocomplete / Suggest  

jQuery-free

The built-in Suggest feature uses jQuery by default. For a modern, jQuery-free implementation, EXT:solr allows the integration of a custom Vanilla JS frontend.

Highlighting & Spellchecking  


Vector and Hybrid Search with Solr 9.8+  

Since Apache Solr 9.8 (January 2025), Solr includes the Solr module llm (Text-to-Vector, embedding pipeline; renamed to language-models from Solr 10) . This enables semantic search directly on the Solr server – without a separate vector database and without additional middleware. Classes and configuration follow the packages org.apache.solr.llm.* / solr.llm.* (in Solr 10+: org.apache.solr.languagemodels.* / solr.languagemodels.*; depending on the Solr version, please check the Reference Guide).

The LLM Module (Solr 9.8: "llm" / from Solr 10: "language-models" / Text-to-Vector)  

The pipeline solves the vocabulary mismatch problem of classic keyword search: users phrase their questions differently to how the content is written. Texts and search queries are transformed into vectors that map semantic similarity.

Vector search: Texts and queries are converted into vectors and compared via KNN

Supported embedding providers (via LangChain4j ):

ProviderPopular ModelDimensions
OpenAItext-embedding-3-small1536
Mistral AImistral-embed1024
Cohereembed-v31024
HuggingFaceVarious open-source modelsvariable

Hybrid Search: BM25 + Vector  

The most powerful configuration combines classic BM25 ranking with vector re-ranking:

This allows you to benefit from exact keyword matches and semantic similarity.

Use Cases  

Use CaseApproach
Intelligent page searchknn_text_to_vector finds semantically similar content
Similar articlesK-Nearest Neighbours on the vector of a document
Multilingual bridgeEmbeddings match across languages
FAQ matchingMatch user questions to FAQ answers despite different phrasing
RAG retrievalSolr as a retrieval layer for LLM-supported Q&A systems
Consider costs and data protection

Vectorisation typically runs via external embedding APIs (configured in the Solr model store) – it is not fully "local" without your own model. Consider API costs, check data privacy requirements, and plan fallbacks for API outages.

EXT:solr and vectors

Native end-to-end support (Index Queue → Vector fields → Fluid) in EXT:solr is continuously evolving. For custom fields, PSR-14 listeners (e.g., BeforeDocumentsAreIndexedEvent) combined with a custom embedding service are suitable – see TYPO3-Solr/ext-solr.


Troubleshooting in 6 Steps  

If the search doesn't work, the problem lies in one of four areas: Connection, Indexing, Solr Core, or Frontend. Proceed systematically.

Diagnostic Flowchart  

Systematic troubleshooting: From the connection to the frontend

Common Errors and Solutions  

SymptomProbable CauseSolution
"Search unavailable"Solr connection misconfiguredCheck site configuration, initialise connection
No resultsSite hash mismatchCheck siteHashStrategy, re-index
Empty Index QueueTypoScript not loadedInclude static templates on the root page
Queue items are not being indexedScheduler task missingCreate "Index Queue Worker" task
Facets not visiblefaceting = 0 or empty fieldEnable faceting, check field values
Docker: Permission deniedIncorrect volume ownershipUID 8983 must own /var/solr
Jar loading error (Solr ≤ 9.7)CVE-2025-24814 affects Solr up to 9.7Upgrade to Solr 9.8+; alternatively move typo3lib/ to the server root or use Docker 13.0.1+

Enabling Logging  

For deeper diagnostics, enable TypoScript logging (dev/staging only):

And configure the log file in config/system/additional.php:

Disable logging in production

Logging generates a significant volume of data and impairs performance. Remove plugin.tx_solr.logging and the writerConfiguration from additional.php before going live.


PSR-14 Events for Custom Extensions  

EXT:solr fires PSR-14 events at central points, which you can utilise for your own logic.

Adding Custom Fields to Documents  

Overview of Important Events  


TYPO3 v14: Changes for Developers  

If you are bringing EXT:solr to TYPO3 v14 / Fluid 5.0, the following points, among others, are relevant according to SKILL and Upstream:

  • Fluid 5.0: Search templates and facet partials now use strictly typed ViewHelper arguments and no underscore-prefixed template variables – check your custom overrides.
  • TypoScriptFrontendController: Custom indexers or plugins that use $GLOBALS['TSFE'] must migrate to request attributes / the new frontend simulation.
  • TCA: ctrl.searchFields is removed in v14 in favour of 'searchable' => true per column (primarily affects the backend search, not the Index Queue field mappings).

Production Checklist  

Configset version matches (ext_solr_13_1_0)
JVM memory is configured (SOLR_JAVA_MEM=-Xms512m -Xmx1g)
Solr is not publicly accessible – firewall or reverse proxy
Scheduler task "Index Queue Worker" runs regularly
Logging is disabled in production
Solr data volume is backed up regularly
CVE patches applied (CVE-2025-24814 affects Solr ≤ 9.7 – upgrading to 9.8+ fixes the issue; Tika Server 3.2.3+ for EXT:tika 13.1)
siteHashStrategy = 1 (site-identifier, recommended since EXT:solr 13.0)
One core per language with a language-specific schema (e.g., core_de, core_en)

Conclusion  

EXT:solr elevates TYPO3 search to enterprise level – with facets, autocomplete, highlighting and, since Solr 9.8, semantic vector search. The extension is mature, actively maintained by dkd Internet Service and ready to use locally with DDEV in just a few minutes.

The three core takeaways:

  1. EXT:solr indexes everything – pages, news, products, files. The Index Queue and PSR-14 events make the integration flexible and extensible.
  2. Vector search on the Solr server – The llm module (from Solr 10: language-models; Text-to-Vector, e.g., knn_text_to_vector) in Solr 9.8+ brings semantic search to the Solr core; check the current release for EXT:solr integration across all steps.
  3. The setup is straightforward – install the DDEV addon, configure cores, create the Scheduler task. Done.

When deciding whether the standard search is sufficient for your project or if a dedicated search infrastructure is required: As soon as you have more than a few hundred pages, need to make structured data searchable or require facets – there is no avoiding Solr.

For Developers

TYPO3 Solr Skill

Detailed reference for EXT:solr — Index Queue, search, debugging, llm module/vector search. Main file SKILL.md; additions SKILL-FRONTEND.md (e.g., Suggest without jQuery) and SKILL-SOLRFAL.md (file index). Open-source Agent Skill for Claude, Cursor, and VS Code.

Auf GitHub öffnen

Let's talk about your project

Locations

  • Mattersburg
    Johann Nepomuk Bergerstraße 7/2/14
    7210 Mattersburg, Austria
  • Vienna
    Ungargasse 64-66/3/404
    1030 Wien, Austria

Parts of this content were created with the assistance of AI.