Apache Solr + TYPO3: Enterprise Search with Facets, Autocomplete, and Vector Search

How EXT:solr turns TYPO3 into an enterprise search platform – from the Index Queue via faceted navigation to semantic vector search using the Solr "language-models" module (Text-to-Vector / knn_text_to_vector) from Solr 9.8+.

Overview

  • EXT:solr turns TYPO3 into an enterprise search platform with full-text search, facets, autocomplete, and highlighting – without an external SaaS service.
  • The Index Queue automatically synchronises pages, news, products, and custom records with the Solr server – including PSR-14 events for custom extensions.
  • Since Apache Solr 9.8, the Solr language-models module (often referred to as 'LLM'/Text-to-Vector in the reference) delivers native vector and hybrid search – directly on the Solr server, without a separate vector database.
  • Hosting options range from DDEV locally and Docker in production to hosted-solr.com as a managed service.

The built-in TYPO3 search scans page titles and content – sufficient for small projects, but a bottleneck for enterprise requirements. When you need to make thousands of pages, structured records, and files searchable, you require a dedicated search infrastructure. Apache Solr provides exactly that – and the EXT:solr extension integrates this capability seamlessly into TYPO3.

This article demonstrates how EXT:solr works, what is possible since Solr 9.8 with the language-models module (embedding/vector pipeline, e.g. knn_text_to_vector), and how to implement enterprise search in your project.


Table of Contents  


What EXT:solr Achieves  

EXT:solr connects TYPO3 to an Apache Solr server and provides a complete enterprise search infrastructure:

Full-text Search

Searchable pages, news, products, and custom records – with language-specific stemming and tokenisation.

Faceted Navigation

Filter by category, date, type, or any custom field – just like in any modern online shop.

Autocomplete

Real-time search suggestions even before pressing the Enter key. With top-result previews.

Architecture Overview  

The system consists of four cooperating components:

EXT:solr Architecture: From the editor to the search result

How the document lifecycle works:

  1. Editors create or edit content in TYPO3.
  2. The monitoring detects changes via PSR-14 events and writes an entry to the Index Queue.
  3. A scheduler task processes the queue and sends documents as JSON to the Solr server.
  4. The search plugin in the frontend sends queries to Solr and renders results via Fluid templates.

Component Overview  

ComponentPackagePurposeRequired?
EXT:solrapache-solr-for-typo3/solrCore integration: indexing, search, backend moduleYes
EXT:tikaapache-solr-for-typo3/tikaText extraction from PDFs, DOCX, XLSX (~1,200 formats)Only file search
EXT:solrfalFunding ExtensionIndex FAL files in SolrOnly file search
EXT:solrconsoleFunding ExtensionBackend console for Solr managementOptional
EXT:solrdebugtoolsFunding ExtensionQuery debugging, score analysis in the frontendRecommended (dev)

Version Compatibility  

EXT:solrTYPO3Apache SolrConfigsetPHP
13.1.x (13.1.1)13.4 LTS9.10.1ext_solr_13_1_08.2 – 8.4
12.112.4 LTS9.10.1ext_solr_12_1_08.1 – 8.3
Note on the TYPO3 12.4 row

The 12.4 LTS row is provided for comparison during migrations from older projects. For new developments on TYPO3 v14, the matrix targets 14.3 + EXT:solr 14.0; on v13 LTS, 13.4 + EXT:solr 13.1 remains the appropriate stack.

Security: EXT:solr 13.1.x and EXT:tika

The 13.1.x line (currently including 13.1.1) uses Solr 9.10.1 and addresses CVE-2025-54988, CVE-2026-22444, and CVE-2026-22022 among others – always check the release notes and the Version Matrix before going live. EXT:tika 13.1 requires Apache Tika Server 3.2.3+ and addresses CVE-2025-54988 and CVE-2025-66516 among others.

TYPO3 v14 and EXT:solr 14.0

The Version Matrix lists TYPO3 14.3 with EXT:solr 14.0, Apache Solr 9.10.1, and the configset ext_solr_14_0_0. As long as stable 14.0.x tags have not yet been published on Packagist, teams often use Composer branches like dev-main / 14.0.x-dev (often with minimum-stability: dev and prefer-stable: true):

Before any upgrade, check the current status on Packagist and GitHub Releases and switch to ^14.0 as soon as stable releases exist. Older integration branches from discussions are secondary to the published matrix and releases .


Setup in under 10 minutes  

Installation via Composer  

Local Development with DDEV  

The fastest way to a local Solr instance is the official DDEV add-on :

Solr Image Version (DDEV)

The add-on documents a default SOLR_BASE_IMAGE (e.g., solr:9.8). The Version Matrix recommends Solr 9.10.1 for TYPO3 13.4/14.x. For parity with production: e.g., ddev dotenv set .ddev/.env.solr --solr-base-image="solr:9.10.1" and recreate the Solr service (see ddev-typo3-solr README).

Configure cores in .ddev/typo3-solr/config.yaml:

And configure .ddev/config.yaml to create the cores after starting:

CommandDescription
ddev solrctl applyCreate cores from configuration
ddev solrctl wipeDelete all cores
ddev launch :8984Open Solr Admin UI in the browser
ddev logs -s typo3-solrView Solr log files

Docker Production  

For production, the official Docker image provides pre-configured cores for all languages:

Managed Hosting  

Two options for anyone who prefers not to operate their own Solr server:

hosted-solr.com

Managed Solr by dkd – the maintainers of EXT:solr. Pre-configured cores starting at approx. 10 EUR/month (Plan 'Small', based on pricing page; excl. VAT as stated by provider).

Mittwald

Managed container platform with Solr service. Access via mw container port-forward --port 8983.

TYPO3 Site Configuration  

Add the connection in config/sites/<identifier>/config.yaml:

Environment Configuration

Never hard-code Solr credentials. Use helhum/dotenv-connector for environment-specific .env files with SOLR_HOST, SOLR_PORT, and SOLR_CORE_*. Commit .env.example, keep .env gitignored.


Index Queue: The Core Component  

The Index Queue is the central mechanism through which TYPO3 content reaches Solr.

Document Lifecycle: From the editor to the search result

Indexing Pages  

Pages are indexed out of the box – no additional configuration is required. The page indexer renders the page and sends the content to Solr.

Indexing Custom Records  

Any TYPO3 table can be added to the index via TypoScript. Here is an example for EXT:news:

The three content objects at a glance:

Content ObjectPurpose
SOLR_CONTENTRemoves HTML/RTE tags from field content
SOLR_RELATIONResolves relations (categories, tags), supports multiValue = 1
SOLR_MULTIVALUESplits comma-separated fields into multiple values

Dynamic Fields  

EXT:solr uses dynamic field suffixes to add custom fields without requiring a schema change :

SuffixTypeMulti?Example
_stringSString (unanalysed)Nocategory_stringS
_stringMString (unanalysed)Yestags_stringM
_textSText (analysed)Nodescription_textS
_intSIntegerNoyear_intS
_dateSDateNopublished_dateS
_floatSFloatNoprice_floatS
_boolSBooleanNoactive_boolS

Monitoring  

EXT:solr detects record changes via PSR-14 events. Two modes are available:

  • Immediate (Default): Changes are processed directly during the DataHandler operation.
  • Delayed: Changes are collected in the event queue (tx_solr_eventqueue_item) and processed by a separate scheduler task.

Configuration is done via Extension Settings → monitoringType.

Site Hash Strategy  

Under Admin Tools → Settings → Extension Configuration → solr, siteHashStrategy controls how the site hash is formed in Solr documents. For new TYPO3 13.4+ projects, siteHashStrategy = 1 (Site Identifier) is recommended; in older defaults, the domain-based variant (0, deprecated) may still be active. Changing the strategy requires a complete reindexing.


Basic Configuration  

Faceted Navigation  

Facets transform the search into an interactive filtering experience :

Available facet types: options (Default), queryGroup, hierarchy, dateRange, numericRange.

Autocomplete / Suggest  

jQuery-free

The built-in suggest feature uses jQuery by default. For a modern, jQuery-free implementation, EXT:solr allows the integration of a custom vanilla JS frontend.

Highlighting & Spellchecking  


Vector and Hybrid Search with Solr 9.8+  

Since Apache Solr 9.8 (January 2025), Solr includes the Solr language-models module (Text-to-Vector, embedding pipeline; often referred to as the 'LLM module' in older texts) , which enables semantic search on the Solr server – without a separate vector database, without additional middleware. Classes and configurations follow the org.apache.solr.languagemodels.* / solr.languagemodels.* packages (depending on the Solr version, please check the Reference Guide).

The LLM Module (Solr: 'language-models' / Text-to-Vector)  

The pipeline solves the vocabulary mismatch problem of classic keyword search: Users formulate their questions differently than the content does. Texts and search queries are converted into vectors that map semantic similarity.

Vector search: Texts and queries are converted into vectors and compared via KNN

Supported embedding providers (via LangChain4j ):

ProviderPopular ModelDimensions
OpenAItext-embedding-3-small1536
Mistral AImistral-embed1024
Cohereembed-v31024
HuggingFaceVarious open-source modelsvariable

Hybrid Search: BM25 + Vector  

The most powerful configuration combines classic BM25 ranking with vector re-ranking:

This way, you benefit from exact keyword matches and semantic similarity.

Use Cases  

Use CaseApproach
Intelligent page searchknn_text_to_vector finds semantically similar content
Similar articlesK-Nearest Neighbours on the vector of a document
Multilingual bridgeEmbeddings match across languages
FAQ matchingMatch user questions to FAQ answers despite different phrasing
RAG retrievalSolr as the retrieval layer for LLM-supported Q&A systems
Consider Costs and Data Privacy

Vectorisation typically runs via external embedding APIs (configured in the Solr model store) – it is not fully 'local' without providing your own model. Take API costs into account, check data protection requirements, and plan fallbacks for API outages.

EXT:solr and Vectors

Native end-to-end support (Index Queue → vector fields → Fluid) in EXT:solr is still evolving. For custom fields, PSR-14 listeners (e.g., BeforeDocumentsAreIndexedEvent) combined with a custom embedding service are suitable – see TYPO3-Solr/ext-solr.


Troubleshooting in 6 Steps  

If the search is not working, the problem lies in one of four areas: Connection, Indexing, Solr Core, or Frontend. Proceed systematically.

Diagnosis Flowchart  

Systematic troubleshooting: From the connection to the frontend

Common Errors and Solutions  

SymptomLikely CauseSolution
"Search not available"Solr connection misconfiguredCheck site configuration, initialise connection
No resultsSite hash mismatchCheck siteHashStrategy, reindex
Empty Index QueueTypoScript not loadedInclude static templates on root page
Queue items are not indexedScheduler task missingCreate "Index Queue Worker" task
Facets not visiblefaceting = 0 or empty fieldEnable faceting, check field values
Docker: Permission deniedVolume ownership incorrectUID 8983 must own /var/solr
Jar loading error (Solr 9.8+)CVE-2025-24814 migration requiredMove typo3lib/ to server root or use Docker 13.0.1+

Activating Logging  

For deeper diagnosis, activate TypoScript logging (only dev/staging):

And configure the log file in config/system/additional.php:

Deactivate Logging in Production

Logging generates a considerable amount of data and impairs performance. Remove plugin.tx_solr.logging and the writerConfiguration from additional.php before going live.


PSR-14 Events for Custom Extensions  

EXT:solr triggers PSR-14 events at central points , which you can use for your own logic.

Adding Custom Fields to Documents  

Important Events at a Glance  


TYPO3 v14: Changes for Developers  

If you are upgrading EXT:solr to TYPO3 v14 / Fluid 5.0, the following points, among others, are relevant according to SKILL and upstream:

  • Fluid 5.0: Search templates and facet partials require strictly typed ViewHelper arguments, no underscore-prefixed template variables – check your custom overrides.
  • TypoScriptFrontendController: Custom indexers or plugins that use $GLOBALS['TSFE'] must migrate to request attributes / the new frontend simulation.
  • TCA: ctrl.searchFields is dropped in v14 in favour of 'searchable' => true per column (primarily affects backend search, not Index Queue field mappings).

Production Checklist  

Configset version matches (ext_solr_13_1_0)
JVM memory configured (SOLR_JAVA_MEM=-Xms512m -Xmx1g)
Solr not publicly accessible – firewall or reverse proxy
Scheduler task "Index Queue Worker" runs regularly
Logging deactivated in production
Solr data volume is backed up regularly
CVE patches applied (Solr 9.8+ Jar migration CVE-2025-24814, Tika Server 3.2.3+ for EXT:tika 13.1)
siteHashStrategy = 1 (site identifier, recommended since EXT:solr 13.0)
One core per language with language-specific schema (e.g., core_de, core_en)

Conclusion  

EXT:solr elevates TYPO3 search to an enterprise level – with facets, autocomplete, highlighting, and, since Solr 9.8, semantic vector search. The extension is mature, actively maintained by dkd Internet Service , and locally ready to use with DDEV in just a few minutes.

The three key takeaways:

  1. EXT:solr indexes everything – pages, news, products, files. The Index Queue and PSR-14 events make integration flexible and extensible.
  2. Vector search on the Solr server – The language-models module (Text-to-Vector, e.g., knn_text_to_vector) in Solr 9.8+ brings semantic search to the Solr core; check the EXT:solr integration for all steps in the current release.
  3. The setup is straightforward – install the DDEV add-on, configure the cores, create the scheduler task. Done.

When deciding whether the standard search is sufficient for your project or if a dedicated search infrastructure is needed: as soon as you have more than a few hundred pages, need to make structured data searchable, or require facets – there is no getting around Solr.

For Developers

TYPO3 Solr Skill

Detailed reference for EXT:solr — Index Queue, search, debugging, language-models/vector search. Main file SKILL.md; additions SKILL-FRONTEND.md (e.g., Suggest without jQuery) and SKILL-SOLRFAL.md (file index). Open-source Agent Skill for Claude, Cursor, and VS Code.

Auf GitHub öffnen

Let's talk about your project

Locations

  • Mattersburg
    Johann Nepomuk Bergerstraße 7/2/14
    7210 Mattersburg, Austria
  • Vienna
    Ungargasse 64-66/3/404
    1030 Wien, Austria

Parts of this content were created with the assistance of AI.