Backend12 min read

Elasticsearch: Scaling Search Beyond Relational Databases

At certain scale, traditional relational databases hit limits with search speed and relevance. That's when teams should consider adding a search-optimized database like Elasticsearch to the stack.

SD
Sushant Dhimal
Covosys
|
29th December 2025

In the early days of a software product, a simple backend built on top of a relational SQL database is usually enough to handle nearly 90 percent of the use cases the system needs to support.

As the product matures and more users onboard, the dataset that used to contain few thousand rows grows into hundreds or even million of records. Queries that were once simple lookups evolve into complex joins and filters. This causes slowdown in the application performance.

Eventually, the application will hit a point where the SQL database is no longer able to handle the load. And teams will need a better solution optimized for search. Elasticsearch is the industry standard for search and analytics. I am going to explain its capabilities and best use cases in this blog.

The Search Problem Most Applications Eventually Hit

Most applications begin with SQL, whether that is PostgreSQL, MySQL, or SQL Server. These databases are extremely good at what they are designed for: ACID principles and complex joins across normalized schemas.

Early implementation of search with SQL uses a pattern like this:

SELECT *
FROM products
WHERE name LIKE '%headphone%';

With a small dataset, this works well enough. As the number of rows and table joins increase, the queries start to evolve into something like:

SELECT *
FROM products
WHERE name ILIKE '%wireless headphone%'
   OR description ILIKE '%wireless headphone%'
   OR category ILIKE '%wireless headphone%'
ORDER BY
  CASE
    WHEN name ILIKE '%wireless headphone%' THEN 1
    WHEN category ILIKE '%wireless headphone%' THEN 2
    ELSE 3
  END,
  popularity DESC
LIMIT 20;

You can see there are several problems with this approach:

  • Leading wildcards force full table scans – indexes are effectively bypassed and performance degrades linearly with dataset size.
  • The search is brittle – any typo, pluralization difference, or variation in wording leads to empty results, even when relevant data exists.
  • Relevance ranking becomes ad hoc – implemented through increasingly complex CASE statements that are hard to reason about and harder to tune.
  • Semantic relationships are absent – synonyms or related terms are completely missing, which directly impacts discoverability.

PostgreSQL’s built-in full-text search can help bridge the gap, especially compared to basic LIKE queries, but it still has practical limits. It works well for simple keyword matching, yet starts to show strain when users expect forgiving searches that handle typos, rank results in a way that actually feels relevant, and respond instantly even as the dataset grows into the million

What Elasticsearch Actually Is

Major thing to note is that Elasticsearch is not a database replacement. You still need a regular databse to store the data. Elasticsearch is used on top of it. It is a distributed search engine built specifically for querying large volumes of text efficiently and intelligently.

At its core, Elasticsearch uses an inverted index. Instead of scanning documents to find matching text, it indexes every term and maintains a mapping from each term to the documents in which it appears. This structure allows it to answer complex text queries extremely quickly, even across millions of documents.

This architectural difference is why Elasticsearch excels at tasks that are fundamentally hard for relational databases. It breaks text into terms and builds an index that lets it jump directly to relevant documents instead of scanning rows one by one. Because of this, things like full-text search, relevance ranking, typo tolerance, and aggregations perform exceptionally well in Elasticsearch.

Translating the Same Search Into Elasticsearch

The earlier SQL example becomes significantly simpler and more expressive when modeled as a search query:

GET /products/_search
{
  "query": {
    "multi_match": {
      "query": "wireless headphone",
      "fields": ["name^3", "category^2", "description"],
      "fuzziness": "AUTO"
    }
  }
}

This single query expresses several important ideas clearly and declaratively. It searches across multiple fields simultaneously, assigns explicit relevance weights to each field, and tolerates minor spelling errors automatically. Relevance scoring is handled by the engine itself using well-established information retrieval algorithms rather than handcrafted SQL logic.

In practice, this often translates to response times dropping from seconds to tens of milliseconds, while result quality improves noticeably from a user's perspective.

Where Elasticsearch Is a Natural Fit

Search-Heavy Applications

Product catalogs, document repositories, internal knowledge bases, and developer tooling all rely on fast, forgiving search that behaves more like a search engine and less like a strict database filter.

Log Aggregation and Observability

In distributed systems with many services emitting logs, storing and querying logs in a relational database quickly becomes impractical. Elasticsearch, combined with Logstash and Kibana, allows engineers to search across massive log volumes in near real time, which is critical during incident response.

Real-Time Analytics

Elasticsearch often outperforms traditional databases for read-heavy workloads. Time-based aggregations, rolling metrics, and dashboards that update continuously are core use cases, not edge cases.

Geospatial Search

Queries like "entities within a radius, ranked by relevance and secondary metrics" are straightforward to express and efficient to execute without specialized extensions.

How It Compares to SQL Databases

Relational databases remain the correct choice for transactional workloads, strong consistency guarantees, and complex relational queries. Elasticsearch trades some of those guarantees for speed and flexibility in search-oriented workloads.

In most mature systems, the two coexist. PostgreSQL or MySQL acts as the system of record, while Elasticsearch provides optimized search and analytics on a denormalized view of the data.

Use CaseSQLElasticsearch
Transactions & ACIDExcellentNot designed for this
Complex JOINsNative supportLimited, denormalize instead
Full-text searchBasic supportPurpose-built
Fuzzy matchingVery limitedExcellent
Real-time analyticsPossible with tuningOptimized
Log analysisNot recommendedIndustry standard

Lessons Learned From Using Elasticsearch in Production

  • Elasticsearch should never be treated as the primary source of truth. It is not ACID-compliant, and failures or inconsistencies are handled differently than in relational systems. Data should flow into Elasticsearch from a reliable backend, whether through change data capture or application-level events.
  • Schema design matters more than many teams expect. Changing field mappings later requires reindexing, which can be expensive at scale. Planning index structure early saves significant operational pain.
  • Memory usage must be monitored carefully. Elasticsearch is sensitive to heap pressure, and poorly tuned clusters can fail under load in ways that are difficult to debug during peak traffic.
  • Sharding should be approached conservatively. Over-sharding is a common mistake and often leads to worse performance rather than better scalability.

Deciding Whether You Actually Need It

Elasticsearch is not a default dependency. It should be introduced only when there is a clear mismatch between what SQL databases do well and what the application requires. Search speed, tolerance for imperfect input, relevance quality, and analytics latency are the usual drivers.

When evaluating it, testing with realistic data volumes is essential. Performance characteristics change significantly as data grows, and assumptions based on small datasets rarely hold in production.

The Trade-offs You Should Know About

Elasticsearch is powerful, but it is not without drawbacks. Before adopting it, teams should be aware of the operational and architectural costs that come with it.

  • Operational complexity is real. Running Elasticsearch in production requires monitoring heap usage, managing cluster health, handling shard allocation, and planning for node failures. It is not a set-and-forget system.
  • Resource consumption is high. Elasticsearch is memory-intensive. A cluster that seems fine during development can struggle under production load if JVM heap and disk I/O are not sized correctly.
  • Data synchronization adds complexity. Since Elasticsearch is not your source of truth, you need a reliable mechanism to keep it in sync with your primary database. This introduces latency, potential consistency issues, and another failure point to monitor.
  • No transaction support. Elasticsearch does not support multi-document transactions. If your use case requires atomic updates across multiple records, you will need to handle that at the application layer.
  • Schema changes can be painful. Unlike relational databases where you can alter columns, changing field mappings in Elasticsearch often requires reindexing the entire dataset, which can take hours or days at scale.
  • Licensing considerations. While Elasticsearch is open source, some features are only available under paid licenses. Additionally, the licensing model has changed over time, which may affect long-term planning.

These trade-offs do not make Elasticsearch a bad choice. They simply mean it should be adopted deliberately, with a clear understanding of what you are signing up for operationally.

Final Thoughts

Relational databases are still the backbone of almost every system I have worked on, and I do not see that changing anytime soon. I personally prefer PostgreSQL as my go-to database for most use cases. For most of the use cases, it is a great choice and enough on its own.

That said, after working on multiple products and client projects at Covosys, I have repeatedly seen search become a pain point as systems grow. At certain scale, queries start to become complex and not enough for searching needs. In those situations, Elasticsearch has consistently been the right tool for me to use. I keep PostgresSQL as the source of truth and use Elasticsearch as a search layer on top of it.

Two common issues I’ve encountered are the need to reindex whenever field types or mappings change, and keeping Elasticsearch in sync with the primary database. At Covosys, we handled syncing using change data capture tools like Debezium, though each team is free to choose the tool that fits their architecture best. For re-indexing issue, I suggest to use alias for existing index and create a new index with the new mapping. Then switch the things. This way, you will have zero downtime and you can keep the old index for a while to be safe.

Got questions about search or data at scale?

Happy to chat if you're figuring out whether Elasticsearch makes sense for your use case.