home
Mohamed Arbi Nsibi
Qdrant 1.18: Faster, Smarter, and Easier Vector Search

Qdrant 1.18: Faster, Smarter, and Easier Vector Search

May 19, 2026 · 5 min
· 0 views
Dynamic schema updates, TurboQuant compression, better monitoring, and stronger operational safeguards for production vector search.

1. A More Flexible Future for Vector Databases

Qdrant 1.18 is a big step toward making vector databases more flexible and easier to manage in production. Instead of treating vector indexes as something rigid that needs rebuilding every time you change them, Qdrant is moving toward a more dynamic system that can adapt while staying online.

For teams running AI systems at scale, this means better scalability, better monitoring, and fewer painful maintenance operations.

This release focuses on five major improvements:

One of the biggest improvements starts with solving a long-time issue in vector databases: re-indexing.


2. Dynamic Updates Without Rebuilding Collections

In AI applications, embedding models change often. Before Qdrant 1.18, updating a collection’s vector setup usually meant recreating the collection and re-uploading all data.

Now, Qdrant allows adding or removing named vectors directly using the PATCH API or schema endpoints, without downtime.

This means your collection structure can evolve together with your application while the cluster keeps running normally.

Real-World Benefit

A common use case is migrating to a new embedding model.

You can now:

All without shutting down the cluster or re-importing everything.

As collections become more dynamic, monitoring resource usage becomes even more important.

To add a new dense named vector to an existing collection:

client.create_vector_name(
    collection_name="{collection_name}",
    vector_name="{vector_name}",
    vector_name_config=models.DenseVectorNameConfig(
        dense=models.DenseVectorConfig(
            size=256,
            distance=models.Distance.COSINE,
        ),
    ),
)

To add a new sparse named vector to an existing collection:

client.create_vector_name(
    collection_name="{collection_name}",
    vector_name="{vector_name}",
    vector_name_config=models.SparseVectorNameConfig(
        sparse=models.SparseVectorConfig(
            modifier=models.Modifier.IDF,
        ),
    ),
)

3. Better Memory Monitoring and Observability

Memory usage in databases is often difficult to understand because operating system metrics don’t clearly show what is using RAM.

Memory Monitoring Dashboard

Qdrant 1.18 improves this with detailed memory monitoring for each collection component. These metrics are available both in the Web UI and through an API endpoint.

Memory Breakdown by Component

ComponentWhat It Tracks
VectorsDense vectors, HNSW indexes, quantization data, RAM usage, cached pages
Sparse VectorsSparse vector storage and indexing memory
PayloadMetadata storage attached to vectors
Payload IndexMemory used for metadata indexes
ID TrackerMapping between external IDs and internal IDs

Understanding “Expected Cache”

Modern operating systems use free RAM as file cache to improve performance. When vectors are stored with on_disk: true, the OS automatically keeps frequently used data in memory.

The Expected Cache metric shows how much data should ideally stay cached for fast searches.

If the Cached value is high, that’s usually a good sign, not a memory leak.

Comparing these values helps determine whether your system is properly warmed up for production traffic.

For a deeper technical explanation, see my last blog at:

TurboQuant: the compression Shannon would approve


4. TurboQuant: Better Compression Without Losing Too Much Accuracy

Before this release, there was a large gap between Scalar Quantization and Binary Quantization in terms of compression and recall quality.

TurboQuant fills that gap.

TurboQuant Compression Performance

It offers compression between 8x and 32x while still keeping strong recall performance.

TurboQuant works using:

One important advantage is that the rotation step preserves distances, so queries stay efficient without requiring expensive reverse operations.

Practical Results

In some datasets, TurboQuant improved recall by more than 20 percentage points compared to standard binary quantization.

Performance Optimizations

TurboQuant also uses low-level SIMD optimizations to keep searches fast even with heavy compression.

This allows Qdrant to reduce memory usage while still maintaining strong throughput.

For a full technical breakdown of TurboQuant and its math, check:

https://mohamedarbi.xyz/posts/turboquant-qdrant


5. Better Security and Infrastructure Protection

Qdrant 1.18 also adds several features focused on operational safety and debugging

Query Audit Logs

Administrators can now search JSON audit logs across the cluster using filters such as:

This simplifies security reviews and troubleshooting.

Request Tracing

Support for headers like:

makes it easier to connect client requests with server logs during debugging.

Strict Mode Improvements

New protections help avoid resource exhaustion:

Per-Collection Metrics

The new ?per_collection=true parameter helps identify latency spikes for specific collections in multi-tenant environments.


6. Conclusion

Qdrant 1.18 makes vector search systems more flexible, observable, and production-ready.

With dynamic schema updates, improved monitoring, TurboQuant compression, and stronger operational safeguards, this release helps teams scale AI applications more efficiently.

Upgrade Path

Recommended upgrade order:

Cloud Users

Select version 1.18 from the Qdrant Cloud dashboard for automatic migration.

Self-Hosted Users

Perform rolling node restarts and verify each node before moving to the next one to maintain availability.

Overall, Qdrant 1.18 gives developers more flexibility to evolve their vector search systems without sacrificing stability or performance.

References

  1. Qdrant Team. Qdrant 1.18 Release Notes and Documentation.
  2. Qdrant Documentation: Collections and Schema Updates.
  3. Qdrant Documentation: Monitoring and Observability.
  4. Qdrant Documentation: Strict Mode and Configuration.
  5. TurboQuant in Qdrant: Compression, Recall, and Performance.
Share this article: LinkedIn

(END)

Join the discussion