The approach of using Solr for reading and a database for writing is a common pattern in high-traffic applications. This is often referred to as Command Query Responsibility Segregation (CQRS) https://learn.microsoft.com/en-us/azure/architecture/patterns/cqrs.
Solr for Reading, Database for Writing
- Writing (Command): All data modifications (create, update, delete) go to the main database. Example: PostgreSQL, MySQL, or MongoDB
- Reading (Query): Read operations are served by Solr https://solr.apache.org/, a fast, open-source search platform.
How it works
Data Flow
- When data is written to the database, it's also indexed in Solr.
- This can be done in real-time or as a batch process.
Querying
- Most read operations go to Solr instead of the database.
- Complex searches and filters are handled by Solr.
Example scenario
An e-commerce site during a Black Friday sale:
Writing:
- New orders are written to the PostgreSQL database.
- Product inventory updates are made in the database.
Reading:
- Product searches are handled by Solr.
- Faceted navigation (filtering by category, price, etc.) uses Solr.
- Product details are served from Solr.
Benefits
- Performance: A search for "red shoes under $50" across millions of products can be much faster in Solr than in a traditional database.
- Scalability: You can easily add more Solr nodes to handle increased read traffic without affecting the write database.
- Complex Queries: Solr excels at full-text search, faceting, and geospatial queries, which can be challenging for traditional databases.
- Reduced Database Load: During peak times, your database can focus on handling critical write operations - like processing orders - while Solr handles the bulk of read queries.
- Flexibility: You can optimize your Solr schema for fast reading without worrying about write performance.
Challenges
- Data Synchronization
- Eventual Consistency
- A just-placed order might not appear in search results immediately if there's a delay in indexing
- Solution: Implement a "write-through" cache (when data is written to the database, also write it directly to Solr) or use versioning (when serving search results, check if the Solr version matches the database version; if not, fetch from the database and update Solr)
- Increased Complexity
- Your system now has two data stores to manage and keep in sync
During peak sales, you can write orders to Redis https://redis.io/ first, then asynchronously persist to the database and Solr.