Sphinx Delta Index - Grows over time?
Here is a sample from a book "Introduction to Search with Sphinx":
We have 10 MB/second text indexing speed, 10 million documents in the
archive collection, 10,000 documents added daily, and 20 KB per average
document. Since we're also handling updates now, let's make that 10,000
added and 10,000 updated documents, totaling 20,000 documents, or 400 MB
of data.
So, our daily data set takes 40 seconds to reindex by the end of the day.
Tomorrow, that figure doubles, to 80 seconds. By the end of the week it's
200 seconds. In a year, 10,000 seconds or almost three hours.
If i understood it right a Delta Index fetch only new documents everytime.
For example a simple query for a delta:
sql_query = SELECT * FROM documents WHERE ts>=@maxts
It only fetches new documents from database and it leaves the previous
fetched documents behind (Right?)
Why on the sample it says it would take 80 second on the next day?
No comments:
Post a Comment