One of the problems that the Nepomuk team had in the past were crashes in the strigi library (usually because of a corrupted file or similar). For that reason, the indexer was moved into a separate process, and it was executed for each file individually.

This is generally a good practice - ensuring that components can not crash the main server. For those who don’t know this, Nepomuk server invokes a few out-of-process services so that those can’t crash each other. And, as already stated, the file indexer service delegates the work to an external process. This is a rather nice design - hierarchical separation of risky components into external processes.

While this brings stability, it is not cheap. Launching processes takes valuable CPU time (loading, linking and other things your OS does whenever you start a program). This is not a big issue for Nepomuk’s services since there are only a few of them, and they are started only once, when the system boots.

But it was the issue with the indexer since the executable was being run for each file that needed indexing. This is not something that I was comfortable with, so I decided to make it a bit more sophisticated, without decreasing stability.

Now, the external indexer process is started only once, and the server feeds it with a list of files that need indexing - one by one. If the indexer crashes, the server just restarts it and continues without a glitch.