Implementing Qt data models is anything but fun. For that reason, I don’t blame anyone for writing a beginResetModel / endResetModel combo any time a more complex change has happened.

That is something that we all do, that is something that even Qt people do in their classes (see QSqlTableModel::select for example).

Resetting a model is an expensive process that can induce overly many transition animations in QML views. And it is often not really necessary.

Updates are rarely that complex to warrant invalidation of all data in the model. Usually, they only consist of a few additions or removals.

Levenshtein distance

I was planning to fix this in the KActivities data models for some time now, but haven’t found the time until now.

Recently I started teaching algorithms to my students, coincidentally some modifications of the usual Levenshtein distance calculation algorithm. (if you don’t recall the algorithm courses you had, L.D. calculates how many insertions, deletions and replacements you have to make in order to convert one string to another)

Now, the Levenshtein distance is not really a fitting algorithm for this task. Here, at least in KActivities use-case, most edits are not per-item. They are more suitable to be represented as sequences of block operations. Block operations also fit the Qt data model API perfectly.

After some research, I found a paper by Walter F. Tichy. A paper that is as old as I am, talking about the string-to-string correction problem with block moves. The algorithm described there was almost a perfect fit.

The algorithm

It is a really simple algorithm, that can be described in only a few lines:

Take the first character in the destination string
Find all occurences of said character in the source string, and take the one that defines the longest matching block
Now, repeat the algorithm for the first character in the destination string that was outside of the previously detected block.

For example:

Source:      abcabcdefg
             | /----/
             |/
Destination: abcdefbcg

In this case, for the character ‘a’, it will find two blocks in the source string: ‘abc’ and ‘abcdef’. It will choose the second one, and register a block operation that moves it to the beginning of the string. The next one will detect ‘bc’, and so on.

In the case of KActivities, the situation is a bit simpler. A model can not contain repeated items, so there is no need to find the biggest block - there is only one possible block that starts with a specific element.

The algorithm has O(n²) complexity in the worst case, but in the common case it tends to be linear - O(n).

Testing

In order to test the updates, the model testing application now has both QWidget-based UI and QML-based one. All transitions in QML are set to last for one second, which makes all the events blatantly obvious.

Why the post title?

So, why is the title of this post ‘there and back again’?

While working on the above algorithm, I wanted to decouple the model cache implementation from the model itself.

The idea was for the cache class to have methods that do not modify anything, but to return a sequence of edit operations that need to be performed.

Something like this:

QList<EditOperation> setItems(...) const;
QList<EditOperation> trimData(...) const;

and a single non-const method that actually performs a specific operation:

void applyOperation(EditOperation eo);

The model class that uses this cache would get a list of operations that should be made, and execute one by one while invoking the corresponding begin(Insert/Remove/Move)Rows / end____Rows methods.

The idea was clean and beautiful, but overly complex and induced some unnecessary performance penalties. The main problem was in the above algorithm, because it needs to actually be able to change the cache data while generating the list of block edits.

For decoupling to work properly, it would need to create a temporary copy of the cached data just to be able to modify it while calculating the edit operations. Then, it would destroy it and return a list of operations it did on the temporary data to be performed on the actual cache.

So, in the end, I decided that efficiency is more important, and allowed the cache to directly call begin_Rows/end_Rows methods.

Conclusion

Is the moral of the story that algorithms are more important to know then object-oriented programming and encapsulation?

Yes… No… Maybe… :)

Chriss
The way QT handles Model - View is crap anyways. They dont even bother fixing it after so many versions.
I wonder why they dont build something simple and elegant like Delphi's TDataset or even a basic Entity Framework using information from QMetaObject.
They spend way too much time on UI stuff and animations but it is too difficult to produce even simple data binding UI's on QT. In other languages you can do this only from the designer
Ivan Čukić
@Chriss
Yes, Qt models are overly rcomplex. The point, I guess, was to have a system as generic as possible. And when you have a system that can abstract any kind of structure to any other kind, you get a beast like this. :)
It didn't have the sole aim of showing SQL-related results. (I think I've seen some 3rd party classes that do look like TDataset, but I can't remember where)
I agree that it would be nice to have a QMetaObject-based framework fot QSql.

Write a new comment...

Ivan Čukić

dr Ivan Čukić

Projects

Blog categories

Functional Programming in C++

Patronage

There and back again, an algorithm tale

Levenshtein distance

The algorithm

Testing

Why the post title?

Conclusion

Comments:

dr Ivan Čukić

Projects

Blog categories

Functional Programming in C++

Patronage

There and back again, an algorithm tale

Levenshtein distance

The algorithm

Testing

Why the post title?

Conclusion

Share:

Related:

Comments: