Sunday, July 19, 2009

New DLTK indexing is promising

The last couple of weeks I've been working on improving DLTK indexer, which was derived from JDT as is. The original bug report sounds like: "Indexing must be adapted for dynamic languages". I have to explain this point a little bit. In Java, every element reference is strongly bound to the original declaration. This is why one can calculate this binding during source code parsing and hold it in a memory (probably update it when referenced/referencing elements are changed). This is not the case for dynamic languages, consider this example (PHP):

<?php

function __autoload($class_name) {

    require_once $class_name . '.php';

}

$obj  = new MyClass();

?>

In this example PHP file is included before the class is loaded, and there's no way for IDE to determine which one. In order to have all JDT-like features in DLTK-based IDE resolution of elements binding is done each time from scratch. This ends up with a lot of queries and updates to index file, which are very I/O intensive operations.

We've tried to implement indexing using H2 database, and the results are really amazing! Here's a screen-cast showing how fast building of full hierarchy for 'Exception' class is using H2 database based index. Comparing to an older implementation I must admit that it's 10 times faster. Due to the fast access to the model most of other features will have great performance as well: Code Assist, Source Navigation, Source Editing, Mark Occurrences, etc...

I hope this will be included into DLTK 2.0.0 and PDT 2.2.0.