Giovanni Guardalben, HiT Syndicaat

Developing Enterprise Syndication Management Systems

Location: Affi, Italy

Wednesday, July 13, 2005

Interesting Article: Interview with Patrick Chanezon

Read "Interview with Patrick Chanezon". They talk about Enterprise Syndication. HiT Syndicaat is mentioned as one of the two players in this address space.

HiT Syndicaat is Powered by Apache Lucene

Originally, when we started development of HiT Syndicaat, we had envisioned a rather heterogeneous pattern of usage. Basically, we were concerned that content to syndicate would come from data vs. text content. In the former case, data is frequently updated. In the latter one, content needs to be searched in sophisticated ways. For this reason, in our product architecture, we organized repository to support both a text repository and a relational repository. It turned out that for internal reasons, we started developing the text repository support, first. Naturally, we immediately considered using Apache Lucene as our text search engine.

Things we noticed right away were the incredible performance of text searches (try searches at our demo RSS sites: HiT Syndicaat RSS Demo Feeds ) and the relatively slowness of response on updates (i.e., when changes are immediately visible – if changes are cached for later commitment, performance is outstanding). Soon, however we realized that content syndication is not about real time performance (as in online transaction processing) but rather about efficient and powerful text queries. For this reason, we decided to postpone development of the relational repository support and to stick to the Lucene-based text search engine for the foreseeable future.

Currently, as far as we know, not many web logs or RSS platforms are Lucene-based (check Powered by Lucene ). This is rather surprising considering that RSS and web log content is preeminently text and the efficiency of the Lucene software (let's not forget it is open-source...). What do you think?