Friday 10 December 2010

Text Mining and Linked Data Cloud

The text mining technologies and tools have been around for past decade. The way most of the text mining engines work, they require a good set of bootstrapping entities in order to perform well (w.r.t. Precision and recall). These bootstrapping entities are called Gazetteers/ Authority Files/ Lists etc. in different tools.

With the emergence of Linked data cloud and its open datasets there is a great opportunity to utilize Text Mining to achieve even better results where entities in these datasets can be utilized in bootstrapping.

Here is my take on utilizing Linked Data cloud with information extraction system GATE.
Presentation at GATE course in May, 2010.