24 March 2015

Natural Language Processing

Natural Language Processing has come a long way from the past eras of rules driven approaches to utilizing more Machine Learning techniques, paving the way to even more advanced hybrid methods. The area is also quite diverse and constantly growing with active research in the community. We also find Natural Language Processing as an applied discipline for almost all web and document related extraction problems. However, there is still room for more scalable libraries and frameworks as they seem to emerge out of mainly research and at times also have restricted user licenses. Natural Language Processing applications are usually designed in a pipeline architecture. They can also utilize rich domain semantics from Linked Data ontologies, vocabularies, thesauri, or even commonsense knowledge bases. Increasingly, they are also utilizing deep learning methods. In general, there are also formal frameworks supported by industry collaborations such as UIMA for building entire pipelines. Or, even frameworks like Gate that provide a variety of pluggable libraries for different domain cases and tasks in the pipeline. The following are some interesting libraries in the domain area that could be applied for Natural Language Processing applications.

solrjava
elasticsearchjava
coreNLPjava
gatejava
jgibbLDAjava
kea/mauijava
lingPipejava
minorthirdjava
openNLPjava
sphinxjava
spotlightjava
wekajava
tikajava
carrot2java
UIMAjava
wordfreakjava
cleartkjava
dkprojava
baliejava
simpleNLGjava
openCCGjava
glossaryjavascript
lingojavascript
naturaljavascript
nlp-nodejavascript
pos-jsjavascript
redsjavascript
tfidfjavascript
conceptnetpython
gensimpython
nltkpython
patternpython
textblobpython
rakepython
spacypython
scalaNLPscala
linkgrammarc