Useful Links related to the project
Information Retrieval
- Apache Lucene (high-performance, full-featured text search engine library written in Java).
- Information Retrieval (Information Retrieval definition in Wikpedia).
- Introduction to Information Retrieval (Christopher D. Manning, Prabhakar Raghavan and Hinrich Schütze, Cambridge University Press. 2008).
- Modern Information Retrieval (Baeza-Yates, Ribeiro-Neto, Addison Wesley Longman).
- ACM SIGIR (ACM Special Interest Group on Information Retrieval – The group covers all aspects of information storage, retrieval and dissemination, including research strategies, output schemes and system evaluations. The site mainly provides information on conferences and calls for papers).
- TREC (Text REtrieval Conference – A conference between various approaches to indexing and searching for data in very large collections has identified the most successful approaches in information retrieval. Sponsored by the NIST, the conference helps translate theory into practice, and provides an objective testbed).
- Asian Text Retrieval Workshop (Evaluation of Asian language text retrieval, question answering and text summarization, following on the US NIST TREC workshops. Also includes cross-language information retrieval in Chinese, Korean, Japanese and English).
- Web IR and IE (Information Extraction)
- Information Retrieval at the Illinois Institute of Technology (Research group with projects in improving retrieval performance, efficiency, visualization, integrating structured data and text, and so on. Includes an excellent IR links page).
- Jeff’s Search Engine Caffè (Information Retrieval research and search engine development discussion).
Natural Language Processing
- Natural Language Processing / Information Retrieval Software Repository ( This directory and account holds centralized software and tools for natural language processing (NLP) and information retrieval (IR) research and teaching at the School of Computing at the National University of Singapore).
- OpenNLP (hosts a variety of java-based NLP tools which perform sentence detection, tokenization, part-of-speech tagging, chunking and parsing, named-entity detection, and co-reference analysis using the Maxent machine learning package).
- Stanford Parser and Part-of-Speech (POS) Tagger – Java packages for sentence parsing and part of speech tagging from the Stanford NLP group. It has implementations of probabilistic natural language parsers, both highly optimized PCFG and lexicalized dependency parsers, and a lexicalized PCFG parser. It’s has a full GNU GPL license.
Text Mining
Question & Answering
Knowledge Representation
Knowledge Management
