Tiedemann (2007), wrote an interesting paper about Genetic Algorithms, regarding how to improve passage retrieval in Question Answering systems. In this paper four selection strategies in evolutionary optimization of information retrieval (IR) in a question answering system are compared. The IR index has been enhanced by linguistic features to improve the retrieval performance of potential [...]
Archive for the ‘Methodology’ Category
Question Answering systems & Genetic Algorithms
Posted in information retrieval, Methodology, passage retrieval, question and answering, tagged Genetic Algorithms, Tiedemann on April 23, 2010 | Leave a Comment »
Passage Retrieval – Some Algorithms
Posted in information retrieval, Methodology, passage retrieval, question and answering, trec, tagged algorithm, alicante, bm25, Clarke, clef, hovy, ibm, information, isi, Ittycheriah, lee, Light, llopis, mitre, multitext, okapi, okapi bm25, PR, Robertson, siteq, sliding window, tellex, trec, vicedo, voting on April 19, 2010 | 2 Comments »
Research in Question Answering (QA) systems has been improved by the Text Retrieval Conference (TREC) series since 1999. Almost all QA systems fielded at TREC employ some passage retrieval technique to reduce the size of the relevant document set to a manageable number of passages. Here are a bunch of algorithms that might be useful [...]
Ambiguity of the day
Posted in Methodology, ontology, tagged class, concept maps, concepts, individuals, ontologies, semantic networks, synonym, wordnet on April 19, 2010 | Leave a Comment »
“… semantic networks suffer from an inherent semantic ambiguity. For example, we were unable to differentiate individuals from concepts in the resulting concept maps. Moreover, due to the direct translation of written sentences into concept map sentences, various terms were used to express synonyms, resulting in further ambiguity …” Hummm…. interesting (or not). References: Building [...]
Evaluation in Information retrieval
Posted in information retrieval, Methodology, trec, tagged clef, evaluation of IR, IR, manning, precision, recall, text collections, trec on April 15, 2010 | 1 Comment »
We have seen that are many alternatives in designing an Information Retrieval (IR) system. However, how do we know which of these techniques are effective? To measure IR effectiveness in the standard way, we need an input test collection consisting of three things: A document collection A test suite of information needs, expressible as queries [...]
Agile Development within a Research Environment
Posted in Agile Development, Methodology, tagged agile software development, agile software development framework, SCRUM on April 12, 2010 | Leave a Comment »
For this research project, instead of a plan-driven or a disciplined methodology, we use an Agile Software development framework, SCRUM. But what is SCRUM? SCRUM is an iterative, incremental framework for agile software development. Scrum encouraging co-location, and verbal communication across all team members and disciplines that are involved in the project. A key principle [...]
Some blueprints…
Posted in information retrieval, Methodology, ontology, passage retrieval, prymas, tagged name-entity, penn treebank, pos-tagging, stemming, stopwords, stopwords removal, tagger, triples on April 6, 2010 | 1 Comment »
It works something like that: We need to build a system that is capable of automatically identifying highly relevant triples (pairs of concepts connected by a relation) over concepts from an existing ontology. By extracting relevant verbs and their grammatical arguments from a domain-specific text collection and computing corresponding relations through a combination of linguistic [...]
Tools: Lucene Java
Posted in information retrieval, Methodology, tools, tagged apache, api, compass, doug cutting, eb-eye, information, IR, jakarta, liferay, lucene, nutch, open-source, pangaea, solr on April 1, 2010 | 1 Comment »
What Lucene is? Lucene is a high-performance, scalable Information Retrieval (IR) library, created originally by Doug Cutting. It provides indexing and searching features to applications. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform. Lucene is a mature, open-source project implemented in Java, available for free download. Moreover, it’s [...]
Gold Rush
Posted in knowledge management, Methodology, passage retrieval, prymas, tagged Entity Recognition, GATE, machine learning, POS, semantic network, Steeming, Tokenization, UIMA, unstructured knowledge on March 23, 2010 | 1 Comment »
Usually X marks the spot, but the path for conversion of unstructured knowledge into a reliable and efficient knowledge database of facts isn’t straightforward. Despite the knowledge being already assembled in a machine-optimal-representation, information recovery into a English natural language answer isn’t trivial. Nowadays, the amount of information that companies deal with is overwhelming. Being [...]
How to Think?
Posted in knowledge representation, Methodology, tagged frames, knowledge representation, MultiNet, semantic network, SNePS on March 23, 2010 | Leave a Comment »
How to represent knowledge in a way that could be efficiently manipulated by a machine program? How to formally represent the domain of a problem? How to achieve intelligent behavior? These are the 1M $ questions… The key problem is to find a representation and a supporting system that make inferences within the constraints, appropriated [...]
