Feeds:
Posts
Comments

Posts Tagged ‘triple extraction’


Triple extraction (Subject, Predicate, Object) is a good method to translate free-form sentences into knowledge.

The Gate Predicate-Argument Extractor Component (PAX) could be very useful in this task.

PAX is a GATE component for extracting predicate-argument structures (PAS). PASs are used in various contexts to represent relations within a sentence structure. Different “semantic” parsers extract relational information from sentences but there exists no common format to store these information. The predicate-argument extractor component (PAX) takes the annotations generated by selected parsers and extracts/transforms the parsers’ results to predicate-argument structures represented as triples (subject-verb-object).

I pass the day testing this plug in and I’m very satisfied with the first results.

By the way In the beginning I was having some configuration problems, but after some emails with project creators everything started working fine.  Thanks guys ;)

More info about this plug-in is available here.

Read Full Post »


The Stanford Parser just returns a list of dependencies between word tokens. To manipulate the dependencies, we will almost certainly want to put them in a graph data structure. We are going to try this using JGraphT.

JGraphT is a free Java graph library that provides mathematical graph-theory objects and algorithms. JGraphT supports various types of graphs including:

  • directed and undirected graphs.
  • graphs with weighted / unweighted / labeled or any user-defined edges.
  • various edge multiplicity options, including: simple-graphs, multigraphs, pseudographs.
  • unmodifiable graphs – allow modules to provide “read-only” access to internal graphs.
  • listenable graphs – allow external listeners to track modification events.
  • subgraphs graphs that are auto-updating subgraph views on other graphs.
  • all compositions of above graphs.

Although powerful, JGraphT is designed to be simple and type-safe (via Java generics). For example, graph vertices can be of any objects. You can create graphs based on: Strings, URLs, XML documents, etc; you can even create graphs of graphs! This code example shows how.

Other features offered by JGraphT:

References: JGraphT

Read Full Post »


NLP2RDF is a framework that integrates multiple NLP tools and linguistic ontologies in order to explicate implicit meaning of natural language by means of RDF/OWL descriptions.
Natural language ( a character sequence with implicit knowledge) is converted into a more expressive formalism – in this case OWL-DL – aiming to grasp the underlying meaning. This explicated meaning then serves as input for (high-level) algorithms and applications with a focus on machine learning.

Did I found the holy grail??

I’m very motivated to try this framework … I hope I can show some results in few days!

References: NLP2RDF

Read Full Post »

Follow

Get every new post delivered to your Inbox.