Can computers learn to read? We think so. “Read the Web” is a research project that attempts to create a computer system that learns over time to read the web. Since January 2010, a computer system called NELL (Never-Ending Language Learner) has been running continuously, attempting to perform two tasks each day:
- First, it attempts to “read,” or extract facts from text found in hundreds of millions of web pages (e.g., playsInstrument(George_Harrison, guitar)).
- Second, it attempts to improve its reading competence, so that tomorrow it can extract more facts from the web, more accurately.
At present, NELL has accumulated a knowledge base of 644,836 beliefs that it has read from various web pages. It is not perfect, but NELL is learning. You can track NELL’s progress on @cmunell on Twitter, browse and download its knowledge base, read more about our technical approach, or join the discussion group.
