Named entity disambiguation project using DBpedia Database
This is the dataset used in DBpedia Spotlight project. It is manually annotated entities of DBpedia resources for entities mentioned in 10 news articles. Here is the goldset (entities).
This is the dataset we compiled for 10 New York Times articles.
We manually annotated all articles and created a seperate goldset for each article.
Evaluation of both Dataset1 and Dataset2 for Zemanta, Spotlight and NERSO
Each zipped file contains 10 text files. Each text file contains surface form and annotated entities for the particular project.
For example: for the line “Search engine Web_search_engine”, “Search engine” is a surface form in the given text and “Web_search_engine” is the DBpedia/Wikipedia URI or entity.