---QSPTGtBingRank_DS---

http://www.ccc.ipt.pt/~ricardo/datasets/QSPTGtBingRank_DS.html

http://www.ccc.ipt.pt/~ricardo/datasets/QSPTGtBingRank_DS.zip (for downloading data)


DATASET REFERENCE

This dataset may be used for any research purposes upon referring the following reference:

Campos, R., Dias, G., Jorge, A. and Nunes, C. (2015). Under Submission.

 

SUMMARY

The QSPTGtBingRank_DS is a dataset designed for evaluating the relationship between queries and snippets (q, Si). Our objective is to compare the results of our approach against two other ranking systems.

It consists of 25 implicit time sensitive queries selected from the archives of the 2012 – 2014 Portuguese Google Trends and 475 distinct (q, Si) pairs obtained by querying the Bing search engine for each of the 25 queries through Bing Search API.

Thirty three participants were recruited to evaluate the relevance of the 475 (q, Si) pairs using a 4-level scale of relevance:

  (1) Not Relevant;

  (2) Fair;

  (3) Good;

  (4) Excellent;

The assessments were performed on March 2015 and did not involve any payment. Each worker evaluated 475 (q, Si) pairs resulting in 15675 (q, Si) total assessments, lasting one hour and a half on average to complete their task.

To get familiar with the topic, workers were given a very short description of the query. They were then asked to consider the query, to look at the description and to the web search results, and to classify them as either relevant or non-relevant by means of a 4-level scale.

 

The QSPTGtBingRank_DS dataset is an Excel file consisting of three spreadsheets described below:

Results Description: it comprises the title, snippet and URL for each of the 475 distinct (q, Si) pairs

Column A has the query name and the corresponding id.

Column B has the title.

Column C has the snippet.

Column D has the url.

 

          Worker's Assessement Summary: Table that gathers the relevance decision of the 33 worker's for set of 25 queries

Column A has the query name and the corresponding id.

Column B - AH has the worker's 1 - 33 relevance decision

 

          Worker's Assessement Summary (Re-Scaled): Table that gathers the relevance decision (re-scaled) of the 33 worker's for set of 25 queries

Column A has the query name and the corresponding id.

Column B has the summary of the worker's relevance decision when the snippet is considerd to be relevant (2)

Column C has the summary of the worker's relevance decision when the snippet is considerd to be non-relevant (1)

Column E - AK has the worker's 1 - 33 relevance decision (re-scaled to a binary scheme labelling)

 

Other datafiles (folders):

               PagWeb - Survey: It gathers the results for each of the 25 queries for the 3 systems under comparison.

 

DOWNLOAD

http://www.ccc.ipt.pt/~ricardo/datasets/QSPTGtBingRank_DS.zip

 

MORE INFO

If you have any further questions, please contact Ricardo Campos (ricardo.campos@ipt.pt).