Ricardo Campos
Belvedere, Vienna, AustriaGellért, Budapeste, HungriaPonte das Correntes, Budapeste, HungriaPraça dos Heróis, Budapeste, Hungria
  • User Interfaces
  • Web Services
  • Datasets
  • Experiments

Here we provide two user interfaces so that the research community can test the GTE-Cluster and the GTE-Rank temporal search engine applications. In order to retrieve the query results, we rely on the recently launched Bing Search API (5000 transactions/month allowed) parameterized with the en-US market language parameter to retrieve 50 results per query. The proposed solutions are computationally efficient and can easily be tested online. Although the main motivation of our work is focused on queries with temporal nature, the implemented prototypes allow the execution of any query including non-temporal ones. Below is a detailed description of both user interfaces.

GTE-Cluster

GTE-Cluster


GTE-Cluster is a temporal clustering search engine, which offers the user two options: to return all the clusters (including the non-relevant) or to return only the relevant oCThe values that appear in front of the cluster, reflect the similarity value computed by the GTE similarity measure. Note that clusters with a similarity value < 0.35 are considered non-relevant and marked in red. In contrast, relevant clusters are marked in blue.

 

GTE-Cluster is available under request for academic purposes.

 

GTE-Rank

GTE-Rank


GTE-Rank is a temporal re-ranking search engine, which offers the user two options: to return all the web snippets (including those not having dates) or to return only the web snippets with relevant dates. The number in red color is the ranking position initially obtained by Bing search engine. The values in front of the snippet ID, reflect the ranking value computed by the GTE-Rank methodology.

 

GTE-Rank is available under request for academic purposes.

Here we make available a number of web services, so that each of the proposals can be tested by the research community.

 

Candidate Dates Extractor


Candidate Dates Extractor returns, in JSON format, a list of relevant words for a given query. This web service can be invoked by means of an automatically generated JSON Interface or by URL.

Note that in order to work, the user should specify one text (e.g., BRUSSELS — 1910 The capital of 10-12-2016 Belgium), one language (e.g., en-US; pt-PT, etc), one typeOfDate (e.g., 1 to retrieve dates as they were caught, 2 to retrieve dates as they were caught but in a normalized form; and 3 to retrieve in a normalized form only years), one MinDate (e.g., 1000) and one MaxDate (e.g., 2100).

 

Keywords Extractor1


Keywords Extractor1 returns, in JSON format, a list of relevant words for a given query. This web service can be invoked by means of an automatically generated JSON Interface or by URL.

Note that in order to work, the user should specify one language (e.g., en-US; pt-PT, etc), one query and one valid Bing Search API.

 

Keywords Extractor2


Keywords Extractor2 returns, in JSON format, a list of relevant words for a given text (no query is required). This web service can be invoked by means of an automatically generated JSON Interface or by URL.

Note that in order to work, the user should specify one language (e.g., en-US; pt-PT, etc), one title (optional field, e.g., avatar movie) and one text (e.g., Avatar is a 2009 American epic science fiction motion capture film written and directed by James Cameron, and starring Sam Worthington, Zoe Saldana, Stephen Lang ...).

 

GTE1


GTE1 returns, in JSON format, the GTE similarity value calculated between the query and all the candidate dates. Please note that GTE values < 0.35 are considered by our system an non-relevant ones. This web service can be invoked by means of an automatically generated JSON Interface or by URL.

Note that in order to work, the user should specify one language (e.g., en-US; pt-PT, etc), one query and one valid Bing Search API.

 

GTE2


GTE2 returns, in JSON format, the GTE similarity value calculated for all the candidate dates found within a given text (no query is required). Please note that GTE values < 0.35 are considered by our system an non-relevant ones. This web service can be invoked by means of an automatically generated JSON Interface or by URL.

Note that in order to work, the user should specify one language (e.g., en-US; pt-PT, etc), one heading (optional field), one title (optional field, e.g., avatar movie) and one text (e.g., Avatar is a 2009 American epic science fiction motion capture film written and directed by James Cameron, and starring Sam Worthington, Zoe Saldana, Stephen Lang ...).

 

GTE-Cluster


GTE-Cluster returns, in JSON format, the GTE similarity value calculated between the query and all the candidate dates together with the corresponding contents - i.e. title, snippet and url - where the set of candidate dates appear. This web service can be invoked by means of an automatically generated JSON Interface or by URL.

Note that in order to work, the user should specify one language (e.g., en-US; pt-PT, etc), one query and one valid Bing Search API.

 

GTE-Rank


GTE-Rank returns, in JSON format, the set of fifty re-ranked web snippets together with the corresponding Bing ranking position. This web service can be invoked by means of an automatically generated JSON Interface or by URL.

Note that in order to work, the user should specify one language (e.g., en-US; pt-PT, etc), one query, one alfa value (e.g., 0.9) and one valid Bing Search API.

Query-Snippet Portuguese Google Trend Bing Ranking Dataset (QSPTGtBingRank_DS)


[QSPTGtBingRank_DS Webpage]

 

 

 

Query-Snippet Google Insights for Search Bing Ranking Dataset (QSGisBingRank_DS)


[QSGisBingRank_DS Webpage]

 

Web Content TREC Dataset (WC_TREC_DS)


[WC_TREC_DS Webpage]

 

 

Web Content Dataset (WC_DS)


[WC_DS Webpage]

 

Campos, R., Dias, G., Jorge, A. and Nunes, C. (2012). GTE: A Distributional Second-Order Co-Occurrence Approach to Improve the Identification of Top Relevant Dates in Web Snippets. In Proceedings of the 21st ACM International Conference on Information and Knowledge Management (CIKM 2012) Maui, Hawaii, October 29 - November 02, ISBN 978-1-4503-1156-4, pp 2035 - 2039. ACM Press

 

Query Logs Dataset (QLog_DS)


[QLog_DS Webpage]

 

Campos, R., Jorge, A. and Dias, G. (2011). Using Web Snippets and Query-logs to Measure Implicit Temporal Intents in Queries. In Proceedings of the Query Representation and Understanding Workshop (QRU 2011) associated to 34th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR2011) Beijing, China, 28 July, pp 13 - 16.

 

Query Logs Dataset (AOL_DS)


[AOL_DS Webpage]

 

Campos, R., Dias, G. & Jorge, A. (2011). What is the Temporal Value of Web Snippets? In Proceedings of the 1st International Temporal Web Analytics Workshop (TWAW2011) associated to the 20th International World Wide Web Conference (WWW2011), pp 9 – 16, Hyderabad, India, 28th March, ISSN 1613 - 0073.

 

Google Insights for Search Future Dates Dataset (GISFD_DS)


[GISFD_DS Webpage]

 

Campos, R., Dias, G. & Jorge, A. (2011). An Exploratory Study on the impact of Temporal Features on the Classification and Clustering of Future-Related Web Documents. In L. Antunes and H.S. Pinto (Eds.), Lecture Notes in Artificial Intelligence - Progress in Artificial Intelligence, - 15th Portuguese Conference on Artificial Intelligence (EPIA2011) associated to APPIA: Portuguese Association for Artificial Intelligence Lisbon, Portugal, 10 - 13 October. (Vol. 7026-2011, pp. 581 - 596). ISBN: 978-3-642-24768-2. DBLP. Springer. Thomson ISI Web of Knowledge. ACM Press.

 

 

Google Insights for Search Query Classification Dataset (GISQC_DS)


[GISQC_DS Webpage]

 

Campos, R., Dias, G. & Jorge, A. (2011). What is the Temporal Value of Web Snippets? In Proceedings of the 1st International Temporal Web Analytics Workshop (TWAW2011) associated to the 20th International World Wide Web Conference (WWW2011), pp 9 – 16, Hyderabad, India, 28th March, ISSN 1613 - 0073.

 

 

GTE-Rank: Evaluating GRank under a set of Portuguese queries by means of a crowdsourcing experiment


[GTE-Rank Crowdsourcing Experiment Webpage]

 

Campos, R., Dias, G., Jorge, A. and Nunes, C. (2016). GTE-Rank: a Time-Aware Search Engine to Answer Time-Sensitive Queries. In Information Processing & Management an International Journal. Elsevier, Vol 52(2), pp 273-298, ISSN 0306-4573.

 

GTE-Rank: Temporal Re-Ranking


[GTE-Rank Temporal Re-Ranking Experiment Webpage]

Campos, R., Dias, G., Jorge, A. and Nunes, C. (2016). GTE-Rank: a Time-Aware Search Engine to Answer Time-Sensitive Queries. In Information Processing & Management an International Journal. Elsevier, Vol 52(2), pp 273-298, ISSN 0306-4573.

 

GTE-Cluster: Flat Temporal Clustering


[GTE-Cluster Flat Temporal Clustering Experiment Webpage]

 

Campos, R., Dias, G., Jorge, A. and Nunes, C. (2012). Disambiguating Implicit Temporal Queries by Clustering Top Relevant Dates in Web Snippets In Proceedings da IEEE Main Conference Proceedings of the 2012 IEEE/WIC/ACM International Conference on Web Intelligence, Macau, China, December 04 – 07.

 

Comparing a Web Content approach (WC_DS) against a Query Log one (QLog_DS)


[WC_DS vs. QLog_DS Experiment Webpage]

 

Campos, R., Dias, G., Jorge, A. and Nunes, C. (2012). Enriching Temporal Query Understanding through Date Identification: How to Tag Implicit Temporal Queries? In Proceedings of the 2nd International Temporal Web Analytics Workshop (TWAW 2012) associated to 21th International World Wide Web Conference (WWW2012) Lyon, France, 17 April. ISBN 978-1-4503-1188-5, pp 41 – 48. ACM Press.

 

GTE: Comparing GTE against a number of different association measures


[GTE Experiment Webpage]

 

Campos, R., Dias, G., Jorge, A. and Nunes, C. (2012). GTE: A Distributional Second-Order Co-Occurrence Approach to Improve the Identification of Top Relevant Dates in Web Snippets. In Proceedings of the 21st ACM International Conference on Information and Knowledge Management (CIKM 2012), Maui, Hawaii, October 29 - November 02, ISBN 978-1-4503-1156-4, pp 2035 - 2039. ACM Press.

 

Classification and clustering of future-related texts


[Future Temporal Data Experiment Webpage]

 

Campos, R., Dias, G. & Jorge, A. (2011). An Exploratory Study on the impact of Temporal Features on the Classification and Clustering of Future-Related Web Documents. In L. Antunes and H.S. Pinto (Eds.), Lecture Notes in Artificial Intelligence - Progress in Artificial Intelligence, - 15th Portuguese Conference on Artificial Intelligence (EPIA2011) associated to APPIA: Portuguese Association for Artificial Intelligence Lisbon, Portugal, 10 - 13 October. (Vol. 7026-2011, pp. 581 - 596). ISBN: 978-3-642-24768-2. DBLP. Springer. Thomson ISI Web of Knowledge. ACM Press.

 

Temporal data analysis of web snippets and classification of queries with regards to the topical and temporal dimension


[Temporal Query Classification Experiment Webpage]

 

Campos, R., Dias, G. & Jorge, A. (2011). What is the Temporal Value of Web Snippets? In Proceedings of the 1st International Temporal Web Analytics Workshop (TWAW2011) associated to the 20th International World Wide Web Conference (WWW2011), pp 9 – 16, Hyderabad, India, 28th March, ISSN 1613 - 0073.

 

Temporal data analysis of explicit temporal queries. AOL dataset (AOL_DS)


[AOL_DS Experiment Webpage]

 

Campos, R., Dias, G. & Jorge, A. (2011). What is the Temporal Value of Web Snippets? In Proceedings of the 1st International Temporal Web Analytics Workshop (TWAW2011) associated to the 20th International World Wide Web Conference (WWW2011), pp 9 – 16, Hyderabad, India, 28th March, ISSN 1613 - 0073.