Oct 17, 2008

Beta Release of LETOR3.0, a Benchmark Dataset for Learning to Rank

A report from Microsoft research asia:

LETOR is a package of benchmark data sets for research on LEarning TO Rank, released from Microsoft Research Asia.
This dataset contains standard features, relevance judgments, data partitioning, evaluation tools, and several baselines, for the OHSUMED data collection and the '.gov' data collection. Version 1.0 was released in March 2007. Version 2.0 was released in Jan. 2008. Since the release of LETOR2.0, we have received valuable feedbacks from many people, such as bug reports, feasibility studies of the tools, and so on. Based on the feedbacks, we launched the project of LETOR3.0 several months ago.
Now the beta version of LETOR30 is available at

http://research.microsoft.com/users/LETOR/.

What's new in LETOR3.0?
LETOR3.0 contains several significant updates:
1) Four new datasets were added: homepage finding 2003, homepage finding 2004, named page finding 2003 and named page finding 2004. Plus the three datasets (OHSUMED, topic distillation 2003 and topic distillation 3004) in LETOR2.0, there are seven datasets in LETOR3.0;
2) More reasonable document sampling strategy was adopted. As a result, there are some changes on the documents associated with each query in the three datasets in LETOR2.0.
3) More features for learning were added.
4) Meta data for each document was provided to enable research on features for learning to rank.
5) More baseline algorithms were provided (actually the baselines will be included in the final version of LETOR 3.0);
What to do for LETOR3.0?

1) We plan to release the final version of LETOR3.0 datasets in early Nov, 2008. If you find any problem with the current beta version, please kindly let us know. We will refine the datasets accordingly. Our goal is to make the datasets really reliable and useful for the community.
2) The baselines in LETOR3.0 will be released at mid of Dec, 2008. If you want your own algorithms to be included as official baselines in LETOR3.0, please contact us as soon as possible.
We would like to express our sincere thanks to you, for your suggestions and helps on LETOR in the past years. We look forward to receiving feedbacks from you.
Please feel free to send an email to letor@microsoft.com to contact Letor team.
Best regards,
Tao Qin, Tie-Yan Liu, Jun Xu and Hang LiMicrosoft Research Asia



With information ICSRG shares

No comments: