Search Box

Thursday, January 28, 2016

Yahoo Releases Machine Learning Dataset for Academic Researchers

Yahoo Releases Machine Learning Dataset for Academic Researchers

Dian Schaffhauser | January 20, 2016

Academic researchers now have free access to a sizable new dataset for the purposes of expanding the scientific world's understanding of Web sciences. Yahoo Labs released the "Yahoo News Recommendation" dataset, which consists of data on 110 billion events, taking up 13.5 terabytes in its uncompressed format.
Already the data has been used for research on the "effects of bid-pulsing on keyword performance in search engines" and the evaluation of "automatic image annotation using human descriptions at different levels of granularity."
The information inside the dataset is completely anonymized and maintained in the Yahoo Labs Webscope data-sharing program, the company reported. It's made up of user content interactions for about 20 million users during the period from February 2015 to May 2015. 

The Yahoo Webscope Program is a reference library of interesting and scientifically useful datasets for non-commercial use by academics and other scientists.
All datasets have been reviewed to conform to Yahoo's data protection standards, including strict controls on privacy. We have a number of datasets that we are excited to share with you.
Yahoo is pleased to make these datasets available to researchers who are advancing the state of knowledge and understanding in web sciences. The datasets are only available for academic use by faculty and university researchers who agree to the Data Sharing Agreement.

<more at; related links: (Webscope Datasets webiste) (The Effects of Bid-Pulsing on Keyword Performance in Search Engines. Savannah Wei Shi and Xiaojing Dong. July 2014. [Abstract: The objective of this study is to empirically examine whether and how the bid-pulsing affects keyword auction performance in search engine advertising. In keyword auctions, advertisers can choose a set-to-forget fixed bidding amount (fixed-bidding), or they can change the bid value based on certain type of rules (pulse-bidding). A keyword auction dataset from Yahoo! reveals that around 60% of advertisers frequently changed their bid value for a particular keyword category; moreover, such pulse-bidding behavior is observed throughout the entire time course for some companies. Both cross-sectional and longitudinal analyses on this dataset demonstrate that when a company pulses its bid values, the ranking, the number of exposures, and the number of clicks of the target ad listing will all benefit; in addition, the average bid price will be lower. The results are consistent across four keyword categories. Among the three keyword categories that are in ascending order of level of competitiveness, the magnitude of the bid-pulsing effect also increases. The study extends search engine advertising literature by providing empirical evidence of pulse-bidding strategy under the generalized second price (GSP) auction and by exploring the consequence of such strategy. Managerially, our results suggest that while keeping all other costs and the bidding environment the same, increasing the frequency and scale of bid pulsing will improve keyword performance; this is especially the case when bidding on a highly competitive keyword category.])

No comments:

Post a Comment