报告题目:Large scale topic modeling on Twitter  

  报告人: Shuanghong YangPh.DLead ScientistMLInfra@Twitter 

  报告时间:2014年12月15日上午10点 

  报告地点:智能化大厦三层第一会议室 

  报告摘要:  

  We aim to provide a topic-aware multi-channel experience on Twitter to facilitate content creation, discovery and consumption. This requires the ability to organize in real-time a continuous stream of sparse and noisy texts (i.e., tweets) into hundreds of topics with measurable and stringently high precision.  We present a spectrum of techniques that contribute to a deployed  tweet topic modeling system. These include scaling up LDA to real-time inference at full Twitter scale, high-precision topic filtering, taxonomy construction, non-topical tweet detection, automatic labeled data acquisition, evaluation with human computation, diagnostic and corrective learning, and most importantly high-precision topic prediction. I will briefly introduce these techniques and the machine learning infrastructure behind it. 

  报告人简介 

  Shuanghong is a Senior Researcher & Lead Scientist at Twitter, where he leads the machine learning infrastructure team. Prior to Twitter, he worked on machine learning and predictive analytics at Microsoft Research and Yahoo! Labs. He earned his Ph.D from Georgia Institute of Technology in 2012. He has published actively at leading academic conferences and journals. He is the winner of Yahoo! Key Scientific Challenge award (2011) and Facebook Fellow (2011, finalist), and the recipient of the ACM SIGIR 2011 Best Student Paper award, the UAI 2010 Best Student Paper award (nominated) and the PAKDD 2008 Best Student Paper award.  

  

    

附件: