Research Contributions and Achievements

Long-term Research Goal: To make smart computer programs for "learning to make our world better".

Summary of Research Areas and Contributions:

To accomplish the ultimate goal in the long run, I have worked hard in both theoretical and practical aspects of machine learning and data mining. In particular, my research contributions cover both fundamental research on machine learning methodology and practical research on real-world application areas, ranging from multimedia information retrieval, the main application domain, to social media, web search and data mining, computational finance, bioinformatics and medical imaging, computer vision and pattern recognition. Below gives a summary of my major research contributions, which are divided into three major categories: (i) foundation of machine learning and data mining, (ii) multimedia search & information systems, and (iii) knowledge discovery & intelligent systems.

1. Foundation of Machine Learning Methdology

1.1 Batch Mode Active Learning

Active learning is an important machine learning methodology It plays an important role in many real-world applications when unlabeled data is abundant but manually labeling can be very expensive. Most of previous research on active learning methodologies is limited to selecting a single unlabeled example at each learning iteration. This could be computationally inefficient since the model has to be re-trained for every labeled example. To address the critical limitation of conventional active learning methods, we have presented the novel framework of “Batch Mode Active Learning” (BMAL) [26], which aims to select multiple informative examples simultaneously at each iteration. Based on our framework, we have proposed several variants of BMAL algorithms that aim to address specific research challenges raised by different learning tasks, including the kernel logistic regression based batch mode active learning method [14,26,82], and the semi-supervised SVM batch mode active learning method [37,52]. Our seminal work of batch mode active learning has inspired many studies in the literature, as reflected by 250+ citations for the series of our work [14,26,37,52,82] according to Google scholar. Our article of semi-supervised SVM batch mode active learning [37] ranks No. 1 of most cited articles published in ACM Transactions on Information Systems (TOIS) over the last five years according to ISI journal citations (Web of Science).

1.2 Distance Metric Learning

The concept of distance metric or distance function is fundamental to many fields  of science and engineering. Choosing an appropriate distance metric or distance function is vital to many real-world applications, including multimedia information retrieval and data mining tasks. In the past, I have worked extensively on the problem of Distance Metric Learning (DML) [7], with the focus on very tough scenarios where side information could be scarce, noisy, or even not given explicitly at all. We have proposed several novel DML algorithms, including the well-known Discriminative Component Analysis (DCA) [55], semi-supervised distance metric learning by exploring both labeled and unlabeled data [36,53], probabilistic distance metric learning for handling uncertain side information [33,50], and nonlinear distance function learning [21,63], etc. Our research efforts in tackling the challenges of learning robust and effective distance metrics from various challenging situations have produced a lot of impact on many real-world applications [13,33,36,53,55], which is reflected by a total of 200+ citations from a variety of application domains in literature.

1.3 Kernel Machine Learning

Kernel methods are a family of important machine learning techniques, which often enjoys state-of-the-art performance when solving many real-world problems. A well-known example of kernel methods is support vector machines (SVM). For kernel methods, choosing a good kernel is essential to the final prediction performance. Many kernel methods assume either fixed or certain parametric/semi-parametric forms for the kernels to be used in their studies. As one of the pioneering approaches, we present the seminal work of fully Non-Parametric Kernel Learning (NPKL) technique [25] that aims to learn the optimal kernel from data effectively. Recently, I have developed a family of efficient and scalable SimpleNPKL algorithms that make it possible to apply the proposed NPKL technique to very large datasets [8,22], a key step toward  real-world applications [47]. In addition to NPKL, I also studied novel algorithms for other kinds of kernel machine learning tasks, such as Multiple Kernel Learning (MKL) which aims to learn the optimal combination of multiple kernels. We have proposed new algorithms to tackle different challenges of MKL, including unsupervised MKL [10] and multi-layer MKL [11]. The series of our work in this area had been cited for 150+ times.

1.4 Online Learning for Scalable Machine Learning

Recently I have actively worked in the area of online learning, a family of efficient and scalable algorithms towards massive-scale machine learning. We proposed novel techniques to overcome several limitations of traditional online learning methods. First of all, unlike traditional online learning methods that are often designed for optimizing mistake rate or classification accuracy, we proposed a novel Online AUC Maximization (OAM) method which aims to optimize the Area under the ROC Curve (AUC) [15]. Second, to make second-order online learning methods robust under noisy observations, we proposed a novel Soft Confidence-Weighed learning method [2] which can handle non-separable data and is more effective for noisy training examples. Third, to improve the learning efficacy of traditional kernel-based online learning methods that often performs  single updating when adding a new support vector, we proposed a novel Double Updating Online Learning (DUOL) method [9, 20], which updates two support vectors simultaneously, and thus significantly boosts the learning efficacy. Last but not least, we also proposed a novel framework of Online Multiple Kernel Learning (OMKL) [1,6,18], which efficiently learns predictive models by combining information from multiple heterogeneous sources.

2. Web-scale Multimedia Search and Information Systems

Besides the fundamental research in machine learning methodology, I believe it is equally important to investigate the applications of machine learning techniques to address real-world challenges. One of my key application areas is to multimedia search and information systems. The following gives a summary of my major contributions in this area.

2.1 Collaborative Image Retrieval

A fundamental challenge in multimedia retrieval is the well-known semantic gap between semantic meaning and low level features of multimedia data. To tackle this challenge, we have proposed a novel paradigm of image retrieval, termed Collaborative Image Retrieval (CIR) [40,41,53,54,57], that explores machine learning techniques in bridging the semantic gap by effectively mining the logs of user’s search history. In particular, we have developed two types of approaches for CIR. The first approach, termed log-based relevance feedback [41,57], that explicitly utilizes the logs of user’s relevance feedback in an online fashion,. The second approach explores users’ logs in an offline fashion. It learns robust distance metrics from noisy user log data for multimedia retrieval, using distance metric learning methods, such as regularized metric learning [40], Laplacian regularized metric learning [36,53], and probabilistic metric learning, etc. Our seminal work in this area has made significant impact in multimedia retrieval, as reflected by a total of 300+ citations for the series of our work.

2.2 Interactive Multimedia Retrieval

One way to close the semantic gap of content-based multimedia retrieval is to explore interactive retrieval paradigm via relevance feedback. However, traditional relevance feedback methods suffer from some critical drawbacks, such as poor learning efficacy, class-imbalance and insufficient labeled data, etc. I have attempted to overcome these limitations from a machine learning perspective. In particular, we proposed to apply batch mode active learning techniques to improve the learning efficacy by optimizing the selection of multiple examples for relevance feedback [14]. We addressed the class imbalance and insufficient labeled data issues by applying semi-supervised active learning algorithms [37,52,56]. These techniques have been applied to various multimedia retrieval tasks, including content-based image retrieval [14,52], multimodal news video retrieval [38], and medical image categorization and retrieval [26], with significant impact as reflected by a total of 250+ citations in this series.

2.3 Social Media Search and Mining

Social media, an emerging new multimedia data, has raised many new research challenges and exciting real applications [29]. Recently we have worked actively and tackled some of the key challenges in this area, including large-scale social image retrieval [34], automated photo tagging [33,46,50], and auto face annotation by mining web/social images [28,43,45]. For these real-world applications, we have applied our proposed novel machine learning techniques to solve the emerging research challenges of social media search and data mining. To encourage researchers and practical engineers to make contributions to this new emerging area, I also co-founded and co-chaired the series of ACM SIGMM International Workshops on Social Media (WSM) in conjunction with the top tier ACM Multimedia conferences over the past years. ACM Multimedia conference has accepted "Social Media" as a new track this year and invited me as one of Area Chairs to lead the new track in this top conference.

3. Large-scale Knowledge Discovery and Intelligent Systems

The advent of big data age has presented a number of challenges and opportunities for the applications of machine learning techniques to large-scale knowledge discovery and intelligent systems. Recently, we have investigated both theoretical and practical issues in mining big data. In addition to the online learning works as mentioned before that have shown promising performance for big data mining, we also investigated multiple practical techniques for mining massive amount of data in real-world applications. Examples of our related work include peer-to-peer machine learning in distributed environments [60], collaborative online learning [76], multi-view semi-supervised learning [65], multi-kernel boosting classification [16], and semi-supervised clustering [21], etc. Besides studying these novel algorithms, we have also applied machine learning techniques to multidisciplinary research across several real-world application domains [4,68], including computational finance, bioinformatics, medical imaging, computer vision and pattern recognition. The following gives a summary of our contributions to multidisciplinary research in some areas.  

3.1 Computational Finance

Machine learning and data mining techniques have emerged as one of promising directions for solving many open challenges in computational finance. My major research contribution in this area is focused on the open problem of On-line Portfolio Selection [66], a critical component of many real-world intelligent financial systems. As a machine learning and data mining researcher, I take a very different perspective to address this challenge.  In particular, we have developed new strategies, based on machine learning techniques, for online portfolio selection. We proposed a family of new online trading algorithms [4,5,59,67] for on-line portfolio selection by exploiting the mean reversion principle using state-of-the-art online learning techniques. Our promising results from comprehensive empirical studies on a variety of large-scale real testbeds showed that the proposed novel strategies outperform the state-of-the-art strategies for intelligent portfolio management in literature.

3.3 Bioinformatics & Medical Imaging

In bioinformatics, we have addressed several open challenges by developing computational  methods for the prediction of binding hot spots, an important task towards understanding protein-protein interactions. In particular, we proposed novel approaches for binding hot spot predictions by predictive learning methods to identify binding hot spots at the epitope sites of the HA1 proteins and at the paratope sites of the 2D1 antibody [64]. Besides predicting binding hot spots, we also investigated computational methods for identifying B-cell epitopes to facilitate the understanding of basic recognition mechanism of immune response [62], which in turn guides disease diagnosis, vaccine design and drug development. Our works have been published/accepted in top journals in bioinformatics. Last but not least, we also proposed machine learning techniques to tackle the real-world challenges in medical imaging domain, including medical image categorization [26] and medical image retrieval tasks [13].

3.3 Computer Vision & Pattern Recognition

We have investigated novel techniques for large-scale vision and recognition systems using machine learning and data mining techniques. The major achievements of my previous work include: (i) face alignment, tracking and annotation: we have investigated new technologies of face alignment, tracking, and annotation by machine learning for various applications in computer vision, multimedia, and augmented reality. For example, we have proposed the novel unsupervised face alignment technique [78]. In addition, we also investigated the face annotation problem extensively [39,43] by exploring machine learning techniques to overcome insufficient and noisy labeled data; (ii) 3D object modeling, tracking, and compressions: We have investigated several effective machine learning methods for solving the challenge of 3D object modeling and tracking. In particular, we have proposed efficient 3D deformable techniques for modeling implicit surfaces to tackle the non-rigid shape recovery problem [79,80], and geometry image and geometry video approaches for compressing and streaming objects of large-scale 3D meshes efficiently [44,48,49,71,72 ].

Selected Publications (by Topics & Years)

• Machine Learning Methodology

1. "Online Multiple Kernel Classification", Steven C.H. Hoi, Rong Jin, Peilin Zhao*, Tianbao Yang, Machine Learning (MLJ), 2012. In press. (Impact Factor = 1.663, 5-Year Impact Factor = 4.099)
2. "Exact Soft Confidence-Weighted Learning", Jialei Wang*, Steven C.H. Hoi, The 29th International Conference on Machine Learning (ICML2012), June 26 - July 1, Edinburgh, Scotland, 2012.
3. "Fast Bounded Online Gradient Descent Algorithms for Scalable Kernel-Based Online Learning", Peilin Zhao*, Jialei Wang*, Pengcheng Wu*, Rong Jin, Steven C.H. Hoi. The 29th International Conference on Machine Learning (ICML2012), June 26 - July 1, Edinburgh, Scotland, 2012.
4. "On-line Portfolio Selection with Moving Average Reversion", Bin Li*, Steven C.H. Hoi, The 29th International Conference on Machine Learning (ICML2012), June 26 - July 1, Edinburgh, Scotland, 2012.
5. "PAMR: Passive-Aggressive Mean Reversion Strategy for Portfolio Selection", Bin Li*, Peilin Zhao*, Steven C.H. Hoi, Vivekanand Gopalkrishnan, Machine Learning (MLJ), vol. 87, no.2, pp.221-258, 2012. (Impact Factor = 1.663, 5-Year Impact Factor = 4.099)
6. “Online Kernel Selection: Algorithms and Evaluations”, Tianbao Yang, Mehrdad Mahdavi, Rong Jin, Jinfeng Yi, Steven C.H. Hoi. In The Twenty-Sixth Conference on Artificial Intelligence (AAAI2012), Toronto, Ontario, Canada, July, 2012. (oral, acceptance rate = 294/1129=26%)
7. "Introduction to the Special Issue on Distance Metric Learning in Intelligent Systems" Steven C.H. Hoi, Rong Jin, Jinghui Tang, Zhi-hua Zhou, ACM Trans. on Intelligent Systems and Technology (TIST), 2012.
8. “A Family of Simple Non-Parametric Kernel Learning Algorithms from Pairwise Constraints”, Jinfeng Zhuang*, Ivor W Tsang, Steven C.H. Hoi, Journal of Machine Learning Research (JMLR), 2011. (Impact Factor = 2.789, 5-Year Impact Factor = 4.748)
9. "Double Updating Online Learning", Peilin Zhao*, Steven C.H. Hoi, Rong Jin, Journal of Machine Learning Research (JMLR), 2011. (Impact Factor=2.789, 5-Year Impact Factor = 4.748)
10. "Unsupervised Multiple Kernel Learning." Jinfeng Zhuang*, Jialei Wang^, Steven C. H. Hoi, Xiangyang Lan, Journal of Machine Learning Research (JMLR), W&CP, vol. 20, pp.129-144, Nov. 2011. (Impact Factor = 2.789, 5-Year Impact Factor = 4.748)
11. "Two-Layer Multiple Kernel Learning" Jinfeng Zhuang*, IW Tsang, Steven C.H. Hoi, Journal Machine Learning Research (JMLR), W&CP, vol.15, 2011. pp.909-917.
(Impact Factor=2.789, 5-Year Impact Factor = 4.748)
12. "Exclusive Lasso for Multi-task Feature Selection", Yang Zhou, Rong Jin, Steven C.H. Hoi, Journal Machine Learning Research (JMLR), W&CP, vol. 9, Italy, May 2010. pp. 988-995.
(Impact Factor=2.789, 5-Year Impact Factor = 4.748)
13. “A Boosting Framework for Visuality-Preserving Distance Metric Learning and Its Application to Medical Image Retrieval”, Liu Yang et al, Steven C.H. Hoi, M. Satyanarayanan, IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), vol.32, no.1, pp. 33-44, 2010. (Impact Factor = 4.378, 5-Year Impact Factor = 6.424)
14. "Batch Mode Active Learning with Applications to Text Categorization and Image Retrieval," Steven C.H. Hoi, Rong Jin, Michael R. Lyu, in IEEE Transactions on Knowledge and Data Engineering (TKDE), 2009. (Impact Factor = 2.285, 5-Year Impact Factor = 3.691)
15. "Online AUC Maximization" Peilin Zhao*, Steven C.H. Hoi, Rong Jin, Tianbo Yang, The 28th International Conference on Machine Learning (ICML2011), 2011. (acceptance rate: 152 out of 589=25.8%)
16. "MKBoost: A Framework of Multiple Kernel Boosting", Hao Xia*, Steven C.H. Hoi, Eleventh SIAM International Conference on Data Mining (SDM2011), Phoenix/Mesa, Arizona, 28-30 April, 2011. (acceptance rate: 86/343 = 24.5%)
17. "OTL: A Framework of Online Transfer Learning" Peilin Zhao* and Steven C.H. Hoi, The 27th International Conference on Machine Learning (ICML2010), Haifa, Israel, 21-24 June, 2010 (Oral, acceptance rate: 152/594=25.6%)
18. "Online Multiple Kernel Learning: Algorithms and Mistake Bounds", Rong Jin, Steven C.H. Hoi, Tianbao Yang, The 21st International Conference on Algorithmic Learning Theory (ALT2010), Canberra, Australia, October 6 - 8, 2010.
19. "Two-View Transductive Support Vector Machines", Guangxia Li*, Kuiyu Chang, Steven C.H. Hoi, SIAM International Conference on Data Mining (SDM2010), Columbus, Ohio, 2010 (acceptance rate: 82/351=23.36%)
20. "DUOL: A Double Updating Approach for Online Learning," Peilin Zhao*, Steven C.H. Hoi, Rong Jin, Advances in Neural Information Processing Systems (NIPS2009), MIT Press, 2009 (acceptance rate=23.8%) (PREMIA's Best Student Paper Award)
21. "Learning Bregman Distance Functions and Its Application for Semi-Supervised Clustering," Lei Wu^, Rong Jin, Steven C.H. Hoi, Jianke Zhu, N. Yu, Advances in Neural Information Processing Systems (NIPS2009), MIT Press, 2009 (acceptance rate: 23.8%)
22. "SimpleNPKL: Simple Non-Parametric Kernel Learning", Jinfeng Zhuang*, I. Tsang, Steven C.H. Hoi, International Conference on Machine Learning (ICML2009), 2009. (oral, acceptance rate = 26%)
23. "Active Kernel Learning," Steven C.H. Hoi and Rong Jin, In Proceedings of International Conference on Machine Learning (ICML2008), Helsinki, Finland, 5-9 July, 2008. (full regular paper, oral presentation, acceptance rate: 155/583 = 26%)
24. "Semi-Supervised Ensemble Ranking," Steven C.H. Hoi and Rong Jin, In Proceedings of Association for the Advancement of Artificial Intelligence (AAAI2008), Chicago, 13-17 July, 2008. (full regular paper, oral presentation, acceptance rate: 227/937 = 24%) 
25. "Learning Non-Parametric Kernel Matrices from Pairwise Constraints,", Steven C.H. Hoi, Rong Jin and Michael R. Lyu, In The 24th Annual International Conference on Machine Learning (ICML2007), Corvallis, OR US, 20-24 June, 2007. (full regular paper, oral presentation, acceptance rate = 28%)
26. “Batch Mode Active Learning and Its Applications to Medical Image Classification”, Steven C.H. Hoi, Rong Jin, Jianke Zhu and Michael R. Lyu, In International Conference on Machine Learning (ICML2006), Pittsburgh, Penn, US, 2006. (full paper, oral, acceptance rate = 18%)

• Multimedia Search & Information Systems

27. "Boosting Multi-Kernel Locality-Sensitive Hashing for Scalable Image Retrieval", Hao Xia*, Steven C.H. Hoi, Pengcheng Wu*, Rong Jin, ACM SIGIR Conference (SIGIR2012), Portland, Oregon, USA, August 12-16, 2012. (Full paper, acceptance rate = 20%)
28. "A Unified Learning Framework for Auto Face Annotation by Mining Web Facial Images", Dayong Wang*, Steven C.H. Hoi, Ying He, Proceedings of The 21st ACM Conference on Information and Knowledge Management (CIKM2012), Hawaii, 2012. (Oral, acceptance rate = 146/1088 = 13.4%)
29. “Social Media Modeling and Computing”, Steven C.H. Hoi, Jiebo Luo, Sussane Boll, Dong Xu, Rong Jin, Irwin King, book series of Advances in Pattern Recognition, Springer Press, 2011. (Book Editor)
30. "Enhancing Bag-of-Words Models by Efficient Semantics-Preserving Metric Learning" Lei Wu^, Steven C.H. Hoi, IEEE Multimedia, vol.18, no.1, pp.24-37, 2011. (Impact Factor=1.661, 5-Year Impact Factor=2.020)
31. “Near-Duplicate Keyframe Retrieval by Semi-supervised Learning and Nonrigid Image Matching”, Jianke Zhu, Steven C.H. Hoi, M.R. Lyu, S. Yan, ACM Transactions on Multimedia Computing, Communications and Applications (TOMCCAP), vol. 7, no. 1, pp. 4, 2011. (Impact Factor = 2.465)
32. "Active Multiple Kernel Learning for Interactive Intelligent 3D Object Retrieval Systems", Steven C.H. Hoi, Rong Jin, ACM Transactions on Interactive Intelligent Systems (TiiS), 1(1):3, 2011.
33. "Distance Metric Learning from Uncertain Side Information for Automated Photo Tagging" Lei Wu^, Steven C.H. Hoi, Rong Jin, Jianke Zhu, N. Yu, ACM Transactions on Intelligent Systems and Technology (TIST), vol. 2, no. 2, pp. 13:1-28, 2011. (Invited Article for Special Issue)
34. "SIRE: A Social Image Retrieval Engine", Steven C.H. Hoi and Pengcheng Wu*, ACM International Multimedia Conference (MM2011), demo track, 2011.
35. "Semantics-Preserving Bag-of-words Models and Applications", Lei Wu^, Steven C.H. Hoi, Nenghai Yu, IEEE Transactions on Image Processing (TIP), vol. 19, no. 7, pp. 1908-1920, 2010.
(Impact Factor = 2.848, 5-Year Impact Factor = 4.139)
36. "Semi-Supervised Distance Metric Learning for Collaborative Image Retrieval and Clustering," Steven C.H. Hoi, Wei Liu, and Shih-Fu Chang, ACM Transactions on Multimedia Computing, Communications and Applications (TOMCCAP), vol. 6, no 3., 2010. (Impact Factor = 2.465)
37. "Semi-Supervised SVM Batch Mode Active Learning with Applications to Image Retrieval," Steven C.H. Hoi, R. Jin, J. Zhu, M.R. Lyu, ACM Transactions on Information Systems (TOIS), 27(3), 2009. (Impact Factor = 1.667, 5-Year Impact Factor = 5.774)
38. “A Multimodal and Multilevel Ranking Framework for Large-Scale Video Retrieval,” Steven C.H. Hoi and Michael R. Lyu, IEEE Transactions on Multimedia (TMM), vol. 10, no. 4, pp. 607-619, June 2008. (Impact Factor = 1.822, 5-Year Impact Factor = 2.372)
39. “Face Annotation using Transductive Kernel Fisher Discriminant”, Jianke Zhu, Steven C.H. Hoi and Michael R. Lyu, IEEE Transactions on Multimedia (TMM), vol 10, no. 01, pp. 86-96, Jan 2008. (Impact Factor = 1.822, 5-Year Impact Factor = 2.372)
40. “Collaborative Image Retrieval via Regularized Metric Learning,” Luo Si, Rong Jin, Steven C.H. Hoi and Michael R. Lyu. ACM Multimedia Systems Journal (MMSJ), Special Issue on Machine Learning Approaches to Multimedia Info Retrieval, pp.34-44, 2006. (5Y Impact Factor = 1.216)
41. “A Unified Log-based Relevance Feedback Scheme for Image Retrieval,” Steven C.H. Hoi, Michael R. Lyu, and Rong Jin. IEEE Transactions on Knowledge and Data Engineering (TKDE), vol. 18, no.4, pp. 509-524, 2006. (Impact Factor = 2.285, 5-Year Impact Factor = 3.691)
42. "Modeling Social Strength in Social Media Community via Kernel-based Learning," Jinfeng Zhuang*, Tao Mei, Steven C. H. Hoi, Xian-Sheng Hua, Shipeng Li, ACM International Multimedia Conference (MM2011), 2011. (Full Long Paper Track, acceptance rate = 17%).
43. "Retrieval-based Face Annotation by Weak Label Regularized Local Coordinate Coding" Dayong Wang*, Steven C.H. Hoi, Ying He, ACM International Multimedia Conference (MM2011), 2011. (Full Long Paper Track, acceptance rate = 17%).
44. "Modeling 3D Articulated Motions with Conformal Geometry Videos" Dao Thi Phuong Quynh*, Sun Qian, Jiazhi Xia, Ying He, Steven C.H. Hoi, ACM International Multimedia Conference (MM2011), 2011. (Full Long Paper Track, acceptance rate = 17%).
45. "Mining Weakly Labeled Web Facial Images for Search-based Face Annotation" Dayong Wang*, Steven C.H. Hoi, Ying He, The 34th Annual International ACM SIGIR Conference (SIGIR2011), Beijing, 2011. (acceptance rate: 108/545 = 19.8%)
46. "Mining Social Images with Distance Metric Learning for Automated Image Tagging," Pengcheng Wu*, Steven C.H. Hoi, Peilin Zhao*, Ying He, Fourth ACM International Conference on Web Search and Data Mining (WSDM2011), Hong Kong, 2011. (Full regular paper, selected as "Oral+Poster" presentation, acceptance rate: 32/372=8.6%)
47. "A Two-View Learning Approach for Image Tag Ranking", Jinfeng Zhuang*, Steven C.H. Hoi, Fourth ACM International Conference on Web Search and Data Mining (WSDM2011), Hong Kong, 2011. (Full regular paper, poster presentation, acceptance rate 51/372 = 13.7%)
48. "Modeling 3D motion data using geometry videos", Y. He, J. XIA, TP DAO, X. Chen, Steven CH Hoi, ACM International Conference on Multimedia (MM2010), Oct 2010. (Oral, acceptance rate = 15%)
49. "Streaming 3D Meshes Using Spectral Geometry Images", Ying He, B.S. Chew, Dayong Wang*, Steven C.H. Hoi, L.P. Chau, ACM International Conference on Multimedia (MM2009), Beijing, 2009 (acceptance rate = 16%)
50. "Distance Metric Learning from Uncertain Side Information with Application to Automated Photo Tagging", Lei Wu^, Steven C.H. Hoi, Jianke Zhu, Rong Jin, N. Yu, ACM International Conference on Multimedia (MM2009), Beijing, 2009 (acceptance rate = 16%)
51. "Near-Duplicate Keyframe Retrieval by Nonrigid Image Matching," Jianke Zhu, Steven C.H. Hoi, Michael R. Lyu and Shuicheng Yan, ACM International Conference on Multimedia (ACM MM2008), 2008. (Oral, full long paper, acceptance rate = 17%)
52. "Semi-Supervised SVM Batch Mode Active Learning for Image Retrieval," Steven C.H. Hoi, Rong Jin, Jianke Zhu and Michael R. Lyu, In Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR2008), Alaska, US, June 24-26, 2008. (full paper, poster presentation, acceptance rate = 31%)
53. "Semi-Supervised Distance Metric Learning for Collaborative Image Retrieval," Steven C.H. Hoi, Wei Liu, and Shih-Fu Chang, In Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR2008), Alaska, US, June 24-26, 2008. (full paper, poster presentation, acceptance rate = 31%)
54. “Output Regularized Metric Learning with Application to Collaborative Image Retrieval”, Wei Liu, Steven C.H. Hoi, J. Liu, and Xiaoou Tang, In Proc. European Conference on Computer Vision (ECCV2008), Marseille, France, October 2008 (full regular paper, acceptance rate = 21%)
55. “Learning Distance Metrics with Contextual Constraints for Image Retrieval,” Steven C.H. Hoi, Wei Liu, Michael R. Lyu, and Wei-Ying Ma, In Proc IEEE International Conference on Computer Vision and Pattern Recognition (CVPR2006), 2006. (full paper, acceptance rate: ~20%)
56. “A Semi-Supervised Active Learning Framework for Image Retrieval,” Steven C.H. Hoi and Michael R. Lyu. In IEEE International Conference on Computer Vision and Pattern Recognition (CVPR2005), pp. 302-309, San Diego, 20-26 June, 2005. (full paper, acceptance rate: ~20%)
57. “A Novel Log-based Relevance Feedback Technique in Content-based Image Retrieval,” Chu-Hong Hoi and Michael R. Lyu. In Proceedings of ACM Multimedia Conference (ACM MM2004), pp.24-31, New York, NY, USA, 10-16 October, 2004 (oral, long paper, acceptance rate = 16%)

• Knowledge Discovery & Intelligent Systems

58. "MKBoost: A Framework of Multiple Kernel Boosting", Hao Xia*, Steven C.H. Hoi, IEEE Transactions on Knowledge and Data Engineering (TKDE), 2012. (Impact Factor=2.285, 5-Year Impact Factor=3.691)
59. "Confidence Weighted Mean Reversion Strategy for On-Line Portfolio Selection", Bin Li*, Steven C.H. Hoi, Peilin Zhao*, Vivek Gopalkrishnan, ACM Transactions on Knowledge Discovery from Data (TKDD), 2012. In press. (Impact Factor= 1.419)
60. "Predictive Handling of Asynchronous Concept Drifts in Distributed Environments", H.H. Ang, V. Gopalkrishnan, I. Zliobaite, M. Pechenizkiy, S.C.H. Hoi, IEEE Transactions on Knowledge and Data Engineering (TKDE), 2012. (Impact Factor=2.285, 5-Year Impact Factor=3.691)
61. "Structural and functional analysis of multi-interface domains", Liang Zhao*, Steven C.H. Hoi, Limsoon Wong, T. Hamp, B. Rost, J. Li, PLoS One, 2012. (Impact Factor= 4.092, 5-Year Impact Factor = 4.537)
62. "Prediction of B-cell epitopes at both protrusive and planar surface areas of antigens with one or multiple epitopes", Liang Zhao*, Steven C.H. Hoi, Limsoon Wong, L. Lu, J. Li, BMC Bioinformatics, 2012. (Impact Factor = 2.751, 5-Year Impact Factor = 3.493)
63. "Learning Bregman Distance Functions with Applications to Semi-Supervised Clustering," Lei Wu^, Steven C.H. Hoi, Rong Jin, Jianke Zhu, N. Yu, IEEE Transactions on Knowledge and Data Engineering (TKDE), 2012. To appear. (Impact Factor = 2.285, 5-Year Impact Factor = 3.691) .
64. "Structural analysis of the hot spots in the binding between H1N1 HA and the 2D1 antibody: do mutations of H1N1 from 1918 to 2009 affect much on this binding?" Qian Liu*, Steven C. H. Hoi, C.T.T. Su, Zhenhua Li, C.K. Kwoh, L. Wong, J. Li: Bioinformatics 27(18): 2529-2536, 2011. (Impact Factor= 5.468, 5-Year Impact Factor = 6.051)
65. "Multi-view Semi-supervised Learning with Consensus" Guangxia Li*, Kuiyu Chang, Steven C.H. Hoi, IEEE Transactions on Knowledge and Data Engineering (TKDE), 2011. (Impact Factor= 2.285, 5-Year Impact Factor = 3.691)
66. "CORN : Correlation-driven Nonparametric Learning Approach for Portfolio Selection", Bin Li*, Steven C.H. Hoi, V. Gopalkrishnan, ACM Tran. on Intelligent Systems and Technology(TIST), vol.2,no.3, 2011.
67. "Confidence Weighted Mean Reversion Strategy for On-Line Portfolio Selection" Bin LI*, Steven C.H. Hoi, Peilin Zhao*, Vivek Gopalkrishnan, Journal Machine Learning Research (JMLR), W&CP, vol. 15, 2011. pp. 434-442. (Impact Factor = 2.789, 5-Y Impact Factor = 4.748)
68. "Software Process Evaluation: A Machine Learning Approach" Ning Chen*, Steven C. H. Hoi, Xiaokui Xiao, 26th IEEE/ACM International Conference On Automated Software Engineering (ASE2011), 2011. (Full oral paper, 37/252 = 14.6%)
69. "When Recommendation Meets Mobile: Contextual and Personalized Recommendation On The Go" Jinfeng Zhuang*, Tao Mei, Steven C.H. Hoi, Shipeng Li, 13th International Conference on Ubiquitous Computing (UbiComp2011), 2011. (acceptance rate: 50/302 = 16.6%)
70. “P2PDocTagger: Content management through automated P2P collaborative tagging”, H.E. Ang*, V. Gopalkrishnan, W.K. Ng, S.C.H. Hoi, PVLDB Journal, 3(2): 1601-1604, 2010.
71. "Modeling and Compressing 3D Facial Expressions Using Geometry Videos", Jiazhi Xia, Dao T. P. Quynh*, Ying He, Xiaoming Chen, Steven C. H. Hoi, IEEE Transactions on. Circuits and Systems for Video Technology (TCVST), 2011. (Impact Factor = 2.548, 5-Year Impact Factor = 3.187)
72. "Spectral Geometry Image: Image Based 3d Models for Digital Broadcasting Applications", B.-S. Chew, L.-P. Chau, Y. He, D. Wang*, Steven C.H. Hoi, IEEE Transactions on Broadcasting, vol. 57, no. 3, pp.636-64. 2011. (Impact Factor =1.444, 5-Year Impact Factor = 1.801)
73. “Robust Regularized Kernel Regression”, Jianke Zhu, Steven C.H. Hoi and Michael R. Lyu, IEEE Transactions on Systems, Man and Cybernetics - Part B (TSMC), vol. 38, no.6, December 2008. (Impact Factor = 3.007, 5-Year Impact Factor = 3.513)
74. "Software Process Evaluation: A Machine Learning Approach" Ning Chen*, Steven C. H. Hoi, Xiaokui Xiao, 26th IEEE/ACM International Conference On Automated Software Engineering (ASE2011), 2011. (Full oral paper, 37/252 = 14.6%)
75. "When Recommendation Meets Mobile: Contextual and Personalized Recommendation On The Go" Jinfeng Zhuang*, Tao Mei, Steven C.H. Hoi, Shipeng Li, 13th International Conference on Ubiquitous Computing (UbiComp2011), 2011. (acceptance rate: 50/302 = 16.6%)
76. "Micro-blogging Sentiment Detection by Collaborative Online Learning", Guangxia Li*, Steven C.H. Hoi, Kuiyu Chang, and Ramesh Jain, IEEE International Conference on Data Mining (ICDM2010) ,Sydney, Australia, December 14-17, 2010.
77. "Web Query Recommendation via Sequential Query Prediction”, Qi He, Z. Liao, D. Jiang, Steven C.H. Hoi, K. Chang, E.P. Lim, and H. Li, IEEE Conference on Data Engineering (ICDE2009), Shanghai China, 2009 (full industrial paper, acceptance rate = 16.8%)
78. "Unsupervised Face Alignment by Robust Nonrigid Mapping", Jianke Zhu, Luc Van Gool, Steven C.H. Hoi, IEEE International Conference on Computer Vision (ICCV2009), Japan 2009
79. "Nonrigid Shape Recovery by Gaussian Process Regression", Jianke Zhu, Steven C.H. Hoi, MR Lyu, IEEE CVPR Conference (CVPR2009), Florida, US, 2009 (acceptance rate = 22%)
80. “An Effective Approach to 3D Deformable Surface Tracking”, Jianke Zhu, Steven C.H. Hoi, Z. Xu, M.R Lyu, In Proc. European Conference on Computer Vision (ECCV2008), Marseille, France, Oct 2008 (full regular paper, acceptance rate = 21%)
81. “Learning the Unified Kernel Machine for Classification”, Steven C.H. Hoi, Edward Y Chang and Michael R. Lyu, In ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD2006), Pittsburgh, US, 2006. (full paper, oral, acceptance rate = 11%) (Nominated for Best Paper Award)
82. "Large-Scale Text Categorization by Batch Mode Active Learning," Steven C.H. Hoi, Rong Jin and Michael R. Lyu, In 15th International World Wide Web conference (WWW2006), Edinburgh, England, UK, 2006. (oral, acceptance rate = 11%)
83. "A Multi-Scale Tikhonov Regularization Scheme for Implicit Surface Modelling," Jianke Zhu, Steven C.H. Hoi, and Michael R. Lyu, In Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR2007), June, 2007. (full regular paper, poster presentation)
84. "Real-Time Non-Rigid Shape Recovery via Active Appearance Models for Augmented Reality," Jianke Zhu, Steven C.H. Hoi, and Michael R. Lyu, In 9th European Conference on Computer Vision (ECCV2006), Graz, Austria, May 7 - 13, 2006. (full paper, poster, acceptance rate: 21%)
85. "Time-Dependent Semantic Similarity Measure of Queries Using Historical Click-Through Data", Qiankun Zhao, Steven C.H. Hoi, Tie-Yan Liu, etc., In 15th International World Wide Web conference (WWW2006), Edinburgh, England, UK, 2006. (oral, acceptance rate = 11%)