Machine Learning and Analytics with Significantly Imbalanced Data and Applications in Information Systems
Bio
George A. Tsihrintzis is Full Professor and Head of the Department of Informatics in the University of Piraeus, Greece. He received the Diploma of Electrical Engineer from the National Technical University of Athens, Greece (with honors) and the M.Sc. and Ph.D. degrees in Electrical Engineering from Northeastern University, Boston, Massachusetts, USA. His current research interests include Pattern Recognition, Machine Learning, Decision Theory, and Statistical Signal Processing and their applications in Multimedia Interactive Services, User Modeling, Knowledge-based Software Systems, Human-Computer Interaction and Information Retrieval. He has authored or co-authored over 300 research publications in these areas, which include 5 monographs and 14 edited volumes.
He is the Editor-in-Chief of Intelligent Decision Technologies (IOS Press) and the International Journal of Computational Intelligence Studies (Inderscience) and a member of the editorial boards of 8 additional journals.
He has chaired over 20 international conferences.
He has guest co-edited 9 special issues of international journals.
He was the recipient of the Best Poster Paper Award of the 5th International Conference on Information Technology: New Generations, Las Vegas, USA, April 7-9, 2008, for co-authoring a paper titled: "Evaluation of a Middleware System for Accessing Digital Music Libraries in Mobile Services."
He was the recipient of one of the Best Applications Papers Award of the 29th Annual International Conference of the British Computer Society Specialist Group on Artificial Intelligence, Cambridge, UK, December 15-17, 2009, for co-authoring a paper titled: "On Assisting a Visual-Facial Affect Recognition System with Keyboard-Stroke Pattern Information."
He was the recipient of one of the Best Student Paper Awards of the 5th IEEE International Conference on Information, Intelligence, Systems and Applications (IISA2014), Chania, Crete, Greece, July, 7-9, 2014, for co-authoring a paper titled: “Genetic-AIRS: A Hybrid Classification Method based on Genetic Algorithms and Artificial Immune Systems”
He has presented keynote speeches in several international conferences.
Abstract
Classification is a very common supervised machine learning and data analytics task, in which a piece of data needs to be assigned by the learning algorithm to one of a given number of potential classes of origin. More specifically, in classification, the machine is given a set of training samples for each of which the class of origin is known. The machine is then asked to learn inductively from the given samples and generalize into a rule for assigning data into classes of origin that allows it to classify samples other than the ones used for training. It is the usual assumption of the binary classification problem that the number of training samples available from one class is comparable to the number of training samples available from the other class. However, it is not uncommon in certain applications for the number of training samples from one class to be significantly higher than the number of training samples from the other class. For example, users of recommender systems are very willing to provide examples (samples) of items they like, but are reluctant to provide samples of items they do not like. Similarly, in a protected system, the number of samples of intruders may be relatively limited, while the number of available samples of allowed/legal users may be quite high. Classification problems with class imbalance arise in nature as well. For example, the immune system in vertebrate organisms needs to be able to discriminate between self cells and other antigens, so as to respond accordingly. A high number of samples from the class of self cells are available to train the immune system. On the other hand, the class of antigens is very broad, including cancer cells, cells from other organisms, molecules and other intruding substances, viruses, bacteria, and parasitic worms. The number of available training samples from the class of antigens is very limited when compared to the size and diversity of this class.
The imbalance in the number of samples from each class affects the performance of traditional binary classifiers. Indeed, in probabilistic terms, classification problems in which training samples from one class are significantly higher in number than training samples from the other class result in significantly uneven prior probabilities of the two classes. The class from which a higher number of samples is available (target class) will have higher prior probability, while the class from which only a limited number of samples is available (outlier class) will have much lower prior probability. In turn, this affects the posterior probabilities of a sample coming from one or the other class. As a result, a binary classifier will erroneously tend to decide more often that an unknown sample comes from the target class than from the outlier class. In recommender system applications, this would mean that the system would tend to recommend items that the user might not like. Similarly, in a protected system, intruders and other threats might not be recognized.
In this presentation, we will discuss machine learning and analytics with extremely-imbalanced data and investigate the applicability of these methodologies in the design of recommender systems that support information systems.
Cloud Service Management for Manufacturing
Bio
Lin Zhang is a professor of Beihang University, China. He received the B.S. degree in 1986 from Nankai University, China, the M.S. and the Ph.D. degree in 1989 and 1992 from Tsinghua University, China. His research interests include service oriented modeling and simulation, cloud manufacturing, model engineering for simulation, agent based control and simulation. Currently, he serves as the Past President of the Society for Modeling & Simulation International (SCS), a Fellow of SCS and ASIASIM, the executive vice president of China Simulation Federation (CSF), a chief scientist of the 863 key projects, and associate Editor-in-Chief and associate editors of 6 peer-reviewed international journals. He authored and co-authored more than 200 papers, 10 books and chapters. He received the National Award for Excellent Science and Technology Books in 1999, the Outstanding Individual Award of China High-Tech R&D Program (863), 2001, the National Excellent Scientific and Technological Workers Awards in 2014.
Abstract
Service management is one of the most important tasks on a cloud. For the cloud designed for manufacturing, service management becomes very difficult because of many specific features, such as multi-varieties, heterogeneity, uncertainty, multi-constraints, and many services consist of physical equipment. This lecture will summarize state-of-art of cloud service management for manufacturing, and then discuss key technologies including service creation, service search and matching, service composition and scheduling, service credibility evaluation and service transaction. Some application examples will be presented.