Tutorials, Invited Talks

Usually, ICDATA hosts several tutorials and invited talks focused on data science topics, such as Conformal Prediction, Big Data Analytics etc.

As soon as tutorials/talks are negotiated with speakers and confirmed, they will be published here.

All workshops, tutorials etc. at CSCE are free for all CSCE attendees.

So far, the following talks/tutorials are approved:

Tutorial 1

SpeakerAndrew Johnston
Mandiant (Consultant)
Topic/TitleTutorial on "What's Yours is Mine: How Modern Attackers are Stealing Your Data"
Date & Timet.b.a.
DescriptionThis talk explores the state of modern cyberattack techniques against well-secured assets. Using examples gained from real-world compromises, modern attack techniques and tactics are explored with an emphasis how attackers evade technical defenses and law enforcement. We will also explore how advanced attack groups such as nation states evade "next-generation" defenses that utilize machine learning and anomaly detection. Although attackers are getting more sophisticated, recommendations for securing large organizations and personal systems will be presented and discussed.
Short BioAndrew Johnston is a proactive consultant with Mandiant, a division of FireEye. He utilizes real-world attacker tools and techniques to identify weaknesses in enterprise security to identify flaws before the attackers can find them. Prior to joining Mandiant, Andrew worked with the FBI in the Cyber and Counterterrorism divisions. Andrew has a Bachelor's degree from Fordham University in Computer Science and Applied Mathematics and is pursuing a Master's degree from Fordham University in Cybersecurity.

Tutorial 2

SpeakerUlf Johansson
Department of Computer Science and Informatics, Jönköping University, Sweden, ulf.johansson@ju.se
Topic/TitleTutorial on "Predicting with confidence – Conformal Prediction and Venn Predictors"
Date & TimeDuration: approx. 2 hours
DescriptionHow good is your prediction? In risk-sensitive applications, it is crucial to be able to assess the quality of a prediction, but traditional classification and regression models don't provide their users with any information regarding the trustworthiness of a prediction.
Conformal predictors, on the other hand, are predictive models that associate each of their predictions with a precise measure of confidence. Given a user-defined significance level E, a conformal predictor outputs, for each test instance, a prediction region (for classification a label set, and for regression a real-valued interval) that, under relatively weak assumptions, contains the true target value with probability 1-E. In other words, given a significance level E, the error rate of a conformal predictor will be exactly E, in the long run. Since all conformal predictors have this remarkable property, called validity, the main goal becomes minimizing the prediction regions, thus maximizing the informativeness.
The conformal prediction framework allows any traditional classification or regression model to be transformed into a confidence predictor with very little extra work, both in terms of implementation and computational complexity.
For classification, the definition of validity in conformal prediction is often perceived as somewhat counter-intuitive, since the guarantee only applies a priori, i.e., once we have seen a specific prediction, the probability for that prediction to be wrong is no longer E. With this in mind, we recommend Venn predictors as a very strong alternative to conformal prediction for classification.
Venn predictors are multi-probabilistic predictors with proven validity properties. The standard impossibility result for probabilistic prediction is circumvented in two ways: (i) multiple probabilities for each label are outputted, with one of them being the valid one and (ii) the statistical tests for validity are restricted to calibration.
Hence, conformal prediction and Venn predictors are important tools that every data scientist should carry in their toolboxes, since they represent a straightforward way of associating the predictions of any predictive machine learning algorithm with confidence measures.
This tutorial aims to provide an introduction and an example-oriented presentation of the conformal prediction and Venn prediction frameworks, directed at machine learning researchers and professionals. The goal of the tutorial is to provide attendees with the knowledge necessary for implementing confidence predictors, and to highlight current research on the subject. The tutorial will contain examples of using confidence predictors in Python and KNIME.

The intended audience is machine learning researchers and professionals at intermediate to expert level. The participants are expected to have a good understanding of machine learning and data mining.
Short BioProf. Ulf Johansson holds a M.Sc. in Computer Engineering and Computer Science from Chalmers University of Technology, and a PhD degree in Computer Science from the Institute of Technology, Linköping University, Sweden. Since 2016, he is a full professor in computer science at the School of Engineering, Jönköping University.
Ulf Johansson’s main area of expertise is machine learning algorithms for data analytics. Most of the research is applied, and often co-produced with industry. Application areas include drug discovery, health science, marketing, high-frequency trading, game AI, sports analytics, sales forecasting and gambling. Prof. Johansson has published extensively in the fields of artificial intelligence, machine learning, soft computing and data mining. He is also a regular program committee member of the leading conferences in computational intelligence and machine learning. During the last few years, he has published several papers on conformal prediction and Venn predictors, some presented in top-tier venues like the Machine Learning journal and the ICDM conference.

Invited Talk 1

SpeakerDr. Peter Geczy
National Institute of Advanced Industrial Science and Technology (AIST), Japan
Topic/TitleData Science: An Interdisciplinary Perspective
Date & Timet.b.a.
Duration: approx. 1 hour
DescriptionWe are in the midst of digital data explosion that is notably influencing nearly every part of contemporary society. Vast quantities of data are generated, transmitted and harvested daily across the globe. Data aware commercial and governmental organizations have been collecting large amounts of data for various purposes. Primary drivers in data collection are value and knowledge extraction. Commercial value extraction from data has been a main target for businesses. Actionable knowledge extraction has been a focus of a broader range of organizations. Both tasks are challenging and present numerous difficulties as well as opportunities. Data Science is an emerging field attempting to address such challenges in an interdisciplinary manner. We will shed light on the pertinent interdisciplinary aspects of data science—spanning the interests of a spectrum of organizations.
Short BioDr. Peter Geczy holds a senior position at the National Institute of Advanced Industrial Science and Technology (AIST). His recent research interests are in information technology intelligence. This multidisciplinary research encompasses development and exploration of future and cutting-edge information technologies. It also examines their impacts on societies, organizations and individuals. Such interdisciplinary scientific interests have led him across domains of technology management and innovation, data science, service science, knowledge management, business intelligence, computational intelligence, and social intelligence. Dr. Geczy received several awards in recognition of his accomplishments. He has been serving on various professional boards and committees, and has been a distinguished speaker in academia and industry. He is a senior member of IEEE and has been an active member of INFORMS and INNS.