Tutorial 1

Time: 9:00 AM to 3.30 PM
Name: Debdoot Sheet, Swanand Khare, Abir Das

Title: Mathematics for deep learning


Deep learning since the start of the 21st century has been witnessing a surge and interest similar to that witnessed by quantum mechanics at the start of the 20th century. While in its early days in the last decade, experimentalists have proven its might and we see a rise in practical deployment of deep learning based solutions across vision, language, speech, signal processing, etc. of which some include chatbots with ambient assisted intelligence like Alexa, Siri, Cortana. Although the advent of specialised toolboxes for the trade, and availability of numerous open source models has enabled rapid deployment of commercial solutions, there is a lack in the community’s in-depth understanding of underlying mathematical foundations which make such technologies so powerful. This has led to the mathematical reasons for its success remains elusive and ill- understood. This tutorial will initially start with a mathematical understanding of the objectives and basis of machine learning, traversing across through linear algebra and data structure associated with representing different deep neural networks, bounds of their performance, statistics and information theory required for understanding learning principles and optimisation, principles behind generative models and adversarial learning, finally to conclude in the mathematical understanding of explainability for reasoning with a deep neural network. Across the tutorial we would associate the mathematics of each step to standard function calls available in standard deep learning toolboxes, in order to familiarise the audience with the internal implementation of the libraries, helping them intrinsically acknowledge the inherent strengths and weaknesses of this technology.


Debdoot Sheet is an assistant professor at the Department of Electrical Engineering and the Centre of Excellence in Artificial Intelligence at Indian Institute of Technology Kharagpur, and founder of SkinCurate Research, a Kharagpur, India based medical imaging AI company. He received the BTech degree in electronics and communication engineering in 2008 from the West Bengal University of Technology, Kolkata; MS and PhD degrees from the Indian Institute of Technology Kharagpur in 2010 and 2014 respectively. His current research interests include computational medical imaging, machine learning, deep learning, high-density tensor computation, fairness-accountability-trust-explainability (FATE) of artificial intelligence (AI). He is a DAAD alumni and was a visiting scholar at the Technical University of Munich during 2011-12. He is also recipient of the IEEE Computer Society Richard E. Merwin Student Scholarship in 2012, the Fraunhofer Applications Award at the Indo-German Grand Science Slam in 2012, Won the GE Edison Challenge 2013, IEM Kolkata Distinguished Young Alumnus Award 2016. He is a senior member of IEEE, member of SPIE and ACM, life member of BMESI and IUPRAI, and serves as an Editor of IEEE Pulse since 2014.

Swanand Khare is an assistant professor at the Department of Mathematics and the Centre of Excellence in Artificial Intelligence at IIT Kharagpur. He obtained the M.Sc. and Ph.D. degrees from IIT Bombay in 2005 and 2011 respectively. He was a post-doctoral researcher in the University of Alberta, Canada from 2011 to 2014. His research interests include inverse eigenvalue problems, computational linear algebra, estimation and computational issues in applied statistics. He is a recipient of Excellent Young Teacher Award 2018 at IIT Kharagpur.

Abir Das is an assistant professor of Computer Science and Engineering Department of IIT Kharagpur, India where he leads the Computer Vision and Intelligence Research (CVIR) group. He received the BE degree in electrical engineering from Jadavpur University, India, in 2007, and the MS and PhD degrees in electrical engineering from the University of California, Riverside, in 2013 and 2015, respectively. He was a postdoctoral researcher in the Computer Science department at Boston University. His main research interests include computer vision, activity detection, person re-identification, explainable AI and bias in machine learning.

Tutorial 2

Time: 9:00 AM to 3.30 PM
Name: Anurag Jain, Narasimha Pai

Narasimha Pai

Anurag Jain

Pankaj Kumar Bajpai

Title: On Device Depth Estimation for Visual Scene Understanding


Over the past few years, Smartphone Camera technology has advanced rapidly enabling consumers to capture high quality photos, which was previously only possible using high-end cameras. In addition, recent advances in computational capabilities for running deep learning networks have opened up new avenues for multi-modal imaging in general and computational photography in particular. This has enabled features like High Dynamic Range Imaging, Low Light enhancement and DSLR like Bokeh effect in the Smartphone. There is a continuous endeavor to bring visual understanding of camera closer to human perception of the real world. In this regards, accurate estimation of scene depth is a critical technology.
In this tutorial, we intend to present a brief overview of different configurations of image sensors and their usages for estimating scene depth. Subsequently we will look at the processing-pipelines and challenges associated with each. We will also look at some hybrid methods for fusing information from multiple modalities. Eventually our goal is to present some of the exciting possibilities it opens up in mobile imaging & application space.
We will then present how depth technology enables a variety of applications, viz. Computational Photography, Computer Vision, Augmented Reality, 3D modelling & Medical Imaging for handheld devices.


Narasimha Pai works as Architect in Imaging R&D at Samsung R&D Institute Bangalore, India & is currently leading the Camera & Imaging Solutions. He has been involved with the development of Computational Photography solutions viz. High Dynamic Range Imaging (HDR), Low Light Enhancement & Depth based Bokeh solutions. He has M.Tech from I.I.T. Kanpur & more than 24 years of experience in Multimedia, Imaging & Communications. Prior to Samsung, he has worked with Ittiam Systems & Motorola India Electronics Limited in development of Telephony & Voice band modems algorithms. His interest lies in Computational Photography, Imaging pipeline & new Sensors for cameras. He has 8 patent applications & 4 granted at USPTO.

Anurag Jain works as a Director in Imaging R&D at Samsung R&D Institute Bangalore, India. He is currently leading the architecture, design and development efforts for new Camera Systems Platform. He has MSc (Engg) from Indian Institute of Sciences and 22 years of experience working in the field of multimedia, imaging and video signal processing. Prior to joining Samsung, he worked for Texas Instruments as Senior Architect. His interest areas lie in the field of computer vision, image & video compression, heterogeneous computing & digital signal processing. He has 11 patent applications with 5 granted by USPTO.

Pankaj Kumar Bajpai works with Samsung R&D Institute Bangalore - India since 2007. In his current role as Associate Architect, he is leading the Depth Technology development efforts right from the State-of-the-Art algorithm to Best Embedded solution. He completed his MTech in Signal Processing from IIT Kanpur in 2004. He has 15+ years of experience in the area of Image/Video processing algorithms and Computer Vision. He has 6 patents (4 International Grant) and 3 International Publications in the area of Image Stitching & AI Stereo Disparity. Prior to joining Samsung, he has worked with Digibee Microsystem for developing Image and Video Codec algorithms.

Tutorial 3

Time: 9:00 AM to 11:30 AM
Name: Biplab Banerjee, Saurabh Kumar

Biplab Banerjee

Saurabh Kumar

Title: Deep learning for Satellite Imaging


Satellites are currently the primary way we observe the dynamics of our planet. With the private sector actively taking part in recent years, we have tons of data coming in. Some vendors these days offer a whole planet imagery every day! While getting hands-on such data can be pricey, various public institutes provide data with a 3-4 day frequency for free. This vast image data is an exciting source for image processing and pattern recognition research for visual computing. We will briefly see how it all started and what is the current state of the dataset and techniques. In this workshop, we will see how the power of machine learning is being used in the field of remote sensing for various tasks on these datasets. We will also have tutorials on how to get started with acquiring such a dataset and using them as a part of one’s research, processing them, and inferring various physical large scale parameters from them. We will mainly see how ML and DL are being used for a breadth of tasks like Remote sensing image classification, Hyperspectral imaging, SAR imaging, and Spectral unmixing. We will also see how to access these datasets and also how some pointers to going about building an ML pipeline around them to extract relevant information. We would also like to include 30-45 minutes for code demonstrations and hand on a sample problem.


Dr. Biplab Banerjee is currently working as an assistant professor at CSRE, IIT Bombay. He had obtained his PhD from IIT Bombay in 2015 for which he was awarded the "Excellence in Thesis" award. Subsequently, he had worked as post-doctoral researcher at the University of Caen in France and Istituto Italiano di Tecnologica Genova in Italy. Before joining IITB in June, 2018, he served as an assistant professor at the Dept. of CSE, IIT Roorkee. He works in the domains including computer vision, machine learning, deep learning, and remote sensing. Specifically, he is interested in the problems of domain adaptation, zero to few shot learning, multi-modal fusion, representation learning etc.

Saurabh Kumar is a Ph.D. student at the Vision and Image Processing research group in the Department of Electrical Engineering, Indian Institute of Technology Bombay. His research interests are multimodal computational imaging, vision, and scientific machine learning.

Tutorial 4

Time: 12:00 PM to 3:30 PM
Name: Sherin Thomas, Surgan Jandial

Title: Tutorial on Pytorch by Facebook AI


This tutorial will make the audience familiar with PyTorch - The fastest growing ML Framework. We will start with the introduction including it's history, key features and the recent growth in the number of users. We will then walk through it's famous automatic differentiation engine and how the basic tensor operations are handled. After this basic idea of working, we will jump to a neural network example including concepts like Dataloaders, Optimizers, Loss Functions and all the other backbones of deep learning. Following this, we would have a hands on session to make the people more familiar with the concepts introduced earlier. To Conclude, we would talk about the recent developments in the framework and how the PyTorch ecosystem as a whole is growing so fast.


Sherin Thomas is a deep learning consultant, author, and international speaker currently working as Senior Developer at [Tensor]werk, a deep learning infrastructure developing company. At Tensorwerk, Sherin helps to build Hangar, a version controlling system for data, and RedisAI, a high performant deep learning runtime. Sherin is also working actively in communities formed around students and currently organize developers.wtf and deep learning Bangalore. He has created the full-stack deep learning course for beginners and for people from non-computer science background which is available at fullstackengineering.ai
Surgan Jandial is a PyTorch Contributor and currently a student at IIT Hyderabad. He is among the top contributors around the globe for torchvision. At IIT Hyderabad,he is actively involved in computer vision research where he uses PyTorch for day to day tasks. Apart from this, Surgan has been conducting various workshops, tutorial sessions aimed at teaching people about the working and insights of the framework.