TVM: End-to-End Optimization Stack for Deep Learning

(Image Source: http://tvmlang.org/)

Abstract

  • Scalable frameworks, such as TensorFlow, MXNet, Caffe, and PyTorch are optimized for a narrow range of serve-class GPUs.
  • Deploying workloads to other platforms such as mobile phones, IoT, and specialized accelarators(FPGAs, ASICs) requires laborious manual effort.
  • TVM is an end-to-end optimization stack that exposes:

    • graph-level
    • operator-level optimizations

    ---> to provide performance portability to deep learning workloads across diverse hardware back-ends.

Kube In Action - 01: Introduction to Kubernetes

I've recently picked up an interest in containerization. I started reading up on and playing with kubernetes.

I have been taking notes as I go through Kubernetes in Action by Marko Lukša and I wanted to share these with those who might have similar interests in containerization and distributed systems in general. This is the 1st installment of a series called Kube in Action. Every week or so, I’ll be summarizing and exploring kubernetes fundamentals + concepts with hands-on examples as I learn more about Kubernetes.

Play interactively with C++ - Getting Started with Xeus-Cling

xeus-cling

This is the 1st installment of a new series called Play interactively with C++. Every week or so, I’ll be summarizing and exploring Standard C++ Programming in Jupyter notebook using xeus-cling.

The source code (in notebook format) for this series can be found here.

xeus-cling is a Jupyter kernel for C++ based on the C++ interpreter cling and the native implementation of the Jupyter protocol xeus.

Why Parallel Computing?

For many years we’ve enjoyed the fruits of ever faster processors. However, because of physical limitations the rate of performance improvement in conventional processors is decreasing. In order to increase the power of processors, chipmakers have turned to multicore integrated circuits, that is, integrated circuits with multiple conventional processors on a single chip.

(Source: Exponential growth of supercomputing power as recorded by the TOP500 list)