Threshold for sample filtering defined using loss values for last “m” batches

This is 3rd (part 1, part 2) article in the series on importance sampling with tensorflow-keras framework. This implementation calculates the loss threshold from history of latest “m” batches. Along with this, it skips over the simple batches (representative loss value of the batch less than the threshold). This provides a couple of advantages:

  1. Makes the threshold more representative of the latest model performance, compared to ema thresholding (part2) where threshold relied on entire history and could be slower to the changes.
  2. Saves on extra training steps on…

Working with global statistics instead of per-batch statistics

In my earlier article (part 1), I discussed implementation of importance sampling, based on per-batch statistics. There, a sample with loss value in the top nth-percentile of its corresponding batch was filtered for training.

Now, the shortcoming of the above approach is that it is possible that most batches contains only simple samples. Even if we filter the batch, the filtered samples are still simple enough for the model. Therefore, a filtering scheme contingent on individual batch statistics is unable to fully exploit the benefits of importance sampling.

With that thought, I wanted to employ whole dataset dependent statistics to…

Why Mahalanobis Distance?

  1. Mahalanobis is an effective measure of distance between a point and a dataset distribution. It is a powerful tool for outlier detection as it employs covariance matrix for calculating the distance between the dataset center and a given observation (figure below). Please check references for more details on this.
Image by Author. “Red” is an outlier identified by its distance “d” from the centre of the distribution

2. It is a powerful anomaly detection and localization tool as demonstrated by Defard et al. in PaDiM: a Patch Distribution Modeling Framework for Anomaly Detection and Localization

Calculating Mahalanobis distance and reasons for tensorflow implementation

Now, there are various, implementations of mahalanobis distance calculator here, here.

However, these methods owing to their implementations with 1-D arrays are very…

In this article, I cover the implementation of class on top of keras’ ImageDataGenerator for creating a data pipeline for image pairs. Few Example tasks with this requirement are

  1. SuperResolution
  2. Deblurring

These tasks inherently require matching input-output pairs to be passed to the model. Now one could write their own iterator function to pick matching pairs but that means

  • writing a lot of code from scratch
  • giving up on a lot of internal optimizations of tensorflow and keras
  • sacrificing the ease of usage of class
  • juggling your way around with tf.Tensor and np.array

Hence, in light of the…

In this article, I share my implementation of CBAM as proposed by S Woo et. al in CBAM: Convolutional Block Attention Module, and my observations with its inclusion.

Environment specifications:

  • keras (v2.4.3)
  • tensorflow (v2.2.0) backend

Below is the code snippet for implementation of cbam layer into your NN.

Code by author. To include into your model please pass the output of a conv layer to cbam_block() above.


The below figure outlines the overall model architecture employed for the target task. cbam layer was included before SpatialDropout2D of downsample block in the following neural network architecture.

Combined (classifier/autoencoder) model architecture. Dotted lines in encoder represents skip connections. Classifier model uses separate multi-level classification blocks. Downsample blocks are shown below.

Training step executed only on the most lossy samples

In this article, I share my work on accelerating training procedure using very simple scheme for filtering the hardest (with highest loss) samples.

Tools and dataset specifications:

  • keras (v2.4.3)
  • tensorflow (v2.2.0) backend
  • cifar-10” dataset from pipeline

TL;DR: Below is an implementation of training on samples with maximum loss.

Code by author. importance_sampler() implements importance sampling for high loss sample

Main steps

  1. Each batch makes a forward pass (line 109-113).
  2. Calculate per sample loss values (line 115–121) and get the loss threshold, 67 percentile value, (line 80–83).
  3. Filter out the samples with loss…

Remove previously saved models with early training stop feature

We all have come to appreciate the flexibility afforded to us by keras’ various in-built callbacks like TensorBoard, LearningRateScheduler etc. during model training. However, it is not uncommon either for them to occasionally leave us wanting. In one such instance, I looked for functionality in ModelCheckpoint to delete the saved model files.

The motivation being that I was working on a shared machine and running multiple training experiments, each running 100s of training epochs. It did not take long for me to end up filling GBs and GBs of storage space. …

In this article I summarize the tensorflow implementation for 1) creating an imbalanced dataset, 2) oversampling of under-represented samples using

Who is this article aimed at? Do you want to

  1. work with class
  2. to create an imbalanced dataset
  3. want to oversample the underrepresented samples in imbalanced dataset
  4. apply image and batch level data augmentations
  5. create a split (train, validation) from a given dataset

Tool and dataset specifications:

  • keras (v2.4.3)
  • tensorflow (v2.2.0) backend
  • cifar-10” dataset from pipeline

Below is the bare minimum code snippet that will fulfill these requirements. …

TL;DR: lpips metric might not translate well as a loss function for training an image transformation/processing model. This is despite the fact that it might serve as a good quantitative evaluator, relating well with human perception of image quality.

Following is the code snippet I utilized for implementing the lpips based loss function for a de-blurring model.

Except for the convolutional layer after SpatialDropout, all the weights were frozen. This refers to the lin configuration as defined by Zhang, et al. in The Unreasonable Effectiveness of Deep Features as a Perceptual Metric.

Here, I share the key insights through…

Keras has three methods for defining neural network architectures, namely, Sequential API, Functional API and model subclassing. More about this can be read here. This article introduces a method to import subclass model weights to Functional API model.

Now, I have written about importing pre-trained tensorflow-1 model weights to keras. Back then the google-research still had not provided a tensorflow-2 implementation of SimCLR. However, tf2 version employs model subclassing method for training and saving of the weights. This implies that the loaded model from the saved model files will not be a keras object as discussed here,

The object returned…

Anuj Arora

Budding AI Researcher

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store