TL;DR: lpips metric might not translate well as a loss function for training an image transformation/processing model. This is despite the fact that it might serve as a good quantitative evaluator, relating well with human perception of image quality.
Following is the code snippet I utilized for implementing the lpips based loss function for a de-blurring model.
Except for the convolutional layer after SpatialDropout, all the weights were frozen. This refers to the lin configuration as defined by Zhang, et al. in The Unreasonable Effectiveness of Deep Features as a Perceptual Metric.
Here, I share the key insights through…
Keras has three methods for defining neural network architectures, namely, Sequential API, Functional API and model subclassing. More about this can be read here. This article introduces a method to import subclass model weights to Functional API model.
Now, I have written about importing pre-trained tensorflow-1 model weights to keras. Back then the google-research still had not provided a tensorflow-2 implementation of SimCLR. However, tf2 version employs model subclassing method for training and saving of the weights. This implies that the loaded model from the saved model files will not be a keras object as discussed here,
For the past few weeks, I have been experimenting with various generator-discriminator networks (GANs) and among numerous trials and struggles, I have managed to come up with a few helpful strategies of my own. In this article, I share one such finding.
This one pertains to discriminator definitions. More specifically, discriminator definitions for image transformation (super-resolution, deblurring) generator models. There are various discriminator models are out there, with their own strengths and weaknesses, aptly covered by A Jolicoeur-Martineau in The relativistic discriminator: a key element missing from standard GAN.
In the paper, Alexia advocates for using a relativistic discriminator with…
While using Pytorch’s (v1.4.0) Dataloader with multiple workers (num_workers > 0), I encountered the following error,
Bus error. It is possible that dataloader's workers are out of shared memory. Please try to raise your shared memory limit.
With this started my couple of hours long struggle for increasing the shared memory size. Now, if one is running a docker container with docker run command, this issue can be handled by inserting following command line argument.
--shm-size=desired_memory_size
However, for running the job on a kubernetes cluster, one needs to include the relevant flag in the corresponding *.yaml file. …
Recently, while implementing my Deep Neural Network (DNN) model into a WebAPI I faced multiple issues,
AttributeError: 'Image' object has no attribute 'read'
AttributeError: 'numpy.ndarray' object has no attribute 'read
I could have switched from using absl-py to some other module (for example, argparse) to set the default values for my command line arguments (unsure if this would have worked, since I did not test) but that would…
In this article, I present three different methods for training a Discriminator-generator (GAN) model using keras (v2.4.3) on a tensorflow (v2.2.0) backend. These vary in implementation complexity, speed of execution and flexibility. I mention the observations for these methods from these aspects.
Method 1:
Carrying out a batch-wise update of discriminator and generator alternatively, inside nested for loops for epoch and training steps. Most references obtained through an internet search seem to be using this method. The example code is designed with “Data transformation model”. Make necessary tweaks depending on the kind of model you will be requiring.
# Build…
With GANs proving increasingly effective in helping the networks construct more realistic images in tasks like Super-Resoultion, Deblurring, Style-transfer, etc. it becomes imperative to employ them in any task that requires image generation. With that thought, I decided to include them in my project as well.
After experimenting with different ways (link), that were lacking in speed I decided to go ahead with tensorflow’s suggested way, Wrapping up: an end-to-end GAN example. At first glance, it looked simple enough but it was anything but straightforward. After struggling for a week or so, I finally succeeded in making the training work…
Recently I have been experimenting with knowledge distillation (KD). Key idea behind KD is to use outputs from a bigger “teacher network”, referred as TN, as a supplementary feedback for a smaller “student network”, referred subsequently as SN, in addition to the ground truth (GT) values. These TN generated soft labels (SL) reveal more information than GT’s hard labels (HL), and can purportedly help the SN learn better. Main motivation being to have smaller and faster production models.
In the following write up, I share some of the key learnings.
Recently, while experimenting with Knowledge Distillation methods for downsizing deep neural network models, I wanted to try out a suggestion made by JH Cho et. al in their paper titled, On the Efficacy of Knowledge Distillation.
They claim that for better training of student model, it helps if the training from the teacher model is stopped early. They also share an extensive set of results to verify their claims. Inspired by this I wanted to try it out for myself.
A simple strategy for this should be to change the weights for the loss functions, during the training process, and…
Keras provides several in-built metrics which can be directly used for evaluating the model performance. However, it is not uncommon to include custom callbacks, to extend beyond keras' capabilities, as I myself had to do recently. In this post, I discuss how to use custom calculated values (metrics, losses) with in-built callbacks.
Before proceeding, we will need to define our own custom callback to carry out calculations that we want for tracking the training progress or evaluate the model performance. Below is a dummy code for defining a custom callback.
CustomCallback(keras.callbacks.Callback): …
Budding AI Researcher