Updated On : Dec-22,2023 Time Investment : ~10 mins

Auto Encoders - Explained in Simple Terms¶

Auto Encoders are a particular type of neural network architecture designed to compress the data representation.

The architecture generally looks like an hourglass. It has a list of layers that try to capture the representation of the input data.

The network takes original data (Image, etc) as input and regenerates it as output. This way network tries to learn the domain of input data through various weights of layers.

The architecture generally has shape as per the below image.

It has two parts.

Encoder: This can be a list of dense layers or a list of convolution layers that reduce input size.
Decoder: This can be a list of dense layers or a list of De-Convolution layers that take compressed representation generated by the encoder part and try to regenerate the original image from it.

There is no limit on the number of layers or shape of layers that we take for designing the encoder and decoder. Overall encoder is supposed to reduce input size and the decoder is supposed to reproduce input from compressed representation.

The output of the encoder is generally referred to as embedding or latent space.

Overall, It is a single neural network architecture but to make understanding easier first few layers that compress data are referred to as the Encoder, and the later part which again expands compressed data to generate the original image is referred to as the Decoder. Otherwise, it is a single neural network.

When training autoencoders, the commonly used loss is either cross-entropy or mean squared error. Both will give good results overall.

We can say that Auto Encoders are somewhat like PCAs as they also compress data but autoencoders can capture non-linear representations which PCAs can’t. This is due to the presence of non-linear activations like Sigmoid, Relu, Tanh, etc.

Once the autoencoder is trained, we can separate the encoder and decoders and use them separately. The encoder can be used to create a smaller representation of data which saves disk space. The original data can be generated using a decoder by giving a smaller representation generated by the encoder.

Over the years, there are many different types of autoencoders have developed. Some of the most commonly used ones are:

Variational Auto Encoder (VAE)
Convolutional Auto Encoder
Deep Auto Encoder
Sparse Auto Encoder
Denoising Auto Encoder.

Apart from data compression, the output of the encoder part of the network can have a few different use cases:

EDA can be performed on compressed data. Tasks like classification, etc can be performed on compressed data. The output of the encoder can be used as extracted features. It can be combined with other features as well.

Below, I have included a sample architecture of an autoencoder created using the Deep learning framework PyTorch.

Sample Architecture 1 (Linear/Dense Layers)¶

from torch.nn import Module
    from torch.nn import Linear
    import torch.nn.functional as F
    
    class AutoEncoder(Module):
        def __init__(self, n_features, embed_size):
            super(AutoEncoder, self).__init__()
            
            ## Encoder
            self.lin1 = Linear(n_features, embed_size)
            
            ## Decoder
            self.lin2 = Linear(embed_size, n_features)
        
        def forward(self, x):
            ## Encode
            x = self.lin1(x)        
            x = F.leaky_relu(x)
            
            ## Decode
            x = self.lin2(x)
            x =  F.sigmoid(x)
            
            return x
        
    model = AutoEncoder(28*28, 32)
    model = model.to(device)
    
    model

Sample Architecture 2 (Convolution/De-Convolution Layers)¶

from torch.nn import Module, Sequential
    from torch.nn import Conv2d, ConvTranspose2d, LeakyReLU, Flatten, Linear, Sigmoid
    import torch.nn.functional as F
    
    class Reshape(Module):
        def __init__(self, *dims):
            super(Reshape, self).__init__()
            self.dims = dims
        def forward(self, x):
            return x.reshape(self.dims)
            
    class AutoEncoder(Module):
        def __init__(self, embed_size):
            super(AutoEncoder, self).__init__()
            
            ## Encoder
            self.encoder = Sequential(
                Conv2d(1, 32, kernel_size=(3,3), stride=1, padding=0),
                LeakyReLU(),
                Conv2d(32, 64, kernel_size=(3,3), stride=2, padding=0),
                LeakyReLU(),
                Conv2d(64, 64, kernel_size=(3,3), stride=2, padding=0),
                Flatten(),
                Linear(64*5*5, embed_size)
            )
            
            ## Decoder
            self.decoder = Sequential(
                Linear(embed_size, 64*5*5),
                Reshape(-1, 64, 5, 5),
                ConvTranspose2d(64, 64, kernel_size=(4,4), stride=2, padding=0),
                LeakyReLU(),
                ConvTranspose2d(64, 32, kernel_size=(4,4), stride=2, padding=0),
                LeakyReLU(),
                ConvTranspose2d(32, 1, kernel_size=(3,3), stride=1, padding=0),
                Sigmoid()
            )
            
        def forward(self, x):
            x = self.encoder(x)
            x = self.decoder(x)
            return x

All right, so in this blog, I tried to explain the autoencoder neural network in a few words. Hope you learned something from it.

If you like this blog then you might like other blogs on our website as well. Please feel free to explore the Blogs and Tutorials section to learn about other concepts.