A Beginner-friendly Introduction to KANs

An evolving alternative to MLPs.

By now, you may likely have heard of the Kolmogorov Arnold Networks (KAN).

In the last couple of weeks, they have gained much traction, especially because they challenge traditional neural network design and offer an exciting new paradigm to design and train them.

In this month’s free deep dive, I cover KANs in a detailed and beginner-friendly way: A Beginner-friendly Introduction to Kolmogorov Arnold Networks (KAN).

The idea behind KANs is pretty smart.

Traditionally, we stack many layers on top of each other (which are always linear transformations), and we deliberately introduce non-linearity with activation functions.

Moreover, these activation functions are fixed across the entire network, and they exist on each node (shown below):

KANs, however, move the activation function to the edges and make them trainable parameters:

As a result, the transformation matrices are not composed of weights but rather learnable activation functions (as shown below), and all of them can be different:

To obtain the output of one layer, we can pass the input vector through these functions:

This produces the output of the layer:

Of course, there are many background details to be understood here, which you can find in the article: A Beginner-friendly Introduction to Kolmogorov Arnold Networks (KAN).

The article is NOT paywalled, so it is open to all viewers.

I am pretty confident that if you have no idea what KANs are, how they work, why they appear so powerful, etc., this deep dive will clear everything for you with proper intuition.

Have a good day!

Avi

Reply

or to participate.