Monday, August 3, 2020

Fourier-Mellin Fractals

The Fourier-Mellin Transform (FMT) seems to be fundamental in encoding 2D fractals.

Here are example kernels of which the FMT composes all shapes:

Notice that these are standard Fourier transformation planar waves, but in log-polar space. Also notice that these are basic components of shapes with single scale symmetry (an image that is invariant to a single similarity transformation), you can build any sort of scale-symmetric spiral from summing these.

If we take a 2D image I:
and get its Fourier transformation F(I) then all shapes in any location will produce the same abs(F(I)):   (zoom in on centre to see):
Now, if we take the log of this log(abs(F(I))) (as though it were on a complex plane), i.e. convert to log polar coordinates: (horiz = angle, vertical = log radius from bottom to top)
The result is vertically periodic only if it is a broad class of scale-symmetric shape. (Above we see the lines getting softer towards the bottom, that is just because there aren't enough pixels close to the abs(F(I)) centre to represent the low spatial frequencies of the red squares shape. The precise result would be periodic).

The height offset of the vertical translation symmetry represents the scale involved in the fractal. In this case it is scale=2.

Next, we take the Fourier transformation of this, and get its absolute, to give the Fourier Mellin Transformation: FMT = abs(F(log(abs(F(I))))):
This gives an encoding of the shape that is now invariant to rotations and scalings.

Let's do it again, this time with a Menger sponge...
                      I                                      abs(F(I))                       log(abs(F(I)))      
                                               FMT = abs(F(log(abs(F(I)))))
and with the 2D Cantor set:
                                   I                                                 FMT(I)
As you can see, the Fourier-Mellin transformation is different to the squares examples. The intermediate images are hard to see, so you'll need to zoom in or brighten them to see better.

Now, this FMT is invariant to translation, rotation and scale. To illustrate, here is the FMT calculated for the Viscek fractal at two different scales and rotations:

Not only does the FMT represent the shape independent of its translation, rotation and scale, but for a large class of fractals, they can represent their infinitely detailed structure.

This is the class of fractal sets S of the form S=X u Ui(TiS)  where X is a set, U is union and Ti are a set of transformations whose scale are mutually rational, and whose rotations are rational multiples of 2PI. The class includes most simple fractals, including the ones above.

For such fractals, the log(abs(F(I))) image is vertically periodic (and horizontally periodic as it represents angle), this means its abs(FFT()) (the FMT of I) fully captures the entire shape.

However, we still need to quantise this FMT to some finite resolution, but the important thing here is that a finite resolution of the FMT does not inverse transform into a finite resolution image, instead the inverse FMT generates an unbounded fractal, with infinite detail as you zoom in to the centre. Further from the centre gets more blurry (faster if the FMT is lower res), but you can move the fractal so any different parts can go under the detailed centre. So there is access to the whole fractal in high detail, you just have a foviated view of it.


OK I have missed something vital from this description so far (mainly because I just realised it!). Any fractal in any position, orientation and scale will give the same FMT image. However, that FMT image does not fully identify the fractal shape. When you do an abs() it not only gets rid of translation information, but it also gets rid of relative phase information. So it gets rid of too much for the FMT to be a full description of the fractal, instead the FMT is a useful descriptor of the fractal; it is an identifier that could be used in matching or finding particular fractals in an image.

So why is this? It needs investigating, let's start with 1D.
A translation of t of a shape causes the phase of spatial frequency f to change by ft. When the frequencies are 2^i for whole numbers i, then the phase for f2 must always be twice the phase for f1, there is no location where they could possibly have phase 0.5 and 1.5, therefore the individual phases hold more information than just the location.

There are two remedies to make the FMT a description:
  1. Don't do an abs() but instead remove just the phase vector over all pixels that is unique to translation. Presumably this simply setting the lowest frequency phase back to zero and removing multiples of this phase from each of the other frequencies. For the FMT this amounts to placing the shape at its centre of mass, and rotating to have centre of rotation aligned vertically, and fixing the scale to where the bulk of the scale is. While this does work, it isn't necessarily great for matching purposes because it relies on just the lowest spatial frequency to pick the phase, if this has a tiny magnitude then the phase will be sensitive. A better pick would be some sort of weighted sum of the phases...
  2. If all frequencies in the Fourier decomposition were mutually irrational, then there will always be a location that produces any particular phase vector. Whether this is sufficient for FMT to fully characterise the fractal shape, I am not sure though, and certainly fast Fourier transforms are not possible

No comments:

Post a Comment