Talks: Learn to see without a teacher by Isola

This is such an interesting topic in computer vision science. From the lecture we were shown a big picture of how machine can learn something without a teacher.

Professor Isola developed his speech of his research and others by showing multiple examples and simple diagrams of how the learning system works. It was quite amazing to see that the machine can predict the color of the b/w picture after training; I was also attracted by the output when it is given a hand write scratch although some of pictures it generated are ridiculous.

The technique is based on two nets: one for generating the picture; the other one is for judging it’s true or not. The theory is developed after Goodfellow ‘s GAN.

This speech did give me some inspiration especially in 2 directions.

The first one is for designing and fashion.

In the lecture, some experimental examples were displayed. What impress me was the attempt to paint the edge of bags and shoes. Although some outputs are very ridiculous which made audiences laugh, it, on the contrary, proved that it can create something that is beyond our imagination. What’s more, when we take a look at those fake image seriously, the imaginative patterns can inspire designers, giving them new thinking approaches and endless creative power to design new bags, new shoes and new clothes, at least the color and patterns of the fashionable product.

My second reckon is about image quality improvement.

Although we have wider bandwidth and economical communication techniques, sometimes we may need faster data transportation. For example, when we watch videos, sometimes we do not care too much about the pixels of each frame, but only focus on fluent display. However if the quality is higher, that would be much more satisfying. To achieve those two targets: small size transportation and higher image quality, I suggest that we can build a self-training agent at the client end to improve the poor-quality images transported from the server. In this way, we can save a lot of bands with very little or even no impact on appreciating the 1080p videos.

There has been some ways of saving bandwidth for other media, one is hosted filtering, your MX record points to a cloud server rather than to your mail server. We can use GANs to implement this host filter.

If the AI agent is very intelligent, we can even apply it in storage area. Nowadays, cameras’ SD card can only contain about 1000 high quality pictures, because each photo is over 10mb, if we only store the structure of the picture in the black and white format, we can minimize it to 1/8 or even 1/16 of the original one.

In this way, the new compress technique has been launched. We can use totally new compress algorithms to compress the file and use the decompressor which corresponded to the specific compress algorithm to get the originally compressor. This kind of compressions have loss with respect to the decompressors’ accuracy, therefore it is necessary to improve the accuracy of the predictor(decompressor).

There are several challenges need to be considered.

One challenge is the accuracy may be stable theoretically but may vary hugely when we pick one input and get the result.

Another challenge is that because of the development of the media, the model should be updated from time to time, thus, it should require the ability of learning by itself, as a result, it need high computational ability, which may meet some hardware obstacles.

All in all, this area is quite hopeful.