TLDR version: It’s not the aesthetics.
Having a little bit of understanding of what a deep artificial neural network is and how it works might help make this article a bit clearer. I try to explain it in non-technical terms here. If you’re not familiar with the territory, feel free to have a quick skim through it before coming back here.
First off I should make it clear that I am not one of the authors of the #deepdream #inceptionism thing that’s been going around lately. It was developed by Alexander Mordvintsev, Christopher Olah and Mike Tyka. I have played with their code, made a few things and tweeted about it. They have a nice explanation of #deepdream with images on their blog post.
And it’s blowing my mind!
But I should also point out: it’s not the aesthetic that’s turning me on.
The aesthetic is interesting. It’s trippy, surreal, abstract, psychedelic, painterly, rich in detail. But the novelty is likely to wear off quickly for most people except for the specially dedicated. Using new datasets or learning to control the output (so it’s not just puppy-slugs) can undoubtedly give it a new breath of life. And there is potential to do very interesting work conceptually pairing datasets with seed images. But it’s not the aesthetic that excites me.
Instead, like I said recently…
…the poetry behind the scenes is blowing my mind!
At a high level here’s what’s happening in #deepdream:
- An artificial neural network (i.e. the AI’s ‘brain’) has already been trained on over a million images (to recognise objects in the images)
- We show the trained network a brand new image
- While the network is processing this new image, we take a snapshot from a particular group of neurons inside the network
- We feed that new snapshot image back in, i.e. show it to the network (We can optionally apply small transformations, like zooming in etc.)
(If interested, see here for a more detailed non-technical technical explanation)
The poetry is blowing my mind at every step of the process…
When an artificial neural network receives an input such as an image, it tries to make sense of it based on what it already knows. The image data flows through the network, ‘activating’ neurons. Effectively the image is ripped apart and scanned for features that the network recognises. This can be thought of as asking the network “Based on what you already know, can you see anything here that you recognise?”.
This of course is how we make sense of the world. It’s analogous to asking us to recognise objects in clouds or ink / Rorschach tests. But I don’t even mean just visually. We try to frame everything that we see, hear or learn within the context of what we already know, and we build on top of that. This can be purely visual like seeing faces in clouds. Or it can be more critical as it affects how we learn, interpret information, make decisions, construct theories or develop prejudices based on the limited knowledge that we have. If we don’t have sufficient information, the assumptions we make are likely to be incorrect, as are the decisions we make as a result of them.
2. Confirmation Bias
When the network is processing the image, some of these recognitions might be weak firings within the network. These weak neural firings can be thought of as almost sub-conscious level “I think I see a little bit of a lizard-like texture over here, perhaps something that resembles a bridge over there”. But if these are very weak activations in the deep layers, they’ll dissipate within the network and won’t elevate to higher layers, or influence the final output.
But in the case of #deepdream, we choose a particular group of neurons dedicated to detecting particular features — e.g. those which respond to lizard like features — and we take a snapshot image, from inside the network. Whatever features a particular group of neurons respond to, will be dominant in the snapshot image created from those neurons. (NB. Technically speaking, by ‘take a snapshot’ I mean we choose a group of neurons, and we modify the input image such that it amplifies the activity in that neuron group. See my other article for more non-technical technical info).
This snapshot shows what that particular group of neurons are responding to, or ‘thinking about’.
When we feed that snapshot image back into the network as a new input, the network recognises those exact same features but with more confidence, because those patterns in the new image are now stronger, so those same neurons fire stronger. And when we take another snapshot of the same neurons, and feed that back in, it becomes even stronger. What was an initial “maybe I see inklings of little lizard-like features over here” on a deep sub-conscious level, starts to become “yea, I think they might be lizard-like features”, to “oh definitely, that’s a lizard-skin puppy-slug” at a well defined, visible high level. These activations are now strong enough to not dissipate and disappear in the depths of the network, and can propagate to higher levels, potentially even affecting the final output or decision.
This creates a positive feedback loop, reinforcing the bias in the system. Building confidence with each iteration. Transforming what was subtle, unnoticeable trends deep within the network, to strong, visible, defining biases that affect the decisions of the network.
This is almost like asking you to draw what you think you see in the clouds, and then asking you to look at your drawing and then draw a new image of what you think you are seeing in your drawing. And repeating this.
But that last sentence was not even fully accurate. It would be accurate, if instead of asking you to draw what you think you saw in the clouds, we scanned your brain, looked at a particular group of neurons which we know responds to a particular pattern, then we reconstructed an image based on the firing patterns of those neurons, and gave that image to you to look at. And then we scanned the same neurons again to produce a new image and showed you that etc.
The critical difference is, if we’d asked you what you saw in the clouds, we’d be representing the final conscious decision you made regarding what you saw. Whereas by scanning and extracting the snapshot from a group of neurons, we’re preying on and amplifying a particular thread of thought to create a strong bias. Like an indoctrination on a neurological level.
This of course is analogous to so many aspects of how our mind functions already. We see the world through the filter of a biased mind. A product of our upbringing, everything we've ever seen or learnt, the culture in which we live or come from. We project this bias onto everything we perceive, and if we’re not very careful, everything we perceive will in turn reinforce the very bias that shaped it.
The face in the clouds looks more and more like a face the more we think it’s a face. The shadow in the alley looks more and more like a mugger the more afraid we become. The image of the virgin mary on a piece of toast is more tangible the more we want to believe in it. The more convinced we are of a certain hypothesis, the more inclined we are — subconscious or not — to find that every piece of evidence confirms that hypothesis.
Interestingly, even if you don’t agree with my previous points, you've probably already confirmed them. If you see a human or ape like face in the image below; or [bird, slug, reptile, worm, puppy, sloth]-like creatures; then you have just demonstrated it. Perhaps you see something else? Something I can’t see? Then you've confirmed my point even stronger.
There are no faces, birds, slugs, reptiles, worms, puppies, sloths in the image above.
Your mind is projecting those meanings, trying to recognise patterns based on what it already knows, what it’s been trained on. Neurons in your brain stimulated by different features of these abstract shapes are trying to make sense of what you’re seeing and frame it in context of something familiar. (NB. Also see See apophenia and pareidolia).
Just like the #deepdream neural network.
You’re looking into a mirror of your own mind.
4. Completion of the cycle.
Even more interestingly, remember that these images generated by the #deepdream process are not what the network is seeing on a high level. These images are extracts from inside the network. Abstract representations from the depths of its memory. Snapshots from inside the AI’s brain.
These were weak neural firings, that we amplified. Without us interfering, these firings might not have even elevated to higher levels, and would have remained latent in the network. On a higher level the AI might not even be aware of these features. But deep inside the artificial neural network, certain neurons fired weakly, responding to certain features that the network recognised.
And then you look at these images. You find patterns in them. You recognise the exact same features that the artificial neural network had recognised, and amplified. You project meaning onto this noise, the abstract representations extracted from inside the hidden depths of the artificial neural network. In recognising these forms, you are confirming what the #deepdream neural network recognises but doesn't know that it’s seeing. The same neurons which fired weakly in the depths of the artificial neural network, are now firing in your brain.
You are completing this cycle of recognition and meaning in your mind.
5. Acknowledgement of Self-fulfilment
A final word. I realise I might be wrong. You might think that I’m reading too much into all of this, and maybe I’m just talking complete rubbish. But in my mind this is what I see. This is what I believe. It just makes complete sense to me.
But I am open to the idea that maybe I am wrong. Maybe I'm making some incorrect assumptions somewhere, maybe I'm making errors in logic, maybe I’m just downright ignorant or stupid. Maybe due to something weird in my brain I’m just not processing the information correctly. Maybe there’s some kind of bias that’s distorting my view, and making me see these things that aren’t really there. A bias that’s there because of everything that I’ve learnt and been subjected to in my journey in life so far. So maybe I am wrong, and maybe because this is what I already believe, I'm projecting it onto what I see here. But then this discussion is just reinforcing this very bias to further confirm everything that I just stated, in an infinite positive feedback loop where I can feel every step of the process being mirrored in my own mind as the neurons in my brain spike like crazy while they recognise the confirmation bias of the process reinforcing the hypothesis and projecting back onto my speculations which match exactly the so-called predictions to complete this cycle, and when I think about how that relates to everything that I've just said…
…the poetry is blowing my mind.
Have a nice day.