Stable Diffusion and Unstable Illusion
Stable Diffusion is trendy, isn’t it?
This is explained in detail in the following article, etc., so please refer to it.
So we paid homage to Stable Diffusion and
I created Unstable Illusion.
Stable Diffusion = Stable Diffusion
↑
vs
.
Unstable Illusion = Unstable Vision
It is.
Because
Anyone who just handles Stable Diffusion does it, so
I tried to do something like that with a little effort.
What we did was
to use TensorFlow’s DeepDream
to overinterpret the images generated by Stable Difussion to
improve the accuracy of the patterns found in the images.
Somehow, dreams are like hallucinations, and the pun is good, so I chose
Unstable Illusion.
『Be Here Now』
Have you ever read Be Here Now?
Square about 20 cm. Indian gods printed on brown straw paper, and warnings and revelatory messages from Ram Das. A pioneer of Eastern thought that spread in the early 1970s as a counter to existing Western materialism, this novel, hip, and graphical book was immediately accepted by young people around the world and became a bestseller with more than 2 million copies. It is also known as yoga and meditation to open up a new world of perception, and as a gateway to the spiritual world.
The explanation is written.
The DeepDream
I introduced earlier is connected
to stories about dreams and hallucinations and is spiritual, so
I referred to “Be Here Now”.
To be honest, it wasn’t interesting because it was
a spiritual book that I didn’t understand,
"The big ice cream cone in the sky"
There is a suspicious and somewhat surreal sentence.
In this article, I’m going to
use Stable Diffusionn and TensorFlow DeepDream to extract the sentence of this passage,
The big ice cream cone in the sky
, and generate an image.
What to prepare
- Google Colaboratory GPU environment (*Jupyter Notebook is also OK)
- Distribution source code (*Not required if you want to start from scratch)
Technology used
- Python
- TensorFlow
“Be Here Now” AI generates an image of “a big ice cream cone floating in the sky” as text
Stable Diffusion
Let’s install stable-diffusion-tensorflow
.
!pip install git+https://github.com/fchollet/stable-diffusion-tensorflow --upgrade --quiet
!pip install tensorflow tensorflow_addons ftfy --upgrade --quiet
!apt install --allow-change-held-packages libcudnn8=8.1.0.77-1+cuda11.2
Let’s generate.
from stable_diffusion_tf.stable_diffusion import Text2Image
from PIL import Image
generator = Text2Image(
img_height=512,
img_width=512,
jit_compile=False,
)
img = generator.generate(
"The big ice cream cone in the sky",
num_steps=50,
unconditional_guidance_scale=7.5,
temperature=1,
batch_size=1,
)
pil_img = Image.fromarray(img[0])
display(pil_img)
A large ice cream cone is generated that floats in the sky
.
I’m impressed at this point.
DeepDream
Import everything.
import tensorflow as tf
import numpy as np
import matplotlib as mpl
import IPython.display as display
import PIL. Image
Deprocess the image and display it.
def download(img, max_dim=None):
return np.array(img)
def deprocess(img):
img = 255*(img + 1.0) / 2.0
return tf.cast(img, tf.uint8)
def show(img):
display.display(PIL. Image.fromarray(np.array(img)))
original_img = download(pil_img, max_dim=500)
show(original_img)
display.display(display.HTML('Image cc-by: <a "href=https://commons.wikimedia.org/wiki/File:Felis_catus-cat_on_snow.jpg">Von.grzanka</a>' ))
Create a foundation model.
base_model = tf.keras.applications.InceptionV3(include_top=False, weights='imagenet')
Create a DeepDream model.
names = ['mixed3', 'mixed5']
layers = [base_model.get_layer(name).output for name in names]
dream_model = tf.keras.Model(inputs=base_model.input, outputs=layers)
Create a loss function.
def calc_loss(img, model):
img_batch = tf.expand_dims(img, axis=0)
layer_activations = model(img_batch)
if len(layer_activations) == 1:
layer_activations = [layer_activations]
losses = []
for act in layer_activations:
loss = tf.math.reduce_mean(act)
losses.append(loss)
return tf.reduce_sum(losses)
Create a DeepDream class.
class DeepDream(tf. Module):
def __init__(self, model):
self.model = model
@tf.function(
input_signature=(
tf. TensorSpec(shape=[None, None, 3], dtype=tf.float32),
tf. TensorSpec(shape=[], dtype=tf.int32),
tf. TensorSpec(shape=[], dtype=tf.float32),
)
)
def __call__(self, img, steps, step_size):
loss = tf.constant(0.0)
for n in tf.range(steps):
with tf. GradientTape() as tape:
tape.watch(img)
loss = calc_loss(img, self.model)
gradients = tape.gradient(loss, img)
gradients /= tf.math.reduce_std(gradients) + 1e-8
img = img + gradients*step_size
img = tf.clip_by_value(img, -1, 1)
return loss, img
Generate a class.
deepdream = DeepDream(dream_model)
Create a DeepDream execution function.
def run_deep_dream_simple(img, steps=100, step_size=0.01):
img = tf.keras.applications.inception_v3.preprocess_input(img)
img = tf.convert_to_tensor(img)
step_size = tf.convert_to_tensor(step_size)
steps_remaining = steps
step = 0
while steps_remaining:
if steps_remaining > 100:
run_steps = tf.constant(100)
else:
run_steps = tf.constant(steps_remaining)
steps_remaining -= run_steps
step += run_steps
loss, img = deepdream(img, run_steps, tf.constant(step_size))
display.clear_output(wait=True)
show(deprocess(img))
print("Step {}, loss {}".format(step, loss))
result = deprocess(img)
display.clear_output(wait=True)
show(result)
return result
Unstable Illusion
Let’s generate.
dream_img = run_deep_dream_simple(
img=original_img,
steps=100,
step_size=0.01
)
Let’s increase the intensity of the pattern.
import time
start = time.time()
OCTAVE_SCALE = 1.30
img = tf.constant(np.array(original_img))
base_shape = tf.shape(img)[:-1]
float_base_shape = tf.cast(base_shape, tf.float32)
for n in range(-2, 3):
new_shape = tf.cast(float_base_shape*(OCTAVE_SCALE**n), tf.int32)
img = tf.image.resize(img, new_shape).numpy()
img = run_deep_dream_simple(img=img, steps=50, step_size=0.01)
display.clear_output(wait=True)
img = tf.image.resize(img, base_shape)
img = tf.image.convert_image_dtype(img/255.0, dtype=tf.uint8)
show(img)
end = time.time()
end-start
Conclusion
Spiritual-ish…
Click here for DeepMagazine