Some Boring Dall E2

I am legitimately having some issues with WordPress eating edits, so this is likely to be short and ill informed.

Yesterday I ran into Don Feasel https://donaldfeasel.com/ in the hall and we talked a little about the upcoming semester at SJSU and I started talking about the Machine Learning class I am prepping for. I think I am going to start a short series where I try to do variations using Dall E2 and painters I know, at least to start with. Here are a few of his images, and the Dall E2 variations, with his permission, of course:

I am interested in abstraction here only partially because there is a lot of subjectivity about what makes a good image. Of course we have line, form, composition and half a dozen other features that I just don’t have a giant amount of experience with. I asked if he felt that the variations were soulless. Dall-E seems to capture a sense of the image but details is lacking, I am also noticing that Dall-E really favors symmetry.

Don’s work is the upper left and has a distinct shape to the action of the work, the variations are much more gridlike, thought they capture some of the energy.

The second image is interesting in the forms that are created but Dall-E seems to be stuck with an X form. Don’s work has a much more dimensional quality and he uses the space of the image in a much more sophisticated way. The Dall-E variations seem to be variatons on parts of the image and not the greater composition.

The Artist says: “These collected bodies of work trace a journey from more to less control. A growing sense of the futility of manipulating materials to produce expressive content consciously set me on this path. I enjoy constructing compositional processes that give me surprising results when they run their course. While painting I try to sustain a state of tension between knowing what I’m doing and being lost. I’m free to reject the efforts that don’t sustain my attention as every attempt pushes the boundaries of a self-imposed system. It provides the incentive to try again—as any gambler would play one more card.”

The artists work is far more rich and complex than the variations. How much longer until that gap closes?

Art Fart

Using the variations on classic art, and I hope to see some aspects that go beyond mere style.

Did Duchamp steal this from the Baroness Elsa von Freytag-Loringhoven? Do I feel guilty using AI to create variations? This fulfills the prophecy. I love the porcelin but I love the background even more. Toilets for everyone!

Ad Reinhardt managed to take abstraction to its logical conclusion. Surprisingly it caught something from the unphotographable image. Unsurprisingly, I think it comes from the extended edge of the image.

All black becomes all white, and though I have had my challenges with Malevich, the algorithm finds the most boring options for the variations. Unsupremitist.

Rothko broke Dall-E for the first round, I felt super guilty about cropping the image and so I cannot really get into the results. Same colors, thought.

Have you heard about the painter Vincent Van Gogh,
Who loved color and who let it show.

This is the basest most boring level of using these multi billion parameter language models, but I cannot help but be a little fascinated by it. Maybe it is about walking some of art history through this to see how it responds. I still hear in my mind responses to Malevich and Reinhardt. Good times.

Comparing Dall-E and Disco Diffusion

I have managed to spend some quality time with Disco Diffusion 5.2 since the beginning of May I have paid Google $10 per month to host some Collab notebooks. I have run well over 100 prompts and for many for 100-200 images, I have about 6000 created images in my drive. The process is excruciating as I have to run a browser and not close the window during generation. If I were to choose to, I could pay $50 to be able to run it in the background. I have considered this, but will be tabling that idea as I have located some high performance computing on campus. The translation from Google Collab to Jupyter Notebooks is reportedly simple, but I am just not familiar enough with the structure to make it seamless. Anyhow, that is the cost of DD for me at present, and though it takes about 7-8 minutes to generate each image, it seems like a decent deal. I have started to create cards / tarot cards and am including an example:

“mystic surreal arcane tarot card”

Regardless 6000 images satisfies my need to make lots of images with these algorithms.

Dall E 2 https://openai.com/dall-e-2/ is another beast entirely, it only runs on Open AI’s cloud and costs about $15 for 115 “credits” that can generate four images from an image or text prompt. Output is 1024×1024 and there is no tweeking the algorithm.

Dall E2 variation on the top of the previous image

Dall – E’s images are much cleaner and much more concise, they come off as legitimate illustrations of the text concepts and creating variations off an existing image is just beautiful. As with Disco Diffusion, there is a certain amount of time and experimentation that is done learning how to talk to the Algorithm and thankfully Open AI provides a LOT of guidance on crafting prompts.

original and three variations by Dall E

Dall -E is quick and the visual variations are stunning, but as of this writing 8/22, it will not accept human faces as prompts, or any part of the image, and is fairly restrictive about what it will permit in terms of text prompts. Poking at a few of these, I have come to love the prompt of “erotic Cthulhu” as part of

(this post was eaten. Well half of it was and I need to go back and rewrite it. 8-8-22)

Not Lovecraft

Shadow of Aldeberan by Dall E2

I have long had a fascination with the works of HP Lovecraft and though he has proven to have some pretty aweful viewpoints, the ideas of existential dread and a universe filled with uncaring beings of immense power continues to be a powerful metaphor for human existence.

I fed all of the HPL texts that I could locate into a GPT2 instance and am delighted with the progress, it is capapble of creating small chunks of very believable text and I cannot help but think of this as a great aid to writing. Perhaps I’ll try it out for the next NaNoWriMo.

The shade of the monument in the shadow of Aldeberan’s Grove is very thin, and marks the end of the line of mining. It is the great beacon of the Sabbat’s call to all the gods and to all men. In the distance the low walls of Arkham are crumbling, and the great hearths of Innsmouth are whispering with fright. A shadow covers the western slope of the hill, and the golden spires and domes of New Town tower afar off. Westward the faces of the valley slope, but still they stare black.

The first people were almost from the valley itself, and all that valley was the bane of their days. For over a century they had resented the intrusion of a garrulous people from the upper woods, and had regarded them as strangers and usurpers of their lands and liberties. Their ways were seldom listened to, and only by the prayers of their devoted priestesses did they attempt to erect a tower on the top of the hill. Their prayer was that no man should come within the walls of the city, for these walls were the tokens of their sovereignty and law.

shadow of aldeberan by gpt2

Files are here:

lovecraft-text – as one big zip file

lovecraft-text – folder .txt files

https://www.hplovecraft.com/writings/texts/

https://app.runwayml.com/models/rubaiyat/HP-Lovecraft-v2

This is not meant to be a full on how-to but a capture of the process I used to write the last piece Shadow of Aldeberan

The work was generated from the first characters “The shade of the monument in the shadow of Aldeberan” and GPT2 with RunwayML in their GPU cloud.

Working with GANs and Images v.1

GAN image generation:

The objective is to create new images based on my, owned, copylefted, files. So, start with this: 

Or more precisely about 200 images I took on a walk in the winter of 2021 in the upper peninsula of Michigan.  https://ruby-yacht.github.io/miichgan/eskie/index.html my software liked squares so I took pictures in landscape and split them into two squares making about 400 images, and fed them into a Generative Adversarial Network.

To Create this: https://ruby-yacht.github.io/miichgan/miichGAN/index.html

This is image creation with my own source images, the higher quality the selected images, the more narrow the images that are created, and the more identifiable. I like this dreamy mashed up quality.

I can see the inspiration from the source in the constructed images and I look forward to manipulating this by selecting images more specifically.

To process this, I used Runway ML, NVIDIAs GPU cloud and about $50. Runway handled all of the code and let me work with the data. This is my goal for student use of this, the details are less important, but will be available. I’ve made the model public if you want to browse the latent space: https://app.runwayml.com/models/rubaiyat/eskie-21

Here is another example with more limited images, I decided to cheat with video and the impact was significant but also more boring : https://ruby-yacht.github.io/miichgan/springan1/index.html and the model: https://app.runwayml.com/models/rubaiyat/spring-lake

Art and the Future, a Disco Diffusion experience

I have been playing with Disco Diffusion 5.2 for about a month now. A friend of the library connected me to Collab Notebooks which pretty much astonished me at the ease of use and the simple face that it puts on something like python. I have worked with neural networks before and found it interesting to engage this as an artist. I am listing a few of the experiments that I have made over the last 30 days. Totally worth the $10 that I paid Google. I am going to try to include the prompts I used because I think some of the interpretations are interesting. Frequently here too I have found that small updates in the language I use can have a positive impact on the images.

“erotic cthulhu at a party in 4 dimensional city” draw me like one of your elder things… I quite like Heiny in the tentacle.
“an artist at San Jose State University” found images that I was familiar with from when the blue construction walls and the dirty brushes club had painted murals on the walls. I am convinced I could find the original pictures on this one but haven’t looked for them.
"an old photograph of a crow during a satanic ritual" was an attempt to work with an image style but it worked really well with the subject, I was slow to put the satanic ritual in, but pleased with the result.
"A page out of a demonic black magic spell book with diagrams." Putting a little emphasis on occult topics and diagrams starts to hit some interesting synergies in that the text is so language like but unreadable. Very inspirational.
"There is a fifth dimension, beyond that which is known to man. It is a dimension as vast as space and as timeless as infinity. It is the middle ground between light and shadow, between science and superstition, and it lies between the pit of man's fears and the summit of his knowledge." I was curious how much of the feel of the show would come out of the intro. Wow, all of these images fit very well, I am not sure how it aligned with this.
"the last supper but in minecraft" playing around with styles and found it interesting that it could catch a good deal of the feel of Minecraft but still miss some essential qualities.
"Seven Hundred Step to the Dreamlands" This is an attempt to align with some fiction that I am working around and think about illustration in the RPG game space. This is Lovecraft dream cycle inspired and though I don't think it fully lands, much of it is usable. 
"Death drinking a boba tea." This was a response to a local boba shop closing and the threat of replacing illustrators and illustration with generative art. This took about tries and I was quite happy with the result.
"A videogame about dreams in an 8 bit RGB style." For this one I went down to 512x512 and am interested in going even lower with the resolution. there are some 8-bit and 16-bit elements. This is a whole project waiting for me.
 "three cards from a strange, dreamy tarot deck" "a cards from a future tarot deck" I ran a bunch of these and think there is a great project here, but it will have to be guided a bit, 78 images with meaning?! I like the quality of the images and the dreamy quality, I guess that was also part of the prompt. 
"a boardgame board from the future" I noticed that the algorithm seemed to work well with these sort of discrete collections. Some of these look playable.
"kleptogenesis" I just love the word and the concept, I ran another with the definition of keptogenesis, but I didn't think the images were as interesting.
"a can of beer with a picture of a computer on the label" Okay, I should make some beer.
"a movie poster of the queen of fairies" I love the perversion of the figure and the insect qualities, really fey.

I haven’t finished with DD, I have a few projects that I plan to do. At least one revolves around the multiplicity of the latent space and the meaninglessness in the fake language that comes out of it. Now all I need to do is produce it. I’ll post when it rolls off.

found my place in Escanaba

my place

I remember being a young gothy kid that hung out in graveyards because they were cool and I find it really compelling to have seen this stone from the road and had it bring me into the cemetery. Shortly after I saw this, I started circling the cemetery on my walks and I am sure the only reason I didn’t dive in sooner was the whole cliche of the thing.

As it stands this is the biggest park inside the town and the gates are open most of the time (I have never seen them closed). I think Every Day the Same Dream called this “a quiet place” and that sticks with me even now. It is quiet.

I fear that if I were to stay here, the only option for me is the grave.

dark moments

from the sea
looking back
from the stars

A little triptych put together influenced by the onslaught of spring in the UP and a desire to make art. These have been printed.

Posted in art

Spring Lake Nuance

https://ruby-yacht.github.io/miichgan/springan1/images/index.html

I am looking for a certain nuance in the images and as the seasons change I have a strong interest in capturing that. These images do not quite get there. I used a bit of a different method for generating the images, shooting 3 short 1080p videos, stringing them together and exporting the frames. I posted the video which is embarrassingly bad: https://www.youtube.com/watch?v=inbXrJForl8 and made it available for download as well. This process was much faster than shooting 200 images and made them very much similar. I think I had an initial FID score of 88 after about 2000 steps, but the images also felt a little off so I ran another 2k. The score bumped up to 109 but the images felt a little better. The latent space walk video is pretty boring, except for perhaps the texture of the water. ( https://www.youtube.com/watch?v=axBomAHiUGk )

Visually this feels like composition, I am thinking that if I can dial in my images I can start to predict the output a little better. There are some interesting patterns emerging because of the trees which I expect to see budding soon and am looking forward to the next comparison. Keep an eye out for springan2.

Here is my project page, and I am happy to share images if there is a desire.

https://ruby-yacht.github.io/miichgan/springan1/index.html