AI, Stable Diffusion, Models and Prompts - ~comp

cold_porridge

July 29, 2023

Link

For sure check out civitai for rendering models to play around with. Most of them are designed to produce different styles of women to be frank, but a lot aren't and are very good. Read up a bit...

Exemplary

For sure check out civitai for rendering models to play around with. Most of them are designed to produce different styles of women to be frank, but a lot aren't and are very good.

Read up a bit on unfamiliar parameters, such as you'll want to know about noise schedulers and guidance scale.

Toy around with inference steps (roughly correlated with image quality at cost of speed).

When you're ready, start playing with inpainting for finer control.

And finally regarding silhouettes - unless you find a particular model trained for it, I suggest this might best be done with a little manual labor atop stable diffusion. I know that's not the most fun answer, but I believe it's probably the best answer. Use photoshop or equivalent to break down images into their composite shapes, turn those shapes to black. If you're lucky you might be able to specify "in the style of a silloute" though and get solid results. If you try let me know how it goes.

Most importantly have fun with image generation you can run on your own computer, and hope that helps!

8 votes

kru

July 29, 2023

Link

You piqued my curiosity and I did a few quick prompt tests. I got some pretty good silhouette generations with the prompt: "silhouette, flat 2d style, blank white background, <content prompt>"...

You piqued my curiosity and I did a few quick prompt tests. I got some pretty good silhouette generations with the prompt:

"silhouette, flat 2d style, blank white background,
<content prompt>"

where <content prompt> is replaced by what type of shape I am hoping to see. some prompts I tried were:

"silhouette, flat 2d style, blank white background,
action shot, warrior, sword"
and I got some nice silhouettes of warriors holding swords. The dark sushi mix model also tended to give some nice "double exposure" shots along with pure silhouettes.

"silhouette, flat 2d style, blank white background,
basketball, dribbling, hoop, sneakers"
which gave images that could be put next to a nike logo

I tried 3 different models, an anime, cartoon and photoreal model, and each provided some good results:
dark sushi mix,
cartunafied,
aZoyvaPhotoreal v2

4 votes

DataWraith

July 29, 2023

Link

I haven't tried generating silhouettes with Stable Diffusion directly, but I've recently played with Facebook's Segment Anything Model, which might fit the bill. If you're familiar with Python, or...

I haven't tried generating silhouettes with Stable Diffusion directly, but I've recently played with Facebook's Segment Anything Model, which might fit the bill.

If you're familiar with Python, or willing to work through the hassle of installing the required libraries (PyTorch, etc.), this could be an alternative for generating silhouettes.

Segment Anything works fairly well for separating a given subject or object from the background in arbitrary images. You pass in an image to load and the coordinate of a single pixel (ideally somewhere at the center of the thing you're trying to silhouette),
and the neural network gives you back a mask of the object at that position. Turning that into a silhouette is fairly simple once you've installed the required libraries:

Python code for silhouetting

import cv2
import numpy as np
import torch
from segment_anything import SamPredictor, sam_model_registry

# Read the input image
image = cv2.imread("dragon.jpg")
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)

# Select a pixel coordinate from inside the object you want to segment
input_point = np.array([[512, 512]])
input_label = np.array([1])

# Create the model (checkpoint must have been downloaded into the current folder)
sam = sam_model_registry["default"](checkpoint="sam_vit_h_4b8939.pth")

# Move it to the GPU if possible
device = "cuda" if torch.cuda.is_available() else "cpu"
sam.to(device)

# Feed the image into the model
predictor = SamPredictor(sam)
predictor.set_image(image)

# Predict masks
masks, _, _ = predictor.predict(
    point_coords=input_point, point_labels=input_label, multimask_output=True
)

# Convert masks to PNGs and save them as "mask_{1,2,3}.png"
for i, mask in enumerate(masks):
    height, width = mask.shape[-2:]

    color = np.array([0, 0, 0, 255]) # Black

    # Convert mask into pixels
    mask_image = np.clip(
        mask.reshape(height, width, 1) * color.reshape(1, 1, -1), a_min=0, a_max=255
    ).astype(np.uint8)

    # Fill in transparent parts with white. There's probably a proper
    # way to do this, but this works.
    for x in range(width):
        for y in range(height):
            if mask_image[x, y][-1] == 0:
                mask_image[x, y] = [255, 255, 255, 255] # White

    # Save the image
    cv2.imwrite(f"mask_{i+1}.png", mask_image)

For a 1024x1024 image, this takes about a minute to process without a GPU.

I've applied it to an image I've generated with Stable Diffusion XL as an example: https://imgur.com/a/IrZdkSE.

3 votes

[3]

Greg

July 29, 2023

Link

Worth mentioning that Google Cloud will give you $300 of free trial credits for signing up - at spot prices that’s about 900 hours on a g2-standard-8, which has an L4 GPU - roughly equivalent to a...

I'm ordering a GPU (3060) to improve the horrendous render times, so don't worry about the under powered rig, I'm still in toy mode.

Worth mentioning that Google Cloud will give you $300 of free trial credits for signing up - at spot prices that’s about 900 hours on a g2-standard-8, which has an L4 GPU - roughly equivalent to a 3090 for this work. You can spin up an instance just for the time you need on a given day, so those hours can go a fairly long way.

You’ll definitely need to have (or learn) a bit of familiarity with SSH to set things up, but I figure given you’re a Linux user looking to self-host ML models that might not be too much of a stretch?

2 votes

[2]
DataWraith
July 29, 2023
Link Parent
There's also the possibility to use volunteer GPUs from the AI Horde, which is a compute grid where people donate GPU time to others who want to use Stable Diffusion or LLMs. It's slightly...

There's also the possibility to use volunteer GPUs from the AI Horde, which is a compute grid where people donate GPU time to others who want to use Stable Diffusion or LLMs. It's slightly gamified so that people with more Kudos have priority for generation, but, except on weekends, it will usually return an image within a few minutes.

3 votes
1. Greg
  July 29, 2023
  Link Parent
  Oh that's cool, it's one I hadn't come across before - always makes me smile to see people pitching in like that!
  
  Oh that's cool, it's one I hadn't come across before - always makes me smile to see people pitching in like that!
  
  1 vote