perceptor.models

perceptor.models.CLIP(architecture: str, precision: Optional[str] = None)[source]
Parameters
  • architecture (str) – name of the clip model. Available models are: - RN50 [-quickgelu] - RN101 [-quickgelu] - RN50x4 - RN50x16 - RN50x64 - ViT-B-32 [-quickgelu] - ViT-B-16 - ViT-L-14 - ViT-L-14-336px

  • precision (str) – precision of the model. Options are “fp32” and “fp16”

class perceptor.models.DeepImagePrior(shape=(128, 256, 256), offset_type='none', n_scales=2, sigmoid=True, decorrelate_rgb=True, output_channels=3)[source]

Bases: Module

property device
forward(latents: ~lantern.tensor.Tensor.dims.<locals>.InheritTensor) InheritTensor[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

fourier_latents(size=1, n_channels=None, min_log2_frequency=0.0, max_log2_frequency=9.0, log2_space=False)[source]
property height
property input_channels
noisy_image_latents(images, n_channels=None, log_snr=-1.0)[source]
non_offset_parameters()[source]
offset_parameters()[source]
parameter_dicts(learning_rate)[source]
random_latents(size=1, n_channels=None)[source]
training: bool
property width
class perceptor.models.GuidedDiffusion(name='standard')[source]

Bases: Module

alphas(indices) Tensor[source]
property device
diffuse_images(denoised_images, indices, noise=None) Tensor[source]
forward(diffused_images, indices) Predictions[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

indices(indices) Tensor[source]
predicted_noise(diffused_images, from_indices) Tensor[source]
predictions(diffused_images, indices) Predictions[source]
random_diffused(shape) Tensor[source]
schedule_indices(n_steps=500, from_index=999, to_index=0, rho=7.0) Tensor[source]
sigmas(indices) Tensor[source]
training: bool
class perceptor.models.MonsterDiffusion(name='all')[source]

Bases: Module

static alphas(ts)[source]
c_in(ts)[source]
c_noise(ts)[source]
c_out(ts)[source]
c_skip(ts)[source]
denoised_(diffused_images: ~lantern.tensor.Tensor.dtype.<locals>.InheritTensor, ts: ~lantern.tensor.Tensor.dims.<locals>.InheritTensor, nonleaky_augmentations: ~typing.Optional[~lantern.tensor.Tensor.dims.<locals>.InheritTensor] = None) InheritTensor[source]

Parameterization from https://arxiv.org/pdf/2206.00364.pdf

property device
static diffuse(images: ~lantern.tensor.Tensor.dtype.<locals>.InheritTensor, ts, noise=None)[source]
elucidated_sample(size, n_evaluations=100, progress=False, diffused_images=None)[source]

Elucidated stochastic sampling from https://arxiv.org/pdf/2206.00364.pdf

forward(diffused_images: ~lantern.tensor.Tensor.dims.<locals>.InheritTensor, ts: ~lantern.tensor.Tensor.dims.<locals>.InheritTensor, nonleaky_augmentations: ~typing.Optional[~lantern.tensor.Tensor.dims.<locals>.InheritTensor] = None)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

static gamma(ts, n_steps)[source]
inject_noise(diffused_images, ts, reversed_ts)[source]
static linear_multistep_coeff(order, sigmas, from_index, to_index)[source]
linear_multistep_sample(size, n_evaluations=100, progress=False, diffused_images=None, order=4)[source]

Katherine Crowson’s linear multistep method from https://github.com/crowsonkb/k-diffusion/blob/4fdb34081f7a09f16c33d3344a042e5bea8e69ee/k_diffusion/sampling.py

predictions(diffused_images: ~lantern.tensor.Tensor.dims.<locals>.InheritTensor, ts: ~lantern.tensor.Tensor.dims.<locals>.InheritTensor, nonleaky_augmentations: ~typing.Optional[~lantern.tensor.Tensor.dims.<locals>.InheritTensor] = None)[source]
predictions_(diffused_images: ~lantern.tensor.Tensor.dims.<locals>.InheritTensor, ts: ~lantern.tensor.Tensor.dims.<locals>.InheritTensor, nonleaky_augmentations: ~typing.Optional[~lantern.tensor.Tensor.dims.<locals>.InheritTensor] = None)[source]
static random_noise(size)[source]
static reversed_ts(ts, n_steps)[source]
sample(size, n_evaluations=100, progress=False, diffused_images=None)[source]
schedule_ts(n_steps)[source]
static sigmas(ts)[source]
training: bool
static training_ts(size)[source]
class perceptor.models.SimulacraAesthetic(model_name='ViT-B-32')[source]

Bases: Module

forward(images)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool
class perceptor.models.StableDiffusion(name: str = 'runwayml/stable-diffusion-v1-5', decoder_name: Optional[str] = 'stabilityai/sd-vae-ft-mse', fp16: bool = True, auth_token: Union[bool, str] = True, flash_attention: bool = True, attention_slicing: Optional[Union[int, Literal['auto']]] = None)[source]

Bases: Module

alphas(indices) Tensor[source]
conditioning(texts: ~typing.List[str] = [''], inpainting_masks: ~typing.Optional[~lantern.tensor.Tensor.dims.<locals>.InheritTensor] = None, inpainting_images: ~typing.Optional[~lantern.tensor.Tensor.dims.<locals>.InheritTensor] = None, mask_blur=4.0) Conditioning[source]

Create a conditioning object from a list of texts. Unconditional is an empty string.

Parameters
  • texts – A list of texts to condition on. Unconditional is an empty string

  • inpainting_masks – A tensor of masks to condition on. Must be 1-channel and between 0 and 1

  • inpainting_images – A tensor of images to condition on. Must be 3-channel and between 0 and 1

decode(latents: ~lantern.tensor.Tensor.dtype.<locals>.InheritTensor) InheritTensor[source]
property device
diffuse_latents(denoised_latents, indices, noise=None) Tensor[source]
encode(images: ~lantern.tensor.Tensor.dtype.<locals>.InheritTensor, method='mode') InheritTensor[source]
finetuneable_vae()[source]
with diffusion_model.finetuneable_vae():

images = diffusion_model.decode(latents)

forward(diffused_latents: Tensor, indices: Tensor, conditioning: Optional[Conditioning] = None) Predictions[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

images(latents: ~lantern.tensor.Tensor.dtype.<locals>.InheritTensor) InheritTensor[source]
indices(indices) Tensor[source]
latent_masks(masks, blur)[source]
latents(images: ~lantern.tensor.Tensor.dtype.<locals>.InheritTensor) InheritTensor[source]
predicted_noise(diffused_latents, from_indices, conditioning: Conditioning) Tensor[source]
predictions(diffused_latents, indices, conditioning) Predictions[source]
random_diffused_latents(shape) Tensor[source]
sample(text: str, from_index: int = 999, to_index: int = 0, n_steps: int = 50, guidance_scale: float = 7.0, n_resample: int = 0, init_image: Optional[Tensor] = None, inpainting_mask: Optional[Tensor] = None, mask_blur: float = 4.0, replace_diffused: bool = True)[source]

Helper function to sample a single image.

Parameters
  • text – The text to condition on

  • from_index – The index to start sampling from

  • to_index – The index to end sampling at

  • n_steps – The number of steps to take between from_index and to_index

  • guidance_scale – The scale of the guidance signal

  • n_resample – The number of times to resample at each step

  • init_image – The initial image to start sampling from (also used for inpainting)

  • inpainting_mask – The mask to use for inpainting

  • mask_blur – The amount of blur to apply to the inpainting mask

  • replace_diffused – Whether to replace the diffused latents at each step (peeks into the init image so it’s not true inpainting)

schedule_indices(n_steps=500, from_index=999, to_index=0, rho=3.0) Tensor[source]
property shape
sigmas(indices) Tensor[source]
text_encodings(texts)[source]
training: bool
class perceptor.models.TransformersOpenAICLIP(name='openai/clip-vit-large-patch14', bfloat16=True)[source]

Bases: Module

property device
encode_images(images) Encodings[source]
encode_texts(texts) Encodings[source]
forward(_)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

static spherical_distance(encodings_a: Encodings, encodings_b: Encodings) Tensor[source]
tokenize(texts)[source]
training: bool
class perceptor.models.VelocityDiffusion(name='yfcc_2')[source]

Bases: Module

alphas(ts) Tensor[source]
conditioning(texts=None, images=None, encodings=None) Tensor[source]
property device
diffuse(denoised_images, ts, noise=None) Tensor[source]
forward(diffused_images, ts, conditioning=None) Predictions[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

inject_noise(diffused_images, ts, reversed_ts, extra_noise_multiplier=1.003) Tensor[source]
predictions(diffused_images, ts, conditioning=None) Predictions[source]
random_diffused(shape) Tensor[source]
static schedule_ts(n_steps=500, from_ts=1.0, to_ts=0.01, rho=7.0) Tensor[source]
property shape
sigmas(ts) Tensor[source]
static sigmas_to_ts(sigmas) Tensor[source]
to(device)[source]

Moves and/or casts the parameters and buffers.

This can be called as

to(device=None, dtype=None, non_blocking=False)[source]
to(dtype, non_blocking=False)[source]
to(tensor, non_blocking=False)[source]
to(memory_format=torch.channels_last)[source]

Its signature is similar to torch.Tensor.to(), but only accepts floating point or complex dtypes. In addition, this method will only cast the floating point or complex parameters and buffers to dtype (if given). The integral parameters and buffers will be moved device, if that is given, but with dtypes unchanged. When non_blocking is set, it tries to convert/move asynchronously with respect to the host if possible, e.g., moving CPU Tensors with pinned memory to CUDA devices.

See below for examples.

Note

This method modifies the module in-place.

Parameters
  • device (torch.device) – the desired device of the parameters and buffers in this module

  • dtype (torch.dtype) – the desired floating point or complex dtype of the parameters and buffers in this module

  • tensor (torch.Tensor) – Tensor whose dtype and device are the desired dtype and device for all parameters and buffers in this module

  • memory_format (torch.memory_format) – the desired memory format for 4D parameters and buffers in this module (keyword only argument)

Returns

self

Return type

Module

Examples:

>>> linear = nn.Linear(2, 2)
>>> linear.weight
Parameter containing:
tensor([[ 0.1913, -0.3420],
        [-0.5113, -0.2325]])
>>> linear.to(torch.double)
Linear(in_features=2, out_features=2, bias=True)
>>> linear.weight
Parameter containing:
tensor([[ 0.1913, -0.3420],
        [-0.5113, -0.2325]], dtype=torch.float64)
>>> gpu1 = torch.device("cuda:1")
>>> linear.to(gpu1, dtype=torch.half, non_blocking=True)
Linear(in_features=2, out_features=2, bias=True)
>>> linear.weight
Parameter containing:
tensor([[ 0.1914, -0.3420],
        [-0.5112, -0.2324]], dtype=torch.float16, device='cuda:1')
>>> cpu = torch.device("cpu")
>>> linear.to(cpu)
Linear(in_features=2, out_features=2, bias=True)
>>> linear.weight
Parameter containing:
tensor([[ 0.1914, -0.3420],
        [-0.5112, -0.2324]], dtype=torch.float16)

>>> linear = nn.Linear(2, 2, bias=None).to(torch.cdouble)
>>> linear.weight
Parameter containing:
tensor([[ 0.3741+0.j,  0.2382+0.j],
        [ 0.5593+0.j, -0.4443+0.j]], dtype=torch.complex128)
>>> linear(torch.ones(3, 2, dtype=torch.cdouble))
tensor([[0.6122+0.j, 0.1150+0.j],
        [0.6122+0.j, 0.1150+0.j],
        [0.6122+0.j, 0.1150+0.j]], dtype=torch.complex128)
training: bool
velocities(diffused, t, conditioning=None) Tensor[source]