In this post, I would like to give a shout out to this model HighCWu/FLUX.1-dev-4bit. The documentation made it very easy to adapt it to google colab with ControlNet, for control net we are going to be using InstantX/FLUX.1-dev-Controlnet-Union. This is a very comprehensive control net.
We start with cloning the repository.
!git clone https://github.com/HighCWu/flux-4bit
We change the directory to the cloned repository and install the requirements.
%cd flux-4bit
!pwd
!pip install -r requirements.txt
At the time of writing this post, the diffusers official pip release doesn't have the ControlNet pipelines for flux, so we are going to take the code from the repository and install it.
!pip install git+https://github.com/huggingface/diffusers.git
We need to run the model file since it contains custom definitions for the T5EncoderModel
and FluxTransformer2DModel
%run /content/flux-4bit/model.py
Now we import the required libraries and define the model for flux and also the control net.
import torch
# from diffusers import FluxPipeline
from diffusers.utils import load_image
from diffusers import FluxControlNetPipeline, FluxControlNetModel
text_encoder_2: T5EncoderModel = T5EncoderModel.from_pretrained(
"HighCWu/FLUX.1-dev-4bit",
subfolder="text_encoder_2",
torch_dtype=torch.bfloat16,
# hqq_4bit_compute_dtype=torch.float32,
)
transformer: FluxTransformer2DModel = FluxTransformer2DModel.from_pretrained(
"HighCWu/FLUX.1-dev-4bit",
subfolder="transformer",
torch_dtype=torch.bfloat16,
)
controlnet = FluxControlNetModel.from_pretrained(
'InstantX/FLUX.1-dev-Controlnet-Union',
torch_dtype=torch.bfloat16
)
pipe = FluxControlNetPipeline.from_pretrained(
"black-forest-labs/FLUX.1-dev",
text_encoder_2=text_encoder_2,
controlnet=controlnet,
transformer=transformer,
torch_dtype=torch.bfloat16,
)
For the free colab enviorment, we need to sue the enable_model_cpu_offload()
function to run the model without consuming too much VRAM.
pipe.enable_model_cpu_offload()
For the other colab environment, we can use the pipe.to('cuda')
function to run the model on GPU.
pipe.remove_all_hooks()
pipe.to('cuda')
Lets run a simple image generation task with a canny outlines images.
control_image = load_image("https://huggingface.co/InstantX/FLUX.1-dev-Controlnet-Union-alpha/resolve/main/images/canny.jpg")
controlnet_conditioning_scale = 0.5
control_mode = 0
width, height = control_image.size
prompt = 'A realistic-style, raytraced, female travel blogger with sun-kissed skin and messy beach waves.'
image = pipe(
prompt,
control_image=control_image,
control_mode=control_mode,
width=width,
height=height,
controlnet_conditioning_scale=controlnet_conditioning_scale,
num_inference_steps=24,
guidance_scale=3.5,
).images[0]
image.save("image.jpg")
image
Flux with control net is a very powerful model, and we can use it for a variety of image generation tasks. Feel free to follow me on social media for more updates and posts.