42 votes

Fooocus - The most user-friendly local image-gen interface to date

15 comments

  1. teaearlgraycold
    Link
    I've been using this and it's pretty fast on my 3080 Ti (20s for 1024x1024 on "Quality" mode) and pretty much always produces good results. I made my wallpaper with it. I upscaled with the 2x...

    I've been using this and it's pretty fast on my 3080 Ti (20s for 1024x1024 on "Quality" mode) and pretty much always produces good results. I made my wallpaper with it. I upscaled with the 2x upscaler twice for this one.

    16 votes
  2. [3]
    FluffyKittens
    Link
    Like a lot of other people on this site, I've been playing around with Stable Diffusion and kin a fair bit over the past year (mostly for TTRPG assets in my personal case). The Automatic11111 and...

    Like a lot of other people on this site, I've been playing around with Stable Diffusion and kin a fair bit over the past year (mostly for TTRPG assets in my personal case).

    The Automatic11111 and ComfyUI interfaces are quite helpful, but can be rather intimidating for beginners and have required some hard-to-follow setup IME. An HN thread from the past week had a lot of people talking about Fooocus, so I decided to give it a whirl this weekend.

    From scratch, the Linux install was a matter of copy & pasting six lines, which took ~3 minutes to run, and I was instantly getting fast, high-quality inference - and the interface is extremely intuitive with nice, user-friendly config options, and preconfigured style toggles.

    Personal screencap: https://imgur.com/MMppg5U.png

    Highly recommend this software for anyone who's been holding off on generative AI, waiting for the tooling to become a bit more ergonomic.

    11 votes
    1. [2]
      danke
      Link Parent
      Do you have a screenshot of the "User-friendly ControlNets"? Last time I caught up on this stuff, I thought this rudimentary live canvas script by /u/arjan_M was a very streamlined way of doing...

      Do you have a screenshot of the "User-friendly ControlNets"? Last time I caught up on this stuff, I thought this rudimentary live canvas script by /u/arjan_M was a very streamlined way of doing things and I'm wondering what the current "state-of-the-art" is.

      4 votes
      1. FluffyKittens
        Link Parent
        Sure, this is what I'm seeing: https://imgur.com/sybd5ac.png I'm not sure it's meant to support that sort of live-painting though.

        Sure, this is what I'm seeing: https://imgur.com/sybd5ac.png

        I'm not sure it's meant to support that sort of live-painting though.

        4 votes
  3. vord
    (edited )
    Link
    Alright, firing up on the Steam Deck, let's see how it goes (venv install)... Segfault. I set the VRAM to 4GB in BIOS, but still segfaults. Damn. Anybody wanna send their LED Deck to the devs lol?...

    Alright, firing up on the Steam Deck, let's see how it goes (venv install)...

    Segfault. I set the VRAM to 4GB in BIOS, but still segfaults.

    Damn. Anybody wanna send their LED Deck to the devs lol?

    Edit: Some progress, but still no dice
    https://github.com/lllyasviel/Fooocus/issues/627#issuecomment-1833735735

    Edit: Processing now, after also adding --use-split-cross-attention flag. Unsurprisingly slow, but we'll see if anything comes of it...

    No dice after a few hours of running 100% GPU...there were some index out of bound errors early on which were probably more fatal than they looked.

    8 votes
  4. JCAPER
    Link
    I’ve been playing around with it and the photos it produces of people are scary good, especially if we use the default settings in “run_realistic.bat”. In a lot of the photos produced, I...

    I’ve been playing around with it and the photos it produces of people are scary good, especially if we use the default settings in “run_realistic.bat”. In a lot of the photos produced, I legitimately could not find hints that they were fake. The only ones I could were a few when the AI made hands (it still struggles with fingers), too elaborate/detailed backgrounds, or text.

    They also included neat features like face swap, expand the image, using pictures and prompts, mixing them, etc.

    This project is better than midjourney and dalle3 in my opinion.

    The downside is that we may really be approaching that level where AI photos are so good that we will have trouble figuring which are real and which aren’t, maybe sooner rather than later.

    6 votes
  5. Well_known_bear
    Link
    Gave it a shot and loved the idiot proof easy install process (although A1111 now also has a similar automated process) and simple user interface. This is a great option for anyone who wants to...

    Gave it a shot and loved the idiot proof easy install process (although A1111 now also has a similar automated process) and simple user interface.

    This is a great option for anyone who wants to just jump in and generate some images, but only being able to use SDXL models as your main model is a real bummer. A lot of the models floating around right now are SD 1.5 and that's likely to remain the case in the short term.

    3 votes
  6. jmpavlec
    Link
    Been playing with this from the HN thread with my Mac M1 32gb ram. Set-up was very easy and the results are great. Image generation is about 2-3 min per image for me which is acceptable for me...

    Been playing with this from the HN thread with my Mac M1 32gb ram. Set-up was very easy and the results are great. Image generation is about 2-3 min per image for me which is acceptable for me given I have no GPU. Watching the image preview is also fun as it makes little tweaks here and there after the initial image it creates. You can also watch the terminal to see the exact prompts it puts into each model. Overall very nice to be able to generate stuff offline (after the models have downloaded).

    2 votes
  7. [2]
    gco
    Link
    Very keen to try this out! Previously I've felt intimidated by the complexity of setting all this up by myself. Are there any repositories of models or training data available online to plug into...

    Very keen to try this out! Previously I've felt intimidated by the complexity of setting all this up by myself. Are there any repositories of models or training data available online to plug into this tool? I'd be interested in experimenting and see what results I can get.

    2 votes
    1. Well_known_bear
      Link Parent
      The default install will give you a couple of models to start out with, and there are plenty of others you can download and try out. Citivai is a good starting point.

      The default install will give you a couple of models to start out with, and there are plenty of others you can download and try out. Citivai is a good starting point.

      2 votes
  8. [5]
    Quanttek
    (edited )
    Link
    I took a look at the example tests linked at the top of the Readme and, while they are truly stunning, I noticed a dire lack of diversity in the generated images: everyone is white despite the...

    I took a look at the example tests linked at the top of the Readme and, while they are truly stunning, I noticed a dire lack of diversity in the generated images: everyone is white despite the generic prompts, safe for one Asian woman and one Middle Eastern-looking man. There are no Black people in the world of Fooocus - at least this version's.

    Of course, it is a consequence.of biased training sets but there is a high risk that the images generated when e.g. prompting for "beautiful woman" will further reinforce what we think is normally beautiful, amplify racial biases and problematic beauty standards, and contribute to the continuing lack of diverse representation in media.

    8 votes
    1. [3]
      Wuju
      Link Parent
      I'm not super into AI generative stuff, so some of this comment may be incorrect. But from what I can tell, this is mainly just an interface for AI image generation. It does seem to have some...

      I'm not super into AI generative stuff, so some of this comment may be incorrect. But from what I can tell, this is mainly just an interface for AI image generation. It does seem to have some preset models, but I would imagine you could quite easily add your own more diverse generative models if you find that to be an issue.

      Additionally, as far as I'm aware, it's an issue that is present in much of AI image generation. So much so, that some of them, such as Bing's will add words like "ethically ambiguous" to some of your prompts just to try and diversify it without completely rebuilding their models from scratch.

      15 votes
      1. atomicshoreline
        Link Parent
        This is correct. This software is simply a user interface. The issues with the model stem not from the model, its doing its job of mimicry fine. The problem is the data the model is trained...

        as far as I'm aware, it's an issue that is present in much of AI image generation.

        This is correct. This software is simply a user interface. The issues with the model stem not from the model, its doing its job of mimicry fine. The problem is the data the model is trained on(largely website content) is influenced by the biases of its creators and users. Generative AI systems are simply holding up a proverbial mirror.

        There is an element of the is/ought fallacy going on here where people irrationally assume the system will reflect a different world than the one present in its training data. Multiple different approaches have been tried to coerce these systems into behaving in ways inconsistent with their training, all of which degrade the functionality of the system. Here are a few examples of what has been tried so far.

        1. Train the system to refuse certain requests -- System denies requests that are harmless and users and developers are in a constant cat and mouse game with jailbreaks(requests that cause the system to ignore its own rules).

        2. Modify the users request by injecting words such as "ethnically ambiguous" -- The prompt is modified unpredictably often causing the injected prompt elements to leak into the image as text or otherwise modify the image in undesirable ways.

        3. Alter the training data to be more balanced and representative -- The size of your dataset is very important when building a system like this. Usually all or nearly all the data that is available is used in the training process. Getting more balanced ratios would involving removing existing data rather than getting new data. This would degrade system performance for obvious reasons. Stability AI tried reducing the amount of porn that Stable Diffusion 2.0 was trained on and it ended up hurting its ability to draw overall. Also AI generating new data tends to train models to pick up on and exaggerate the peculiarities of AI generate data which leads to something akin to inbreeding and is not a good way around the issue.

        9 votes
      2. ThrowdoBaggins
        Link Parent
        I assume you meant “ethnically ambiguous” but I prefer the mental image of the AI behind Bing being asked to “make it just a little bit evil, but not too obvious”

        I assume you meant “ethnically ambiguous” but I prefer the mental image of the AI behind Bing being asked to “make it just a little bit evil, but not too obvious”

        14 votes
    2. updawg
      (edited )
      Link Parent
      I just asked Dall-E to generate a beautiful woman and it asked me I just said "no" to see what it would give me and I got two images of what I would judge to be either light skinned Black women or...

      I just asked Dall-E to generate a beautiful woman and it asked me

      Could you provide more details or specific characteristics for the image you'd like to create? For example, any particular setting, attire, or additional elements that should be included? This will help in generating an image that closely matches your request.

      I just said "no" to see what it would give me and I got two images of what I would judge to be either light skinned Black women or possibly half-Black, half-white women with this description:

      Here are the images of a beautiful woman, portrayed with a focus on diversity and individuality. The portraits feature a warm expression and modern elegance. If you have any specific adjustments in mind or other requests, feel free to let me know!

      Then I asked it to "Make them more diverse" and I got two images each with one dark-skinned Black woman and one East Asian woman.

      Here are the images showcasing the beauty of diverse women. The portraits feature two women of different ethnic backgrounds, each with unique features and expressions. The focus is on their individual beauty in a respectful and inclusive manner. If you have more specific ideas or requests, feel free to share them!

      I had to ask for more diversity four times before it showed anyone who didn't look like a super model.

      The fifth time, things really started going off the rails. Women with beards, horrible disfigurements (AI artifacts), random textures on faces, weird flag things in the background, etc.

      10 votes