Launch HN: Rubbrband (YC W23) – Deformity detection for AI-generated images

Fission · on Aug 10, 2023

The male astronaut with coffee [1] (that I believe you're using as a "verified" example) has an extra finger on his right hand

[1] https://www.rubbrband.com/static/media/astronaut_with_coffee...

toddmorey · on Aug 10, 2023

And his cup backwards

jumpkick · on Aug 10, 2023

Look more closely: the cup has two handles.

jrmylee · on Aug 10, 2023

haha I thought this was a funny example to use. On second thought we'll replace it with something better!

wentin · on Aug 11, 2023

why is showing your algorithm doesn't work funny?

abhinavgopal · on Aug 11, 2023

That example actually wasn't meant to be related to the hands/deformity algorithm-- it was meant to talk about the prompt alignment. Regardless, it does make sense for us to change up that image to avoid confusion!

RadiozRadioz · on Aug 12, 2023

Bad way to advertise your product. It's entirely reasonable to make the conclusion that your product only does one thing at a time, also that the team lacks attention to detail (for a product that's all about attention to detail)

upwardbound · on Aug 11, 2023

Doesn't the new one have too few fingers now? And the pointer finger looks like it's shaped like a thumb, huge and gross.

upwardbound · on Aug 11, 2023

Doesn't the new one have too few fingers now? Looks like a total of only 3 fingers. And the pointer finger looks like it's shaped like a thumb, huge and gross.

belter · on Aug 11, 2023

Pivot and pitch it as a grossness validating algorithm?

ghgr · on Aug 11, 2023

And his coffee should be boiling

bjourne · on Aug 10, 2023

Isn't this product kind of impossible? Like a compression program that compresses compressed files? If you have an algorithm for determining whether a generated image is good or bad couldn't the same logic be incorporated into the network so that it doesn't generate bad images?

joefourier · on Aug 11, 2023

Not impossible at all - classifier networks are much, much easier to train than generative networks. However you can’t directly integrate the logic into the generator, you’d have to train the generator against the discriminator network. This is essentially the principle of a GAN and although many tricks have been developed in recent years, they tend to be finicky and difficult to train.

Diffusion models like SD are trained with a very simple loss function instead, which is just the L2 loss of an iterative denoising process. This tends to result in stabler training than using GANs. However, you could fine tune SD with reinforcement learning using the deformity detector as the reward, but it’s not a panacea as it could lead to overfitting and performance degradation.

bjourne · on Aug 11, 2023

> Not impossible at all - classifier networks are much, much easier to train than generative networks. However you can’t directly integrate the logic into the generator, you’d have to train the generator against the discriminator network.

Generative networks are ime not at all difficult to train because the amount of training data is typically orders of magnitudes larger. In this case, the idea is to train something to classify images as high or low quality, which I think is just as hard as generating images. Regardless, if you had such logic, I don't see why you couldn't incorporate that into the network's own loss function? That's how it is done for L1 and L2 regularization and many other techniques for "tempering" the training process.

The problem is that you want the model to be creative but not "too creative" (e.g eight finger hands). But preventing it from being too creative risks making it boring and bland (e.g only generating stock images). I don't think you can solve that with a post-processing filter. Generating say 100 images and picking the "best" one might just be the same as picking the most bland one.

__loam · on Aug 10, 2023

That's essentially how using a GAN works.

E: or how it's supposed to work.

thumbuddy · on Aug 11, 2023

Kind of phenomological but both parts of the GAN are the same model.

darren_hsu · on Aug 10, 2023

We’re optimistic about using our own algorithms and models to evaluate another model. In theoretical computer science, it is easier to verify a correct solution than to generate a correct solution (P vs NP problem).

bagels · on Aug 11, 2023

I don't think p vs np has anything to do with it, but also, I don't think your maxim is always (but maybe often is) true anyways.

Problem: traveling salesman, solution: one particular path. I think verification that the solution is optimal in this case is exactly the same problem as finding the solution.

vore · on Aug 11, 2023

Not to nitpick but this is NOT the right takeaway from P vs NP.

_bkyr · on Aug 10, 2023

Do you, or will you, use human labor in any instance on evaluating images?

darren_hsu · on Aug 10, 2023

We currently don't since its not scalable

version_five · on Aug 10, 2023

Fundamentally it sounds like you built an ML model(s) and are trying to monetize it behind an API. How does that work medium-term? Are you expecting there won't be open source alternatives, is your value in hosting the model (and if so will you open source yours) or is there another angle. I've built ML models and looked into how to monetize them, and overall it seems like a tough play without the model being part of some bigger thing that has more of a moat. What are your thoughts?

jrmylee · on Aug 10, 2023

Yeah it's a great q

The way we think about it is that we're building a product for organizations in scaling mode, and they have deep needs on the product-side. Flexibility on filtering, different client-libraries, a clean observability interface, etc...

It's possible that we open-source parts of our models, but fundamentally we think we can capture value by building a great all-around web product, and not just a set of eval models.

notahacker · on Aug 11, 2023

I'd either want the deformity testing either integrated with the generation service (really deeply if it's a GAN!) or a post processing tool (a "look, dodgy hands" layer in Photoshop which could then allow you to fix those deformities) rather than a separate web service

If the quality of the model is difficult to replicate (which seems to be a big "if" at the pace of NN image processing improvements), I guess there might be licensing or plugin sale opportunities there

alecbell · on Aug 10, 2023

This is a brilliant idea. Whenever I look at an image these days that has the "texture" of a generated image, I immediately start looking at certain features such as "more than 5 fingers" to determine whether it's real or not. If you could immediately detect those features and block the generated image from making it to production, that'd be a huge value gain.

jrmylee · on Aug 11, 2023

appreciate it :)

autoexec · on Aug 11, 2023

Although I can see how it could be useful now, a "Human Deformity Detector" app still seems strange. I can see this being abused to make fun of actual people or at the very least amusing someone who discovers their selfie has a high deformity score. If it works as advertised people who consider themselves deformed now have an objective ranking system I guess.

VincentEvans · on Aug 10, 2023

What happens if you run it against pictures of regular people? :) Damn these beauty standards!

I kid, I kid.

neovialogistics · on Aug 10, 2023

Posting communities will inevitably try to find false positives, and it'll be fun to see if there are any.

I'm wondering about real photos that have deliberately been shot to screw with normative standards of photo composition, like Weston's headshot of Igor Stravinsky. Another genre of photo that may be flagged are sci-fi and/or fantasy film set candid shots, such as photos featuring an actor (or actors) partially out of costume.

Come to think of it, various photography hall of fame galleries could be great testing suites.

abhinavgopal · on Aug 10, 2023

Interesting convo, for now we haven't focused heavily on photos that are not AI-generated, because there are cases of intentional style changes (that are not true deformities or photo quality errors). I can definitely seeing some of the photos you've mentioned here being a bit confusing.

illegalmemory · on Aug 10, 2023

I tried it out, the first result itself was wrong. I am sort of a potential customer for a product I am working on. The feature detection and scene description even was off. https://app.rubbrband.com/image/f2e1h.png

darren_hsu · on Aug 10, 2023

Hey! Feel free to shoot us an email at contact@rubbrband.com. Happy to set the record straight

schopra909 · on Aug 10, 2023

This is really cool! As a next step, it would be even more useful if you could auto-inpaint on regions of the detected issues (e.g., malformed hands). That way you could maintain the subject / most of the image -- if you like everything but the deformed features. That way you tackle both the QA analyst problem and the "engineer" problem of trying to path the model for the end user/customer.

darren_hsu · on Aug 10, 2023

That's great feedback! We're working on this so stay tuned for updates ;)

keiferski · on Aug 10, 2023

Cool idea. I do think that filtering large batches AI-generated data is potentially a huge market, especially if you get it to filter for certain characteristics. I.e., remove images that are a specific color or contain certain background elements.

I use Midjourney quite a bit and I end up using the creation process itself as an ongoing filtering process, but on a per image basis. It would be a lot more efficient to generate 1,000 images at once, filter out the images that are deformed/not fitting the requirements, then see what you've got to work with.

abhinavgopal · on Aug 10, 2023

Thanks! We're interested in this use case too, it seems that a lot of people find random seeds to be quite powerful in the diversity of images produced. We built this so that you can upload the images from those seeds and then quickly figure out which of those are not deformed.

brucethemoose2 · on Aug 10, 2023

Amazing! There are boatloads of imagegen businesses that should pay you for this.

And this is quite a rabbit hole. There are an infinite class of distortions to detect, from "fused limbs" (which I get quite frequently) to cfg "burn" (which I see in some of those example images) to bad cropping and such.

I can see y'all expanding to generative video once that takes off. I can only imagine the kinds of artifacts that will show up... real video is already a problem!

abhinavgopal · on Aug 10, 2023

Thank you! Definitely a lot of problems to tackle for us. We are interested in possibly eventually expanding to generative video, and it'll give us a new world of problems to look at in that space too.

kikijiji · on Aug 11, 2023

Congrats on the launch. It's an interesting idea.

What are your thoughts on DPPO[1] and if your model(s) could be integrated into that or a similar process to fine tune the original model? Could that potentially remove the need for this product?

[1] https://arxiv.org/abs/2305.13301

Alanhlwang · on Aug 10, 2023

This is awesome, the ability to assess image composition according to photography 'rules' especially is super intriguing. Could you provide some examples of the specific rules your algorithm considers and how they contribute to the overall image quality evaluation? Do you guys see this feature being able to replace studio photographers?

darren_hsu · on Aug 10, 2023

Our algorithm evaluates image composition by taking into account several well-known photography rules, such as the rule of thirds and symmetry. We don't see this feature replacing studio photographers since it's designed to enhance the photography process and aid in composition analysis (basically helping them make the most informed creative decision).

dimatura · on Aug 11, 2023

I'm not sure about how the longevity of this business model. It seems to me that the quality of generative models will continuously improve over time (in part by incorporating similar kinds of "bad output filters"), making this kind of service less and less useful.

atentaten · on Aug 10, 2023

I have an app that needs to generate hundreds and eventually thousands of images. What tools can I use to generate these images at scale. After I figure that out, then I will be looking at Rubbrband for quality control.

jrmylee · on Aug 10, 2023

There's a few companies that provide APIs for image-gen. Personally I've been using the Kandinsky model off Replicate. It's really high quality, I'd recommend it: https://replicate.com/ai-forever/kandinsky-2.2

sakopov · on Aug 11, 2023

Wow, that page generates some striking images. Thanks for sharing the link!

michaelt · on Aug 11, 2023

If you've got an nvidia graphics card with 8GB of vram made in the last decade, AUTOMATIC1111's stable diffusion web ui [1] will crank out a few thousand images every 24 hours. Depending on settings and how fast your card is, naturally.

And there's a large ecosystem of downloadable models available online for specific looks and concepts, like models trained for photorealism.

[1] https://github.com/AUTOMATIC1111/stable-diffusion-webui

toddmorey · on Aug 10, 2023

Question: what’s a use case for generating so many images in a batch at scale?

michaelt · on Aug 11, 2023

The majority of AI art you see online is actually cherrypicked best results after someone tried 20 different seeds and prompts until they got something they really liked.

Far better UX if you can show a grid of 20 images, and let the user choose.

abhinavgopal · on Aug 10, 2023

Stock photos, e-commerce, gaming, marketing all generate images at a very high scale

bboylen · on Aug 10, 2023

This seems pretty useful for companies generating images at scale. Are you at all worried that generative models will get so good that you don't need to check for deformities?

brucethemoose2 · on Aug 10, 2023

I can't see this ever happening. Every new genAI model and finetune I've seen brings fun new classes of distortion.

zirgs · on Aug 10, 2023

We'll probably need higher resolution models. SD 1.5 is 512x512 and it usually outputs deformed faces in full body shots simply because its resolution is too small. SD XL is 1024x1024 and that no longer happens there.

We'll see how outputs from 2048 and 4096 models are going to look like. But those models will need lots of VRAM.

jrmylee · on Aug 10, 2023

It's something we think about for sure

We think that image-gen models are lagging behind LLMs by about a year, so the problems that these models will have should look quite different in the future.

It'll require us to be adaptable and also to take chances on solving problems that aren't huge issues just yet, but are likely to be once models improve.

zirgs · on Aug 10, 2023

Is there any model that doesn't output deformed hands?

Utkarsh_Mood · on Aug 10, 2023

iirc midjourney has figured it out although its closed source if you were wondering.

zirgs · on Aug 11, 2023

Checked midjourney discord - it still generates weird hands from time to time, but it's much better than before.

thatcherthorn · on Aug 10, 2023

This is a perfect problem if you're able to tackle it. I'm super curious on the approach.

YOLO style object detection on a manually created dataset would be my first pass at this problem.

dragonwriter · on Aug 10, 2023

Its part of the process used in some existing SD workflows in popular open-source consumer tooling use to detect and automatically correct deformities, so I’d agree its a useful approach, but I’d also be skeptical of someone using it as the core of a startup they are launching that is selling only the first half of that.

gsharma · on Aug 10, 2023

https://github.com/Bing-su/adetailer works very well on faces and hands.

This is a good solution for all the use cases I’ve dealt with.

Are there widespread use cases where, knowing a deformity exists is required than just fixing the deformities?

orliesaurus · on Aug 10, 2023

There are so many possible imperfections, where do you stop?

jrmylee · on Aug 10, 2023

The short answer to this is to build evaluators for imperfections that our customers see as most pressing.

We have to balance that with our view of what we think will still be a problem in the future, as image generation models get better.

Eventually we want to build a vision model that's great at fine-grained details of images, while will take some time.

michaelmior · on Aug 10, 2023

Maybe it's just me, but with the aspect ratios in the first three images being clearly off, the website immediately feels broken.

jrmylee · on Aug 10, 2023

Thanks will fix this

sriyam · on Aug 10, 2023

Super cool and will def be needed by companies if they want to use gen-ai images at scale and use in production.

ssijak · on Aug 10, 2023

Bug report: clicking "pricing" while on the pricing page leads to blank page

ssijak · on Aug 10, 2023

Clicking on "get started for free" in the blue pricing box while on pricing page opens blank page in the new window

abhinavgopal · on Aug 10, 2023

Thanks, fixed both!

ssijak · on Aug 10, 2023

For anything that is not Midjourney you can just flag as deformed :) Jokes aside, Midjourney is far far ahead of all the others in what it can generate.

zirgs · on Aug 10, 2023

Midjourney still straight up ignores prompts. From my experiments it seems that SD XL follows my promts much better.

darren_hsu · on Aug 10, 2023

We're building prompt alignment models too so you can objectively see how well SDXL follows your prompt :)

dragonwriter · on Aug 10, 2023

Are you building prompt alignment model alignment models so I can objectively see how well what your prompt alignment model is judging SDXL’s interpretation of my prompt against aligns with my prompt?

Quis custodiet ipsos custodes?

darren_hsu · on Aug 10, 2023

Its how your prompt aligns with the SDXL image output, not how your prompt aligns with SDXL's interpretation of your prompt

dragonwriter · on Aug 10, 2023

Isn't it actually how the SDXL image output (that is, SDXL’s interpretation of my prompt) aligns with your model’s intepretation of my prompt?

andybak · on Aug 10, 2023

SDXL feels pretty close now. Have you spent much time with it?

ssijak · on Aug 10, 2023

yes, it's much better than the previous version but still not close to midjourney. there is that threshold in image generation via these models where until you cross it every picture feels off in some way and obviously ai generated. with midjourney I am continuously amazed at what I get out of it. then if you want high res pictures pass them through Gigapixel and voila.

Animats · on Aug 11, 2023

Coming soon from YC: SpelCheck.

abatilo · on Aug 10, 2023

A meta question for y'all if you're willing to share. I've seen y'all launch what I think is now 5 different ideas under the same name? I want to say that I remember a platform for musicians, and a platform to automatically convert a codebase into hostable inference servers, among a few other things.

Asking for myself as someone who has a hard time sticking to one idea to explore: are the pivots coming from finding it in hard to land customers? Is it hard to stay motivated with a single product? The pivots look pretty drastic as an external observer but maybe they've all been very organic from the inside.

jrmylee · on Aug 10, 2023

actually very cool to me that you remember these pivots. From the inside these pivots were about solving problems we faced with our previous idea.

We mainly pivoted either because we discovered the market wasn't great, or we didn't have founder-market fit with the idea. There's a gut feeling aspect that plays in there as well, but it's mostly been analytical approach.

abatilo · on Aug 10, 2023

Thank you for answering. Best of luck with this idea!

jrmylee · on Aug 10, 2023

thanks, good luck with your startup as well!

sebastianconcpt · on Aug 11, 2023

Every day I find AI creepier and creepier. Like a non-chemical hallucinogenic inducing stimuli.