{"id":19787,"date":"2022-08-15T22:50:08","date_gmt":"2022-08-16T05:50:08","guid":{"rendered":"http:\/\/jnack.com\/blog\/?p=19787"},"modified":"2022-08-15T22:50:11","modified_gmt":"2022-08-16T05:50:11","slug":"clip-interrogator-reveals-what-your-robo-artist-assistant-sees","status":"publish","type":"post","link":"https:\/\/jnack.com\/blog\/2022\/08\/15\/clip-interrogator-reveals-what-your-robo-artist-assistant-sees\/","title":{"rendered":"CLIP interrogator reveals what your robo-artist assistant sees"},"content":{"rendered":"\n<p>Ever since DALL\u2022E hit the scene, I&#8217;ve been wanting to know what words its model for language-image pairing would use to describe images:<\/p>\n\n\n\n<figure class=\"wp-block-embed is-type-rich is-provider-twitter wp-block-embed-twitter\"><div class=\"wp-block-embed__wrapper\">\n<blockquote class=\"twitter-tweet\" data-width=\"550\" data-dnt=\"true\"><p lang=\"en\" dir=\"ltr\">Mobile DALL\u2022E app idea: capture an image, then use GPT-3 to generate descriptive text, then use that as prompts for <a href=\"https:\/\/twitter.com\/hashtag\/dalle?src=hash&amp;ref_src=twsrc%5Etfw\">#dalle<\/a>. <br><br>I want to know how to produce lettering like this: <a href=\"https:\/\/t.co\/dprziqKWzE\">pic.twitter.com\/dprziqKWzE<\/a><\/p>&mdash; John Nack (@jnack) <a href=\"https:\/\/twitter.com\/jnack\/status\/1533498433398378497?ref_src=twsrc%5Etfw\">June 5, 2022<\/a><\/blockquote><script async src=\"https:\/\/platform.twitter.com\/widgets.js\" charset=\"utf-8\"><\/script>\n<\/div><\/figure>\n\n\n\n<p>Now the somewhat scarily named <a href=\"https:\/\/colab.research.google.com\/github\/pharmapsychotic\/clip-interrogator\/blob\/main\/clip_interrogator.ipynb#scrollTo=YQk0eemUrSC7\">CLIP Interrogator<\/a> promises exactly that kind of insight:<\/p>\n\n\n\n<blockquote class=\"wp-block-quote\"><p>What do the different OpenAI CLIP models see in an image? What might be a good text prompt to create similar images using CLIP guided diffusion or another text to image model? The CLIP Interrogator is here to get you answers!<\/p><\/blockquote>\n\n\n\n<p>Here&#8217;s hoping it helps us get some interesting image -> text -> image flywheels spinning.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Ever since DALL\u2022E hit the scene, I&#8217;ve been wanting to know what words its model for language-image pairing would use to describe images: Now the somewhat scarily named CLIP Interrogator promises exactly that kind of insight: What do the different OpenAI CLIP models see in an image? What might be a good text prompt to [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[66,68],"tags":[],"_links":{"self":[{"href":"https:\/\/jnack.com\/blog\/wp-json\/wp\/v2\/posts\/19787"}],"collection":[{"href":"https:\/\/jnack.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/jnack.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/jnack.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/jnack.com\/blog\/wp-json\/wp\/v2\/comments?post=19787"}],"version-history":[{"count":2,"href":"https:\/\/jnack.com\/blog\/wp-json\/wp\/v2\/posts\/19787\/revisions"}],"predecessor-version":[{"id":19789,"href":"https:\/\/jnack.com\/blog\/wp-json\/wp\/v2\/posts\/19787\/revisions\/19789"}],"wp:attachment":[{"href":"https:\/\/jnack.com\/blog\/wp-json\/wp\/v2\/media?parent=19787"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/jnack.com\/blog\/wp-json\/wp\/v2\/categories?post=19787"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/jnack.com\/blog\/wp-json\/wp\/v2\/tags?post=19787"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}