Introducing The Generative Edge by Contiamo

Keeping you up-to-date on the world of generative AI, one week at a time

Mar 03, 2023

Our logo running through ControlNet and Stable Diffusion

Welcome to the first edition of The Generative Edge, your weekly digest of all things generative AI. With so much happening in this rapidly evolving field, we know it can be tough to stay up to date. That's why we're here to help! From cutting-edge research to practical applications, we've got you covered.

So, let's dive in and see what the world of generative AI has been up to this week.

Generative image models

You blink twice and AI image generation has become even more powerful than it already was, and it really has made huge strides since the release of Dall-E in April and Stable Diffusion in August of last year. Let’s see what’s new in this space.

Stable Diffusion

The image generation model with the most vibrant open source community around it is pushing the frontier of image generation on a daily basis.

Fine-tuned models for a variety of styles

A community around specifically fine-tuned image models has sprung up, and you can choose if you want to generate cinematic, woolly, synthwave, papercut-art or any other style of image and choose the model that fits what you’re looking for.
A new system called LoRa makes fine-tuning these models yourself much easier
No idea where to start? Check out this guide: https://stable-diffusion-art.com/beginners-guide/

ControlNet

Generative image models always had significant limitations, they are quite random in their output and difficult to finely control. ControlNet solves some of these limitations.

ControlNet is a collection of neural networks that can influence a generative model and force it to generate things in a very specific way (pose, depth, outlines and many more)
The variations of our company logo you see at the top was generated via Stable Diffusion, HED ControlNet and the knollingcase model.
Learn more about ControlNet here: https://levelup.gitconnected.com/controlnet-control-your-ai-art-generation-616c86c88964

Try ControlNet for yourself, but fair warning, it’s a little on the technical side: https://huggingface.co/spaces/hysts/ControlNet

ChatGPT API

ChatGPT has taken the world by storm, and now developers can finally integrate with its API.

ChatGPT model name gpt-3.5-turbo-0301 is significantly (10x!) cheaper than the existing GPT-3 (davinci) model
Check out https://platform.openai.com/docs/guides/chat for more information
Open AI’s amazing, open and state of the art transcription model Whisper also received an fresh API, so that it’s even easier to perform audio transcription from within your apps.
For more information see https://platform.openai.com/docs/guides/speech-to-text

Large Language Model safety

Programs written in code can be exploited, and it turns out, language models (like GPT) can be too! It’s good to keep this in mind when building your own chatbot.

Giving input text to a large language model (LLM) like ChatGPT is effectively running code, with the LLM acting as the execution environment.

A paper released last week discusses various ways a language model could be exploited. Code can be malicious, and natural language can be too!
Check out this example for prompt injection called DAN,
Expect “Neural Network security” to become a thing! Many attack vectors already exist, from prompt injection, to training data poisoning and malicious classifier embedding.
We will keep you updated on developments in this space, and also ways to mitigate these risks.

Voice synthesis

Voice synthesis (generating human sounding voice) has made huge leaps over the last months, and voice cloning (train a model on a specific voice and then use it for synthesis) alongside it

It’s easier than ever before to create professional sounding audio.
elevenlabs.io has one of the easiest, cheapest, fastest and probably best voice synth (and cloning) tech right now. You can use it for narration, voice overs, podcasts and whatever else you can think of.
To give you an example what this can do, we created 60 seconds of a fake data engineering podcast which uses voices cloned from our actual voices. Check it out:

And that’s it for this week!

Find all of our updates on our Substack at thegenerativeedge.substack.com, get in touch via our office hours if you want to talk about Generative AI and visit our website at contiamo.com.

Have a wonderful weekend everyone!

Daniel
Generative AI engineer at Contiamo

The Generative Edge by Contiamo