Unify.ai (and gemini) with open-webui

26 Feb, 2025

I use unify.ai for API access to LLMs. They aggregate different API providers (like anthropic, openai, vertexai, together-ai, groq...). This way I only need to buy unify credits. This is a tutorial to set up open-webui to use unify.

Google provides a very generous free tier usage limits, so I have included that as well.

Install and run open-webui. I first tried a normal pip install and it took a good 3 minutes and then I killed that did it using uv and it took about 3s:

uv pip install open-webui
open-webui serve

Create an account and add an open-webui function for using unify and google.
- go to admin-settings > functions > add function (+ button)
- paste this function here and give it a name and save and enable the function
- click on the settings button of the function and set the unify and/or google api keys

I built this with a lot of help from gemini-2.0-flash-thinking-exp-01-21. I tried using perplexity's (o3-mini) and grok's deep research to no real help. I ended up gathering documentation from open-webui and unify and feeding it in myself. 2-3 prompts and I had a working version real quick. Later added a few things:

image input support
thinking mode for 3.7 - you can customize the token limits for low, medium and high at the function setting (alongside api-key)
low, medium, high settings for o3-mini
some debug prints to the terminal

Things to improve/fix:

there's a 30s timeout on chat completion requests. this is definitely going to get fucked for o3-mini high.
stop generation button does not work - you have to wait 30s if no response
idk how to pass documents as full files in open-webui - uploading pdf defaults to RAG 🤦‍♂️
there's some issue with the max_tokens (likely on unify's end) for 3.7, I see stop reason length - unsure how to fix

#opensource #projects #tech