Unify.ai (and gemini) with open-webui
# | #tech, #projects, #opensource
I use unify.ai for API access to LLMs. They aggregate different API providers (like anthropic, openai, vertexai, together-ai, groq...). This way I only need to buy unify credits. This is a tutorial to set up open-webui to use unify.

Google provides a very generous free tier usage limits, so I have included that as well.
- Install and run open-webui. I first tried a normal
pip installand it took a good 3 minutes and then I killed that did it using uv and it took about 3s:
uv pip install open-webui
open-webui serve
- Create an account and add an open-webui function for using unify and google.
- go to admin-settings > functions > add function (+ button)
- paste this function here and give it a name and save and enable the function
- click on the settings button of the function and set the unify and/or google api keys

I built this with a lot of help from gemini-2.0-flash-thinking-exp-01-21. I tried using perplexity's (o3-mini) and grok's deep research to no real help. I ended up gathering documentation from open-webui and unify and feeding it in myself. 2-3 prompts and I had a working version real quick. Later added a few things:
- image input support
- thinking mode for 3.7 - you can customize the token limits for low, medium and high at the function setting (alongside api-key)
- low, medium, high settings for o3-mini
- some debug prints to the terminal
Things to improve/fix:
- there's a 30s timeout on chat completion requests. this is definitely going to get fucked for o3-mini high.
- stop generation button does not work - you have to wait 30s if no response
- idk how to pass documents as full files in open-webui - uploading pdf defaults to RAG 🤦♂️
- there's some issue with the
max_tokens(likely on unify's end) for 3.7, I see stop reasonlength- unsure how to fix