Dify + OpenRouter + k8s: Quickly Building a Pre-Production Environment LLM Application Development Platform

Since the release of GPT 3.5 by OpenAI at the end of 2022, the large model market has been booming and has become an unignorable technology. As major companies compete, the price of large models on the market is dropping. Recently, the gpt-4o-mini, capable of generating a 2500-page book, costs only 60 cents. In this context, the price of large models will not be a bottleneck in the foreseeable future. To efficiently develop and deploy LLM applications, choosing the right platform and tools is crucial. This article will introduce how to quickly build a quasi-production environment LLM application development platform using Dify, OpenRouter, and Kubernetes (k8s), allowing you to use Dify to quickly build custom large model agents.

What is Dify?

Dify is an open-source large language model (LLM) application development platform tailored for developers who want to quickly build production-grade generative AI applications. It perfectly combines Backend as Service with LLMOps, making the development process more efficient and smooth.

Dify supports a variety of advanced large language models such as Claude3, OpenAI, and Gemini and collaborates closely with multiple model providers to offer developers flexible choices, ensuring that every project finds the most suitable solution. The platform also features robust dataset management capabilities, supporting the upload and management of text and structured data and simplifying prompt orchestration and application operations through intuitive visualization tools, making AI application development unprecedentedly simple.

Dify Technical Architecture

The technical architecture of Dify.AI includes several key components, providing developers with a one-stop solution:

Powerful Tech Stack Support: Dify comes with a top-notch tech stack needed for building LLM applications, supporting hundreds of models, with an intuitive prompt orchestration interface, high-quality RAG (Retrieval-Augmented Generation) engine, and a flexible agent framework.
Visual Orchestration and Operation Management: Dify provides visual prompt orchestration, operation, and dataset management features, allowing developers to complete AI application development within days, quickly integrate into existing systems, and continuously optimize and improve.
Rich Tech Stack: The core AI tech stack includes the Python programming language, TensorFlow and Keras deep learning frameworks, and common NLP libraries like NLTK and spaCy. These choices endow Dify.AI with high flexibility and extensibility.
Out-of-the-Box Application Templates and Orchestration Frameworks: Dify offers comprehensive application templates and orchestration frameworks, enabling developers to quickly build large language model-driven generative AI applications based on these resources and seamlessly scale as needed to drive business growth.
Dify Orchestration Studio: This is a professional-grade visual orchestration tool that provides an integrated working environment for generative AI applications, allowing developers to build and manage their AI projects more efficiently.

With these robust technical architectures, Dify.AI creates a comprehensive, flexible, and easy-to-use platform for developers, supporting one-stop solutions for generative AI applications from rapid development to stable deployment.

What is OpenRouter?

As the name suggests, OpenRouter is a powerful large model API router. It is designed to seamlessly integrate various AI models and services into a unified interface. Users can easily access multiple pre-trained large models with simple configuration and invocation, without the need to deploy and maintain them themselves. This innovative concept significantly lowers the threshold for using AI technology, enabling more people to effortlessly utilize large models to solve practical problems.

Quickly Building a Quasi-Production Environment LLM Application Development Platform

Environment Preparation

Before starting, please ensure you have completed the following preparations:

Install Kubernetes: Ensure that Kubernetes cluster is installed and configured on your machine. If not, refer to the Kubernetes official documentation for installation.
Install kubectl: kubectl is a command-line tool for interacting with Kubernetes clusters. Refer to the kubectl installation guide for installation.
Docker Image: Ensure you have built and pushed Ollama's Docker image to your container image repository (e.g., Docker Hub).

For simple trials, you can also use microk8s — https://microk8s.io/ or k3s — https://k3s.io/ for installation.

Step 1: Deploy Dify

Deploy relevant dependencies using dify's open-source k8s deployment.

kubectl apply -f https://raw.githubusercontent.com/Winson-030/dify-kubernetes/main/dify-deployment.yaml

Configure ingress in k8s to expose the external IP for access.

kind: Ingress  
apiVersion: networking.k8s.io/v1  
metadata:  
  name: dify-ingress  
  namespace: dify  
spec:  
  rules:  
    - host: dify.fflow.link  
      http:  
        paths:  
          - path: /  
            pathType: Prefix  
            backend:  
              service:  
                name: dify-nginx  
                port:  
                  number: 80  
          - path: /api  
            pathType: Prefix  
            backend:  
              service:  
                name: dify-nginx  
                port:  
                  number: 80  
          - path: /console/api  
            pathType: Prefix  
            backend:  
              service:  
                name: dify-nginx  
                port:  
                  number: 80  
          - path: /v1  
            pathType: Prefix  
            backend:  
              service:  
                name: dify-nginx  
                port:  
                  number: 80  
          - path: /files  
            pathType: Prefix  
            backend:  
              service:  
                name: dify-nginx  
                port:  
                  number: 80

Access dify.fflow.link (be sure to replace it with the actual domain or IP) to verify the installation. The system will prompt for an initial password. The initial password for dify installed through our yaml is password.

Step 2: Configure OpenRouter

Open Dify's Settings configuration and select Model Provider as OpenRouter.

Enter the API Key obtained from OpenRouter.

Practical Example

Suppose we need to build a translation API based on Dify. Here are the specific steps:

Choose a Template: Choose a Workflow from the templates that closely matches our application scenario.

2. Step Orchestration: Orchestrate the entire translation process on the Dify platform, continuously testing and choosing the appropriate model during the process.

3. Publish API: Click Publish to allow external access to the model through the API and API Key.

FAQs

1. What are the advantages of OpenRouter compared to native APIs like OpenAI?

OpenRouter can provide more models at once and can be used with Visa cards for payment in China. This is advantageous for application development platforms that need to compare and verify multiple models.

Summary

By following the steps introduced in this article, you can quickly build a quasi-production environment LLM application development platform. Using Dify for application development and process orchestration, OpenRouter for model integration, and k8s for Dify container orchestration and management, you can greatly simplify the development and deployment process, improving efficiency and stability. I hope this article is helpful to you, and I wish you success in your LLM application development journey!