Posted by Nodus Labs | January 29, 2025
Setting Up DeepSeek on a Local or Private Server: A Step-by-Step Tutorial

DeepSeek’s new AI LLM model made a lot of noise in the last days, but many people also raised concerns about privacy. Multiple screenshots of DeepSeek’s privacy policy highlight the fact that your data is going to travel all the way to China and stay there for quite a bit. Not that it’s much different from OpenAI’s ChatGPT, but still, with DeepSeek you actually do have a choice. So why not?
The best way to avoid that is to install DeepSeek on your own server, a local computer, or a public server that you control and to use it from there. In this article, we will present the best ways to achieve this using as little technical knowledge as possible. This information will be useful for both individuals and enterprises who work with sensitive data that they don’t want to be exposed.
Why Should I Run My Own DeepSeek?
There are plenty of apps offering to run DeepSeek locally. But there’s a caveat. How do you know they are not storing and using your data? The model might be on your local device, sure, but what prevents the owner of the app from sending your inputs to the server once the app is connected to the internet?
That’s why if you care about security and privacy, you don’t want your model to be locked in some closed-source package because you don’t know what’s going to happen with your data. You are much better off using a popular open-source package both for installing and running the model and serving it through an open-source user interface (or directly through Terminal if you like it raw). Below I explain how to do that in the easiest and non-technical way possible.
Install DeepSeek on Your Own Computer (Local Model)
This is the easiest and the safest way to use DeepSeek so far. However, you have to be ready that it’s going to be quite slow, especially for the bigger, more advanced model. The reason is that it runs faster when it has more GPUs, but your laptop or computer has only one, and it’s probably not the most powerful one. So if you want speed that is not annoying, you’ll probably have to settle with DeepSeek R1:8B (5Gb), which works fine on 2022 MacBook Pro and on most modern desktop and laptop computers. If you want a bigger and a more powerful model, you’ll probably need to install it on an external server, so if that is the case, you can skip to the next section directly.
Steps to install DeepSeek Locally:
1) Get the Ollama software — it’s the best open-source app for managing open-source LLMs. It works on Mac, Windows, and Linux.
2) Follow Ollamas instructions to install the LLM model you want. Basically, you just need to open the Terminal and run “ollama run deepseek-r1:8b” (you can also run ollama and other models, see the full list on Ollama)
3) Once it’s downloaded and running, you can now choose the interface of your choice to interact with the LLM. The easiest way to do that is to actually use the Terminal itself, but it may be too raw for most users. If you’re a technical user, you will notice that the model is available via http://localhost:11434 so that means that other apps can talk to it as well, so here are some of the choices you have.
DeepSeek Interface choices:
It mainly comes down to installing a ChatGPT-like interface that will run in your browser (more complicated but lots of settings), using an existing tool like VSCode (the easiest install and greater control of the context), or using some external app that you can hook up to the localhost Ollama server.
a) My recommendation is to use VSCode (a popular open-source software development tool, which is basically a text editor on steroids with multiple extensions). The advantage is that you can open it in any folder, which will automatically be the context for your model, and you can then start querying it directly on your text files. You will also be able to save your chats as well and control what stays and what doesn’t, which is also pretty cool. So if you decide to go for this option, install VSCode and then get the “Continue” extension, which is an open-source AI chatbot used for coding. Open it, and it will detect the models installed in Ollama, choose the DeepSeek model, and you can start using it straight away:

b) Another option is to install ChatGPT-like interface that you’ll be able to open in your browser locally called Open-WebUI. It does require you to have some experience using Terminal because the best way to install it is Docker, so you need to download Docker first, run it, then use the Terminal to download the Docker package for Open WebUI, and then install the whole thing. Make sure to choose DeepSeek and you can start using it straight away. It takes a bit of time, but you get very good controls, and you can choose the model’s parameters.

In both cases, you can use the InfraNodus graph visualization to get a direct visual representation of your LLM interaction (via VSCode extension or the Chrome browser plugin) and to be able to guide your model’s thinking process in a more creative direction as we describe in the video below.
c) Finally, you can also use an external app that allows you to put in the localhost (or 127.0… on your local network). I wouldn’t be too creative here and just download the Enchanted app listed on Ollama’s GitHub, as it’s open source and can run on your phone, Apple Vision Pro, or Mac. Then you just add a link to Ollama’s DeepSeek server, and it’s ready to use. The problem here is that you have fewer controls than in ChatGPT or VSCode (especially for specifying the context).
Install DeepSeek on a Server
If you have a local version of DeepSeek running, you can access it in your local network. However, there are instances where you might want to make it available to the outside world. Hosting an LLM model on an external server ensures that it can work faster because you have access to better GPUs and scaling. However, it comes at a price. The cheapest rate is about $0.5 per hour, so that makes about $12 a day, $360 a month.
Option 1: 1-Click Server Setup for testing and non-technical users
The easiest way to do that is to deploy DeepSeek via Ollama on a server using Koyeb — a cloud service provider from France. The fastest one-click option is via the deployment button Open-WebUI on Koyeb which includes both Ollama and Open-WebUI interface. Make sure to choose a GPU instance for that for faster inference (check the price first!). You will get a public URL link, which you can use to access the service, create an admin account, and download the DeepSeek R1:8b via the web interface once the service is deployed.
However, as this solution does not have persistent storage, which means as soon as the service goes down, you lose all your settings, chats, and have to download the model again.
For a more consistent option, you can install Ollama separately via Koyeb on a GPU with one click and then the Open-WebUI with another (choose a cheap CPU instance for it at about $10 a month). Then attach a storage volume to the Open-WebUI service to ensure it’s persistent.

The best setup, however, is to create a separate PostgreSQL database to store Open-WebUI settings and chats and to then use that database for storage. You can set this database up on a service of your choice or even on Koyeb itself. We created a deployment instruction at http://github.com/noduslabs/ollama-ui-docker-deploy which you can use. If you decide to go for this setup, you can even use your service for production, as your data will be persistent, and that means you can share your deployment with other people within your organization and create / admin user accounts.
Another advantage of having a PostgreSQL DB is that you can have the same chats and settings available to you on multiple deployments.
Important note: with any of these setups, you can stop your service at any time, so you can theoretically switch it off if you don’t use it to save money. However, if you don’t use persistent storage or PostgreSQL, then all your settings will be lost. So it’s much better to use the PostgreSQL database because then every time you restart your instance, you can use it again. It’s also much easier to then port this data somewhere else, even to your local machine, as all you need to do is clone the DB, and you can use it anywhere.
Option 2: Technical setup for advanced users, SMEs, enterprises, and production use
For a more “serious” setup where you have a high degree of control, you can set up an AWS EC2 instance of Ollama with DeepSeek R1 and Open Web UI. You will need to create an account on AWS and request permission to get GPU instances, but you can then start building your own AI stack on top. The lowest price is also at about $350 per month, but you also have the highest level of flexibility here in terms of adding load balancing, deployment rules, private access rules, etc. This is probably the best choice for enterprises because it provides the highest level of security and control. However, the setup is much more difficult and requires some knowledge of server operations.
…
Our experts at Nodus Labs can help you set up a private LLM instance on your servers and adjust all the necessary settings in order to enable local RAG for your private knowledge base. Please, contact us if you need any help.