Serve Ollama on embedded without pain
This repository provides tools and scripts for running Ollama on embedded systems and local networks with minimal setup. Includes solutions for network service configuration and SSH connection management with automatic key generation.
GitHub Repository: https://github.com/wronai/ollama/
# Download all scripts
wget https://raw.githubusercontent.com/wronai/ollama/main/ollama.sh
wget https://raw.githubusercontent.com/wronai/ollama/main/ssh.sh
wget https://raw.githubusercontent.com/wronai/ollama/main/monitor.sh
wget https://raw.githubusercontent.com/wronai/ollama/main/test.sh
# Make executable
chmod +x *.sh
# Start Ollama network service
./ollama.sh
# Connect to remote server with automatic SSH key setup
./ssh.sh root@192.168.1.100
# Monitor system performance in real-time
./monitor.sh
ollama.sh
)Automatically configures Ollama to serve across your entire network, making AI models accessible from any device on your local network. Features include:
ssh.sh
)Fixes common SSH authentication issues and automatically manages SSH keys:
monitor.sh
)Advanced ASCII-based system monitoring tool with beautiful visualizations:
ollama.sh
Main script for setting up Ollama network service.
Key Features:
ssh.sh
SSH connection script with automatic key management.
Key Features:
monitor.sh
Real-time ASCII system monitoring tool.
Key Features:
# Make executable
chmod +x ollama.sh
# Start with default settings (port 11434)
./ollama.sh
# Start on custom port
./ollama.sh -p 8081
# Test the service
./ollama.sh --test
# View API examples
./ollama.sh --examples
# Check service status
sudo systemctl status ollama-network
# Start/stop/restart service
sudo systemctl start ollama-network
sudo systemctl stop ollama-network
sudo systemctl restart ollama-network
# View logs
sudo journalctl -u ollama-network -f
# Enable/disable auto-start
sudo systemctl enable ollama-network
sudo systemctl disable ollama-network
Once running, your Ollama service will be accessible from any device on your network:
# Replace 192.168.1.100 with your server's IP
curl http://192.168.1.100:11434/api/tags
Command | Description |
---|---|
./ollama.sh |
Start service with default settings |
./ollama.sh -p PORT |
Start on specific port |
./ollama.sh --stop |
Stop the service |
./ollama.sh --status |
Show service status |
./ollama.sh --test |
Run comprehensive tests |
./ollama.sh --examples |
Show API usage examples |
./ollama.sh --install-model |
Install DeepSeek Coder model |
./ollama.sh --logs |
View service logs |
# Make executable
chmod +x ssh.sh
# Connect to remote host (will generate keys if needed)
./ssh.sh root@192.168.1.100
# Connect to different host
./ssh.sh user@hostname.local
~/.ssh/id_ed25519
- Private key~/.ssh/id_ed25519.pub
- Public key# Make executable
chmod +x monitor.sh
# Start monitoring
./monitor.sh
# Exit with Ctrl+C or press 'q'
β‘ CPU Usage: 25%
Overall: βββββββββββββββββββββ [25%]
History: βββββ
ββββββ
ββββββββ
β
πΎ Memory Usage: 45% (3584MB / 7928MB)
Usage: ββββββββββββββββββββ [45%]
History: βββ
ββββ
ββ
ββββ
ββββ
βββ
π‘οΈ Temperature: 42Β°C (38Β°C 41Β°C 42Β°C 39Β°C)
Current: ββββββββββββββββββββ [42Β°C]
History: βββββ
β
βββββββ
β
βββ
β
ββ
πΏ Disk I/O: Read: 15.2MB/s Write: 8.7MB/s
Read: ββββββββββββββββββββ [15.2MB/s]
Write: ββββββββββββββββββββ [8.7MB/s]
R.Hist: βββββ
ββββ
ββββββββ
βββ
W.Hist: βββββββββ
β
ββββββββββ
Generate system load for testing the monitor:
# CPU stress test
stress --cpu 4 --timeout 30s
# Memory stress test
stress --vm 2 --vm-bytes 1G --timeout 30s
# Disk I/O test
dd if=/dev/zero of=/tmp/test bs=1M count=500
sudo hdparm -t /dev/mmcblk0
# Temperature monitoring during load
# Monitor will show real-time changes in all metrics
For Ollama Service:
curl -fsSL https://ollama.ai/install.sh | sh
)For SSH Script:
For System Monitor:
# Download and setup everything
curl -fsSL https://raw.githubusercontent.com/wronai/ollama/main/install.sh | bash
wget https://raw.githubusercontent.com/wronai/ollama/main/ollama.sh
wget https://raw.githubusercontent.com/wronai/ollama/main/ssh.sh
wget https://raw.githubusercontent.com/wronai/ollama/main/monitor.sh
chmod +x ollama.sh ssh.sh
curl -fsSL https://ollama.ai/install.sh | sh
./ollama.sh
./ssh.sh user@hostname
./monitor.sh
### Alternative Download Methods
```bash
# Using curl
curl -O https://raw.githubusercontent.com/wronai/ollama/main/ollama.sh
curl -O https://raw.githubusercontent.com/wronai/ollama/main/ssh.sh
curl -O https://raw.githubusercontent.com/wronai/ollama/main/monitor.sh
# Clone entire repository
git clone https://github.com/wronai/ollama.git
cd ollama
chmod +x *.sh
curl -s http://192.168.1.100:11434/api/tags | jq '.models[].name'
curl -X POST http://192.168.1.100:11434/api/generate \
-H 'Content-Type: application/json' \
-d '{
"model": "deepseek-coder:1.3b",
"prompt": "Write a Python function to sort a list",
"stream": false
}'
curl -X POST http://192.168.1.100:11434/api/chat \
-H 'Content-Type: application/json' \
-d '{
"model": "deepseek-coder:1.3b",
"messages": [
{"role": "user", "content": "Explain recursion in programming"}
],
"stream": false
}'
curl -X POST http://192.168.1.100:11434/api/pull \
-H 'Content-Type: application/json' \
-d '{"name": "mistral:latest"}'
curl -X POST http://192.168.1.100:11434/api/generate \
-H 'Content-Type: application/json' \
-d '{
"model": "deepseek-coder:1.3b",
"prompt": "Create a REST API in Python",
"stream": true
}'
import requests
import json
class OllamaClient:
def __init__(self, host="192.168.1.100", port=11434):
self.base_url = f"http://{host}:{port}"
def generate(self, model, prompt, stream=False):
response = requests.post(
f"{self.base_url}/api/generate",
json={
"model": model,
"prompt": prompt,
"stream": stream
}
)
return response.json()
def chat(self, model, messages, stream=False):
response = requests.post(
f"{self.base_url}/api/chat",
json={
"model": model,
"messages": messages,
"stream": stream
}
)
return response.json()
def list_models(self):
response = requests.get(f"{self.base_url}/api/tags")
return response.json()
# Usage
client = OllamaClient()
# Generate code
result = client.generate(
"deepseek-coder:1.3b",
"Write a function to calculate fibonacci numbers"
)
print(result['response'])
# Chat
messages = [{"role": "user", "content": "What is machine learning?"}]
result = client.chat("deepseek-coder:1.3b", messages)
print(result['message']['content'])
# Connect to server (first time - will setup keys)
./ssh.sh root@192.168.1.100
# Connect to different user
./ssh.sh admin@server.local
# Connect to server with non-standard port (manual)
ssh -p 2222 -i ~/.ssh/id_ed25519 root@192.168.1.100
Endpoint | Method | Description |
---|---|---|
/api/tags |
GET | List available models |
/api/generate |
POST | Generate text completion |
/api/chat |
POST | Chat conversation |
/api/pull |
POST | Download model |
/api/push |
POST | Upload model |
/api/show |
POST | Show model information |
/api/copy |
POST | Copy model |
/api/delete |
DELETE | Delete model |
Request:
{
"model": "deepseek-coder:1.3b",
"prompt": "Write hello world in Python",
"stream": false,
"options": {
"temperature": 0.7,
"top_p": 0.9
}
}
Response:
{
"model": "deepseek-coder:1.3b",
"created_at": "2025-06-02T08:42:07.311663549Z",
"response": "print(\"Hello, World!\")",
"done": true,
"context": [...],
"total_duration": 1234567890,
"load_duration": 1234567,
"prompt_eval_duration": 1234567,
"eval_count": 10,
"eval_duration": 1234567890
}
# Check if port is in use
ss -tlnp | grep :11434
# Check service logs
sudo journalctl -u ollama-network -f
# Restart service
sudo systemctl restart ollama-network
# Check firewall
sudo ufw status
# Test local connection first
curl http://localhost:11434/api/tags
# Check firewall rules
sudo ufw allow 11434/tcp
# Check if service binds to all interfaces
ss -tlnp | grep :11434
# Should show 0.0.0.0:11434, not 127.0.0.1:11434
# Clear SSH agent
ssh-add -D
# Remove old host keys
ssh-keygen -R hostname
# Try with password only
ssh -o IdentitiesOnly=yes -o PreferredAuthentications=password user@host
# Debug connection
ssh -v user@host
# Check disk space
df -h
# Check internet connection
curl -I https://ollama.ai
# Try smaller model
ollama pull tinyllama
# Check Ollama logs
tail -f /tmp/ollama.log
# Increase model context (if you have RAM)
export OLLAMA_CONTEXT_LENGTH=8192
# Enable GPU acceleration (if available)
export OLLAMA_INTEL_GPU=1
# Adjust concurrent requests
export OLLAMA_MAX_QUEUE=512
# CPU and memory usage
htop
# GPU usage (if applicable)
nvidia-smi
# Disk usage
df -h ~/.ollama/models/
This repository includes comprehensive support for RK3588βs hardware acceleration capabilities, including both the Mali-G610 GPU and the Neural Processing Unit (NPU).
# Install GPU drivers and OpenCL
./rkgpu.sh
# Install NPU drivers and tools
./rknpu.sh
# Test GPU functionality
./testgpu.sh
# Test NPU functionality
./testnpu.sh
# Enable GPU acceleration
OLLAMA_GPU=1 ollama serve
# Enable NPU acceleration (experimental)
OLLAMA_NPU=1 ollama serve
For detailed documentation, see RK3588 Documentation and Testing Guide.
ollama run qwen3:4b
OLLAMA_NPU=1 ollama serve
We provide comprehensive testing tools to verify your RK3588 hardware acceleration setup:
testgpu.sh
)# Run all GPU tests
./testgpu.sh
# Run specific test category
./testgpu.sh --category opencl
testnpu.sh
)# Run all NPU tests
./testnpu.sh
# Test specific model
./testnpu.sh --model path/to/model.rknn
For detailed testing documentation, see Test.md.
sudo journalctl -u ollama-network -f
./ollama.sh --test
./ollama.sh --examples
sudo systemctl status ollama-network
# Check terminal capabilities
echo $TERM
tput cols; tput lines
# Test system access
cat /proc/stat | head -5
ls /sys/class/thermal/
cat /proc/diskstats | head -5
# Complete system status check
./ollama.sh --test
./monitor.sh & # Start monitor in background
# View all running services
systemctl list-units --type=service --state=running | grep ollama
# Check network configuration for Ollama
ip addr show
ss -tlnp | grep ollama
# Monitor real-time logs
tail -f /var/log/syslog | grep ollama
# System performance baseline
# Terminal 1: Start monitoring
./monitor.sh
# Terminal 2: Generate test load
stress --cpu 2 --vm 1 --vm-bytes 500M --timeout 60s
sudo hdparm -t /dev/mmcblk0
Check service status: sudo systemctl status ollama-network.service
bash ollm.sh
# Stop any existing Ollama processes
sudo pkill -f ollama || true
# Start Ollama with network access on port 8081
OLLAMA_HOST=0.0.0.0:8081 OLLAMA_ORIGINS=* ollama serve &
# Allow through firewall
sudo ufw allow 8081/tcp || true
# Wait a moment, then test
sleep 5
curl -s http://localhost:8081/api/tags
Contributions welcome! Please feel free to submit issues and pull requests to the GitHub repository.
git checkout -b feature-name
This project is open source. Feel free to modify and distribute.
Repository: https://github.com/wronai/ollama/
Happy coding with Ollama! ππ€