Local AI copilot on VSCode
A few people have asked me about my local AI environment, so I thought I’d document it here. Setup subject to change.
Ollama with AMD Radeon GPU (rocm) on docker
I like to run services on docker with compose files. Here’s my compose.yaml for ollama-rocm that provides the AI backend for VSCode and other applications:
version: '3'
services:
ollama:
image: ollama/ollama:0.1.27-rocm
container_name: ollama
restart: unless-stopped
volumes:
- ${HOME}/.ollama:/root/.ollama
group_add:
- video
# AMD
devices:
- /dev/kfd
- /dev/dri
environment:
- ROCR_VISIBLE_DEVICES=0
# NVIDIA
# devices:
# - /dev/nvidia0
# environment:
# - NVIDIA_VISIBLE_DEVICES=all
ports:
- 11434:11434
cap_add:
- SYS_PTRACE
security_opt:
- seccomp:unconfined
ipc: host
A local ollama installation can communicate with the running docker container. I’ve left the service port open to my local area network, but it can be bound to localhost for added security.
VSCodium extension for local AI: Continue
There are many extensions for local AI. I tried out a few and settled on Continue. The extension is available for VSCode (or VSCodium) and JetBrains.
Continue allows you to generate or refactor your code based on your input, allowing you to accept of cancel changes from a line-by-line diff. You can also ask questions from the AI, and refer any files or code selections in your discussion. More information available on their website, continue.dev
Continue allows one to run multiple language models simultaneously. I’ve set it up to use ollama by default, with a small model for autocomplete and a larger model for discussions, code refactoring and iteration. I also have GPT-4 available through OpenAI API, if I want to pay for some extra power.
The different models are available in the drop-down menu of Continue, so it’s quick to switch.
AI Models
My current choice is to run CodeLlama 34B as the main AI, along with stable-code 3B for autocompletion. Here’s the relevant config for Continue:
"tabAutocompleteModel": {
"title": "Tab Autocomplete stable-code-3b",
"provider": "ollama",
"model": "stable-code",
"apiBase": "http://127.0.0.1:11434"
},
"models": [
{
"title": "CodeLlama 34B",
"provider": "ollama",
"model": "codellama:34b",
"apiBase": "http://127.0.0.1:11434"
},
{
"title": "stable-code 7B",
"provider": "ollama",
"model": "stable-code",
"apiBase": "http://127.0.0.1:11434"
},
{
"title": "OpenAI GPT-4-turbo",
"provider": "openai",
"apiKey": "sk-XXXX",
"model": "gpt-4-turbo-preview"
},
{
"title": "OpenAI GPT-4 (full)",
"provider": "openai",
"apiKey": "sk-XXXX",
"model": "gpt-4"
}