local llm 101

rozbehat mlx

daj chatgpt/gemini

pustit model cez mlx v terminali

pustit cez MLX (ked dany model nie je v cache, najprv ho zacne stahovat) model

hardcore 14B na 24gb (unified) ram

python -m mlx_lm.server --model mlx-community/Qwen2.5-Coder-14B-Instruct-4bit

baby model 7B

python -m mlx_lm.server --model mlx-community/Qwen2.5-7B-Instruct-4bit


💡alternativa je rozbehat MLX model cez LM studio appku na Macu, potom je vsak konfiguracia opencode configu trocha ina... nemam vyskusane e2e.

zinstalovat opencode

opencode config (~/.config/opencode/opencode.json)

{ "$schema": "https://opencode.ai/config.json", "provider": { "local-mlx": { "npm": "@ai-sdk/openai-compatible", "name": "MLX Local", "options": { "baseURL": "http://127.0.0.1:8080/v1" }, "models": { "mlx-community/Qwen2.5-14B-Instruct-4bit": { "name": "Qwen 2.5 Coder 14B", "quantization": "4bit", "max_context": 4096, "cache_dir": "~/.cache/huggingface/hub", "tools": true, "parameters": { "stop": [ "" ], "format": "json" }, "reasoning": true, "tool_call": true }, "mlx-community/Qwen2.5-7B-Instruct-4bit": { "name": "Qwen 2.5 Coder 7B", "quantization": "4bit", "max_context": 4096, "cache_dir": "~/.cache/huggingface/hub", "tools": true, "parameters": { "stop": [ "" ], "format": "json" }, "reasoning": true, "tool_call": true } } } } }

pustit opencode

vybrat si model ctrl+p, switch model

tabom prepinas medzi build a plan modom agenta

build mod,  v danom adresari mam inicializovanu kostru ruby on rails aplikacie... (to je jak PHP-cko, dynamicky jazyk RUBY, takze su tam zdrojove kody, koncovka .rb)

prompt: find in current and subdirectories files with .rb extension where inside file is word "ActiveRecord"

prvy dopyt voci modelu posiela aj kontext , cca 10000 tokenov, treba byt trpezlivy

prompt: "display content of application_job.rb"

tu vidis, ze promt uz neni peklo na pocty tokenov - tie volania uz nie je len moj prompt(tokeny), ale aj dalsie automaticke dopyty, kvoli toolom