open source · free for all
Oskis — Open Source Skilskis. Free for all.
Oskis are community-contributed AI skills, free to run via the skill.ski public MCP endpoint. 127 indexed across 6 upstream sources. Attribution required — no paywall, no lock-in.
install · 30 seconds
Get all 127 Oskis into your agent
01
Sign up free
Create a free skil.ski account. No card. Oskis ship with every account.
02
Copy your MCP token
Settings → MCP → Generate token. The free tier scope is the Oski catalog.
03
Add the server
Paste the URL + token into Claude Desktop, Cursor, ChatGPT Custom GPT, Codex, or any MCP client.
Free MCP endpoint
https://skil.ski/api/mcp/freeFree tier · scoped to the Oski catalog. Pro and Elite endpoints unlock the full skill registry.
example config
{
"mcpServers": {
"skil-ski-oskis": {
"url": "https://skil.ski/api/mcp/free",
"headers": {
"Authorization": "Bearer <token>"
}
}
}
}browse oskis
20 Orchestra Oskis
Autoresearch
OrchestraAutoresearch Skill
Orchestrates end-to-end autonomous AI research projects using a two-loop architecture. The inner loop runs rapid experiment iterations with clear optimization targets. The outer loop synthesizes results, identifies patterns, and steers research direction. Routes to domain-specific skills for execution, supports continuous agent operation via Claude Code /loop and OpenClaw heartbeat, and produces r
MITGet via MCP →
Implementing Llms Litgpt
OrchestraModel Architecture
Implements and trains LLMs using Lightning AI's LitGPT with 20+ pretrained architectures (Llama, Gemma, Phi, Qwen, Mistral). Use when need clean model implementations, educational understanding of architectures, or production fine-tuning with LoRA/QLoRA. Single-file implementations, no abstraction layers.
MITGet via MCP →
Mamba Architecture
OrchestraModel Architecture
State-space model with O(n) complexity vs Transformers' O(n²). 5× faster inference, million-token sequences, no KV cache. Selective SSM with hardware-aware design. Mamba-1 (d_state=16) and Mamba-2 (d_state=128, multi-head). Models 130M-2.8B on HuggingFace.
MITGet via MCP →
Nanogpt
OrchestraModel Architecture
Educational GPT implementation in ~300 lines. Reproduces GPT-2 (124M) on OpenWebText. Clean, hackable code for learning transformers. By Andrej Karpathy. Perfect for understanding GPT architecture from scratch. Train on Shakespeare (CPU) or OpenWebText (multi-GPU).
MITGet via MCP →
Rwkv Architecture
OrchestraModel Architecture
RNN+Transformer hybrid with O(n) inference. Linear time, infinite context, no KV cache. Train like GPT (parallel), infer like RNN (sequential). Linux Foundation AI project. Production at Windows, Office, NeMo. RWKV-7 (March 2025). Models up to 14B parameters.
MITGet via MCP →
Distributed Llm Pretraining Torchtitan
OrchestraModel Architecture
Provides PyTorch-native distributed LLM pretraining using torchtitan with 4D parallelism (FSDP2, TP, PP, CP). Use when pretraining Llama 3.1, DeepSeek V3, or custom models at scale from 8 to 512+ GPUs with Float8, torch.compile, and distributed checkpointing.
MITGet via MCP →
Huggingface Tokenizers
OrchestraTokenization
Fast tokenizers optimized for research and production. Rust-based implementation tokenizes 1GB in <20 seconds. Supports BPE, WordPiece, and Unigram algorithms. Train custom vocabularies, track alignments, handle padding/truncation. Integrates seamlessly with transformers. Use when you need high-performance tokenization or custom tokenizer training.
MITGet via MCP →
Sentencepiece
OrchestraTokenization
Language-independent tokenizer treating text as raw Unicode. Supports BPE and Unigram algorithms. Fast (50k sentences/sec), lightweight (6MB memory), deterministic vocabulary. Used by T5, ALBERT, XLNet, mBART. Train on raw text without pre-tokenization. Use when you need multilingual support, CJK languages, or reproducible tokenization.
MITGet via MCP →
Axolotl
OrchestraFine Tuning
Expert guidance for fine-tuning LLMs with Axolotl - YAML configs, 100+ models, LoRA/QLoRA, DPO/KTO/ORPO/GRPO, multimodal support
MITGet via MCP →
Llama Factory
OrchestraFine Tuning
Expert guidance for fine-tuning LLMs with LLaMA-Factory - WebUI no-code, 100+ models, 2/3/4/5/6/8-bit QLoRA, multimodal support
MITGet via MCP →
Peft Fine Tuning
OrchestraFine Tuning
Parameter-efficient fine-tuning for LLMs using LoRA, QLoRA, and 25+ methods. Use when fine-tuning large models (7B-70B) with limited GPU memory, when you need to train <1% of parameters with minimal accuracy loss, or for multi-adapter serving. HuggingFace's official library integrated with transformers ecosystem.
MITGet via MCP →
Unsloth
OrchestraFine Tuning
Expert guidance for fast fine-tuning with Unsloth - 2-5x faster training, 50-80% less memory, LoRA/QLoRA optimization
MITGet via MCP →
Nnsight Remote Interpretability
OrchestraMechanistic Interpretability
Provides guidance for interpreting and manipulating neural network internals using nnsight with optional NDIF remote execution. Use when needing to run interpretability experiments on massive models (70B+) without local GPU resources, or when working with any PyTorch architecture.
MITGet via MCP →
Pyvene Interventions
OrchestraMechanistic Interpretability
Provides guidance for performing causal interventions on PyTorch models using pyvene's declarative intervention framework. Use when conducting causal tracing, activation patching, interchange intervention training, or testing causal hypotheses about model behavior.
MITGet via MCP →
Sparse Autoencoder Training
OrchestraMechanistic Interpretability
Provides guidance for training and analyzing Sparse Autoencoders (SAEs) using SAELens to decompose neural network activations into interpretable features. Use when discovering interpretable features, analyzing superposition, or studying monosemantic representations in language models.
MITGet via MCP →
Transformer Lens Interpretability
OrchestraMechanistic Interpretability
Provides guidance for mechanistic interpretability research using TransformerLens to inspect and manipulate transformer internals via HookPoints and activation caching. Use when reverse-engineering model algorithms, studying attention patterns, or performing activation patching experiments.
MITGet via MCP →
Nemo Curator
OrchestraData Processing
GPU-accelerated data curation for LLM training. Supports text/image/video/audio. Features fuzzy deduplication (16× faster), quality filtering (30+ heuristics), semantic deduplication, PII redaction, NSFW detection. Scales across GPUs with RAPIDS. Use for preparing high-quality training datasets, cleaning web data, or deduplicating large corpora.
MITGet via MCP →
Ray Data
OrchestraData Processing
Scalable data processing for ML workloads. Streaming execution across CPU/GPU, supports Parquet/CSV/JSON/images. Integrates with Ray Train, PyTorch, TensorFlow. Scales from single machine to 100s of nodes. Use for batch inference, data preprocessing, multi-modal data loading, or distributed ETL pipelines.
MITGet via MCP →
Grpo Rl Training
OrchestraPost Training
Expert guidance for GRPO/RL fine-tuning with TRL for reasoning and task-specific model training
MITGet via MCP →
Miles Rl Training
OrchestraPost Training
Provides guidance for enterprise-grade RL training using miles, a production-ready fork of slime. Use when training large MoE models with FP8/INT4, needing train-inference alignment, or requiring speculative RL for maximum throughput.
MITGet via MCP →
open contribution
Want to publish an Oski?
Oskis are open-source Skilskis — contribute yours via GitHub. Include a SKILL.md, a license, and a source attribution and we will list it here.
Submit a PR →