🔭AI Tools Scout
LeaderboardMCPSkillsContentAbout
🔭AI Tools Scout·Data updated every 6 hours
LeaderboardMCPSkillsContentAbout
← Back to Leaderboard

Best Open Source AI Inference Projects

95 inference projects ranked by GitHub stars, weekly growth, and maintenance health.

Project data last synced 13h ago.
#ProjectCategoryStars ▼Weekly ▽TrendHealth ▽LanguageUpdated ▽
1
Ollama
Get up and running with Kimi-K2.5, GLM-5, MiniMax, DeepSeek, gpt-oss, Qwen, Gemma and other models.
⚡ Inference171.2K+45095Go12h ago
2
Prompts.chat
f.k.a. Awesome ChatGPT Prompts. Share, discover, and collect prompts from the community. Free and open source — self-host for your organization with complete privacy.
⚡ Inference162.0K057HTML1d ago
3
llama.cpp
LLM inference in C/C++
⚡ Inference109.6K+1.2K100C++13h ago
4
vLLM
A high-throughput and memory-efficient inference and serving engine for LLMs
⚡ Inference79.7K+60693Python13h ago
5
Llm Course
Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks.
⚡ Inference79.2K030—3mo ago
6
Llamafactory
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
⚡ Inference71.1K056Python4d ago
7
Caveman
🪨 why use many token when few token do trick — Claude Code skill that cuts 65% of tokens by talking like caveman
⚡ Inference58.3K064JavaScript1d ago
8
Trendradar
⭐AI-driven public opinion & trend monitor with multi-platform aggregation, RSS, and smart alerts.🎯 告别信息过载,你的 AI 舆情监控助手与热点筛选工具!聚合多平台热点 + RSS 订阅,支持关键词精准筛选。AI 智能筛选新闻 + AI 翻译 + AI 分析简报直推手机,也支持接入 MCP 架构,赋能 AI 自然语言对话分析、情感洞察与趋势预测等。支持 Docker ,数据本地/云端自持。集成微信/飞书/钉钉/Telegram/邮件/ntfy/bark/slack 等渠道智能推送。
⚡ Inference57.3K039Python12d ago
9
Context7
Context7 Platform -- Up-to-date code documentation for LLMs and AI code editors
⚡ Inference55.0K066TypeScript16h ago
10
Mempalace
The best-benchmarked open-source AI memory system. And it's free.
⚡ Inference52.0K080Python1d ago
11
Pi Mono
AI agent toolkit: coding agent CLI, unified LLM API, TUI & web UI libraries, Slack bot, vLLM pods
⚡ Inference48.2K073TypeScript21h ago
12
LocalAI
LocalAI is the open-source AI engine. Run any model - LLMs, vision, voice, image, video - on any hardware. No GPU required.
⚡ Inference46.2K+13091Go14h ago
13
Milvus
Milvus is a high-performance, cloud-native vector database built for scalable vector ANN search
⚡ Inference44.2K075Go13h ago
14
Kong
🦍 The API and AI Gateway
⚡ Inference43.4K040Lua1mo ago
15
Jan
Jan is an open source alternative to ChatGPT that runs 100% offline on your computer.
⚡ Inference42.5K+8880TypeScript22h ago
16
Lightrag
[EMNLP2025] "LightRAG: Simple and Fast Retrieval-Augmented Generation"
⚡ Inference35.0K080Python22h ago
17
Graphrag
A modular graph-based Retrieval-Augmented Generation (RAG) system
⚡ Inference32.9K056Python14h ago
18
New Api
A unified AI model hub for aggregation & distribution. It supports cross-converting various LLMs into OpenAI-compatible, Claude-compatible, or Gemini-compatible formats. A centralized gateway for personal and enterprise model management. 🍥
⚡ Inference32.5K074Go1d ago
19
Self Llm
《开源大模型食用指南》针对中国宝宝量身打造的基于Linux环境快速微调(全参数/Lora)、部署国内外开源大模型(LLM)/多模态大模型(MLLM)教程
⚡ Inference30.4K037Jupyter Notebook17d ago
20
Void
⚡ Inference28.7K036TypeScript4mo ago
21
Sglang
SGLang is a high-performance serving framework for large language models and multimodal models.
⚡ Inference27.7K077Python13h ago
22
Gitleaks
Find secrets with Gitleaks 🔑
⚡ Inference26.8K033Go1mo ago
23
Awesome Generative Ai Guide
A one stop repository for generative AI research updates, interview resources, notebooks and much more!
⚡ Inference26.6K042HTML3d ago
24
Hands On Large Language Models
Official code repo for the O'Reilly Book - "Hands-On Large Language Models"
⚡ Inference26.2K032Jupyter Notebook18d ago
25
Llmfit
Hundreds of models & providers. One command to find what runs on your hardware.
⚡ Inference25.8K070Rust1d ago
26
Scrapegraph Ai
Python scraper based on AI
⚡ Inference25.0K060Python1d ago
27
llamafile
Distribute and run LLMs with a single file.
⚡ Inference24.4K+4465C++7d ago
28
Llm Action
本项目旨在分享大模型相关技术原理以及实战经验(大模型工程化、大模型应用落地)
⚡ Inference24.3K030HTML1d ago
29
MLC LLM
Universal LLM Deployment Engine with ML Compilation
⚡ Inference22.6K+3662Python14h ago
30
Awesome Chinese LLM
整理开源的中文大语言模型,以规模较小、可私有化部署、训练成本较低的模型为主,包括底座模型,垂直领域微调及应用,数据集与教程等。
⚡ Inference22.6K041—2d ago
31
Unilm
Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
⚡ Inference22.1K043Python3mo ago
32
Skyvern
Automate browser based workflows with AI
⚡ Inference21.6K068Python14h ago
33
Datasets
🤗 The largest hub of ready-to-use datasets for AI models with fast, easy-to-use and efficient data manipulation tools
⚡ Inference21.5K060Python21h ago
34
Free Llm Api Resources
A list of free LLM inference resources accessible via API.
⚡ Inference21.3K030Python2d ago
35
Qwen
The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud.
⚡ Inference21.1K046Python2mo ago
36
Peft
🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.
⚡ Inference21.1K060Python1d ago
37
Heretic
Fully automatic censorship removal for language models
⚡ Inference20.8K054Python3d ago
38
Dyad
Local, open-source AI app builder for power users ✨ v0 / Lovable / Replit / Bolt alternative 🌟 Star if you like it!
⚡ Inference20.3K072TypeScript13h ago
39
Llama Cookbook
Welcome to the Llama Cookbook! This is your go to guide for Building with Llama: Getting started with Inference, Fine-Tuning, RAG. We also show you how to solve end to end problems using Llama model family and using them on various provider services
⚡ Inference18.3K037Jupyter Notebook20d ago
40
Web Llm
High-performance In-browser LLM Inference Engine
⚡ Inference18.0K046TypeScript6d ago
41
Ml Engineering
Machine Learning Engineering Open Book
⚡ Inference17.9K036Python1mo ago
42
Airllm
AirLLM 70B inference with single 4GB GPU
⚡ Inference17.7K031Jupyter Notebook2mo ago
43
Qbot
[🔥updating ...] AI 自动量化交易机器人(完全本地部署) AI-powered Quantitative Investment Research Platform. 📃 online docs: https://ufund-me.github.io/Qbot ✨ :news: qbot-mini: https://github.com/Charmve/iQuant
⚡ Inference17.3K030Jupyter Notebook2mo ago
44
Code Review Graph
Local knowledge graph for Claude Code. Builds a persistent map of your codebase so Claude reads only what matters — 6.8× fewer tokens on reviews and up to 49× on daily coding tasks.
⚡ Inference16.1K077Python4d ago
45
RWKV LM
RWKV (pronounced RwaKuv) is an RNN with great LLM performance, which can also be directly trained like a GPT transformer (parallelizable). We are at RWKV-7 "Goose". So it's combining the best of RNN and transformer - great performance, linear time, constant space (no kv-cache), fast training, infinite ctx_len, and free sentence embedding.
⚡ Inference14.5K023Python4d ago
46
Easy Dataset
A powerful tool for creating datasets for LLM fine-tuning 、RAG and Eval
⚡ Inference14.2K056JavaScript10d ago
47
Outlines
Structured Outputs
⚡ Inference13.8K052Python7d ago
48
Omlx
LLM inference server with continuous batching & SSD caching for Apple Silicon — managed from the macOS menu bar
⚡ Inference13.6K078Python1d ago
49
Awesome Generative Ai
A curated list of modern Generative Artificial Intelligence projects and services
⚡ Inference12.0K043—6d ago
50
Tensorzero
TensorZero is an open-source LLMOps platform that unifies an LLM gateway, observability, evaluation, optimization, and experimentation.
⚡ Inference11.4K072Rust16h ago
51
Llm Engineer Toolkit
A curated list of 120+ LLM libraries category wise.
⚡ Inference10.4K039—1mo ago
52
Openvino
OpenVINO™ is an open source toolkit for optimizing and deploying AI inference
⚡ Inference10.2K072C++13h ago
53
Unity Mcp
Unity MCP acts as a bridge, allowing AI assistants (like Claude, Cursor) to interact directly with your Unity Editor via a local MCP (Model Context Protocol) Client. Give your LLM tools to manage assets, control scenes, edit scripts, and automate tasks within Unity.
⚡ Inference9.5K075C#7d ago
54
Ipex Llm
Accelerate local LLM inference and finetuning (LLaMA, Mistral, ChatGLM, Qwen, DeepSeek, Mixtral, Gemma, Phi, MiniCPM, Qwen-VL, MiniCPM-V, etc.) on Intel XPU (e.g., local PC with iGPU and NPU, discrete GPU such as Arc, Flex and Max); seamlessly integrate with llama.cpp, Ollama, HuggingFace, LangChain, LlamaIndex, vLLM, DeepSpeed, Axolotl, etc.
⚡ Inference8.8K042Python3mo ago
55
Toonflow App
Toonflow 是开源一站式 AI 短剧创作工具,将小说、剧本快速转化为动画短剧。集成 AI 编剧、智能分镜、角色与视频生成,跨平台桌面端轻量部署,助力创作者低成本批量产出视觉内容。Toonflow is an open-source AI tool that turns stories and scripts into animated short dramas. Features AI scriptwriting, storyboarding, character and video generation. A cross-platform desktop app for efficient content creation.
⚡ Inference7.8K071HTML3d ago
56
Prompt Master
A Claude skill that writes the accurate prompts for any AI tool. Zero tokens or credits wasted. Full context and memory retention
⚡ Inference7.4K042—8d ago
57
Transformer Explainer
Transformer Explained Visually: Learn How LLM Transformer Models Work with Interactive Visualization
⚡ Inference7.3K028JavaScript1mo ago
58
Local Deep Research
~95% on SimpleQA (e.g. Qwen3.6-27B on a 3090). Supports all local and cloud LLMs (llama.cpp, Ollama, Google, ...). 10+ search engines - arXiv, PubMed, your private documents. Everything Local & Encrypted.
⚡ Inference7.2K075Python14h ago
59
Openllmetry
Open-source observability for your GenAI or LLM application, based on OpenTelemetry
⚡ Inference7.1K051Python20h ago
60
Vespa
AI + Data, online. https://vespa.ai
⚡ Inference6.9K066Java16h ago
61
Llm Wiki
LLM Wiki is a cross-platform desktop application that turns your documents into an organized, interlinked knowledge base — automatically. Instead of traditional RAG (retrieve-and-answer from scratch every time), the LLM incrementally builds and maintains a persistent wiki from your sources。
⚡ Inference6.9K065TypeScript1d ago
62
Learning
A log of things I'm learning
⚡ Inference6.9K030—9d ago
63
LTX 2
Official Python inference and LoRA trainer package for the LTX-2 audio–video generative model.
⚡ Inference6.6K031Python23h ago
64
Firecrawl Mcp Server
🔥 Official Firecrawl MCP Server - Adds powerful web scraping and search to Cursor, Claude and any other LLM clients.
⚡ Inference6.3K037JavaScript4d ago
65
Sqlbot
🔥 基于大模型和 RAG 的智能问数系统,对话式数据分析神器。Text-to-SQL Generation via LLMs using RAG.
⚡ Inference6.1K065JavaScript21h ago
66
Pgai
A suite of tools to develop RAG, semantic search, and other AI applications more easily with PostgreSQL
⚡ Inference5.8K034PLpgSQL2mo ago
67
Taxhacker
Self-hosted AI accounting app. LLM analyzer for receipts, invoices, transactions with custom prompts and categories
⚡ Inference5.6K040TypeScript25d ago
68
Alignment Handbook
Robust recipes to align language models with human and AI preferences
⚡ Inference5.6K037Python1mo ago
69
Ultrarag
A Low-Code MCP Framework for Building Complex and Innovative RAG Pipelines
⚡ Inference5.5K043Python20h ago
70
Chronos Forecasting
Chronos: Pretrained Models for Time Series Forecasting
⚡ Inference5.3K041Python21d ago
71
5ire
5ire is a cross-platform desktop AI assistant, MCP client. It compatible with major service providers, supports local knowledge base and tools via model context protocol servers .
⚡ Inference5.2K047TypeScript1mo ago
72
Sparrow
Structured data extraction and instruction calling with ML, LLM and Vision LLM
⚡ Inference5.2K043Python2d ago
73
Transformerlab App
The open source research environment for AI researchers to seamlessly train, evaluate, and scale models from local hardware to GPU clusters.
⚡ Inference4.9K074Python14h ago
74
Bifrost
Fastest enterprise AI gateway (50x faster than LiteLLM) with adaptive load balancer, cluster mode, guardrails, 1000+ models support & <100 µs overhead at 5k RPS.
⚡ Inference4.8K074Go12h ago
75
Shimmy
⚡ Python-free Rust inference server — OpenAI-API compatible. GGUF + SafeTensors, hot model swap, auto-discovery, single binary. FREE now, FREE forever.
⚡ Inference4.8K044Rust1mo ago
76
Claude Obsidian
Claude + Obsidian knowledge companion. Persistent, compounding wiki vault based on Karpathy's LLM Wiki pattern. /wiki /save /autoresearch
⚡ Inference4.8K054Python18d ago
77
Mlx Vlm
MLX-VLM is a package for inference and fine-tuning of Vision Language Models (VLMs) on your Mac using MLX.
⚡ Inference4.7K068Python14h ago
78
Vllm Omni
A framework for efficient model inference with omni-modality models
⚡ Inference4.7K074Python19h ago
79
Llm Twin Course
🤖 𝗟𝗲𝗮𝗿𝗻 for 𝗳𝗿𝗲𝗲 how to 𝗯𝘂𝗶𝗹𝗱 an end-to-end 𝗽𝗿𝗼𝗱𝘂𝗰𝘁𝗶𝗼𝗻-𝗿𝗲𝗮𝗱𝘆 𝗟𝗟𝗠 & 𝗥𝗔𝗚 𝘀𝘆𝘀𝘁𝗲𝗺 using 𝗟𝗟𝗠𝗢𝗽𝘀 best practices: ~ 𝘴𝘰𝘶𝘳𝘤𝘦 𝘤𝘰𝘥𝘦 + 12 𝘩𝘢𝘯𝘥𝘴-𝘰𝘯 𝘭𝘦𝘴𝘴𝘰𝘯𝘴
⚡ Inference4.3K028Python22d ago
80
LLM RL Visualized
🌟100+ 原创 LLM / RL 原理图📚,《大模型算法》作者巨献!💥(100+ LLM/RL Algorithm Maps )
⚡ Inference4.3K034Python3d ago
81
Spark Nlp
State of the Art Natural Language Processing
⚡ Inference4.1K048Scala2d ago
82
Lemonade
Lemonade helps users discover and run local AI apps by serving optimized LLMs right from their own GPUs and NPUs. Join our discord: https://discord.gg/5xXzkMu8Zk
⚡ Inference3.9K069C++19h ago
83
Scikit Llm
Seamlessly integrate LLMs into scikit-learn.
⚡ Inference3.5K029Python10d ago
84
Optimum
🚀 Accelerate inference and training of 🤗 Transformers, Diffusers, TIMM and Sentence Transformers with easy to use hardware optimization tools
⚡ Inference3.4K042Python5d ago
85
Horizon
📡 Your own AI-powered news radar. Generates daily briefings in English & Chinese. | 用 AI 构建你专属的新闻雷达
⚡ Inference3.4K054Python13h ago
86
Hallucination Leaderboard
Leaderboard Comparing LLM Performance at Producing Hallucinations when Summarizing Short Documents
⚡ Inference3.2K032Python18h ago
87
Landppt
一个基于LLM的演示文稿生成平台,能够自动将文档内容转换为专业的PPT演示文稿。平台支持多种AI模型,提供丰富的模板和样式选择,让用户能够创建高质量的演示文稿。
⚡ Inference3.2K038Python16d ago
88
Xturing
Build, personalize and control your own LLMs. From data pre-processing to fine-tuning, xTuring provides an easy way to personalize open-source LLMs. Join our discord community: https://discord.gg/TgHXuSJEk6
⚡ Inference2.7K047Python2mo ago
89
Aix DB
Aix-DB 基于 LangChain/LangGraph 框架,结合 MCP Skills 多智能体协作架构,实现自然语言到数据洞察的端到端转换。
⚡ Inference2.1K049JavaScript27d ago
90
Rapid MLX
The fastest local AI engine for Apple Silicon. 4.2x faster than Ollama, 0.08s cached TTFT, 100% tool calling. 17 tool parsers, prompt cache, reasoning separation, cloud routing. Drop-in OpenAI replacement. Works with Claude Code, Cursor, Aider.
⚡ Inference2.1K076Python14h ago
91
Lucebox Hub
Lucebox optimization hub: hand-tuned LLM inference, built for specific consumer hardware.
⚡ Inference1.9K063C++17h ago
92
Detikzify
Synthesizing Graphics Programs for Scientific Figures and Sketches with TikZ.
⚡ Inference1.8K018Python3mo ago
93
Mindpipe
A powerful model compression framework for LLMs and LVLMs, adapted for NVIDIA GPUs and Huawei Ascend NPUs.
⚡ Inference1.0K043Python1d ago
94
Llm Internals
Learn LLM internals step by step - from tokenization to attention to inference optimization.
⚡ Inference978021—1d ago
95
Vllm Studio
Control panel for VLLM, Sglang, llama.cpp, exllamav3
⚡ Inference908045TypeScript1d ago

Weekly AI open-source movers

Get the fastest-growing projects, useful MCP servers, and technical reads in one weekly email.