| 1 |  Ollama Get up and running with Kimi-K2.5, GLM-5, MiniMax, DeepSeek, gpt-oss, Qwen, Gemma and other models. | ⚡ Inference | 171.2K | +450 | | 95 | Go | 12h ago |
| 2 |  Prompts.chat f.k.a. Awesome ChatGPT Prompts. Share, discover, and collect prompts from the community. Free and open source — self-host for your organization with complete privacy. | ⚡ Inference | 162.0K | 0 | | 57 | HTML | 1d ago |
| 3 |  llama.cpp LLM inference in C/C++ | ⚡ Inference | 109.6K | +1.2K | | 100 | C++ | 13h ago |
| 4 |  vLLM A high-throughput and memory-efficient inference and serving engine for LLMs | ⚡ Inference | 79.7K | +606 | | 93 | Python | 13h ago |
| 5 |  Llm Course Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks. | ⚡ Inference | 79.2K | 0 | | 30 | — | 3mo ago |
| 6 |  Llamafactory Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024) | ⚡ Inference | 71.1K | 0 | | 56 | Python | 4d ago |
| 7 |  Caveman 🪨 why use many token when few token do trick — Claude Code skill that cuts 65% of tokens by talking like caveman | ⚡ Inference | 58.3K | 0 | | 64 | JavaScript | 1d ago |
| 8 |  Trendradar ⭐AI-driven public opinion & trend monitor with multi-platform aggregation, RSS, and smart alerts.🎯 告别信息过载,你的 AI 舆情监控助手与热点筛选工具!聚合多平台热点 + RSS 订阅,支持关键词精准筛选。AI 智能筛选新闻 + AI 翻译 + AI 分析简报直推手机,也支持接入 MCP 架构,赋能 AI 自然语言对话分析、情感洞察与趋势预测等。支持 Docker ,数据本地/云端自持。集成微信/飞书/钉钉/Telegram/邮件/ntfy/bark/slack 等渠道智能推送。 | ⚡ Inference | 57.3K | 0 | | 39 | Python | 12d ago |
| 9 |  Context7 Context7 Platform -- Up-to-date code documentation for LLMs and AI code editors | ⚡ Inference | 55.0K | 0 | | 66 | TypeScript | 16h ago |
| 10 |  Mempalace The best-benchmarked open-source AI memory system. And it's free. | ⚡ Inference | 52.0K | 0 | | 80 | Python | 1d ago |
| 11 |  Pi Mono AI agent toolkit: coding agent CLI, unified LLM API, TUI & web UI libraries, Slack bot, vLLM pods | ⚡ Inference | 48.2K | 0 | | 73 | TypeScript | 21h ago |
| 12 |  LocalAI LocalAI is the open-source AI engine. Run any model - LLMs, vision, voice, image, video - on any hardware. No GPU required. | ⚡ Inference | 46.2K | +130 | | 91 | Go | 14h ago |
| 13 |  Milvus Milvus is a high-performance, cloud-native vector database built for scalable vector ANN search | ⚡ Inference | 44.2K | 0 | | 75 | Go | 13h ago |
| 14 |  Kong 🦍 The API and AI Gateway | ⚡ Inference | 43.4K | 0 | | 40 | Lua | 1mo ago |
| 15 |  Jan Jan is an open source alternative to ChatGPT that runs 100% offline on your computer. | ⚡ Inference | 42.5K | +88 | | 80 | TypeScript | 22h ago |
| 16 |  Lightrag [EMNLP2025] "LightRAG: Simple and Fast Retrieval-Augmented Generation" | ⚡ Inference | 35.0K | 0 | | 80 | Python | 22h ago |
| 17 |  Graphrag A modular graph-based Retrieval-Augmented Generation (RAG) system | ⚡ Inference | 32.9K | 0 | | 56 | Python | 14h ago |
| 18 |  New Api A unified AI model hub for aggregation & distribution. It supports cross-converting various LLMs into OpenAI-compatible, Claude-compatible, or Gemini-compatible formats. A centralized gateway for personal and enterprise model management. 🍥 | ⚡ Inference | 32.5K | 0 | | 74 | Go | 1d ago |
| 19 |  Self Llm 《开源大模型食用指南》针对中国宝宝量身打造的基于Linux环境快速微调(全参数/Lora)、部署国内外开源大模型(LLM)/多模态大模型(MLLM)教程 | ⚡ Inference | 30.4K | 0 | | 37 | Jupyter Notebook | 17d ago |
| 20 | | ⚡ Inference | 28.7K | 0 | | 36 | TypeScript | 4mo ago |
| 21 |  Sglang SGLang is a high-performance serving framework for large language models and multimodal models. | ⚡ Inference | 27.7K | 0 | | 77 | Python | 13h ago |
| 22 |  Gitleaks Find secrets with Gitleaks 🔑 | ⚡ Inference | 26.8K | 0 | | 33 | Go | 1mo ago |
| 23 |  Awesome Generative Ai Guide A one stop repository for generative AI research updates, interview resources, notebooks and much more! | ⚡ Inference | 26.6K | 0 | | 42 | HTML | 3d ago |
| 24 |  Hands On Large Language Models Official code repo for the O'Reilly Book - "Hands-On Large Language Models" | ⚡ Inference | 26.2K | 0 | | 32 | Jupyter Notebook | 18d ago |
| 25 |  Llmfit Hundreds of models & providers. One command to find what runs on your hardware. | ⚡ Inference | 25.8K | 0 | | 70 | Rust | 1d ago |
| 26 |  Scrapegraph Ai Python scraper based on AI | ⚡ Inference | 25.0K | 0 | | 60 | Python | 1d ago |
| 27 |  llamafile Distribute and run LLMs with a single file. | ⚡ Inference | 24.4K | +44 | | 65 | C++ | 7d ago |
| 28 |  Llm Action 本项目旨在分享大模型相关技术原理以及实战经验(大模型工程化、大模型应用落地) | ⚡ Inference | 24.3K | 0 | | 30 | HTML | 1d ago |
| 29 |  MLC LLM Universal LLM Deployment Engine with ML Compilation | ⚡ Inference | 22.6K | +36 | | 62 | Python | 14h ago |
| 30 |  Awesome Chinese LLM 整理开源的中文大语言模型,以规模较小、可私有化部署、训练成本较低的模型为主,包括底座模型,垂直领域微调及应用,数据集与教程等。 | ⚡ Inference | 22.6K | 0 | | 41 | — | 2d ago |
| 31 |  Unilm Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities | ⚡ Inference | 22.1K | 0 | | 43 | Python | 3mo ago |
| 32 |  Skyvern Automate browser based workflows with AI | ⚡ Inference | 21.6K | 0 | | 68 | Python | 14h ago |
| 33 |  Datasets 🤗 The largest hub of ready-to-use datasets for AI models with fast, easy-to-use and efficient data manipulation tools | ⚡ Inference | 21.5K | 0 | | 60 | Python | 21h ago |
| 34 |  Free Llm Api Resources A list of free LLM inference resources accessible via API. | ⚡ Inference | 21.3K | 0 | | 30 | Python | 2d ago |
| 35 |  Qwen The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud. | ⚡ Inference | 21.1K | 0 | | 46 | Python | 2mo ago |
| 36 |  Peft 🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning. | ⚡ Inference | 21.1K | 0 | | 60 | Python | 1d ago |
| 37 |  Heretic Fully automatic censorship removal for language models | ⚡ Inference | 20.8K | 0 | | 54 | Python | 3d ago |
| 38 |  Dyad Local, open-source AI app builder for power users ✨ v0 / Lovable / Replit / Bolt alternative 🌟 Star if you like it! | ⚡ Inference | 20.3K | 0 | | 72 | TypeScript | 13h ago |
| 39 |  Llama Cookbook Welcome to the Llama Cookbook! This is your go to guide for Building with Llama: Getting started with Inference, Fine-Tuning, RAG. We also show you how to solve end to end problems using Llama model family and using them on various provider services | ⚡ Inference | 18.3K | 0 | | 37 | Jupyter Notebook | 20d ago |
| 40 |  Web Llm High-performance In-browser LLM Inference Engine | ⚡ Inference | 18.0K | 0 | | 46 | TypeScript | 6d ago |
| 41 |  Ml Engineering Machine Learning Engineering Open Book | ⚡ Inference | 17.9K | 0 | | 36 | Python | 1mo ago |
| 42 |  Airllm AirLLM 70B inference with single 4GB GPU | ⚡ Inference | 17.7K | 0 | | 31 | Jupyter Notebook | 2mo ago |
| 43 |  Qbot [🔥updating ...] AI 自动量化交易机器人(完全本地部署) AI-powered Quantitative Investment Research Platform. 📃 online docs: https://ufund-me.github.io/Qbot ✨ :news: qbot-mini: https://github.com/Charmve/iQuant | ⚡ Inference | 17.3K | 0 | | 30 | Jupyter Notebook | 2mo ago |
| 44 |  Code Review Graph Local knowledge graph for Claude Code. Builds a persistent map of your codebase so Claude reads only what matters — 6.8× fewer tokens on reviews and up to 49× on daily coding tasks. | ⚡ Inference | 16.1K | 0 | | 77 | Python | 4d ago |
| 45 |  RWKV LM RWKV (pronounced RwaKuv) is an RNN with great LLM performance, which can also be directly trained like a GPT transformer (parallelizable). We are at RWKV-7 "Goose". So it's combining the best of RNN and transformer - great performance, linear time, constant space (no kv-cache), fast training, infinite ctx_len, and free sentence embedding. | ⚡ Inference | 14.5K | 0 | | 23 | Python | 4d ago |
| 46 |  Easy Dataset A powerful tool for creating datasets for LLM fine-tuning 、RAG and Eval | ⚡ Inference | 14.2K | 0 | | 56 | JavaScript | 10d ago |
| 47 |  Outlines Structured Outputs | ⚡ Inference | 13.8K | 0 | | 52 | Python | 7d ago |
| 48 |  Omlx LLM inference server with continuous batching & SSD caching for Apple Silicon — managed from the macOS menu bar | ⚡ Inference | 13.6K | 0 | | 78 | Python | 1d ago |
| 49 |  Awesome Generative Ai A curated list of modern Generative Artificial Intelligence projects and services | ⚡ Inference | 12.0K | 0 | | 43 | — | 6d ago |
| 50 |  Tensorzero TensorZero is an open-source LLMOps platform that unifies an LLM gateway, observability, evaluation, optimization, and experimentation. | ⚡ Inference | 11.4K | 0 | | 72 | Rust | 16h ago |
| 51 |  Llm Engineer Toolkit A curated list of 120+ LLM libraries category wise. | ⚡ Inference | 10.4K | 0 | | 39 | — | 1mo ago |
| 52 |  Openvino OpenVINO™ is an open source toolkit for optimizing and deploying AI inference | ⚡ Inference | 10.2K | 0 | | 72 | C++ | 13h ago |
| 53 |  Unity Mcp Unity MCP acts as a bridge, allowing AI assistants (like Claude, Cursor) to interact directly with your Unity Editor via a local MCP (Model Context Protocol) Client. Give your LLM tools to manage assets, control scenes, edit scripts, and automate tasks within Unity. | ⚡ Inference | 9.5K | 0 | | 75 | C# | 7d ago |
| 54 |  Ipex Llm Accelerate local LLM inference and finetuning (LLaMA, Mistral, ChatGLM, Qwen, DeepSeek, Mixtral, Gemma, Phi, MiniCPM, Qwen-VL, MiniCPM-V, etc.) on Intel XPU (e.g., local PC with iGPU and NPU, discrete GPU such as Arc, Flex and Max); seamlessly integrate with llama.cpp, Ollama, HuggingFace, LangChain, LlamaIndex, vLLM, DeepSpeed, Axolotl, etc. | ⚡ Inference | 8.8K | 0 | | 42 | Python | 3mo ago |
| 55 |  Toonflow App Toonflow 是开源一站式 AI 短剧创作工具,将小说、剧本快速转化为动画短剧。集成 AI 编剧、智能分镜、角色与视频生成,跨平台桌面端轻量部署,助力创作者低成本批量产出视觉内容。Toonflow is an open-source AI tool that turns stories and scripts into animated short dramas. Features AI scriptwriting, storyboarding, character and video generation. A cross-platform desktop app for efficient content creation. | ⚡ Inference | 7.8K | 0 | | 71 | HTML | 3d ago |
| 56 |  Prompt Master A Claude skill that writes the accurate prompts for any AI tool. Zero tokens or credits wasted. Full context and memory retention | ⚡ Inference | 7.4K | 0 | | 42 | — | 8d ago |
| 57 |  Transformer Explainer Transformer Explained Visually: Learn How LLM Transformer Models Work with Interactive Visualization | ⚡ Inference | 7.3K | 0 | | 28 | JavaScript | 1mo ago |
| 58 |  Local Deep Research ~95% on SimpleQA (e.g. Qwen3.6-27B on a 3090). Supports all local and cloud LLMs (llama.cpp, Ollama, Google, ...). 10+ search engines - arXiv, PubMed, your private documents. Everything Local & Encrypted. | ⚡ Inference | 7.2K | 0 | | 75 | Python | 14h ago |
| 59 |  Openllmetry Open-source observability for your GenAI or LLM application, based on OpenTelemetry | ⚡ Inference | 7.1K | 0 | | 51 | Python | 20h ago |
| 60 |  Vespa AI + Data, online. https://vespa.ai | ⚡ Inference | 6.9K | 0 | | 66 | Java | 16h ago |
| 61 |  Llm Wiki LLM Wiki is a cross-platform desktop application that turns your documents into an organized, interlinked knowledge base — automatically. Instead of traditional RAG (retrieve-and-answer from scratch every time), the LLM incrementally builds and maintains a persistent wiki from your sources。 | ⚡ Inference | 6.9K | 0 | | 65 | TypeScript | 1d ago |
| 62 |  Learning A log of things I'm learning | ⚡ Inference | 6.9K | 0 | | 30 | — | 9d ago |
| 63 |  LTX 2 Official Python inference and LoRA trainer package for the LTX-2 audio–video generative model. | ⚡ Inference | 6.6K | 0 | | 31 | Python | 23h ago |
| 64 |  Firecrawl Mcp Server 🔥 Official Firecrawl MCP Server - Adds powerful web scraping and search to Cursor, Claude and any other LLM clients. | ⚡ Inference | 6.3K | 0 | | 37 | JavaScript | 4d ago |
| 65 |  Sqlbot 🔥 基于大模型和 RAG 的智能问数系统,对话式数据分析神器。Text-to-SQL Generation via LLMs using RAG. | ⚡ Inference | 6.1K | 0 | | 65 | JavaScript | 21h ago |
| 66 |  Pgai A suite of tools to develop RAG, semantic search, and other AI applications more easily with PostgreSQL | ⚡ Inference | 5.8K | 0 | | 34 | PLpgSQL | 2mo ago |
| 67 |  Taxhacker Self-hosted AI accounting app. LLM analyzer for receipts, invoices, transactions with custom prompts and categories | ⚡ Inference | 5.6K | 0 | | 40 | TypeScript | 25d ago |
| 68 |  Alignment Handbook Robust recipes to align language models with human and AI preferences | ⚡ Inference | 5.6K | 0 | | 37 | Python | 1mo ago |
| 69 |  Ultrarag A Low-Code MCP Framework for Building Complex and Innovative RAG Pipelines | ⚡ Inference | 5.5K | 0 | | 43 | Python | 20h ago |
| 70 |  Chronos Forecasting Chronos: Pretrained Models for Time Series Forecasting | ⚡ Inference | 5.3K | 0 | | 41 | Python | 21d ago |
| 71 |  5ire 5ire is a cross-platform desktop AI assistant, MCP client. It compatible with major service providers, supports local knowledge base and tools via model context protocol servers . | ⚡ Inference | 5.2K | 0 | | 47 | TypeScript | 1mo ago |
| 72 |  Sparrow Structured data extraction and instruction calling with ML, LLM and Vision LLM | ⚡ Inference | 5.2K | 0 | | 43 | Python | 2d ago |
| 73 |  Transformerlab App The open source research environment for AI researchers to seamlessly train, evaluate, and scale models from local hardware to GPU clusters. | ⚡ Inference | 4.9K | 0 | | 74 | Python | 14h ago |
| 74 |  Bifrost Fastest enterprise AI gateway (50x faster than LiteLLM) with adaptive load balancer, cluster mode, guardrails, 1000+ models support & <100 µs overhead at 5k RPS. | ⚡ Inference | 4.8K | 0 | | 74 | Go | 12h ago |
| 75 |  Shimmy ⚡ Python-free Rust inference server — OpenAI-API compatible. GGUF + SafeTensors, hot model swap, auto-discovery, single binary. FREE now, FREE forever. | ⚡ Inference | 4.8K | 0 | | 44 | Rust | 1mo ago |
| 76 |  Claude Obsidian Claude + Obsidian knowledge companion. Persistent, compounding wiki vault based on Karpathy's LLM Wiki pattern. /wiki /save /autoresearch | ⚡ Inference | 4.8K | 0 | | 54 | Python | 18d ago |
| 77 |  Mlx Vlm MLX-VLM is a package for inference and fine-tuning of Vision Language Models (VLMs) on your Mac using MLX. | ⚡ Inference | 4.7K | 0 | | 68 | Python | 14h ago |
| 78 |  Vllm Omni A framework for efficient model inference with omni-modality models | ⚡ Inference | 4.7K | 0 | | 74 | Python | 19h ago |
| 79 |  Llm Twin Course 🤖 𝗟𝗲𝗮𝗿𝗻 for 𝗳𝗿𝗲𝗲 how to 𝗯𝘂𝗶𝗹𝗱 an end-to-end 𝗽𝗿𝗼𝗱𝘂𝗰𝘁𝗶𝗼𝗻-𝗿𝗲𝗮𝗱𝘆 𝗟𝗟𝗠 & 𝗥𝗔𝗚 𝘀𝘆𝘀𝘁𝗲𝗺 using 𝗟𝗟𝗠𝗢𝗽𝘀 best practices: ~ 𝘴𝘰𝘶𝘳𝘤𝘦 𝘤𝘰𝘥𝘦 + 12 𝘩𝘢𝘯𝘥𝘴-𝘰𝘯 𝘭𝘦𝘴𝘴𝘰𝘯𝘴 | ⚡ Inference | 4.3K | 0 | | 28 | Python | 22d ago |
| 80 |  LLM RL Visualized 🌟100+ 原创 LLM / RL 原理图📚,《大模型算法》作者巨献!💥(100+ LLM/RL Algorithm Maps ) | ⚡ Inference | 4.3K | 0 | | 34 | Python | 3d ago |
| 81 |  Spark Nlp State of the Art Natural Language Processing | ⚡ Inference | 4.1K | 0 | | 48 | Scala | 2d ago |
| 82 |  Lemonade Lemonade helps users discover and run local AI apps by serving optimized LLMs right from their own GPUs and NPUs. Join our discord: https://discord.gg/5xXzkMu8Zk | ⚡ Inference | 3.9K | 0 | | 69 | C++ | 19h ago |
| 83 |  Scikit Llm Seamlessly integrate LLMs into scikit-learn. | ⚡ Inference | 3.5K | 0 | | 29 | Python | 10d ago |
| 84 |  Optimum 🚀 Accelerate inference and training of 🤗 Transformers, Diffusers, TIMM and Sentence Transformers with easy to use hardware optimization tools | ⚡ Inference | 3.4K | 0 | | 42 | Python | 5d ago |
| 85 |  Horizon 📡 Your own AI-powered news radar. Generates daily briefings in English & Chinese. | 用 AI 构建你专属的新闻雷达 | ⚡ Inference | 3.4K | 0 | | 54 | Python | 13h ago |
| 86 |  Hallucination Leaderboard Leaderboard Comparing LLM Performance at Producing Hallucinations when Summarizing Short Documents | ⚡ Inference | 3.2K | 0 | | 32 | Python | 18h ago |
| 87 |  Landppt 一个基于LLM的演示文稿生成平台,能够自动将文档内容转换为专业的PPT演示文稿。平台支持多种AI模型,提供丰富的模板和样式选择,让用户能够创建高质量的演示文稿。 | ⚡ Inference | 3.2K | 0 | | 38 | Python | 16d ago |
| 88 |  Xturing Build, personalize and control your own LLMs. From data pre-processing to fine-tuning, xTuring provides an easy way to personalize open-source LLMs. Join our discord community: https://discord.gg/TgHXuSJEk6 | ⚡ Inference | 2.7K | 0 | | 47 | Python | 2mo ago |
| 89 |  Aix DB Aix-DB 基于 LangChain/LangGraph 框架,结合 MCP Skills 多智能体协作架构,实现自然语言到数据洞察的端到端转换。 | ⚡ Inference | 2.1K | 0 | | 49 | JavaScript | 27d ago |
| 90 |  Rapid MLX The fastest local AI engine for Apple Silicon. 4.2x faster than Ollama, 0.08s cached TTFT, 100% tool calling. 17 tool parsers, prompt cache, reasoning separation, cloud routing. Drop-in OpenAI replacement. Works with Claude Code, Cursor, Aider. | ⚡ Inference | 2.1K | 0 | | 76 | Python | 14h ago |
| 91 |  Lucebox Hub Lucebox optimization hub: hand-tuned LLM inference, built for specific consumer hardware. | ⚡ Inference | 1.9K | 0 | | 63 | C++ | 17h ago |
| 92 |  Detikzify Synthesizing Graphics Programs for Scientific Figures and Sketches with TikZ. | ⚡ Inference | 1.8K | 0 | | 18 | Python | 3mo ago |
| 93 |  Mindpipe A powerful model compression framework for LLMs and LVLMs, adapted for NVIDIA GPUs and Huawei Ascend NPUs. | ⚡ Inference | 1.0K | 0 | | 43 | Python | 1d ago |
| 94 |  Llm Internals Learn LLM internals step by step - from tokenization to attention to inference optimization. | ⚡ Inference | 978 | 0 | | 21 | — | 1d ago |
| 95 |  Vllm Studio Control panel for VLLM, Sglang, llama.cpp, exllamav3 | ⚡ Inference | 908 | 0 | | 45 | TypeScript | 1d ago |