TOP AI Developers by monthly star count
TOP AI Organization Account by AI repo star count
Top AI Project by Category star count
Top Growing Speed list by the speed of gaining stars
Top List of who create influential repos with little people known
Rankings | Developers | Related Project | Project intro | Star count |
---|---|---|---|---|
1 | YuE | YuE: Open Full-song Music Generation Foundation Model, something similar to Suno.ai but open | 3.8K | |
2 | UI-TARS-desktop | A GUI Agent application based on UI-TARS(Vision-Lanuage Model) that allows you to control your computer using natural language. | 2.7K | |
3 | node-DeepResearch | Keep searching, reading webpages, reasoning until it finds the answer (or exceeding the token budget) | 2.3K | |
4 | yek | A fast Rust based tool to serialize text-based files in a repository or directory for LLM consumption | 1.6K | |
5 | BrowserAI | Run local LLMs like llama, deepseek-distill, kokoro and more inside your browser | 822 | |
6 | minuet-ai.el | 💃 Dance with Intelligence in Your Code. Minuet offers code completion as-you-type from popular LLMs including OpenAI, Gemini, Claude, Ollama, Codestral, and more. | 101 | |
7 | vllm-ascend | Community maintained hardware plugin for vLLM on Ascend | 74 | |
8 | local-dream | Run Stable Diffusion on Android Devices with Snapdragon NPU acceleration. Also supports CPU inference. | 51 | |
9 | toolkit | Are VLMs Ready for Autonomous Driving? An Empirical Study from the Reliability, Data, and Metric Perspectives | 42 | |
10 | step_game | Multi-Agent Step Race Benchmark: Assessing LLM Collaboration and Deception Under Pressure. A multi-player “step-race” that challenges LLMs to engage in public conversation before secretly picking a move (1, 3, or 5 steps). Whenever two or more players choose the same number, all colliding players fail to advance. | 32 | |
11 | devops-gpt | 30 | ||
12 | AIN | AIN - The First Arabic Inclusive Large Multimodal Model. It is a versatile bilingual LMM excelling in visual and contextual understanding across diverse domains. | 29 | |
13 | azure-ai-agent-service-enterprise-demo | Demonstrates how to build a streaming enterprise agent using Azure AI Agent Service. This sample integrates local HR and company policy documents, Bing for external context, and GPT-4o (though the service is model-agnostic) to deliver real-time Q&A and automated workflows. | 27 | |
14 | CiscoPacketTracerChinese | 高质量的思科模拟器汉化 CiscoPacketTracerChinese 通过使用智谱清言的 glm-4-plus 大模型 进行翻译,并结合后期人工矫正,达到了前所未有的汉化质量。 | 25 | |
15 | LLMRank | PageRank for LLMs | 23 | |
16 | SubtitleAI | An AI-powered tool for summarizing YouTube videos by generating scene descriptions, translating them, and creating subtitled videos with text-to-speech narration | 17 | |
17 | llm_sts | 一个基于WebSocket和LLM的实时语音对话系统,集成了语音识别(ASR)、大语言模型(LLM)和语音合成(TTS)功能。 | 12 | |
18 | Cross-the-Gap | [ICLR 2025] - Cross the Gap: Exposing the Intra-modal Misalignment in CLIP via Modality Inversion | 11 | |
19 | fedichatbot | An LLM-powered chatbot for fediverse. A tech demo for BotKit. | 11 | |
20 | hana | Hana is an AI agent built with the Zen package. | 11 | |
21 | llama-bot-framework | Enable your ollama model to participate in online chats. | 11 | |
22 | VLM_S2H | Generalizing from SIMPLE to HARD Visual Reasoning: Can We Mitigate Modality Imbalance in VLMs? | 10 | |
23 | Green-AI-Resources | A curated collection of research, tools, and best practices for environmentally sustainable AI development and deployment. | 10 | |
24 | PixelWorld | The official code of "PixelWorld: Towards Perceiving Everything as Pixels" | 9 | |
25 | sentinel | Securade.ai Sentinel - A monitoring and surveillance application that enables visual Q&A and video captioning for existing CCTV cameras. | 9 | |
26 | Futurestic-Ai | Futurestic-Ai is an innovative artificial intelligence platform designed to revolutionize the way businesses operate by providing advanced predictive analytics and automation capabilities. It offers cutting-edge solutions for optimizing processes, identifying trends, and making intelligent decisions based on real-time data analysis. | 9 | |
27 | ComfyUI-DeepSeek-Toolkit | ComfyUI-DeepSeek-Toolkit is a node developed within the ComfyUI framework, designed to bring powerful LLMs like DeepSeek and Qwen into practical ComfyUI development. The goal is to leverage the capabilities of these advanced LLMs to create a series of useful nodes for ComfyUI. | 9 | |
28 | WebHive | Meet WebHive, the AI-powered browser that takes care of tasks for you. No more endless clicks, tell it what you need, and it gets it done. | 7 | |
29 | ctxs.ai | an open-source, community-curated directory of contexts for use with LLMs | 7 | |
30 | DeepSeaGLM | 深海GLM大模型竞赛 | 7 | |
31 | private-machine | I'll be your machinery. | 6 | |
32 | komorebi | LM/VLM implementations | 5 | |
33 | sider-ai-api | An API library for sider.ai, providing alternative access to ChatGPT, Gemini, Claude, and other models. 国内访问ChatGPT、Gemini、Claude的替代解决方案,访问sider.ai的API请求库。 | 5 | |
34 | Erniu-Inventory-management-AI-agent | Inventory Management AI Assistant A smart inventory system using Feishu Sheets and Deepseek AI for natural language operations. Ideal for small teams, it supports product and warehouse management with real-time notifications and profit calculations. | 5 | |
35 | ant-chat | 基于ant-design-x 的 AI Web客户端 | 5 | |
36 | ComfyUI-Venice-API | A custom node implementation for ComfyUI that integrates with venice.ai's Flux and SDXL image generation models | 4 | |
37 | vswarm | Vswarm (extended version of Swarm): Extended multi-agent orchestration framework for AI, supporting OpenAI, Gemini, Ollama and other providers. | 4 | |
38 | Langraph-GLM | Langragh using GLM-4-flash, with 2 LLMs and 1 tool | 4 | |
39 | comfyui-fonts | comfy-font万能字体节点node,支持truetype,freetype,open1等经典字库格式,也支持jpg、png等图片格式,还支持gif、短视频等各种视频格式,以及api接口。 Comfy font universal font node, supporting classic font library formats such as Truetype, FreeType, Open1, as well as image formats such as JPG and PNG. It also supports various video formats such as GIF and short videos, as well as API interfaces. | 3 | |
40 | DelphiGenAI | The GenAI API wrapper for Delphi is designed to integrate OpenAI’s latest models (GPT-4o, O1, and O3) seamlessly, offering robust features for chat interactions, text generation, vision processing, audio analysis, JSON configuration, and asynchronous operations with efficient error handling and testing support. | 3 | |
41 | docsifer | Docsifer is a powerful tool for converting various data formats into Markdown for applications such as indexing, text analysis, and more. It supports PDF, PowerPoint, Word, Excel, Images, Audio, HTML, and other text-based formats, and leverages LLMs to enhance performance. | 3 | |
42 | po-translator | PO-Translator 是一个基于大语言模型(LLM)的自动化翻译工具,专门用于处理 .po 文件的国际化(i18n)和本地化(l10n)任务。该项目利用先进的大语言模型(如 kimi、deepseek、qwen、glm、openai 或其他类似模型)提供高质量的翻译,同时结合 Python 编程语言的强大功能,为开发者和翻译团队提供了一个高效、智能的解决方案。 | 3 | |
43 | LMS | Learning Management System | MERN stack | 3 | |
44 | RAG-LLM-Metric | 3 | ||
45 | BloomBee | Decentralized LLMs fine-tuning and inference with offloading | 3 | |
46 | vmr-vlm | Evaluating the performance of VLMs on visual mathematical reasoning problems and proposing a novel scoring system | 2 | |
47 | Multi-Round-VLM-powered-Multimodal-Conversational-AI-Navigation-Bot | Streamlit App Combining Vision, Language, and Audio AI Models | 2 | |
48 | ComfyUI-OpenSoraPlan | Another Comfy implementation on PKU-YuanGroup/Open-Sora-Plan with supports on latest 1.3.0 and 1.2.0 supports and image to video features | 2 | |
49 | Shard | Open Source Video Understanding API and Large Vision Model Observability Platform. | 2 | |
50 | Using-stable-diffusion-webui-in-Google-Colab | Easily run Stable Diffusion Web UI on Google Colab. This repo provides notebooks to download and manage models, LoRAs, and more for AUTOMATIC1111's web UI. | 2 | |
51 | microblog-ai | A full-stack AI-powered microblogging application leveraging Azure Static Web Apps, Azure Functions, and Remix SSR with Azure OpenAI GPT-4o. | 2 | |
52 | coffee-chat-voice-assistant | Coffee Chat Voice Assistant is a voice-driven ordering system powered by Azure OpenAI GPT-4o Realtime API and Azure AI Speech, simulating the experience of ordering coffee with a café barista. It supports natural conversations, live order updates, and real-time transcription, showcasing the power of AI for seamless customer interactions. | 2 | |
53 | gptbmw.github.io | 国内升级订阅ChatGPT Plus会员教程:如何注册ChatGPT账号?中国大陆用户如何使用WildCard虚拟信用卡充值购买ChatGPT Plus账号? | 2 | |
54 | AI-Jailbreaks | This github repository features a variety of unique prompts to jailbreak ChatGPT, and other AI to go against OpenAI policy. Please read the notice at the bottom of the README.md file for more information. | 2 | |
55 | schema-doctor | .NET helper library to fix schema issues when working with LLM's | 2 | |
56 | MiniMax-01 | MiniMax-01 is a simple implementation of the MiniMax algorithm, a widely used strategy for decision-making in two-player turn-based games like Tic-Tac-Toe. The algorithm aims to minimize the maximum possible loss for the player, making it a popular choice for developing AI opponents in various game scenarios. | 2 | |
57 | glm-realtime-sdk | zhipu glm realtime api python sdk | 2 | |
58 | College-Type-Prediction | This project entails the analysis, utilizing the glm() function in R, to construct a Logistic Regression model, making use of the College dataset from the ISLR library to determine if a university was designated as private or public (James, Witten, Hastie, & Tibshirani, 2013). | 2 | |
59 | GPT4Scene | GPT4Scene: Understand 3D Scenes from Videos with Vision-Language Models | 2 | |
60 | convertpdf-gpt | Convert a pdf file automatically to markdown, using the OpenAI API | 2 | |
61 | PunditAI | Effortlessly generate upto 10K words & more through existing LLMs | 1 | |
62 | StoryScape | A storytelling (generates stories with pictures) generative AI based iOS application based on custom fine tuned LLaMA 3.2 3B-Instruct model on Hindi stories (Provision to generate English stories via call to OpenAI GPT-4o). | 1 | |
63 | stable-diffusion-tinygrad-f16 | Stable Diffusion with f16 compute | 1 | |
64 | Polymath-Project | JARVIS-like avec Gemini | 1 | |
65 | ailert-nextjs | This repository contains the frontend code for Ailert.tech build on Next.js, Tailwind CSS, and Python. | 1 | |
66 | vlmusicstudio | Web Application for VL Music Studios, Bangalore | 1 | |
67 | vlm-seminar | 1 | ||
68 | robot_manipulation_survey | 机械臂抓取工作汇总调研。 | 1 | |
69 | Euler_VLM | 1 | ||
70 | VLM_Implementation | Implementing a Video Language Model from scratch | 1 | |
71 | Archival_AI_DEPOSIT | Archival AI DEscription Pipeline for Open-Source Image Tagging VLMs | 1 | |
72 | VLM-LwEIB | The official pytorch implemention of our IJCV-2025 paper "Learning with Enriched Inductive Biases for Vision-Language Models". | 1 | |
73 | Multimodal-OCR | OCR Vision Language Model | 1 | |
74 | vlm_mia | Code for paper "Membership Inference Attacks Against Vision-Language Models" | 1 | |
75 | mini-paligemma2 | Minimalist implementation of PaliGemma 2 & PaliGemma VLM from scratch | 1 | |
76 | VLM_study | Vision Language Model study | 1 | |
77 | Radiology_Report_Generation_VLM | 1 | ||
78 | paulthebest1000-sorasaki-hina-website | 1 | ||
79 | soraandharu.github.io | 1 | ||
80 | sora-vscode-1.0.0 | VS code extension for Sora Programming Language | 1 | |
81 | Sora-The-call-assistant- | Granville Tech Company Assignment | 1 | |
82 | train-text2video-scratch | This repository provides a PyTorch implementation of a video diffusion model, similar to OpenAI's Sora, allowing you to train and generate videos from text prompts using a configurable architecture and diffusion process. | 1 | |
83 | my-dockerized-sd | Collection of Dockerfiles for Stable Diffusion on AMD hardware | 1 | |
84 | 100-Days-of-Stable-Diffusion | 100 Days of learning Stable Diffusion | 1 | |
85 | sd | Stable Diffusion Code In Python | 1 | |
86 | MVD.AI | MVD.AI: A modular AI management hub with web-based interface and API support, integrating OpenAI, GoogleAI, ClaudeAI, Stable Diffusion and more. It is a personal project with experimental features. | 1 | |
87 | inpaint-ui | This is not graphic design software. This tool is designed specifically for creating masks to be used for InPainting with models like SDXL. The goal is to eliminate the need for fully featured graphic design software if all you need is a great masking tool that produces a black and white mask. This is a work in progress, more features are coming. | 1 | |
88 | WAI-NSFW-illustrious-character-select-EN | character select for WAI-NSFW-illustrious-SDXL | 1 | |
89 | HDR_SDXL | 1 | ||
90 | openai | 【OpenAI】国内可用的 OpenAI ChatGPT 免费使用指南(支持 GPT-4,无需翻墙)【2月最新更新】全面讲解 OpenAI 的 ChatGPT 使用指南,支持 GPT-4,无需翻墙即可轻松使用 ChatGPT~ 本指南提供 OpenAI 的 ChatGPT 中文版的使用说明、ChatGPT 镜像网站推荐以及常见使用问题~ 助您在生活、学习和工作中高效使用 ChatGPT,并可无限使用 GPT-4、4o 和 o1 模型~ | 1 | |
91 | chatgpt-cn-site | ChatGPT 中文版:国内访问指南(支持GPT-4o,无需翻墙)【2025年2月更新】 | 1 | |
92 | doggo-rag-chatbot | A Streamlit GenAI chatbot that uses retrieval-augmented generation (RAG) to teach a public LLM respond questions about my dog. Powered by OpenAIs gpt-4o and embeddings model and embedding-based search (ChromaDB), this app offers real-time interactions with personalized answers. | 1 | |
93 | pdf2md-gpt | Converts PDF documents to Markdown format using GPT-4o-mini's vision capabilities. | 1 | |
94 | ChatGPT | ChatGPT 中文版:国内免费使用指南(支持 GPT-4 和 4o、o1,无需翻墙)【2025年2月最新】本指南详细介绍了如何在国内免费使用 ChatGPT 中文版,并推荐了一些无需翻墙即可访问的ChatGPT 镜像网站。我们精心准备了全面的 ChatGPT 中文版使用指南,包括国内可用的 ChatGPT 镜像网站和官方使用教程,旨在帮助您在学习、工作和生活中灵活运用 ChatGPT,享受无限制的 GPT-4o 和 o1 使用体验。 | 1 | |
95 | chatgpt-4o | ChatGPT 中文版:国内免费使用指南(支持 GPT-4、GPT-4o 和 o1,无需翻墙)【2025年2月最新】 | 1 | |
96 | ChatGPT_CN | ChatGPT 中文版:2025年最新国内使用指南~(支持 GPT-4,无需翻墙)【2025年最新更新】无需翻墙,轻松使用ChatGPT 2025年最新【ChatGPT中文版】国内使用指南,无限使用 GPT-4、4o和o1模型 本指南将详细介绍如何在国内使用 ChatGPT 中文版,助你快速用上ChatGPT~ | 1 | |
97 | ChatGPT-CN | ChatGPT 中文版:国内访问指南(支持 GPT-4、GPT-4o、GPT-o1,无需翻墙) | 1 | |
98 | MediaMind | MediaMind is an AI-powered tool that transcribes and summarizes .mov video files using OpenAI's Whisper model for transcription and GPT-4o for summarization. | 1 | |
99 | Langflow-Customer-Support-Agent | Langflow Customer Support Agent: An LLM-powered, multi-agent chatbot using Langflow, Streamlit, and OpenAI GPT-4o mini, with RAG-based retrieval via AstraDB. It dynamically routes queries to specialized agents (FAQAgent, OrderLookupAgent) for accurate, context-aware responses. Includes vector search and file upload for knowledge expansion. | 1 | |
100 | ChatGPT-CN | ChatGPT 中文版|国内 ChatGPT 镜像网站免费推荐(支持 GPT-4、4o和o1)【2025年2月更新】 | 1 |