[CVPR 2025] Official repository of SoRA: Singular Value Decomposed Low-Rank Adaptation for Domain Generalizable Representation Learning
TOP AI Developers by monthly star count
TOP AI Organization Account by AI repo star count
Top AI Project by Category star count
Top Growing Speed list by the speed of gaining stars
Top List of who create influential repos with little people known
Projects and developers that are thriving yet have not been updated for a long time.
Rankings | Developers | Related Project | Project intro | Star count |
---|---|---|---|---|
1 | llama-ocr | Document to Markdown OCR library with Llama 3.2 vision | 2.2K | |
2 | papersgpt-for-zotero | Zotero chat PDF with DeepSeek, GPT 4.5, ChatGPT, Claude, Gemini | 1.3K | |
3 | StableAnimator | [CVPR2025] We present StableAnimator, the first end-to-end ID-preserving video diffusion framework, which synthesizes high-quality videos without any post-processing, conditioned on a reference image and a sequence of poses. | 1.2K | |
4 | LLaMA-Mesh | Unifying 3D Mesh Generation with Language Models | 910 | |
5 | MV-Adapter | [768 Resolution] [Any "SDXL" Model] [Various Conditions] [Arbitrary Views] Official impl. of "MV-Adapter: Multi-view Consistent Image Generation Made Easy" | 600 | |
6 | vlmrun-hub | A hub for various industry-specific schemas to be used with VLMs. | 459 | |
7 | copilot-more | GPT-4o and Claude-3.7-Sonnet APIs for coding. | 393 | |
8 | codegate | CodeGate: CodeGen Privacy and Security | 326 | |
9 | WavChat | A Survey of Spoken Dialogue Models (60 pages) | 270 | |
10 | sanic-web | 一个轻量级、支持全链路且易于二次开发的大模型应用项目 支持DeepSeek/Qwen2.5等大模型 基于 Dify 、Ollama&Vllm、Sanic 和 Text2SQL 📊 等技术构建的一站式大模型应用开发项目,采用 Vue3、TypeScript 和 Vite 5 打造现代UI。它支持通过 ECharts 📈 实现基于大模型的数据图形化问答,具备处理 CSV 文件 📂 表格问答的能力。同时,能方便对接第三方开源 RAG 系统 检索系统 🌐等,以支持广泛的通用知识问答。 | 199 | |
11 | swift-chat | A Cross-platform AI chat application built with React Native and powered by Amazon Bedrock | 161 | |
12 | ChatRex | Code for ChatRex: Taming Multimodal LLM for Joint Perception and Understanding | 156 | |
13 | llm4ad | LLM4AD: A Platform for Algorithm Design with Large Language Model | 131 | |
14 | GLM-Edge | GLM Series Edge Models | 129 | |
15 | ffpa-attn-mma | 📚FFPA(Split-D): Yet another Faster Flash Prefill Attention with O(1) GPU SRAM complexity for headdim > 256, ~2x↑🎉vs SDPA EA. | 125 | |
16 | BALROG | Benchmarking Agentic LLM and VLM Reasoning On Games | 117 | |
17 | llm-codenames | Implementation of the board game Codenames, re-imagined as a collaborative game between LLM agents | 102 | |
18 | pyvisionai | The PyVisionAI Official Repo | 97 | |
19 | VLM-surveys | A most Frontend Collection and survey of vision-language model papers, and models GitHub repository | 66 | |
20 | 3d-conditioning | Enhance and modify high-quality compositions using real-time rendering and generative AI output without affecting a hero product asset. | 61 | |
21 | SDXL-Training-Improvements | SDXL Training Improvements | 57 | |
22 | OLA-VLM | OLA-VLM: Elevating Visual Perception in Multimodal LLMs with Auxiliary Embedding Distillation, arXiv 2024 | 51 | |
23 | gemini-gradio | 51 | ||
24 | VLMnav | End-to-End Navigation with VLMs | 48 | |
25 | one | Build AI powered websites with Astro, Shadcn and Vercel AI SDK | 47 | |
26 | Build-An-LLM-RAG-Chatbot-With-LangChain-Python | Build-An-LLM-RAG-Chatbot-With-LangChain-Python | 38 | |
27 | PhoneLM | 37 | ||
28 | Thinking-GPT4o | The prompts which could enable GPT-4o to think and act like GPT-o1-mini | 37 | |
29 | DPO_pLM | 36 | ||
30 | maux-calories-tracker | 🤖 AI-powered food analysis tool that instantly calculates calories and nutrients from images. Built with Next.js 15, Vercel AI SDK, and GPT-4o. | 35 | |
31 | GLMix | [NeurIPS 2024] official code release for our paper "Revisiting the Integration of Convolution and Attention for Vision Backbone". | 31 | |
32 | parsemypdf | Collection of PDF parsing libraries like AI based docling, claude, openai, llama-vision, unstructured-io, and pdfminer, pymupdf, pdfplumber etc for efficient snapshot, text, table, and metadata extraction. | 31 | |
33 | ama | Ask Me Anything for any website, powered by Firecrawl and OpenAI GPT-4o-mini | 30 | |
34 | AI-Frontiers-Digest | AI Frontiers Digest leverages LLMs to intelligently curate and summarize the latest developments in AI, while also generating engaging Podcasts. | 30 | |
35 | realtime-gpt4o-videochat | Real-time GPT-4o video/photo/voice chat | 28 | |
36 | poe | 最新Poe订阅教程:国内如何充值购买Poe会员账号?可无限制使用未降智的ChatGPT-4.5、满血版DeepSeek-R1、马斯克Grok-3、谷歌Gemini-2 Pro等!本文将手把手地教大家如何快速获取一张“野卡YEKA虚拟信用卡”,并使用该卡为Poe充值。而且,本Poe订阅教程经过亲测,真实有效且安全可靠,整个Poe购买流程最快仅需十五分钟即可完成。 | 23 | |
37 | Free-Unoffical-OpenAI-API | A powerful, unofficial OpenAI-compatible API service offering free access to GPT-4o, GPT-4-turbo, and audio preview models like gpt-4o-audio-preview & Realtime Models like gpt-4o-realtime. Features streaming responses, voice synthesis, TTS with no authentication requirements. Hosted on Hugging Face & Railway (Free tier) | 21 | |
38 | netaivideoanalyzer | This repository contains a series of samples on how to analyse a video using multimodal Large Language Models, like GPT-4o or GPT-4o-mini. | 20 | |
39 | IPLoc | Repository for the paper: Teaching VLMs to Localize Specific Objects from In-context Examples | 20 | |
40 | gemini-coder | 20 | ||
41 | LayoutVLM | Official code for "LayoutVLM: Differentiable Optimization of 3D Layout via Vision-Language Models" | 18 | |
42 | AdvDiffVLM | 18 | ||
43 | chatgpt-4o | ChatGPT中文版、ChatGPT官网、ChatGPT网页版,本文提供完整的 ChatGPT 中文版使用指南,推荐国内可用的 ChatGPT 镜像网站,支持 GPT-4、GPT-4o,永久免费,无需翻墙,适合中文用户。 本项目旨在为用户提供一站式的 ChatGPT 中文版使用指南,同时整理了国内可用的 ChatGPT镜像网站 和 官网使用教程,帮助您快速上手 ChatGPT,无论是个人使用还是专业需求。 | 18 | |
44 | gpt-resolve | Can GPT solve Brazilian university entrance exams? | 17 | |
45 | VISTA | The code for "VISTA: Enhancing Long-Duration and High-Resolution Video Understanding by VIdeo SpatioTemporal Augmentation" [CVPR2025] | 14 | |
46 | LLMling | Easy MCP (Model Context Protocol) servers and AI agents, defined as YAML. | 14 | |
47 | Chatchat-Lite | 从零开始基于 LangGraph 和 Streamlit 实现基于本地模型的 RAG、Agent 应用 | 13 | |
48 | llmtest | A behavioral testing library for LLM applications that allows developers to write natural language specifications for unit and integration tests. Validate LLM application behavior using plain English assertions in a simple assert(str, str) form factor. | 13 | |
49 | k0lmena | Automation framework in TypeScript and Playwright | 13 | |
50 | Pycon-mini-Tokai-2024-VLM-Colaboratory-Sample | PyCon mini 東海 2024 のトーク「Google Colaboratoryで試すVLM」で紹介したサンプル集 | 12 | |
51 | DG-SoRA | [CVPR 2025] Official repository of SoRA: Singular Value Decomposed Low-Rank Adaptation for Domain Generalizable Representation Learning | 12 | |
52 | hands-on-llama | 11 | ||
53 | computer-agent-arena-hub | Computer Agent Arena Hub: Compare & Test AI Agents on Crowdsourced Real-World Computer Use Tasks | 11 | |
54 | DeWm | Automatically remove watermarks from illustrations using AI (Stable Diffusion). | 11 | |
55 | lecca-io | Lecca.io | AI Agents & Automations | 11 | |
56 | UnrealGenAISupport | This project aims to build a long-term support (LTS) plugin for various cutting-edge LLM/GenAI models & foster a community around it. It currently includes OpenAI's GPT4o & GPT4o-mini for Unreal Engine 5.1 or higher, with plans to add Claude Sonnet 3.5, Deepseek, Gemini, Grok 2 & realtime APIs soon. Note that this plugin is still under development. | 11 | |
57 | LLM-Based-Multi-Agent-Stock-Analysis-and-Investment-Advisor | This project leverages Large Language Models (LLMs) and a multi-agent framework to analyze stock prices, gather relevant news, and generate comprehensive financial investment reports for companies. | 11 | |
58 | ObsiAI | An AI chatbot plugin for Obsidian using the Gemini API for note summarization, content generation, and more. Enhance your workflow with AI assistance like the Notion AI bot. | 10 | |
59 | dish-ai-commit | 🤖 AI-Powered VSCode extension for generating standardized Git/SVN commit messages. ✨ Supports multiple AI services: OpenAI, ChatGPT, Ollama, Zhipu, DashScope, Doubao, Gemini and VS Code built-in AI. 🌍 Multi-language support (EN/CN/JP/KR/Other). 📊 Auto-generate weekly reports. | 10 | |
60 | Korean-SAT-LLM-Leaderboard | Korean SAT leader board | 9 | |
61 | chat-with-image | AI Gemini | 9 | |
62 | aibook | (WIP) 🦀 An Insanely Fast 🚀 Full Stack Content Generation SaaS Platform Powered by Dioxus, Dioxus Server Functions, Axum, Unsplash, Gemini AI & MongoDB. | 8 | |
63 | FreeIPCC | 呼叫中心,智能外呼,大模型呼入机器人,大模型呼出机器人,客服系统,工单系统,开源呼叫中心系统,话务系统,智能外呼系统,智能电话外呼,呼叫中心系统,大模型客服,电话外呼,客服中心,在线客服,大模型呼叫中心,呼入机器人,大模型机器人,智能电话外呼,开源呼叫中心系统,电话外呼,在线客服,大模型callcenter,contactcenter,Call,IPCC,Customer Service,Voice,AI,Call Center, Contact Center,LLM,TTS,ASR,NLP,Chabot,FreeSWITCH,OpenSips,Kamailio,Asterisk,WebRTC,Robot,Outbound,LangChain,RAG! | 8 | |
64 | codebase-dump | Dump your codebase into single file, so you can use it as an input to LLMs, like ChatGPT, Google Gemini (directly or through or Google AI Studio), Claude and others. This project is the lightweight version of Codebase Digest - check it for example LLM prompts. | 8 | |
65 | fastllm.cpp | A low latency, fault tolerant API for accessing LLM's written in C++ using llama.cpp. | 8 | |
66 | vlm-api | REST API for computing cross-modal similarity between images and text using the ColPaLI vision-language model | 7 | |
67 | Funasr-Qwen-GPTSovits | <综合> Funasr语音识别,调用Qwen大模型回答,通过GPTSovits输出语音的ai程序,其中调用模型还是在线,后续将添加离线大模型 | 7 | |
68 | covercraft-ai | Customized Cover Letter PDF Generator using OpenAI's GPT-4o model | 7 | |
69 | llama-pruning | This project provides tools to load and prune large language models using a structured pruning method. | 7 | |
70 | chatgpt-4o | ChatGPT中文版:国内访问指南(支持GPT-4,无需魔法,无限使用GPT-4o和o1) | 7 | |
71 | Local-Diffusion | Flutter GUI wrapper for stable-diffusion.cpp - Run Stable Diffusion locally on Android | 7 | |
72 | ai-summarizer-extension | A browser extension that helps you summarize web content and YouTube videos using your existing AI accounts (ChatGPT, Claude, or Gemini). Hence it's free! | 7 | |
73 | DecorateLM | 6 | ||
74 | openweights | A python sdk for LLM finetuning and inference on runpod infrastructure | 6 | |
75 | VLM_TriTraining | Construct a Tri-Training framework using VLMs as base estimators, and evaluate its accuracy on multiple semi-supervised learning benchmarks. | 5 | |
76 | gemini-flashcards | Revolutionizing learning through AI-powered flashcards and adaptive study methods. | 5 | |
77 | NekoACM | 🐱🐾基于大模型的 ACM-ICPC 算法题目自动出题系统,可以自动生成算法题目、测试用例和题解代码。系统可以作为一个单独的服务运行,也可以作为一个模块集成到 OJ 系统中。 | 5 | |
78 | llama-index-cloud-sql-pg-python | 5 | ||
79 | LightGPT | A lightweight GPT-style large language model (LLM) with high parameter and memory efficiency. | 5 | |
80 | XmodelLM-1.5 | 4 | ||
81 | OpenResearch | Open-Research.ai is an AI-driven search engine that leverages OpenAI and Serper.dev to deliver a powerful search experience. 1-Click deploy to Vercel. | 4 | |
82 | chatgpt-mirrors-free | 【中国在线免费】国内可用的ChatGPT免费中文版镜像网站整理(2024/12/17) 不用翻墙就能用免费ChatGPT,本站收录了国内可用且免费的ChatGPT镜像网站,无限制使用GPT-4o、o1。 在国内可以直接在线使用的 ChatGPT免费中文版镜像网站。 这些网站均为免费的网页版,国内ChatGPT、免费ChatGPT、ChatGPT中文版、ChatGPT镜像网站~ | 4 | |
83 | par_gpt | CLI LLM tool | 4 | |
84 | build-n-roll-tg-bot | Telegram bot for fast and efficient creation of D&D character | 4 | |
85 | feedforge | A feedback app for mipyme bits in peru using gemini, open ai and a lot diferent and useful integrations(AYNI HACKATON REGIONAL WINNER PROJECT) | 4 | |
86 | leapMiceGemini | The simplest Leap Motion-based mouse | 4 | |
87 | Llama_impact-3.2 | "GovEase" is a simple platform that connects citizens with essential government information and services. | 4 | |
88 | PoseMaster-Dynamic-Person-Repose-ChangeClothes-and-Face-Transformation | PoseMorphAI is a comprehensive pipeline built using ComfyUI and Stable Diffusion, designed to reposition people in images, modify their facial features, and change their clothes seamlessly. This solution leverages advanced pose estimation, facial conditioning, image generation, and detail refinement modules for high-quality output. | 4 | |
89 | opencoder-llm.github.io | 4 | ||
90 | Search-GPT | Chat GPT with real time internet search capabilities. Built using Python, LangChain, Streamlit, Open AI, and Google Search Results. | 4 | |
91 | FitnessLM | 4 | ||
92 | lm_dam | 4 | ||
93 | agent-gpt | AgentGPT: Distributed RL training with AWS and remote environments. | 3 | |
94 | Chat-with-PDF | A RAG Application powered by Llama 3.2b and HuggingFace Embeddings | 3 | |
95 | SDXL_Anime_Arena | 3 | ||
96 | ThinkGPT | 3 | ||
97 | ai-artist | An AI Artist application utilizing LLM and Stable Diffusion. | 3 | |
98 | MLLM-applied-to-autonomous-driving-across-various-weather-conditions | MLLM-AD-4o leverages GPT-4o in LimSim++ with CARLA, tested under adverse conditions using various sensor setups (front and back cameras, LiDAR). This project explores MLLMs in autonomous driving to enhance performance across challenging environments. | 3 | |
99 | xpon-lms-plugin | 3 | ||
100 | awesome-text-to-video-plus | The Ultimate Guide to Effortlessly Creating AI Videos for Social Media Go From Text to Eye-Catching Videos in Just a Few Steps | 3 |