intel

[email protected]

Github Data

Followers 4433

Following 0

Links

AI Project

Public repos: 1277Public gists: 0

auto-round

Advanced Quantization Algorithm for LLMs and VLMs, with support for CPU, Intel GPU, CUDA and HPU. Seamlessly integrated with Torchao, Transformers, and vLLM.

star: 524fork: 42

language: Python

created at: 2024-01-04

updated at: 2025-06-27

llm-on-ray

Pretrain, finetune and serve LLMs on Intel platforms with Ray

star: 109fork: 30

language: Python

created at: 2023-11-13

updated at: 2025-02-09

ipex-llm-tutorial

Accelerate LLM with low-bit (FP4 / INT4 / FP8 / INT8) optimizations using ipex-llm

star: 157fork: 39

language: Jupyter Notebook

created at: 2023-07-27

updated at: 2025-02-04

xFasterTransformer

star: 428fork: 69

language: C++

created at: 2023-06-14

updated at: 2025-06-17

intel-extension-for-transformers

⚡ Build your chatbot within minutes on your favorite device; offer SOTA compression techniques for LLMs; run LLMs efficiently on Intel Platforms⚡

star: 2.2Kfork: 211

language: Python

created at: 2022-11-11

updated at: 2025-02-10

neural-compressor

SOTA low-bit LLM quantization (INT8/FP8/INT4/FP4/NF4) & sparsity; leading model compression techniques on TensorFlow, PyTorch, and ONNX Runtime

star: 2.3Kfork: 262

language: Python

created at: 2020-07-21

updated at: 2025-02-10

lms

star: 35fork: 10

language: C++

created at: 2019-04-09

updated at: 2025-02-09

lmbench

star: 305fork: 140

created at: 2018-07-31

updated at: 2025-04-23

ipex-llm

Accelerate local LLM inference and finetuning (LLaMA, Mistral, ChatGLM, Qwen, DeepSeek, Mixtral, Gemma, Phi, MiniCPM, Qwen-VL, MiniCPM-V, etc.) on Intel XPU (e.g., local PC with iGPU and NPU, discrete GPU such as Arc, Flex and Max); seamlessly integrate with llama.cpp, Ollama, HuggingFace, LangChain, LlamaIndex, vLLM, DeepSpeed, Axolotl, etc.

star: 8.1Kfork: 1.4K

language: Python

created at: 2016-08-29

updated at: 2025-07-01