IPEX-LLM transformers-style API#

Hugging Face transformers AutoModel#

You can apply IPEX-LLM optimizations on any Hugging Face Transformers models by using the standard AutoModel APIs.

AutoModelForCausalLM#

AutoModel#

AutoModelForSpeechSeq2Seq#

AutoModelForSeq2SeqLM#

Native Model#

For llama/chatglm/bloom/gptneox/starcoder model families, you may also convert and run LLM using the native (cpp) implementation for maximum performance.