Exo LLM Guide: Your Fast Track to Running Local LLMs on Your Mac

Are you ready to dive into the exciting world of Large Language Models (LLMs) and run them directly on your Mac? This Exo Llm Guide is designed for tech enthusiasts and developers eager to explore on-device LLM capabilities using Swift and CoreML. If you’re aiming to build cutting-edge Mac, iOS, or visionOS applications powered by local LLMs, you’ve landed in the right place. For those simply curious about LLMs, this might be a deeper dive than you were expecting!

The buzz around Apple’s demonstration of on-device CoreML LLMs at WWDC was undeniable. It sparked my immediate interest: how can we harness this technology? At conduct.edu.vn, we’re dedicated to helping individuals and organizations leverage on-device LLMs. Why rely solely on cloud infrastructure when powerful hardware sits right at your fingertips? On-device processing enhances data security, ensures offline functionality, and optimizes resource utilization. This guide offers a streamlined path to get you started.

Getting Started with Local LLMs: A Pain-Free Approach

“There has to be a better way!”

This was my exact thought while navigating fragmented instructions and code snippets to get a local LLM operational on my Mac. The process can be riddled with potential pitfalls. Frustrated by the lack of a clear, end-to-end guide, I decided to create one. This exo llm guide provides a straightforward walkthrough, enabling you to run a local LLM in approximately 10 minutes.

For optimal results, this guide is tailored for macOS Sequoia 15.1 beta and Xcode 16.1 beta 2 (16B5014f). Compatibility with older versions isn’t guaranteed, so macOS Sequoia beta is highly recommended.

Part 1: Setting Up Your Model and Command-Line LLM

Our LLM of choice, Mistral 7B v0.3, is hosted on Hugging Face, a hub for countless models. Begin by creating a free Hugging Face account and logging in at https://huggingface.co (email verification required).

Request access to the Mistral7B LLM model by clicking ‘Agree and access repository’ here: https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.3

Generate a Hugging Face access token with read permissions. Securely copy and save this token, which will resemble asdijfasidfhaisudhfaoiusdhfaiousdhf. You can create a token here: https://huggingface.co/settings/tokens

If you’ve previously attempted to set up local LLMs, it’s crucial to delete the Documents/huggingface directory on your Mac. Previous, unsuccessful attempts might have left cached files that can interfere with the process.

Open Terminal and create a working directory. For this guide, we’ll use Downloads/apple-llm-test on your Mac.

Install the Swift Transformers project, which is essential for running CoreML models using Swift. Navigate into your newly created directory after installation. Important: Ensure you’re using the preview branch of Swift Transformers.

git clone --depth 1 --branch preview https://github.com/huggingface/swift-transformers
cd swift-transformers/Examples

‼️⚠️💥 Critical Step: Setting the Hugging Face Environment Variable

Do Not Skip This Step. Set up the Hugging Face environment variable within your Terminal window. This grants the software the necessary permissions to access key installation and configuration components. Without this, the setup will fail. This setup is temporary for the current window only. For persistent command-line access, add it to your shell configuration. Replace 'YOURTOKENHERE' with your actual Hugging Face access token.

export HUGGINGFACE_TOKEN=YOURTOKENHERE

Install the Hugging Face tools required to download the model.

pip install huggingface_hub

Alternative for Homebrew Users Only: If you use Homebrew, use this command instead (otherwise, skip).

brew install huggingface_hub

Now, for the most time-consuming part: downloading the model. We’ll use the Int4 quantized version of Mistral for a smaller download size in this exo llm guide. Ensure you are in the swift-transformers/Examples directory when executing this command.

swift run --package-path .. mistral-download --model-id mistralai/Mistral-7B-Instruct-v0.3-coreml-int4

If everything has been set up correctly, you can now interact with your local LLM via the command line.

swift run --package-path .. mistral --model-id mistralai/Mistral-7B-Instruct-v0.3-coreml-int4

You should see output similar to this:

[MistralApp.Mistral] Starting chat session
>>> User: What is the capital of France?
[MistralApp.Mistral] Paris is the capital of France.
[MistralApp.Mistral] Tokens per second: 42.77
>>> User:

Notice the performance metric: tokens per second (42.77 in this example), measured on a 2023 MacBook Pro M2 Max. This metric will be crucial for future model and code optimization, especially when deploying to mobile platforms.

Next Steps in Your LLM Journey

Congratulations! Achieving command-line output signifies a significant step towards running LLMs on your Mac and beyond. Experiment with different prompts and explore the capabilities of your local LLM through the command line.

To further your exploration, Part 2 of this exo llm guide will demonstrate how to utilize the Swift Chat app and Xcode to run the model within a graphical user interface (GUI) on your Mac.