• Docs >
  • Building ExecuTorch LLaMA iOS Demo App

Building ExecuTorch LLaMA iOS Demo App

This app demonstrates the use of the LLaMA chat app demonstrating local inference use case with ExecuTorch.


git clone -b release/0.2 https://github.com/pytorch/executorch.git
cd executorch
git submodule update --init

python3 -m venv .venv && source .venv/bin/activate


Exporting models

Please refer to the ExecuTorch Llama2 docs to export the model.

Run the App

  1. Open the project in Xcode.

  2. Run the app (cmd+R).

  3. In app UI pick a model and tokenizer to use, type a prompt and tap the arrow buton


ExecuTorch runtime is distributed as a Swift package providing some .xcframework as prebuilt binary targets. Xcode will dowload and cache the package on the first run, which will take some time.

Copy the model to Simulator

  1. Drag&drop the model and tokenizer files onto the Simulator window and save them somewhere inside the iLLaMA folder.

  2. Pick the files in the app dialog, type a prompt and click the arrow-up button.

Copy the model to Device

  1. Wire-connect the device and open the contents in Finder.

  2. Navigate to the Files tab and drag&drop the model and tokenizer files onto the iLLaMA folder.

  3. Wait until the files are copied.

Click the image below to see it in action!

iOS app running a LlaMA model

Reporting Issues

If you encountered any bugs or issues following this tutorial please file a bug/issue here on Github.


Access comprehensive developer documentation for PyTorch

View Docs


Get in-depth tutorials for beginners and advanced developers

View Tutorials


Find development resources and get your questions answered

View Resources