Improving performance of LLaMPPL models

If your LLaMPPL model is running slowly, consider exploiting the following features to improve performance:

Auto-Batching — to run multiple particles concurrently, with batched LLM calls
Caching - to cache key and value vectors for long prompts
Immutability hinting - to significantly speed up the bookkeeping performed by SMC inference