Sitemap
Python in Plain English

New Python content every day. Follow to join our 3.5M+ monthly readers.

Creating a Llama2 Managed Endpoint in Azure ML and Using it from Langchain

Part 1: A step-by-step guide to creating a Llama2 model in Azure ML.

7 min readAug 3, 2023

--

Today, we are going to show step by step how to create a Llama2 model (from Meta), or any other model you select from Azure ML Studio, and most importantly, using it from Langchain.

Once you are logged in Azure ML, click on Model Catalog on the left.

Then on Introducing Llama 2, click on view models.

Press enter or click to view image in full size
Image

Select the model you want, in my case I selected the smallest one, Llama2 7B.

On the summary page, click on Deploy and Select Real-time endpoint

Image

Then select to deploy with AI Content Safety which can detect and filter harmful content:

Image

All these steps were just basically to show you a Jupyter Notebook where we will replace some variables to deploy the Azure ML Managed Endpoint.

Click on Clone this notebook, and lets start reviewing the code:

Let's start with some basic variables, with this piece of code we define the registry where we will get the model from, the name of the model, the name of the endpoint, the deployment name, the compute instance (size), and the severity threshold of content to be filtered with Azure AI Content Safety.

💡 Learn how to use LLMs to generate clips from long videos:

👉 To read more such acrticles, sign up for free on Differ.

Please note the standard sku selected its a GPU VM, here you can find a list of the supported sizes:

--

--

Python in Plain English
Python in Plain English

Published in Python in Plain English

New Python content every day. Follow to join our 3.5M+ monthly readers.

Responses (4)