feat: implement managed identity (closes #56) (#58)

* feat: use managed identity for storage * feat: use openai managed identity * chore: add identity package * fix: db connection string * refactor: add getCredentials helper * chore: remove key references * chore: update dependencies * docs: migrate docs to mention AI search * feat: replace MongoDB with AI search in infra * fix: infra template * chore: fix infra template to disable local auth * feat: migrate DB code to AI Search * fix: migrate filename metadata * chore: disable blob public access * chore: update dependency * docs: add RBAC permissions * chore: update chat model and reduce db cost * chore: update infra * docs: suggest default region * chore: revert gpt version * chore: fix typo
Azure-Samples · Apr 30, 2024 · d7c7a29 · d7c7a29
1 parent 7a7f4bb
commit d7c7a29
Show file tree

Hide file tree

Showing 18 changed files with 799 additions and 326 deletions.
diff --git a/README.md b/README.md
@@ -23,7 +23,7 @@
 
 </div>
 
-This sample shows how to build a serverless AI chat experience with Retrieval-Augmented Generation using [LangChain.js](https://js.langchain.com/) and Azure. The application is hosted on [Azure Static Web Apps](https://learn.microsoft.com/azure/static-web-apps/overview) and [Azure Functions](https://learn.microsoft.com/azure/azure-functions/functions-overview?pivots=programming-language-javascript), with [Azure Cosmos DB for MongoDB vCore](https://learn.microsoft.com/azure/cosmos-db/mongodb/vcore/) as the vector database. You can use it as a starting point for building more complex AI applications.
+This sample shows how to build a serverless AI chat experience with Retrieval-Augmented Generation using [LangChain.js](https://js.langchain.com/) and Azure. The application is hosted on [Azure Static Web Apps](https://learn.microsoft.com/azure/static-web-apps/overview) and [Azure Functions](https://learn.microsoft.com/azure/azure-functions/functions-overview?pivots=programming-language-javascript), with [Azure AI Search](https://learn.microsoft.com/azure/search/search-what-is-azure-search) as the vector database. You can use it as a starting point for building more complex AI applications.
 
 > [!TIP]
 > You can test this application locally without any cost using [Ollama](https://ollama.com/). Follow the instructions in the [Local Development](#local-development) section to get started.
@@ -44,7 +44,7 @@ This application is made from multiple components:
 
 - A serverless API built with [Azure Functions](https://learn.microsoft.com/azure/azure-functions/functions-overview?pivots=programming-language-javascript) and using [LangChain.js](https://js.langchain.com/) to ingest the documents and generate responses to the user chat queries. The code is located in the `packages/api` folder.
 
-- A database to store the text extracted from the documents and the vectors generated by LangChain.js, using [Azure Cosmos DB for MongoDB vCore](https://learn.microsoft.com/azure/cosmos-db/mongodb/vcore/).
+- A database to store the text extracted from the documents and the vectors generated by LangChain.js, using [Azure AI Search](https://learn.microsoft.com/azure/search/search-what-is-azure-search).
 
 - A file storage to store the source documents, using [Azure Blob Storage](https://learn.microsoft.com/azure/storage/blobs/storage-blobs-introduction).
 
@@ -161,13 +161,16 @@ Note that the documents are uploaded automatically when deploying the sample to
 
 - **Azure account**. If you're new to Azure, [get an Azure account for free](https://azure.microsoft.com/free) to get free Azure credits to get started. If you're a student, you can also get free credits with [Azure for Students](https://aka.ms/azureforstudents).
 - **Azure subscription with access enabled for the Azure OpenAI service**. You can request access with [this form](https://aka.ms/oaiapply).
+- **Azure account permissions**:
+  - Your Azure account must have `Microsoft.Authorization/roleAssignments/write` permissions, such as [Role Based Access Control Administrator](https://learn.microsoft.com/azure/role-based-access-control/built-in-roles#role-based-access-control-administrator-preview), [User Access Administrator](https://learn.microsoft.com/azure/role-based-access-control/built-in-roles#user-access-administrator), or [Owner](https://learn.microsoft.com/azure/role-based-access-control/built-in-roles#owner). If you don't have subscription-level permissions, you must be granted [RBAC](https://learn.microsoft.com/azure/role-based-access-control/built-in-roles#role-based-access-control-administrator-preview) for an existing resource group and [deploy to that existing group](docs/deploy_existing.md#resource-group).
+  - Your Azure account also needs `Microsoft.Resources/deployments/write` permissions on the subscription level.
 
 #### Deploy the sample
 
 1. Open a terminal and navigate to the root of the project.
 2. Authenticate with Azure by running `azd auth login`.
 3. Run `azd up` to deploy the application to Azure. This will provision Azure resources, deploy this sample, and build the search index based on the files found in the `./data` folder.
-   - You will be prompted to select a base location for the resources. Choose a location that is closest to you.
+   - You will be prompted to select a base location for the resources. If you're unsure of which location to choose, select `eastus2`.
    - By default, the OpenAI resource will be deployed to `eastus2`. You can set a different location with `azd env set AZURE_OPENAI_RESOURCE_GROUP_LOCATION <location>`. Currently only a short list of locations is accepted. That location list is based on the [OpenAI model availability table](https://learn.microsoft.com/azure/ai-services/openai/concepts/models#standard-deployment-model-availability) and may become outdated as availability changes.
 
 The deployment process will take a few minutes. Once it's done, you'll see the URL of the web app in the terminal.
@@ -194,7 +197,7 @@ Here are some resources to learn more about the technologies used in this sample
 - [LangChain.js documentation](https://js.langchain.com)
 - [Generative AI For Beginners](https://github.com/microsoft/generative-ai-for-beginners)
 - [Azure OpenAI Service](https://learn.microsoft.com/azure/ai-services/openai/overview)
-- [Azure Cosmos DB for MongoDB vCore](https://learn.microsoft.com/azure/cosmos-db/mongodb/vcore/)
+- [Azure AI Search](https://learn.microsoft.com/azure/search/search-what-is-azure-search)
 - [Ask YouTube: LangChain.js + Azure Quickstart sample](https://github.com/Azure-Samples/langchainjs-quickstart-demo)
 - [Chat + Enterprise data with Azure OpenAI and Azure AI Search](https://github.com/Azure-Samples/azure-search-openai-javascript)
 - [Revolutionize your Enterprise Data with Chat: Next-gen Apps w/ Azure OpenAI and AI Search](https://aka.ms/entgptsearchblog)

diff --git a/docs/faq.md b/docs/faq.md
@@ -7,7 +7,7 @@ Retrieval-Augmented Generation (RAG) is a method used in artificial intelligence
 
 At its core, RAG involves two main components:
 
-- **Retriever**: Think "_like a search engine_", finding relevant information from a knowledgebase, usually a vector database. In this sample, we're using Azure CosmosDB for MongoDB vCore as our vector database.
+- **Retriever**: Think "_like a search engine_", finding relevant information from a knowledgebase, usually a vector database. In this sample, we're using Azure AI Search as our vector database.
 
 - **Generator**: Acts like a writer, taking the prompt and information retrieved to create a response. We're using here a Large Language Model (LLM) for this task.
 

diff --git a/docs/images/architecture.drawio.png b/docs/images/architecture.drawio.png
diff --git a/docs/readme.md b/docs/readme.md
@@ -17,7 +17,7 @@ description: Build your own serverless AI Chat with Retrieval-Augmented-Generati
 
 <!-- Learn samples onboarding: https://review.learn.microsoft.com/en-us/help/contribute/samples/process/onboarding?branch=main -->
 <!-- prettier-ignore -->
-This sample shows how to build a serverless AI chat experience with Retrieval-Augmented Generation using [LangChain.js](https://js.langchain.com/) and Azure. The application is hosted on [Azure Static Web Apps](https://learn.microsoft.com/azure/static-web-apps/overview) and [Azure Functions](https://learn.microsoft.com/azure/azure-functions/functions-overview?pivots=programming-language-javascript), with [Azure Cosmos DB for MongoDB vCore](https://learn.microsoft.com/azure/cosmos-db/mongodb/vcore/) as the vector database. You can use it as a starting point for building more complex AI applications.
+This sample shows how to build a serverless AI chat experience with Retrieval-Augmented Generation using [LangChain.js](https://js.langchain.com/) and Azure. The application is hosted on [Azure Static Web Apps](https://learn.microsoft.com/azure/static-web-apps/overview) and [Azure Functions](https://learn.microsoft.com/azure/azure-functions/functions-overview?pivots=programming-language-javascript), with [Azure AI Search](https://learn.microsoft.com/azure/search/search-what-is-azure-search) as the vector database. You can use it as a starting point for building more complex AI applications.
 
 ![Animation showing the chat app in action](./images/demo.gif)
 
@@ -37,7 +37,7 @@ This application is made from multiple components:
 
 - A serverless API built with [Azure Functions](https://learn.microsoft.com/azure/azure-functions/functions-overview?pivots=programming-language-javascript) and using [LangChain.js](https://js.langchain.com/) to ingest the documents and generate responses to the user chat queries. The code is located in the `packages/api` folder.
 
-- A database to store the text extracted from the documents and the vectors generated by LangChain.js, using [Azure Cosmos DB for MongoDB vCore](https://learn.microsoft.com/azure/cosmos-db/mongodb/vcore/).
+- A database to store the text extracted from the documents and the vectors generated by LangChain.js, using [Azure AI Search](https://learn.microsoft.com/azure/search/search-what-is-azure-search).
 
 - A file storage to store the source documents, using [Azure Blob Storage](https://learn.microsoft.com/azure/storage/blobs/storage-blobs-introduction).
 
@@ -48,6 +48,9 @@ This application is made from multiple components:
 - [Git](https://git-scm.com/downloads)
 - Azure account. If you're new to Azure, [get an Azure account for free](https://azure.microsoft.com/free) to get free Azure credits to get started. If you're a student, you can also get free credits with [Azure for Students](https://aka.ms/azureforstudents).
 - Azure subscription with access enabled for the Azure OpenAI service. You can request access with [this form](https://aka.ms/oaiapply).
+- Azure account permissions:
+  - Your Azure account must have `Microsoft.Authorization/roleAssignments/write` permissions, such as [Role Based Access Control Administrator](https://learn.microsoft.com/azure/role-based-access-control/built-in-roles#role-based-access-control-administrator-preview), [User Access Administrator](https://learn.microsoft.com/azure/role-based-access-control/built-in-roles#user-access-administrator), or [Owner](https://learn.microsoft.com/azure/role-based-access-control/built-in-roles#owner). If you don't have subscription-level permissions, you must be granted [RBAC](https://learn.microsoft.com/azure/role-based-access-control/built-in-roles#role-based-access-control-administrator-preview) for an existing resource group and [deploy to that existing group](docs/deploy_existing.md#resource-group).
+  - Your Azure account also needs `Microsoft.Resources/deployments/write` permissions on the subscription level.
 
 ## Setup the sample
 
@@ -65,7 +68,7 @@ You can run this project directly in your browser by using GitHub Codespaces, wh
 1. Open a terminal at the root of the project.
 2. Authenticate with Azure by running `azd auth login`.
 3. Run `azd up` to deploy the application to Azure. This will provision Azure resources, deploy this sample, and build the search index based on the files found in the `./data` folder.
-   - You will be prompted to select a base location for the resources. Choose a location that is closest to you.
+   - You will be prompted to select a base location for the resources. If you're unsure of which location to choose, select `eastus2`.
    - By default, the OpenAI resource will be deployed to `eastus2`. You can set a different location with `azd env set AZURE_OPENAI_RESOURCE_GROUP_LOCATION <location>`. Currently only a short list of locations is accepted. That location list is based on the [OpenAI model availability table](https://learn.microsoft.com/azure/ai-services/openai/concepts/models#standard-deployment-model-availability) and may become outdated as availability changes.
 
 The deployment process will take a few minutes. Once it's done, you'll see the URL of the web app in the terminal.
@@ -109,7 +112,7 @@ Here are some resources to learn more about the technologies used in this sample
 - [LangChain.js documentation](https://js.langchain.com)
 - [Generative AI For Beginners](https://github.com/microsoft/generative-ai-for-beginners)
 - [Azure OpenAI Service](https://learn.microsoft.com/azure/ai-services/openai/overview)
-- [Azure Cosmos DB for MongoDB vCore](https://learn.microsoft.com/azure/cosmos-db/mongodb/vcore/)
+- [Azure AI Search](https://learn.microsoft.com/azure/search/search-what-is-azure-search)
 - [Ask YouTube: LangChain.js + Azure Quickstart sample](https://github.com/Azure-Samples/langchainjs-quickstart-demo)
 - [Chat + Enterprise data with Azure OpenAI and Azure AI Search](https://github.com/Azure-Samples/azure-search-openai-javascript)
 - [Revolutionize your Enterprise Data with Chat: Next-gen Apps w/ Azure OpenAI and AI Search](https://aka.ms/entgptsearchblog)

diff --git a/infra/core/ai/cognitiveservices.bicep b/infra/core/ai/cognitiveservices.bicep
@@ -20,6 +20,7 @@ param networkAcls object = empty(allowedIpRules) ? {
   ipRules: allowedIpRules
   defaultAction: 'Deny'
 }
+param disableLocalAuth bool = false
 
 resource account 'Microsoft.CognitiveServices/accounts@2023-05-01' = {
   name: name
@@ -30,6 +31,7 @@ resource account 'Microsoft.CognitiveServices/accounts@2023-05-01' = {
     customSubDomainName: customSubDomainName
     publicNetworkAccess: publicNetworkAccess
     networkAcls: networkAcls
+    disableLocalAuth: disableLocalAuth
   }
   sku: sku
 }
@@ -51,4 +53,3 @@ resource deployment 'Microsoft.CognitiveServices/accounts/deployments@2023-05-01
 output endpoint string = account.properties.endpoint
 output id string = account.id
 output name string = account.name
-output apiKey string = account.listKeys().key1
diff --git a/infra/core/database/cosmos-mongo-db-vcore.bicep b/infra/core/database/cosmos-mongo-db-vcore.bicep
diff --git a/infra/core/search/search-services.bicep b/infra/core/search/search-services.bicep
@@ -0,0 +1,68 @@
+metadata description = 'Creates an Azure AI Search instance.'
+param name string
+param location string = resourceGroup().location
+param tags object = {}
+
+param sku object = {
+  name: 'standard'
+}
+
+param authOptions object = {}
+param disableLocalAuth bool = false
+param disabledDataExfiltrationOptions array = []
+param encryptionWithCmk object = {
+  enforcement: 'Unspecified'
+}
+@allowed([
+  'default'
+  'highDensity'
+])
+param hostingMode string = 'default'
+param networkRuleSet object = {
+  bypass: 'None'
+  ipRules: []
+}
+param partitionCount int = 1
+@allowed([
+  'enabled'
+  'disabled'
+])
+param publicNetworkAccess string = 'enabled'
+param replicaCount int = 1
+@allowed([
+  'disabled'
+  'free'
+  'standard'
+])
+param semanticSearch string = 'disabled'
+
+var searchIdentityProvider = (sku.name == 'free') ? null : {
+  type: 'SystemAssigned'
+}
+
+resource search 'Microsoft.Search/searchServices@2021-04-01-preview' = {
+  name: name
+  location: location
+  tags: tags
+  // The free tier does not support managed identity
+  identity: searchIdentityProvider
+  properties: {
+    authOptions: disableLocalAuth ? null : authOptions
+    disableLocalAuth: disableLocalAuth
+    disabledDataExfiltrationOptions: disabledDataExfiltrationOptions
+    encryptionWithCmk: encryptionWithCmk
+    hostingMode: hostingMode
+    networkRuleSet: networkRuleSet
+    partitionCount: partitionCount
+    publicNetworkAccess: publicNetworkAccess
+    replicaCount: replicaCount
+    semanticSearch: semanticSearch
+  }
+  sku: sku
+}
+
+output id string = search.id
+output endpoint string = 'https://${name}.search.windows.net/'
+output name string = search.name
+output principalId string = !empty(searchIdentityProvider) ? search.identity.principalId : ''
+
diff --git a/infra/core/security/role.bicep b/infra/core/security/role.bicep
@@ -0,0 +1,21 @@
+metadata description = 'Creates a role assignment for a service principal.'
+param principalId string
+
+@allowed([
+  'Device'
+  'ForeignGroup'
+  'Group'
+  'ServicePrincipal'
+  'User'
+])
+param principalType string = 'ServicePrincipal'
+param roleDefinitionId string
+
+resource role 'Microsoft.Authorization/roleAssignments@2022-04-01' = {
+  name: guid(subscription().id, resourceGroup().id, principalId, roleDefinitionId)
+  properties: {
+    principalId: principalId
+    principalType: principalType
+    roleDefinitionId: resourceId('Microsoft.Authorization/roleDefinitions', roleDefinitionId)
+  }
+}
diff --git a/infra/core/storage/storage-account.bicep b/infra/core/storage/storage-account.bicep
@@ -60,8 +60,5 @@ resource storage 'Microsoft.Storage/storageAccounts@2022-05-01' = {
   }
 }
 
-var sharedKey = storage.listKeys().keys[0].value
-
 output name string = storage.name
 output primaryEndpoints object = storage.properties.primaryEndpoints
-output connectionString string = 'DefaultEndpointsProtocol=https;AccountName=${storage.name};AccountKey=${sharedKey};EndpointSuffix=core.windows.net'