-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for Azure Goverment storage #15523
Add support for Azure Goverment storage #15523
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thank you for the PR. I have left some comments.
* <code>"blob.core.chinacloudapi.cn"</code> or | ||
* <code>"blob.core.usgovcloudapi.net</code>" | ||
*/ | ||
public AzureUtils(String blobStorageEndpointSuffix) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since its a utils class, lets not add a constructor here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
@@ -40,5 +40,6 @@ To use this Apache Druid extension, [include](../../configuration/extensions.md# | |||
|`druid.azure.protocol`|the protocol to use|http or https|https| | |||
|`druid.azure.maxTries`|Number of tries before canceling an Azure operation.| |3| | |||
|`druid.azure.maxListingLength`|maximum number of input files matching a given prefix to retrieve at a time| |1024| | |||
|`druid.azure.endpointSuffix`|the endpoint suffix to use.|Examples: `core.windows.net`, `core.usgovcloudapi.net`|`core.windows.net`| |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add some more colour here? when would one want to override this? Pointing to external documentation is fine too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added usecase with link to docs
extensions-core/azure-extensions/src/main/java/org/apache/druid/storage/azure/AzureUtils.java
Outdated
Show resolved
Hide resolved
|
||
// The azure storage hadoop access pattern is: | ||
// wasb[s]://<containername>@<accountname>.blob.core.windows.net/<path> | ||
// wasb[s]://<containername>@<accountname>.<blobStorageEndpointSuffix>/<path> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// wasb[s]://<containername>@<accountname>.<blobStorageEndpointSuffix>/<path> | |
// wasb[s]://<containername>@<accountname>.blob.<endpointSuffix>/<path> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated
return endpointSuffix; | ||
} | ||
|
||
public String getBlobStorageEndpointSuffix() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
isn't it the whole endpoint itself? why are we calling this method getBlobStorageEndpointSuffix
instead of getBlobStorageEndpoint
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Renamed
@@ -59,7 +64,7 @@ FileUtils.FileCopyResult getSegmentFiles( | |||
"Loading container: [%s], with blobPath: [%s] and outDir: [%s]", containerName, blobPath, outDir | |||
); | |||
|
|||
final String actualBlobPath = AzureUtils.maybeRemoveAzurePathPrefix(blobPath); | |||
final String actualBlobPath = azureUtils.maybeRemoveAzurePathPrefix(blobPath); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you could pass the endpoint suffix as an argument than passing it in the AzureUtils constructor
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One last comment on the tests. Please do fix the conflicts as well.
String outputBlob = AzureUtils.maybeRemoveAzurePathPrefix("blob.core.usgovcloudapi.net/container/blob", config.getBlobStorageEndpoint()); | ||
Assert.assertEquals("container/blob", outputBlob); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These tests should move to AzureUtilsTest. Here you can test that AzureAccountConfig is correctly initialized with the properties you are setting. And in AzureUtilsTest, you will add these tests but without creating an AzureAccountConfig object.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
@@ -40,5 +40,6 @@ To use this Apache Druid extension, [include](../../configuration/extensions.md# | |||
|`druid.azure.protocol`|the protocol to use|http or https|https| | |||
|`druid.azure.maxTries`|Number of tries before canceling an Azure operation.| |3| | |||
|`druid.azure.maxListingLength`|maximum number of input files matching a given prefix to retrieve at a time| |1024| | |||
|`druid.azure.endpointSuffix`|The endpoint suffix to use. Could be overriden for connecting with [Azure Goverment](https://learn.microsoft.com/en-us/azure/azure-government/documentation-government-get-started-connect-to-storage#getting-started-with-storage-api).|Examples: `core.windows.net`, `core.usgovcloudapi.net`|`core.windows.net`| |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|`druid.azure.endpointSuffix`|The endpoint suffix to use. Could be overriden for connecting with [Azure Goverment](https://learn.microsoft.com/en-us/azure/azure-government/documentation-government-get-started-connect-to-storage#getting-started-with-storage-api).|Examples: `core.windows.net`, `core.usgovcloudapi.net`|`core.windows.net`| | |
|`druid.azure.endpointSuffix`|The endpoint suffix to use. Override the default value to connect to [Azure Goverment](https://learn.microsoft.com/en-us/azure/azure-government/documentation-government-get-started-connect-to-storage#getting-started-with-storage-api).|Examples: `core.windows.net`, `core.usgovcloudapi.net`|`core.windows.net`| |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the suggestion, I've added your variant!
@@ -206,4 +207,18 @@ public void test_azureRetry_RunTimeExceptionWrappedInRunTimeException_returnsFal | |||
boolean retry = AzureUtils.AZURE_RETRY.apply(RUNTIME_EXCEPTION_WRAPPED_IN_RUNTIME_EXCEPTON); | |||
Assert.assertFalse(retry); | |||
} | |||
|
|||
@Test | |||
public void testGetAzureUtilsWithDefaultProperties() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
public void testGetAzureUtilsWithDefaultProperties() | |
public void testRemoveAzurePathPrefixDefaultEndpoint() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Renamed
} | ||
|
||
@Test | ||
public void testGetAzureUtilsWithDefaultCustomBlobPath() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
public void testGetAzureUtilsWithDefaultCustomBlobPath() | |
public void testRemoveAzurePathPrefixCustomEndpoint() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Renamed
@@ -259,6 +259,26 @@ public void testAllCredentialsUnset() | |||
); | |||
} | |||
|
|||
@Test | |||
public void testGetAzureUtilsWithDefaultProperties() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
public void testGetAzureUtilsWithDefaultProperties() | |
public void testGetBlobStorageEndpointWithDefaultProperties() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Renamed
} | ||
|
||
@Test | ||
public void testGetAzureUtilsWithDefaultCustomBlobPath() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
public void testGetAzureUtilsWithDefaultCustomBlobPath() | |
public void testGetBlobStorageEndpointWithCustomBlobPath() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Renamed
|`druid.azure.protocol`|the protocol to use|http or https|https| | ||
|`druid.azure.maxTries`|Number of tries before canceling an Azure operation.| |3| | ||
|`druid.azure.maxListingLength`|maximum number of input files matching a given prefix to retrieve at a time| |1024| | ||
|Property| Description |Possible Values|Default| |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The formatting change needs to be reverted. LGTM otherwise.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, changes reverted
thank you for your contribution @nozjkoitop |
Description
This pull request introduces a configurable Azure Storage Endpoint Suffix in the Druid Azure-Extensions.
Release note
Added support for Azure Government storage in Druid Azure-Extensions. This enhancement allows the Azure-Extensions to be compatible with different Azure storage types by updating the endpoint suffix from a hardcoded value to a configurable one.
Key changed/added classes in this PR
AzureAccountConfig
AzureUtils
AzureDataSegmentPuller
AzureStorageDruidModule
This PR has: