Skip to content

Commit

Permalink
Merge pull request #254 from trz42/configurable_upload_directory
Browse files Browse the repository at this point in the history
make upload directory in S3 bucket configurable
  • Loading branch information
bedroge authored Feb 26, 2024
2 parents b03baa1 + 8b09bec commit 41c1ab1
Show file tree
Hide file tree
Showing 5 changed files with 267 additions and 25 deletions.
98 changes: 98 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -404,6 +404,20 @@ submit_command = /usr/bin/sbatch
```
`submit_command` is the full path to the Slurm job submission command used for submitting batch jobs. You may want to verify if `sbatch` is provided at that path or determine its actual location (using `which sbatch`).

```
build_permission = GH_ACCOUNT_1 GH_ACCOUNT_2 ...
```
`build_permission` defines which GitHub accounts have the permission to trigger
build jobs, i.e., for which accounts the bot acts on `bot: build ...` commands.
If the value is left empty, everyone can trigger build jobs.

```
no_build_permission_comment = The `bot: build ...` command has been used by user `{build_labeler}`, but this person does not have permission to trigger builds.
```
`no_build_permission_comment` defines a comment (template) that is used when
the account trying to trigger build jobs has no permission to do so.


#### `[bot_control]` section

The `[bot_control]` section contains settings for configuring the feature to
Expand Down Expand Up @@ -485,6 +499,43 @@ This defines a message that is added to the status table in a PR comment
corresponding to a job whose tarball should have been uploaded (e.g., after
setting the `bot:deploy` label).


```
metadata_prefix = LOCATION_WHERE_METADATA_FILE_GETS_DEPOSITED
tarball_prefix = LOCATION_WHERE_TARBALL_GETS_DEPOSITED
```

These two settings are used to define where (which directory) in the S3 bucket
(see `bucket_name` above) the metadata file and the tarball will be stored. The
value `LOCATION...` can be a string value to always use the same 'prefix'
regardless of the target CVMFS repository, or can be a mapping of a target
repository id (see also `repo_target_map` below) to a prefix.

The prefix itself can use some (environment) variables that are set within
the upload script (see `tarball_upload_script` above). Currently those are:
* `'${github_repository}'` (which would be expanded to the full name of the GitHub
repository, e.g., `EESSI/software-layer`),
* `'${legacy_aws_path}'` (which expands to the legacy/old prefix being used for
storing tarballs/metadata files, the old prefix is
`EESSI_VERSION/TARBALL_TYPE/OS_TYPE/CPU_ARCHITECTURE/TIMESTAMP/`), _and_
* `'${pull_request_number}'` (which would be expanded to the number of the pull
request from which the tarball originates).
Note, it's important to single-quote (`'`) the variables as shown above, because
they may likely not be defined when the bot calls the upload script.
The list of supported variables can be shown by running
`scripts/eessi-upload-to-staging --list-variables`.
**Examples:**
```
metadata_prefix = {"eessi.io-2023.06": "new/${github_repository}/${pull_request_number}"}
tarball_prefix = {
"eessi-pilot-2023.06": "",
"eessi.io-2023.06": "new/${github_repository}/${pull_request_number}"
}
```
If left empty, the old/legacy prefix is being used.
#### `[architecturetargets]` section
The section `[architecturetargets]` defines for which targets (OS/SUBDIR), (for example `linux/x86_64/amd/zen2`) the EESSI bot should submit jobs, and which additional `sbatch` parameters will be used for requesting a compute node with the CPU microarchitecture needed to build the software stack.
Expand Down Expand Up @@ -657,6 +708,53 @@ job_test_unknown_fmt = <details><summary>:shrug: UNKNOWN _(click triangle for de
`job_test_unknown_fmt` is used in case no test file (produced by `bot/check-test.sh`
provided by target repository) was found.
#### `[download_pr_comments]` section
The `[download_pr_comments]` section sets templates for messages related to
downloading the contents of a pull request.
```
git_clone_failure = Unable to clone the target repository.
```
`git_clone_failure` is shown when `git clone` failed.
```
git_clone_tip = _Tip: This could be a connection failure. Try again and if the issue remains check if the address is correct_.
```
`git_clone_tip` should contain some hint on how to deal with the issue. It is shown when `git clone` failed.
```
git_checkout_failure = Unable to checkout to the correct branch.
```
`git_checkout_failure` is shown when `git checkout` failed.
```
git_checkout_tip = _Tip: Ensure that the branch name is correct and the target branch is available._
```
`git_checkout_tip` should contain some hint on how to deal with the failure. It
is shown when `git checkout` failed.
```
curl_failure = Unable to download the `.diff` file.
```
`curl_failure` is shown when downloading the `PR_NUMBER.diff`
```
curl_tip = _Tip: This could be a connection failure. Try again and if the issue remains check if the address is correct_
```
`curl_tip` should help in how to deal with failing downloads of the `.diff` file.
```
git_apply_failure = Unable to download or merge changes between the source branch and the destination branch.
```
`git_apply_failure` is shown when applying the `.diff` file with `git apply`
failed.
```
git_apply_tip = _Tip: This can usually be resolved by syncing your branch and resolving any merge conflicts._
```
`git_apply_tip` should guide the contributor/maintainer about resolving the cause
of `git apply` failing.
# Instructions to run the bot components
The bot consists of three components:
Expand Down
22 changes: 22 additions & 0 deletions app.cfg.example
Original file line number Diff line number Diff line change
Expand Up @@ -147,6 +147,28 @@ deploy_permission =
# template for comment when user who set a label has no permission to trigger deploying tarballs
no_deploy_permission_comment = Label `bot:deploy` has been set by user `{deploy_labeler}`, but this person does not have permission to trigger deployments

# settings for where (directory) in the S3 bucket to store the metadata file and
# the tarball
# - Can be a string value to always use the same 'prefix' regardless of the target
# CVMFS repository, or can be a mapping of a target repository id (see also
# repo_target_map) to a prefix.
# - The prefix itself can use some (environment) variables that are set within
# the script. Currently those are:
# * 'github_repository' (which would be expanded to the full name of the GitHub
# repository, e.g., 'EESSI/software-layer'),
# * 'legacy_aws_path' (which expands to the legacy/old prefix being used for
# storing tarballs/metadata files) and
# * 'pull_request_number' (which would be expanded to the number of the pull
# request from which the tarball originates).
# - The list of supported variables can be shown by running
# `scripts/eessi-upload-to-staging --list-variables`.
# - Examples:
# metadata_prefix = {"eessi.io-2023.06": "new/${github_repository}/${pull_request_number}"}
# tarball_prefix = {"eessi-pilot-2023.06": "", "eessi.io-2023.06": "new/${github_repository}/${pull_request_number}"}
# If left empty, the old/legacy prefix is being used.
metadata_prefix =
tarball_prefix =


[architecturetargets]
# defines both for which architectures the bot will build
Expand Down
109 changes: 86 additions & 23 deletions scripts/eessi-upload-to-staging
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,7 @@ function create_metadata_file
_tarball=$1
_url=$2
_repository=$3
_pull_request=$4
_pull_request_number=$4
_pull_request_comment_id=$5

_tmpfile=$(mktemp)
Expand All @@ -56,31 +56,43 @@ function create_metadata_file
--arg sha256 "$(sha256sum "${_tarball}" | awk '{print $1}')" \
--arg url "${_url}" \
--arg repo "${_repository}" \
--arg pr "${_pull_request}" \
--arg pr "${_pull_request_number}" \
--arg pr_comment_id "${_pull_request_comment_id}" \
'{
uploader: {username: $un, ip: $ip, hostname: $hn},
payload: {filename: $fn, size: $sz, ctime: $ct, sha256sum: $sha256, url: $url},
link2pr: {repo: $repo, pr: $pr, pr_comment_id: $pr_commend_id},
link2pr: {repo: $repo, pr: $pr, pr_comment_id: $pr_comment_id},
}' > "${_tmpfile}"

echo "${_tmpfile}"
}

function display_help
{
echo "Usage: $0 [OPTIONS] <filenames>" >&2
echo " -e | --endpoint-url URL - endpoint url (needed for non AWS S3)" >&2
echo " -h | --help - display this usage information" >&2
echo " -i | --pr-comment-id - identifier of a PR comment; may be" >&2
echo " used to efficiently determine the PR" >&2
echo " comment to be updated during the" >&2
echo " ingestion procedure" >&2
echo " -n | --bucket-name BUCKET - bucket name (same as BUCKET above)" >&2
echo " -p | --pull-request NUMBER - a pull request NUMBER; used to" >&2
echo " link the upload to a PR" >&2
echo " -r | --repository FULL_NAME - a repository name ACCOUNT/REPONAME;" >&2
echo " used to link the upload to a PR" >&2
echo "Usage: $0 [OPTIONS] <filenames>" >&2
echo " -e | --endpoint-url URL - endpoint url (needed for non AWS S3)" >&2
echo " -h | --help - display this usage information" >&2
echo " -i | --pr-comment-id - identifier of a PR comment; may be" >&2
echo " used to efficiently determine the PR" >&2
echo " comment to be updated during the" >&2
echo " ingestion procedure" >&2
echo " -l | --list-variables - list variables that are available" >&2
echo " for expansion" >&2
echo " -m | --metadata-prefix PREFIX - a directory to which the metadata" >&2
echo " file shall be uploaded; BASH variable" >&2
echo " expansion will be applied; arg '-l'" >&2
echo " lists variables that are defined at" >&2
echo " the time of expansion" >&2
echo " -n | --bucket-name BUCKET - bucket name (same as BUCKET above)" >&2
echo " -p | --pull-request-number INT - a pull request number (INT); used to" >&2
echo " link the upload to a PR" >&2
echo " -r | --repository FULL_NAME - a repository name ACCOUNT/REPONAME;" >&2
echo " used to link the upload to a PR" >&2
echo " -t | --tarball-prefix PREFIX - a directory to which the tarball" >&2
echo " shall be uploaded; BASH variable" >&2
echo " expansion will be applied; arg '-l'" >&2
echo " lists variables that are defined at" >&2
echo " the time of expansion" >&2
}

if [[ $# -lt 1 ]]; then
Expand All @@ -106,8 +118,16 @@ endpoint_url=

# provided via command line arguments
pr_comment_id="none"
pull_request="none"
repository="EESSI/software-layer"
pull_request_number="none"
github_repository="EESSI/software-layer"

# provided via options in the bot's config file app.cfg and/or command line argument
metadata_prefix=
tarball_prefix=

# other variables
legacy_aws_path=
variables="github_repository legacy_aws_path pull_request_number"

while [[ $# -gt 0 ]]; do
case $1 in
Expand All @@ -119,20 +139,36 @@ while [[ $# -gt 0 ]]; do
display_help
exit 0
;;
-l|--list-variables)
echo "variables that will be expanded: name (default value)"
for var in ${variables}
do
echo " ${var} (${!var:-unset})"
done
exit 0
;;
-i|--pr-comment-id)
pr_comment_id="$2"
shift 2
;;
-m|--metadata-prefix)
metadata_prefix="$2"
shift 2
;;
-n|--bucket-name)
bucket_name="$2"
shift 2
;;
-p|--pull-request)
pull_request="$2"
-p|--pull-request-number)
pull_request_number="$2"
shift 2
;;
-r|--repository)
repository="$2"
github_repository="$2"
shift 2
;;
-t|--tarball-prefix)
tarball_prefix="$2"
shift 2
;;
-*|--*)
Expand Down Expand Up @@ -168,23 +204,50 @@ for file in "$*"; do
basefile=$( basename ${file} )
if check_file_name ${basefile}; then
if tar tf "${file}" | head -n1 > /dev/null; then
aws_path=$(basename ${file} | tr -s '-' '/' \
# 'legacy_aws_path' might be used in tarball_prefix or metadata_prefix
# its purpose is to support the old/legacy method to derive the location
# where to store the tarball and metadata file
export legacy_aws_path=$(basename ${file} | tr -s '-' '/' \
| perl -pe 's/^eessi.//;' | perl -pe 's/\.tar\.gz$//;' )
if [ -z ${tarball_prefix} ]; then
aws_path=${legacy_aws_path}
else
export pull_request_number
export github_repository
aws_path=$(envsubst <<< "${tarball_prefix}")
fi
aws_file=$(basename ${file})
echo "Creating metadata file"
url="${bucket_base}/${aws_path}/${aws_file}"
metadata_file=$(create_metadata_file "${file}" "${url}" \
"${repository}" "${pull_request}" \
echo "create_metadata_file file=${file} \
url=${url} \
github_repository=${github_repository} \
pull_request_number=${pull_request_number} \
pr_comment_id=${pr_comment_id}"
metadata_file=$(create_metadata_file "${file}" \
"${url}" \
"${github_repository}" \
"${pull_request_number}" \
"${pr_comment_id}")
echo "metadata:"
cat ${metadata_file}

echo Uploading to "${url}"
echo " store tarball at ${aws_path}/${aws_file}"
upload_to_staging_bucket \
"${file}" \
"${bucket_name}" \
"${aws_path}/${aws_file}" \
"${endpoint_url}"

if [ -z ${metadata_prefix} ]; then
aws_path=${legacy_aws_path}
else
export pull_request_number
export github_repository
aws_path=$(envsubst <<< "${metadata_prefix}")
fi
echo " store metadata file at ${aws_path}/${aws_file}.meta.txt"
upload_to_staging_bucket \
"${metadata_file}" \
"${bucket_name}" \
Expand Down
11 changes: 10 additions & 1 deletion tasks/build.py
Original file line number Diff line number Diff line change
Expand Up @@ -49,6 +49,7 @@
ERROR_GIT_APPLY = "git apply"
ERROR_GIT_CHECKOUT = "git checkout"
ERROR_GIT_CLONE = "curl"
ERROR_NONE = "none"
GITHUB = "github"
GIT_CLONE_FAILURE = "git_clone_failure"
GIT_CLONE_TIP = "git_clone_tip"
Expand Down Expand Up @@ -399,6 +400,9 @@ def download_pr(repo_name, branch_name, pr, arch_job_dir):
error_stage = ERROR_GIT_APPLY
return git_apply_output, git_apply_error, git_apply_exit_code, error_stage

# need to return four items also in case everything went fine
return 'downloading PR succeeded', 'no error while downloading PR', 0, ERROR_NONE


def comment_download_pr(base_repo_name, pr, download_pr_exit_code, download_pr_error, error_stage):
"""
Expand Down Expand Up @@ -862,6 +866,8 @@ def request_bot_build_issue_comments(repo_name, pr_number):
status_table (dict): dictionary with 'arch', 'date', 'status', 'url' and 'result'
for all the finished builds;
"""
fn = sys._getframe().f_code.co_name

status_table = {'arch': [], 'date': [], 'status': [], 'url': [], 'result': []}
cfg = config.read_config()

Expand All @@ -882,9 +888,12 @@ def request_bot_build_issue_comments(repo_name, pr_number):
first_line = comment['body'].split('\n')[0]
arch_map = get_architecture_targets(cfg)
for arch in arch_map.keys():
target_arch = '/'.join(arch.split('/')[-1])
# drop the first element in arch (which names the OS type) and join the remaining items with '-'
target_arch = '-'.join(arch.split('/')[1:])
if target_arch in first_line:
status_table['arch'].append(target_arch)
else:
log(f"{fn}(): target_arch '{target_arch}' not found in first line '{first_line}'")

# get date, status, url and result from the markdown table
comment_table = comment['body'][comment['body'].find('|'):comment['body'].rfind('|')+1]
Expand Down
Loading

0 comments on commit 41c1ab1

Please sign in to comment.