diff --git a/.github/PULL_REQUEST_TEMPLATE.md b/.github/PULL_REQUEST_TEMPLATE.md
index c8f84986..9a4b80af 100644
--- a/.github/PULL_REQUEST_TEMPLATE.md
+++ b/.github/PULL_REQUEST_TEMPLATE.md
@@ -27,7 +27,9 @@
- [ ] `CHANGELOG.md` has been updated
+- [ ] `xdmod_data/__version__.py` has been updated to the next development version
- [ ] The milestone is set correctly on the pull request
- [ ] The appropriate labels have been added to the pull request
- [ ] Running the automated tests (see `docs/developing.md`) produces no errors
- [ ] Updates have been made to the `xdmod-notebooks` repository as necessary, and the notebooks all run successfully
+- [ ] The changes in this PR have been ported/backported to other branches as needed
diff --git a/CHANGELOG.md b/CHANGELOG.md
index 0bc3f24c..a154d432 100644
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -1,8 +1,9 @@
# xdmod-data Changelog
-## Main development branch
+## v1.x.y development branch
- Document Open XDMoD compatibility in changelog ([\#31](https://github.com/ubccr/xdmod-data/pull/31)).
+- Fix IOPub error when showing progress with `get_raw_data()` ([\#37](https://github.com/ubccr/xdmod-data/pull/37)).
## v1.0.1 (2024-09-27)
diff --git a/docs/developing.md b/docs/developing.md
index 67f934a4..991fca4c 100644
--- a/docs/developing.md
+++ b/docs/developing.md
@@ -90,7 +90,6 @@
1. Go to the [GitHub milestones](https://github.com/ubccr/xdmod-data/milestones) and close the milestone for the version.
## After release
-
1. Make a new branch of `xdmod-data` and:
1. Make sure the version number is updated in `xdmod_data/__version__.py` to a pre-release of the next version, e.g., `1.0.1-01`.
1. Update `CHANGELOG.md` to add a section at the top called `Main development branch`.
diff --git a/tests/regression/data/jobs-dimensions.csv b/tests/regression/data/jobs-dimensions.csv
index 798d3541..7c7e2c2b 100644
--- a/tests/regression/data/jobs-dimensions.csv
+++ b/tests/regression/data/jobs-dimensions.csv
@@ -2,7 +2,7 @@ id,label,description
none,None,Summarizes jobs reported to the ACCESS allocations service (excludes non-ACCESS usage of the resource).
allocation,Allocation,A funded project that is allowed to run jobs on resources.
fieldofscience,Field of Science,The field of science indicated on the allocation request pertaining to the running jobs.
-gateway,Gateway,A science gateway is a portal set up to aid submiting jobs to resources.
+gateway,Gateway,A science gateway is a portal set up to aid submitting jobs to resources.
grant_type,Grant Type,A categorization of the projects/allocations.
jobsize,Job Size,A categorization of jobs into discrete groups based on the number of cores used by each job.
jobwaittime,Job Wait Time,A categorization of jobs into discrete groups based on the total linear time each job waited.
@@ -19,7 +19,7 @@ resource,Resource,A resource is a remote computer that can run jobs.
resource_type,Resource Type,A categorization of resources into by their general capabilities.
provider,Service Provider,A service provider is an institution that hosts resources.
username,System Username,The specific system username of the users who ran jobs.
-person,User,"A person who is on a PIs allocation, hence able run jobs on resources."
+person,User,"A person who is on a PIs allocation, hence able to run jobs on resources."
institution,User Institution,Organizations that have users with allocations.
institution_country,User Institution Country,The name of the country of the institution of the person who ran the compute job.
institution_state,User Institution State,The location of the institution of the person who ran the compute job.
diff --git a/tests/regression/data/jobs-metrics.csv b/tests/regression/data/jobs-metrics.csv
index f5a9f14e..b9ff7429 100644
--- a/tests/regression/data/jobs-metrics.csv
+++ b/tests/regression/data/jobs-metrics.csv
@@ -1,4 +1,5 @@
id,label,description
+utilization,ACCESS CPU Utilization (%),"The percentage of the ACCESS obligation of a resource that has been utilized by ACCESS jobs.
ACCESS CPU Utilization: The ratio of the total CPU hours consumed by ACCESS jobs over a given time period divided by the total CPU hours that the system is contractually required to provide to ACCESS during that period. It does not include non-ACCESS jobs.
It is worth noting that this value is a rough estimate in certain cases where the resource providers don't provide accurate records of their system specifications, over time."
avg_ace,ACCESS Credit Equivalents Charged: Per Job (SU),"The average amount of ACCESS Credit Equivalents charged per compute job.
The ACCESS Credit Equivalent is a measure of how much compute time was used on each resource.
@@ -15,7 +16,6 @@ The ACCESS Credit Equivalent allows comparison between usage of node-allocated,
resources. It also allows a comparison between resources with different compute power per core.
The ACCESS allocations exchange calculator
lists conversion rates between an ACCESS Credit Equivalent and a service unit on a resource."
-utilization,ACCESS Utilization (%),"The percentage of the ACCESS obligation of a resource that has been utilized by ACCESS jobs.
ACCESS Utilization: The ratio of the total CPU hours consumed by ACCESS jobs over a given time period divided by the total CPU hours that the system is contractually required to provide to ACCESS during that period. It does not include non-ACCESS jobs.
It is worth noting that this value is a rough estimate in certain cases where the resource providers don't provide accurate records of their system specifications, over time."
rate_of_usage,Allocation Usage Rate (XD SU/Hour),The rate of ACCESS allocation usage in XD SUs per hour.
rate_of_usage_ace,Allocation Usage Rate ACEs (SU/Hour),The rate of ACCESS allocation usage in ACCESS Credit Equivalents per hour.
avg_cpu_hours,CPU Hours: Per Job,"The average CPU hours (number of CPU cores x wall time hours) per ACCESS job.
For each job, the CPU usage is aggregated. For example, if a job used 1000 CPUs for one minute, it would be aggregated as 1000 CPU minutes or 16.67 CPU hours."
@@ -82,7 +82,7 @@ Current TeraGrid supercomputers have complex multi-core and memory hierarchies.
Note: The actual charge will depend on the specific requirements of the job (e.g., the mapping of the cores across the machine, or the priority you wish to obtain).
-Note 2: The SUs show here have been normalized against the XSEDE Roaming service. Therefore they are comparable across resources."
+Note 2: The SUs shown here have been normalized against the XSEDE Roaming service. Therefore they are comparable across resources."
total_su,XD SUs Charged: Total,"The total amount of XD SUs charged by ACCESS jobs.
XD SU: 1 XSEDE SU is defined as one CPU-hour on a Phase-1 DTF cluster.
SU - Service Units: Computational resources on the XSEDE are allocated and charged in service units (SUs). SUs are defined locally on each system, with conversion factors among systems based on HPL benchmark results.
@@ -91,4 +91,4 @@ Current TeraGrid supercomputers have complex multi-core and memory hierarchies.
Note: The actual charge will depend on the specific requirements of the job (e.g., the mapping of the cores across the machine, or the priority you wish to obtain).
-Note 2: The SUs show here have been normalized against the XSEDE Roaming service. Therefore they are comparable across resources."
+Note 2: The SUs shown here have been normalized against the XSEDE Roaming service. Therefore they are comparable across resources."
diff --git a/tests/regression/data/machine-learning-notebook-example-every-1000.csv b/tests/regression/data/machine-learning-notebook-example-every-1000.csv
index 7f44b29e..a14ba1f5 100644
--- a/tests/regression/data/machine-learning-notebook-example-every-1000.csv
+++ b/tests/regression/data/machine-learning-notebook-example-every-1000.csv
@@ -1,44 +1,44 @@
,Nodes,Requested Wall Time,Wait Time,Wall Time,CPU User,"Mount point ""home"" data written","Mount point ""scratch"" data written",Total memory used
-0,1,172800,11,506,,,,
-1000,1,86400,1,66,,,,
-2000,1,86400,18,752,,,,
-3000,1,86400,8,5434,,,,
-4000,1,86400,6,1572,,,,
-5000,1,172800,7,2592,,,,
-6000,1,14400,7,2800,,,,
-7000,1,3600,2894,1357,,,,
-8000,1,21600,116,7277,,,,
-9000,1,21600,2173,6764,,,,
-10000,1,21600,3574,7095,,,,
-11000,1,9000,4,3564,88.01518903173182,992.4354304606816,267087841.6178405,811231171.4285715
-12000,1,21600,158,5565,,,,
-13000,1,21600,59,6965,,,,
-14000,1,21600,9,7760,,,,
-15000,1,3600,22122,1335,,,,
-16000,1,28800,130,9421,12.262731018331898,19749.432156075072,0,787292327.46875
-17000,1,28800,6,1990,,,,
-18000,1,172800,13,73,,,,
-19000,1,172800,7,129,,,,
-20000,1,25200,4,25211,82.16279473845965,0,5113844.667942916,240912572.72941175
-21000,1,21600,18,6099,,,,
-22000,1,21600,27,7131,,,,
-23000,1,1800,61,1079,35.02319701051263,5818814744.200479,0,91742777.25
-24000,1,3600,5,2306,0.11814596015380158,,,33983854837.760006
-25000,1,960,1,59,2.025062333453586,0,0,118141168
-26000,1,172800,1,20494,87.54061396105656,548.1123048956289,0,224020798.15942028
-27000,4,7200,2,7214,99.2396948622311,441.0514348202534,34392345950.27519,1104895888.4
-28000,1,21600,13,55,1.2148444482641405,,,
-29000,1,21600,171,40,,,,
-30000,1,960,0,42,1.5504320217730077,0,0,112133180
-31000,1,1800,11,183,25.94758412119134,,,129784697856
-32000,1,21600,372,114,1.5571541609296509,,,92681043968
-33000,2,1800,134,139,55.875186246345784,533.6963000565588,754284385.8435647,137041136
-34000,1,7200,74,9,0.94096807333301,4681.666820975073,0,145688575
-35000,1,172800,22,83953,98.95217460976379,,,91787361316.73543
-36000,1,6000,8,152,0.4601673251104144,,,85277047466.66667
-37000,1,900,124,137,96.18834348033303,,,35703571797.333336
-38000,1,21600,12,56,24.892228849477622,,,
-39000,1,21600,12,134,26.487756894710913,,,113801609216
-40000,1,21600,12,229,45.74138522053433,,,42761861802.66667
-41000,1,21600,20,307,0.9428384414161763,,,27184930360.88889
-42000,1,21600,130,386,1.68608777466353,,,49510031732.36363
+0,1,172800,2,15048,67.78143277322484,0,0,149919987.98039216
+1000,1,1800,7,133,,,,
+2000,1,86400,8,1997,,,,
+3000,2,60,7,10,,,,
+4000,1,172800,448,88,,,,
+5000,1,72000,1514,5277,,,,
+6000,1,3600,5575,1340,,,,
+7000,1,900,155,252,,,,
+8000,1,172800,6,12013,,,,
+9000,1,3600,111,36,1.8511046269271951,0,12415575.890133914,223612927
+10000,1,21600,9,5993,,,,
+11000,1,3600,18425,1346,,,,
+12000,1,21600,9,7839,,,,
+13000,1,3600,22445,1321,,,,
+14000,1,3600,211,3,,,,
+15000,1,86400,1681,108,,,,
+16000,2,172800,2,85924,35.32048284494427,0,0,669471630.8666667
+17000,1,28800,48,39,,,,
+18000,1,7200,0,611,1.6317304418827532,0,0,242301667
+19000,1,21600,12,1662,,,,
+20000,1,21600,26,7206,,,,
+21000,1,3600,0,1095,20.298036056662443,307784.68622169655,0,377195866
+22000,1,172800,3,280,87.38755792994296,,,131463737856
+23000,1,960,3,42,1.573028452792688,0,0,153550667
+24000,1,960,0,42,1.4618214897575181,0,0,141788220
+25000,256,172800,169623,169681,90.53302268073315,716.4750310720816,110578427985.13992,426185127.56684494
+26000,1,21600,13,72,49.75276270147344,,,118342819840
+27000,1,21600,191,36,,,,
+28000,1,21600,17,21,,,,
+29000,1,1800,11,211,50.00127478407303,,,124171051827.2
+30000,1,21600,328,184,36.69212927476799,,,144036221952
+31000,1,18000,13,6,,,,
+32000,1,960,1,41,1.5277511080658916,0,0,172152952
+33000,1,129600,476,15,,,,
+34000,1,172800,1,15897,98.96305566793583,,,105552713859.87883
+35000,1,7200,0,7217,,,,
+36000,1,21600,1,93,17.130505149503232,,,147102908416
+37000,1,21600,11,150,1.5495503254849237,,,155952593578.66666
+38000,1,21600,1,256,31.266221286764996,,,60418209792
+39000,1,21600,334,10,,,,
+40000,1,21600,334,221,6.50711299161715,,,53499913011.2
+41000,1,3600,1,115,0.2959079693229115,,,163743516672
+42000,4,7200,0,7215,98.47234983404904,441.1070124335988,4020681635.3062363,1288843817.88
diff --git a/tests/regression/data/realms.csv b/tests/regression/data/realms.csv
index f0fb63e9..4d403729 100644
--- a/tests/regression/data/realms.csv
+++ b/tests/regression/data/realms.csv
@@ -5,4 +5,5 @@ Cloud,Cloud
Gateways,Gateways
Jobs,Jobs
Requests,Requests
+ResourceSpecifications,Resource Specifications
SUPREMM,SUPREMM
diff --git a/xdmod_data/__version__.py b/xdmod_data/__version__.py
index 41086d70..61b75eba 100644
--- a/xdmod_data/__version__.py
+++ b/xdmod_data/__version__.py
@@ -1,2 +1,2 @@
__title__ = 'xdmod-data'
-__version__ = '2.0.0-01'
+__version__ = '1.0.2.dev1'
diff --git a/xdmod_data/_http_requester.py b/xdmod_data/_http_requester.py
index ad5bc939..25454343 100644
--- a/xdmod_data/_http_requester.py
+++ b/xdmod_data/_http_requester.py
@@ -63,15 +63,12 @@ def _request_raw_data(self, params):
response = {'fields': line_json}
else:
data.append(line_json)
- if params['show_progress']:
- progress_msg = (
- 'Got ' + str(i) + ' row' + ('' if i == 1 else 's')
- + '...'
- )
- print(progress_msg, end='\r')
+ # Only print every 10,000 rows to avoid I/O rate errors.
+ if params['show_progress'] and i % 10000 == 0:
+ self.__print_progress_msg(i, '\r')
i += 1
if params['show_progress']:
- print(progress_msg + 'DONE')
+ self.__print_progress_msg(i, 'DONE\n')
else:
num_rows = limit
offset = 0
@@ -83,16 +80,11 @@ def _request_raw_data(self, params):
partial_data = response['data']
data += partial_data
if params['show_progress']:
- progress_msg = (
- 'Got ' + str(len(data)) + ' row'
- + ('' if len(data) == 1 else 's')
- + '...'
- )
- print(progress_msg, end='\r')
+ self.__print_progress_msg(len(data), '\r')
num_rows = len(partial_data)
offset += limit
if params['show_progress']:
- print(progress_msg + 'DONE')
+ self.__print_progress_msg(len(data), 'DONE\n')
return (data, response['fields'])
def _request_filter_values(self, realm_id, dimension_id):
@@ -210,3 +202,10 @@ def __get_raw_data_limit(self):
else:
raise
return self.__raw_data_limit
+
+ def __print_progress_msg(self, num_rows, end='\n'):
+ progress_msg = (
+ 'Got ' + str(num_rows) + ' row' + ('' if num_rows == 1 else 's')
+ + '...'
+ )
+ print(progress_msg, end=end)