Skip to content

This repository analyzes the Multivariate workload data of Google Cluster machines.

Notifications You must be signed in to change notification settings

JavadDogani/Multivariate-Cloud-workload-analysis

Repository files navigation

#This repository analyzes multivariate Google Cluster machine load data. #The dataset contains traces of host load in Google's data center, which may be accessed at https://github.com/google/cluster-data. Google Cluster analyzed the resource use of over 12,500 physical computers for 29 days, recording 67,2074 activities and over 26 million records. These measurements were taken every five minutes. The task usage table reveals how much of each task's available resources it utilizes. Each item in the database comprises twenty elements, including the average and maximum CPU utilization, the amount of RAM assigned to each job, the task ID, the device ID, the unmapped and total page cache utilization, the maximum input and output time, and the disk utilization.

#Five computers with the highest number of jobs executed were chosen from this data set, and a reliability test using the Augmented Dickey Fuller (ADF) test  technique and a correlation test using the Multivariate Granger causality analysis method were conducted. This study included the variables average CPU usage, canonical memory utilization, and maximum memory usage.

Releases

No releases published

Packages

No packages published

Languages