This repository contains the data for the assignment in the Coursera "Getting and Cleaning Data" course.
The repository contains the following files:
- 'README.md': This file that gives an overview over the content of the repository
- 'CodeBook.md': The codebook that describes the variables used in the 'tidy.txt' file.
- 'run_analysis.R': The R program that processes the raw data and produces the 'tidy.txt' file.
- 'tidy.txt': The result file written from the 'run_analysis.R' script.
- When the data Zip file has not already been downloaded it will be downloaded into 'dataset.zip'
- If the 'dataset.zip' file has not already been unpacked it will be unpacked.
- All required files are loaded into R.
- All test data and training data are merged together respectively into merged dataframes.
- All features with mean() and std() are determined.
- The names of the features are tidied. That is all spaces, dashes, brackets and commas are removed.
- Only the measurements for mean and standard deviation are kept from the training data and test data.
- The activity codes are replaced with the readable description.
- Feature names, activities, subject ids and measurements are merged into one dataframe.
- The dataframe is grouped by users and activities.
- The mean is calculated with respect to the groups from the previous step.
- The new dataframe is written to file.