- Added support for Kubernetes 1.17
- Updated defaults, including updating the default cluster version to 1.15.
- Upgraded CentOS7 nodes to 1908.
- Verified that DC/OS installation is still working fine, even with the switch to 2.0. Added kernel modules for raid1 and dm_raid in support of new DC/OS Storage Service.
I can't deprecate DC/OS when it just keeps working. Mesosphere sets the bar.
- Added Ubuntu 18.04 LTS variant of
kubeadm k8s
asubuntu-k8s
- Added the
cluster-update
command. Although still experimental - this command updates Kubernetes clusters in place (minor and major release upgrades and underlying OS security patching). See the README for more details. - Verified HA Kubernetes 1.13 through 1.16 for:
centos-k8s
fedora-k8s
ubuntu-k8s
- Cleaned up Ansible output with
ANSIBLE_DISPLAY_SKIPPED_HOSTS=True
- Updated to Fedora 30.
- Added secrets encryption to all Kubernetes clusters.
- Added Kubeflow setup scripts in xtras.
- Validated Calico CNI as a working option to Canal (the default).
- Aligned component versions for NGINX and Canal from 1.12 to 1.16.
Experimental
- Added initial scaffold for CentOS 8, but thus far no luck installing
kubeadm k8s
, still hung up on the CNI. - Added initial scaffold for the CRI-O runtime as alternative to containerd, but with no stable version yet.
- Added initial support for Kubernetes 1.16.
- Upgraded NGINX to 0.25.
- Introduced Kubernetes Dashboard 2.0 and Metrics Server into the 1.15 and 1.16 cluster deployments.
- Re-introduce Kubernetes 1.12.x series as this is the current stable GKE version (who knew Google's GCP would trail their own releases so many versions)
- Removed Docker Swarm provisioning; a deprecated, never-was orchestration system.
- Standardize toolset on Python3 and pip3.
- Removed Python2 dependency on CentOS7 nodes.
In preparation for the deprecation of Python2, please rebuild your CentOS ova templates, and ensure that
ansible_python_interpreter=/usr/bin/python3
in all your centos-k8s hosts files going forward.
- Renamed all
fusion_
parameters todesktop_
parameters as they apply to both VMware Fusion and VMware Workstation desktop products.
Make sure to update all locally deployed cluster
hosts
files accordingly as the following parameters have changed for desktop deployment:
Old
vmware_target=fusion
fusion_vm_folder="../virtuals"
fusion_net="vmnet2" # this should be vmnet8 for Windows and Linux
fusion_net_type="custom" # this should be nat for Windows and Linux
New
vmware_target=desktop
desktop_vm_folder="../virtuals"
desktop_net="vmnet2" # this should be vmnet8 for Windows and Linux
desktop_net_type="custom" # this should be nat for Windows and Linux
- Created configuration template and verified CentOS/Fedora Kubernetes 1.15.
- Updated Canal to version 3.8.
- Updated to
docker-ce
stable on Fedora 29. - Removed Python2 dependency on Fedora 29 nodes.
In preparation for the deprecation of Python2, please rebuild your Fedora nodes, and ensure that
ansible_python_interpreter=/usr/bin/python3
in all your fedora-k8s hosts files going forward. In the future this will be the default.
- Removed deprecations and tested with Packer 1.4.5.
- New xtras:
- Working EFK stack in xtras/k8s/elastic will automatically aggregate new container logs, along with base node metrics, using the targetd storage class.
- Basic Kafka manifests and a port of the Lenses dev box in xtras/k8s/kafka that also uses the targetd storage class.
- Tiller RBAC patch to enable Helm installation, to enable KEDA installation.
- Updated and verified CentOS Kubernetes 1.14.
- Tested and validated multi-master HA configuration for 1.13/1.14 on centos-k8s and fedora_k8s deployments. HA deployments are now fully supported by cluster-builder.
- Removed support for 1.12 as this is now approaching EOL.
- Removed calico-policy until a working version can be completed.
- Moved custom Istio and Knative deployments into standalone ansible playbooks to reduce the complexity and dependency matrix in cluster deployment. Knative now installs the required Istio dependency.
- Fedora 29 1.13 and 1.14 stable on the 5.x kernel series.
- CentOS 7.6 DC/OS 1.12 and Docker Swarm CE 17.12.1 updated with fixes for deprecated ansible modules.
- Added durability and retry logic to mitigate
vmrun
instability issues on macOS with Fusion 11+. - Added kube-config and web-ui-token instructions to the cluster-deploy output for Kubernetes deployments.
Please rebuild all OVA template images.
- Added durability and retry logic to many of the ansible deployment tasks, and replaced several hard VM restarts with soft.
- Pruned OVA templates, with numerous fixes, re-instating firmware for increased disk stability in vmware.
- Script cleanup, numerous fixes in stability of ESXi deployment.
- Working fedora-k8s 1.13.4 cluster wih Knative 0.30. Surprisingly stable for fedora and the 4.20 kernel is a bit less dated.
- Added protected=true|false option to cluster hosts file that guards against cluster-control destroy or cluster-deploy re-deployment to prevent accidental destruction of a deployed cluster.
- Streamlined the ESXi deployment scripts by reducing the arbitrary wait times and improving unattended execution performance.
- Fixed latest stable defaults for CentOS 7
kubeadm
Kubernetes:- CentOS 7.6 1810 w/ updates
- Kubernetes - v1.13.4
- Canal CNI w/ Network Policy - v3.5
- MetalLB Native Load Balancer - v0.73
- NGINX Ingress Controller - v0.21
- Knative Serverless Platform - v0.3.0
- Updated examples and readme to reflect new stable formula.
- Added experimental Kafka manifests in
xtras
- Set base
node-packer
CentOS ova to use base distro docker version by default, as this is preferred by k8s and DC/OS. - Removed last remnants of Tectonic CoreOS and updated documentation accordingly.
- Added CentOS kernel parameter tuning for network performance on nodes.
- Initial implementation of cluster deployment to local machines on Windows and Linux VMware Workstation hosts. Windows can now deploy to both rvSphere/ESXi and local VMware Workstation environments. Windows is now a first class cluster builder!
- Verified and tested DC/OS 1.11 local deployment on Windows and macOS. Linux mileage may vary for some reason still under investigation.
- Verified and tested Docker CE Swarm on CentOS deployment locally on Windows, Linux and macOS.
- Verified and tested CentOS 7.5 Vanilla Kubernetes 1.12 deployment locally on Windows, Linux and macOS. A solid up to date reference Kubernetes. Early support for 1.13.
- Updated all
CentOS 7
clusters from 7.5 (1804) to 7.6 (1810). Rebuild ova template images to upgrade. - Integrated MetalLB into the Vanilla Kubernetes configuration.
- Added support for Calico CNI Plugin w/ Istio and Network Policy for both CentOS 7.5 and Fedora 29. k8s_network__cni allows selection of canal (default), calico and calico-policy (whcih includes Istio and Network Policy). Load tested and validated Network Policy functionality on
centos-k8s
with calico-policy. However, testing of canal and metal-lb shows a performance delta as compared to Calico so as to make the overhead of Istio a consideration. Likely not worth it unless you plan on using Istio, and even then... - Changed default ingress controller on
kubeadm
clusters tonginx
, and added setting for selecting: k8s_ingress_controller is one of nginx (default), traefik-nodeport, traefik-daemonset and traefik-deployment. - Completed and tested initial packer build of candidate Windows 2016 node in preparation for Windows 2019 Kubernetes networking support.
- Bugs in Fusion vmrun appear to be fixed in 11.02 release and Fusion 11.02 has been tested with Packer 1.3.2, with applicable fixes applied.
- Removed advanced swarm deployment model with seperate control and data plane configuration as this approach will not be carried forward.
- Removed PhotonOS related artifacts as PhotonOS is no longer relevant due to VMware's misstep with PKS. Proof that better technology does not always trump legacy enterprise software politics.
- Removed UCP/Docker EE related artifacts as Docker EE is no longer supported, and not likely to have a future.
- Cleaned up the code base and pruned out artifacts no longer required or relevant.
With support now on Windows and Linux workstations, ESXi deployment should work as expected on all platforms, however the cluster-builder-control station is still recommended. Also known conceptually as a bastion server or jump box. This is no longer limited to the CentOS version supplied and can, in fact, be a Windows based jump box.
- Updated
kubeadm
custom built Kubernetes variants for CentOS 7.5 and Fedora 29 for Kubernetes 1.12 and updated the configuration format and Canal networking manifests. Also removed the explicitetcd
install in the scripts askubeadm
now handles creation of the KV store. - Fixed
kubeadm
Kubernetes deployments to1.12.1
. - Added
coreDNS
patch to increase available memory and correct the CrashLoopBackoff issue mentioned here. - Verified and tested DC/OS.
- Verified and tested Docker Swarm on CentOS.
- PhotonOS is now deprecated and no longer supported.
An OVA rebuild is recommended with this release due to performance optimizations introduced in the Packer phase kickstart files.
- Included custom built Kubernetes variants for CentOS 7.5 and Fedora 28, using
kubeadm
, and implementing Canal CNI for network policy and the iSCSI provisioner for persistent storage. - Updated CentOS7 to 7.5 (1804).
- Added
coreos-ansible.yml
wrapper script for bootstrapping CoreOS withPyPy
to enable Ansible management of CoreOS nodes. - Initial implementation of Ansible based CoreOS iSCSI configuration playbook.
- Initial implementation of Targetd Storage Appliance VM with dynamic iSCSI provisioning and an integrated iscsi-provisioner Tectonic CoreOS configuration.
- Fixes in support of Ansible 2.5.
- Implemented recommendations for
elevator=deadline
in all CentOS7/RHEL and CoreOS VMs as per the Redhat documentation. Kickstart files for RHEL based VMs and thecoreos-init.yml
ansible playbook for optimizing CoreOS post PXE deployment after thecoreos-ansible.yml
install playbook. - Tested with VMware Fusion Pro 10 for Mac.
- Tested with VMware Workstation Pro 14 for Linux on the cluster control station appliance.
- Updated and validated Tectonic CoreOS install for 1.9.6-tectonic.1.
- Validated installation of DC/OS 1.11, still no changes required to cluster-builder after 4 version upgrades.
- Fully automated the CoreOS PXE deployment process, including the
coreos-init.yml
and iscsi-provisioner deployment. - Included static ip assignment into the post deployment
coreos-init.yml
configuration to remove dependency on the coreos provisioner for runtime operation of the clusters. It is now only needed temporarily during new cluster deployment. - The
coreos-init.yml
script has been tuned so it can be properly re-applied, in support of node additions, recovery, etc. - Fixed a bug in the grep logic for VM ID derrivation that caused conflicts with subset names.
- Included the ability to trigger coreos provisioner Matchbox CoreOS version image downloads by specifying
coreos_linux_version
andcoreos_linux_channel
in the Ansible hosts file. - Integrated the MariaDB Galera/Drupal7 Load Testing Stack into the core cluster-builder codebase to enable regular load testing and performance validation of deployed clusters.
- Fixed default volume layout for CentOS/RHEL. Customized allocation so that /var now has a dedicated 180GB volume, / is allocated 40GB, /boot 1GB and /tmp 10GB. Note that Nodes remain thinly provisioned.
- Increase system wide file descriptor limit via
fs.file-max = 100000
and adjusted /etc/security/limits.conf to raise service level limits to 65536. - Ensured that /etc/hosts on the nodes can be correctly populated with FQDN host names.
- Added wrappers for optional DNS entries in nic configuration
- Updated and validated Tectonic CoreOS install for 1.8.9-tectonic.1.
- More README and documentation cleanup.
- Added additional documentation for iSCSI configuration for Tectonic CoreOS.
- Cleaned up READMEs and documentation.
- Added cluster-update functionality for applying rolling updates to CentOS7 and RHEL Docker Swarm cluster nodes (in support of regular patching).
- Pinned Docker engine version on CentOS7 and RHEL CE to Docker 17.09.1-ce.
- Added support for Tectonic CoreOS with documentation
Tectonic is a late addition but moving into the top spot quickly - it has the polish and stability of DC/OS, with the momentum of Kubernetes. If you have a Docker Swarm headache (as I do), this is the cure.
Note that this release requires all Packer output OVA VM images to be rebuilt due to significant changes in the ova profile.
Updated the CentOS to Docker 17.12.0-ce, with explicit versioning.- Already downgraded back to Docker 17.09.1-ce, as this issue appeared almost immediately in the swarm: moby/libnetwork#2045. Likely a better policy to skip Docker's .0 releases.
- Migrated to a Packer centric node provisioning approach moving most of the ansible logic from post OVA provisioning into the packer OVA creation process. This includes the monitoring agents and underlying plugin dependencies. This results in a ready-to-go node OVA that can be deployed into service simply by:
- Assigning a static IP
- Assigning a hostname (and DNS entries)
- Joining a Swarm
- The Packer centric approach favors a model where OVAs are versioned along with deployments for reliable recovery (TODO: future releases should build this into the model explicitly). It should also enable the creation of simple "add node to cluster" scripts for easily expanding existing cluster deployments.
- Updated the CentOS variant to use Overlay2 (supported on 1708/7.4). This simplifies the CentOS/RHEL images eliminating the need for a dedicated 2nd VMDK and the rather cumbersome direct-lvm approach
- Added ovftool_parallel=true option to speed up ESXI ovftool deployments
- Added docker_swarm_mgmt_cn and docker_swarm_mgmt_gw to allow external DNS names to be used in the advanced deployment model (where access is through a load balancer external to the cluster)
- Added the cluster-passwd high level script for managing cluster admin/root user passwords and integrated this into the deployment process
- Cleaned up the CE based Swarm Secure API implementation and migrated dockerd settings to standardize on daemon.json configuration
- Changed the designation of target node for prometheus from an ansible group definition of [docker_prometheus_server] to a simple variable of docker_prometheus_server=servername
- Underlying metric support from dockerd, cAdvisor and Node Exporter has been built into the OVA and is no longer optional (docker_enable_metrics is now deprecated - any need to upgrade or adjust these components can be considered a post-deployment activity - considering moving the entire monitoring system to an independent stack deployment)
- Added support for separate Control and Data interface swarm deployments where each node has 2 NICS on separate subnets, partitioned accordingly.
- Enabled ELK Logging via logstash and gelf. When the docker_elk_target variable is set, logstash containers are distributed via service mode=global, and the Docker daemon.json is configured for gelf logging to the local logstash instance.
- Added "dns" entry to Docker daemon.json based on the data_network_dns and network_dns respectively. The first two DNS entries will be entered into the Docker daemon.json to support DNS name resolution in containers. If a custom dns entry is desired for the Docker daemon.json , the "docker_daemon_dns_override" variable can be used.
- Cleaned up the base CentOS OVA and moved DC/OS specific items into post-ova ansible deployment
- FIXED: issue with cAdvisor not being accessible to prometheus when prometheus is running on the node. All targets now show as UP immediately after deployment.
- Atomic Swarm !Deprecated, what is the point? I fought with rpm-ostree enough just trying to get it to stay on their antiquated idea of docker-latest as 1.13, and it kept reverting back to 1.12... and I think I've had enough. Docker 1.13 in 2018 is of no use to anyone.