Skip to content
This repository has been archived by the owner on Sep 11, 2020. It is now read-only.

Comments/Suggestions on gc source repo page #101

Open
obrien-j opened this issue Jan 4, 2019 · 23 comments
Open

Comments/Suggestions on gc source repo page #101

obrien-j opened this issue Jan 4, 2019 · 23 comments
Assignees
Labels
enhancement New feature or request

Comments

@obrien-j
Copy link
Contributor

obrien-j commented Jan 4, 2019

Source document: GC Source Code Repo analysis

1: Suggest reframing the Options Analysis section, primary question internal v external and the various options to be inline with the Technology arch from Annex C of ref.1, specifically Section 2.3.11[1,2,3] Use Cloud First, also taking into consideration the potential need for multiple instances due to classification or network restrictions.

Order by (as an example):

  • Public Cloud
    • SaaS
    • PaaS
    • IaaS
  • Hybrid Cloud
  • Private Cloud
  • non-cloud onprem

Given that C.2.3.8.3 of ref 1, annex C, also directs that all code written by government must be released in an open format, I would suggest that majority of code will fall into the 'unclassified/not protected' bucket, and would fit quite easily into a public cloud, SaaS based model (of which there are several good options.)

Also noting that an exemption path is noted, in C.2.3.9.5 of ref1, annex C "Share code publicly when appropriate, and when not, share within the Government of Canada", with 'where appropriate' yet undefined, there will obviously be a need for a 'protected option in the options analysis above. I would strongly recommend that departments or teams that feel they have source code, or potentially network based restrictions on development + access to data, that fit into this bucket make their concerns known. Ensuring that the developer experience for people working in multiple networks/systems is seamless will go along way towards minimizing friction.

refs: 1: Directive on Management of IT

@obrien-j obrien-j added the enhancement New feature or request label Jan 4, 2019
@gcharest
Copy link
Member

gcharest commented Jan 7, 2019

Good idea, will use the ref arch standards as well as the business requirements checklist identified in the same folder of the repo.

Thanks @obrien-j !

@CalvinRodo
Copy link

I figure I'll add some suggestions here rather then open a new issue.

I'd like to add to the Functional and Non-Functional Requirements the requirement that the solution provide an API for introspection as well as interaction.

For instance both GitHub, GitLab, and BitBucket all provide this functionality to some extent.

This will allow us to build tools that can ensure compliance such as what CDS is doing with their Symmorfosi project.

It will also allow us more novel and potentially easier ways to administer the sites for instance using tools like Terraform to handle administering and configuring the solution as code.

Examples of SaaS/Cloud solutions with APIs
https://developer.github.com/v3/
https://docs.gitlab.com/ee/api/
https://developer.atlassian.com/bitbucket/api/2/reference/

@britthurley
Copy link

Just wanted to chime in to agree with @CalvinRodo on the importance of exposed APIs. The current ecosystem for developers across departments is collection of on-prem self managed of tools for different needs, and a barrier to adoption will be integration with existing tools that people currently rely on.

As I am just jumping in, I am wondering if we're scoping this to enabling the use of CD functionality?
There's specific mention about CI, but making CD a focus would be a big enabler for a lot of existing dev teams. So, things like ensuring we can open flows to end state SSC data centers - whether that be because the solution is hosted there or SSC has gotten on board to support that. If its out of scope, ignore me!

@gcharest
Copy link
Member

gcharest commented Jan 8, 2019

OK, Will setup a quick call around this topic either today or tomorrow.

@MikeNwin
Copy link
Contributor

MikeNwin commented Jan 9, 2019

Suggest reframing the Options Analysis section
...
Order by (as an example):

  • Public Cloud
    • SaaS
    • PaaS
    • IaaS
  • Hybrid Cloud
  • Private Cloud
  • non-cloud onprem

A suggested format could be a table--e.g.:

Option SaaS PaaS IaaS Hybrid Cloud Private Cloud On-premises
A ✔️
B ✔️ ✔️
C ✔️ ✔️

Which would enable adding more criteria columns--e.g.:

Option SaaS PaaS IaaS Hybrid Cloud Private Cloud On-premises REST API GraphQL API Webhooks
A ✔️ ✔️ ✔️ ✔️
B ✔️ ✔️ ✔️ ✔️
C ✔️ ✔️

@handshape
Copy link

The format proposed by @MikeNwin makes sense to me as a a way of classifying at-a-glance. A way to flag which criteria are mandatory would be nice, too.

The importance of a clean exit path can't be overstated in this decision, IMHO -- the source will almost certainly count as RBVs from the standpoint of the Directive on IM.
Another spot that might bear some policy wrangling will be disposition authority; commits are forever, but disposition authorities are not. It would probably make sense to first look at what's involved in getting a disposition authority with a very long lifespan before chasing a technical solution.
Looking forward to chatting this one over.

@gcharest
Copy link
Member

gcharest commented Jan 9, 2019

Hi folks!

I think the breakdown in #102 of the scope can also be used with the proposed format for an initial discussion.

Before even getting to the technical options/reference architectures, we need to understand what specific requirements are for each scoped use cases.

@Zulban
Copy link

Zulban commented Jan 9, 2019

Glad to join the chat! My thoughts:

As I am just jumping in, I am wondering if we're scoping this to enabling the use of CD functionality?

  1. CI is a first step to CD. CD should be the final goal, and enabled by the platform. I strongly recommend we take inspiration from this excellent paper software development at Google which was recently number one on Hacker News. Our platform solutions should enable at least half of that workflow (section 2).

  2. Search engines like Google should be able to index our repos (and READMEs). Currently, much of the science GitLab is not indexable. Which leads us to:

  3. Open repos by default. It is currently a burden getting my science GitLab repos to be truly public. This means that 98% of users will not bother (or even realize it). If a user is managing confidential information, it should be their responsibility to flag it as such. We cannot expect 99% of users to flag their repos as public - even if it's as easy as one button. Default behaviour rules all.

  4. If a developer accidentally commits and pushes confidential information to a public repo, they need some way to purge that commit. So the platform needs to allow destructive, force commits. CI can help with this: for example, we can recommend standard CI templates which scan for accidental commits of telephone numbers. You then get an email "hey, do you realize there is a phone number in your commit, oops! click this button to purge."

  5. One of our highest priorities should be platform agnostic solutions. For every feature, we should identify if it works on Windows, Mac, and Linux. If it doesn't work identically on all three, we shouldn't even consider it a feature. This is complicated - as sometimes things look the same but on the backend they are implemented differently for different platforms. For example: "there is a Skype client for Linux... for now". If a major core library needs to be built from scratch to support all platforms then that feature is fragile and will not support future platforms. This is a very easy to understand requirement which has a ton of good implications on software longevity and breaking out of vendor lock in. Works on all platforms or it is not a feature.

  6. The platform must very clearly distinguish at least three classes of user:

  • Public servant.
  • Private industry.
  • Member of the public not from industry.

This role needs to be made clear for all comments, commits, threads, everything. Otherwise we're opening ourselves up to a ton of stealth influence from full time salespeople and communicators. I checked manually, for example, each user in this thread before writing here because I wasn't sure who was from where. I simply won't trust using a platform that does any less.

  1. Permanent links to repos - domain controlled by GoC. When we provide a link to a repo it should remain permanent for as long as the GoC wants. No one but the GoC should have control over changing our URLs, or breaking them, or adding some "/repo/" prefix to the URL when they make a new update to their platform.

@schindld
Copy link

schindld commented Jan 9, 2019

For the Functional/Non-Functional Requirements section, the solution(s) would need connectivity to (or built-in) container registries to be useful for many of us. And ssh access from off-GoC network would be nice for protected/non-classified repos.

@CalvinRodo
Copy link

So I'm reading some of the comments @gcharest is the goal to have a single solution that provides both inner-sourcing and open source hosting?

Is there even anything that would currently allow that without it becoming a nightmare to administer, my view would be two separate solutions something similar to GCCode for inner sourcing and a separate solution for Open source projects. Which I was under the impression we were just going to use existing platforms, places that already have large open source communities. Is that no longer correct and we are looking at a centralized location for all GoC Open Source Projects?

As for @Zulban suggestion for a way to purge commits I do agree with that and in fact that's something we can do with git so 👍 for that, but if as @handshape stated that "the source will almost certainly count as RBVs from the standpoint of the Directive on IM." allowing for users to purge the repository of commits could cause some concerns if there is not a backed-up copy of the source code that records those purged commits. Although I'm definitely not an expert on Records of Business Value so maybe it's not considered IRBV and so not an issue, so maybe we need to get some clarification on that point from someone more knowledgeable of that domain.

@gcharest
Copy link
Member

gcharest commented Jan 9, 2019

@CalvinRodo the scope is part of the discussion for the reference architecture in my humble opinion. The initial ask was for recommendation for a GC-wide VCS. And it was followed by a question around whether this should be for both internal and external repos.

I think part of the exercise needs to include this exact discussion. I've thus broken down the high level requirements in a more distinct series of requirements to support the varying legal and policies constraints.

Really happy with the feedback by the way. Lots of very interesting points.

@MikeNwin
Copy link
Contributor

MikeNwin commented Jan 9, 2019

So I'm reading some of the comments @gcharest is the goal to have a single solution that provides both inner-sourcing and open source hosting?

Is there even anything that would currently allow that without it becoming a nightmare to administer

GitHub Connect helps to simplify this by providing a unified platform to enable inner-source on GitHub Enterprise and open source on www.GitHub.com.

@Zulban
Copy link

Zulban commented Feb 4, 2019

A unified solution is an absolute requirement in my opinion. Choosing whether a project is private/public should be a choice on a platform, not a choice of a platform. I could go on a spiel explaining why, but I think I'll just say that if we split the platform, the default will drift towards never using the public one (or using it only as a repo file dump and not actively work).

@Zulban
Copy link

Zulban commented Feb 4, 2019

Something else worth mentioning: at the Canadian Meteorological Centre we have unfortunately deployed our own GitLab instance (in addition to existing ones like SSC). The justification for this is so that it has 24/7 support, as many of our operations require 24/7 support and monitoring. We therefore have a duplicate platform with fewer resources supporting it (evidenced by its version being out of date).

I am still investigating, but I can't imagine how our platform with fewer man-hours supporting it could possibly be more stable and "24/7" than the SSC one. Both have had outages.

So @gcharest it may be a good requirement for this platform to say it has 24/7 support. That would help me eventually convince our groups to migrate to it instead of fracturing. Or define some number of hours/CS folks dedicated to supporting it.

@gcharest
Copy link
Member

gcharest commented Feb 4, 2019

@Zulban Yes, I've quickly added such a requirement re: 24/7 support.

Regarding the choice of platform, I need to take more time (I don't have enough...) to articulate the document but there's most likely a Ven diagram of needs here, depending on the level of classification.

I'm trying to present this as a clear(er) picture of the needs so that we can do an informed recommendation in terms of options analysis. I do believe that a single platform most likely is the most logical step forward. However, there's a (lack of) governance challenge where we are all different organizations and enforcing a single platform is not done lightly.

It may however be our recommendation if all the requirements point to it.

@Zulban
Copy link

Zulban commented Feb 4, 2019

I understand if it's tricky. I'm just glad to give my input. Great job so far. Do what you can 😄

@harsh
Copy link

harsh commented Feb 10, 2019

Is there even anything that would currently allow that without it becoming a nightmare to administer, my view would be two separate solutions something similar to GCCode for inner sourcing and a separate solution for Open source projects.

@CalvinRodo In addition to what @MikeNwin outlined here, as of January 8th, GitHub provides a "Unified SKU" so large organizations like the GC can have a private instance of GitHub Enterprise on premise or virtual private cloud (open to every employee) AND have multiple GitHub Enterprise SaaS organizations but only pay for unique users. A lot of organizations already have plans on GitHub SaaS so this would allow the GC to quickly connect the SaaS orgs with their On Premise server. Not only is this a huge cost savings but this allows the GC to decide where to InnerSource since there are unlimited public and private repositories in both cases. More sensitive code can be on Enterprise On premise and then use GitHub Connect to get access to an org on github.com and decide what to open source.

Humbly speaking, GitHub is the only vendor that provides a single platform for both InnerSource and seamless access to Open Source as well as the Open Source community. So yes @Zulban we can satisfy your requirement above AND GitHub offers 24/7 support.

@gcharest Happy to clarify any of the information above 😄

@Zulban
Copy link

Zulban commented Mar 5, 2019

I wish public servants could discuss the government of Canada requirements without hearing advertisements from Microsoft.

@britthurley
Copy link

My personal hope is that this doesn't result in a single product, but a service offering that can satisfy each use case.

Focusing on a single platform is a major risk to adoption, especially when a VCS platform must be interoperable with tools that support additional capabilities outside of just inner and open sourcing code.

@gcharest
Copy link
Member

gcharest commented Mar 5, 2019

@Zulban to be fair, we have this discussion in the open and I'd welcome any and all service providers to share their ideas.

I do not know everything and I know some of the very problems we're trying to solve have been addressed already.

@gcharest
Copy link
Member

gcharest commented Mar 5, 2019

@britthurley I would totally support that. I have been delayed on working on this file but getting back to it now.

I'll put some new ideas for discussion shortly.

@Zulban
Copy link

Zulban commented Apr 2, 2019

@Zulban to be fair, we have this discussion in the open and I'd welcome any and all service providers to share their ideas.

I do not know everything and I know some of the very problems we're trying to solve have been addressed already.

What other service providers are part of this conversation?

@gcharest
Copy link
Member

gcharest commented Apr 2, 2019 via email

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

9 participants