Skip to content

Commit

Permalink
Features/mongo (#33)
Browse files Browse the repository at this point in the history
* add mongo module

* mongo serializer

* add mongodb metadata

* mongo deserialize + test

* add attribute filter sub attribute function

* transform filter + tests

* mongo persistence

* index capability of mongo database

* refactor argument and context

* modify README and mongo/README

* core readme
imulab authored Dec 22, 2019

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
1 parent f4cd5f0 commit bed1efa
Showing 77 changed files with 8,496 additions and 310 deletions.
8 changes: 4 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
@@ -2,7 +2,7 @@

> GoSCIM aims to be a fully featured implementation of [SCIM v2](http://www.simplecloud.info/) specifiction. It provides basic building blocks to SCIM functions and a functional out-of-box server. It is also designed with extensibility in mind to make customizations easy.
**Caution** This is the early stage of `v2.0.0` version of go-scim. We are now at `v2.0.0-m3` ([release notes](https://github.com/imulab/go-scim/releases/tag/v2.0.0-m3)). This second major release will introduce drastic changes to the way resources are handled in the system.
**Caution** This is the early stage of `v2.0.0` version of go-scim. We are now at `v2.0.0-m4` ([release notes](https://github.com/imulab/go-scim/releases/tag/v2.0.0-m4)). This second major release will introduce drastic changes to the way resources are handled in the system.

For the currently stable version, checkout tag `v1.0.1`, or go to [here](https://github.com/imulab/go-scim/tree/v1.0.1).

@@ -21,7 +21,7 @@ For the currently stable version, checkout tag `v1.0.1`, or go to [here](https:/
The project is in the early stage of `v2.0.0`. As for now, to check out the functionalities included in the tests:

```
# cd into one of core, protocol, server
# cd into one of core, protocol, mongo, server
$ go test ./...
```

@@ -47,9 +47,9 @@ The project will continue to use a single tag until the official release of `v2.
While the fundamentals of the functions are delivered in `v2.0.0-m1`, we are still hard at work to deliver the rest. In the coming weeks and months, the rest of functions towrads `v2.0.0` will be released.
In addition to the scheduled functions, tests and documentations will also be added.

- `v2.0.0-m4` to (re-)introduce mongo db persistence, and integration test on the server
- `v2.0.0-m5` to tackle resource root query and bulk operations.
- `v2.0.0-rc1` to complete tests and documentations
- `v2.0.0-rc1` to complete tests
- `v2.0.0-rc2` to complete documentations

As for after the release of `v2.0.0`, more features are being planned. The list includes:
- [SCIM Password Management Extension](https://tools.ietf.org/id/draft-hunt-scim-password-mgmt-00.txt)
116 changes: 116 additions & 0 deletions core/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,116 @@
# Core Module

The core module provides features described in [RFC 7643 - SCIM: Core Schema](https://tools.ietf.org/html/rfc7643), as well as the foundation for features described in [RFC 7644 - SCIM: Protocol](https://tools.ietf.org/html/rfc7644).

## Attributes

At the core of the SCIM specification is the attribute, which describes data type and constraints. The core module uses an extended version of the attributes from the specification. In addition to defined properties such as `type`, `required`, `multiValued`, etc, we added four more internal properties:

- `id`: The unique id of an attribute. This id can be used to identity the attribute and correlate other metadata to the attribute, thus serves as the major extension point.
- `_path`: The complete path from the resources's root. The path saves processing complexity so that the traversal mechanism do not have to remember its trail in order to point out the full path of an attribute.
- `_index`: The index specifies a relative ascending order for an attribute among its fellow attributes. Although SCIM does not specify attribute orders and it is okay to scramble them, it is just so much nicer to be able to return attributes in a determined order.
- `_annotations`: The string based annotations applied to the attribute. Other functions may detect annotations on an attribute and carry out extra logic. This is another major extension point. The module comes with several baked in annotations in `annotations/annotations.go`.

The added internal properties are never leaked to the API. This is ensured by the `json.Marshaler` implementation.

## Schema and ResourceType

The core module lets user define schemas and then assemble them into resource types. The schema definition completely follows the specification, with the addition of the internal properties described above in its attributes. It is important to cache them into the `SchemaHub` after parsing them.

```go
schema := new(spec.Schema)
if err := json.Unmarshal(rawJsonBytes, schema); err != nil {
return err
}
spec.SchemaHub.Put(schema)
```

After all schemas are parsed, we can then parse the resource types which reference their id in schemas and schema extensions.

```go
resourceType := new(spec.ResourceType)
if err := json.Unmarshal(rawJsonBytes, resourceType); err != nil {
return err
}
```

Check out the `internal` folder for schema and resource type definitions in test cases.

## Property and Container

`Property` references an attribute that describes its data requirement and holds a piece of resource data that conforms to that attribute. `Property` is the reason why the project can traverse dynamic data without resorting to reflection.

All SCIM attribute types have their corresponding property:

- `stringProperty`: holds and represents data of Go's `string` type, or `nil` if unassigned.
- `integerProperty`: holds and represents data of Go's `int64` type, or `nil` if unassigned.
- `decimalProperty`: holds and represents data of Go's `float64` type, or `nil` if unassigned.
- `booleanProperty`: holds and represents data of Go's `bool` type, or `nil` if unassigned.
- `referenceProperty`: holds and represents data of Go's `string` type, or `nil` if unassigned.
- `dateTimeProperty`: holds data of Go's `time.Time` type, represents data of Go's `string` type in ISO8601 format, or `nil` if unassigned.
- `binaryProperty`: holds data of Go's `[]byte` type, represents data of Go's `string` type in base64 encoded format, or `nil` if unassigned.

As with `Property` holds data, `Container` holds `Property`. Two containers are present:

- `complexProperty`: holds a list of sub properties, corresponds to SCIM's `complex` type
- `multiValuedProperty`: holds a list of member properties, corresponds to SCIM attribute where `multiValued=true`.

The `multiValuedProperty` is special because this type was not explicitly modelled in the SCIM specification. The modelling of this type of property makes traversing resource structure much easier, although it adds one complexity: the attribute of its member properties will be derived. They are derived by setting `multiValued=false` and appending `$elem` to the container attribute. For instance:

```
# emails attribute (incomplete for brevity)
{
"id": "urn:ietf:params:scim:schemas:core:2.0:User",
"name": "emails",
"type": "complex",
"multiValued": true
}
# derived emails member attribute (incomplete for brevity)
{
"id": "urn:ietf:params:scim:schemas:core:2.0:User$elem",
"name": "emails",
"type": "complex",
"multiValued": false
}
```

### Traversal

The core module introduces two ways to traverse the resource data structure: `Visitor` and `Navigator`, along with its fluent API derivation `FluentNavigator`. These two mechanism supports different use cases.

`Visitor` is an interface to be implemented by the intended party to visit the resource in depth-first-search order. The visiting party may chose to skip or visit a certain property through callback defined in the interface. However, the main control lies with the DFS traversal on the resource side. This is a passive traversal mechanism. It is usually used in serialization scenario.

`Navigator` is a structure that exposes methods to focus on sub properties addressable by different types of handle (i.e. by name, by index, by criteria). When the caller is done with the property, it can go back to the last state by calling `Retract`. This is an active traversal mechanism. It is usually used in deserialization scenario.

### Event System

The property design also features an event system, which allows different parts of the resource to react to changes in other non-local properties.

When the property value is modified, an event will be generated describing the modification. This event will be propagated up the resource tree structure and eventually arrive at the root of the resource. Along the way during the propagation, `Subscriber`s can subscribe to a property for such modification events and react to it.

Subscribers are added to the property using the attribute annotation. Some subscribers are already baked in. For instance, `prop.NewExclusivePrimarySubscriber` produces a subscriber loads itself onto properties annotated with `@ExclusivePrimary` (usually on multiValued complex property with a primary boolean sub property). It reacts to boolean property changes from its sub properties and maintain at most one primary property that has `true` value. This implements the SCIM requirement:

> The primary attribute value "true" MUST appear no more than once.
To add custom subscribers, use `prop.AddEventFactory` method.

## Expression

The `expr` package maintains the `CompilePath` and `CompileFilter` method to compile SCIM path and SCIM filters into a hybrid abstract syntax tree-list. The resulting data structure can then be used to guide the traversal of the resource.

`expr.Expression` is the main data structure, it represents a meaningful token in the path or filter expression. For instance, for path `emails[value eq "[email protected]"].primary`, compilation turns it into a structure where each component is a separate `*expr.Expression`.

```
emails -> eq -> primary
/ \
value "[email protected]"
```

One pecularity to watch out for is that SCIM paths can be prepended by namespaces. For instance, `emails` are equivalent to `urn:ietf:params:scim:schemas:core:2.0:emails`. Notice the `2.0` part in the namespace violates the SCIM attribute name syntax by introducing a dot that does not indicate path separation. To properly compile these paths, the compiler needs to learn all the expected namespaces of likes so it can distinguish between path separation and namespace dot when seeing a dot. To do so:

```go
expr.Register(resourceType)
```

This will register the ids of all schema and schema extensions of the resource type with the expression compiler.
154 changes: 153 additions & 1 deletion core/json/deserialize.go
Original file line number Diff line number Diff line change
@@ -5,6 +5,9 @@ import (
"github.com/imulab/go-scim/core/prop"
"github.com/imulab/go-scim/core/spec"
"strconv"
"unicode"
"unicode/utf16"
"unicode/utf8"
)

// Entry point of JSON deserialization. Unmarshal the JSON input bytes into the unassigned
@@ -293,7 +296,11 @@ func (d *deserializeState) parseStringProperty() error {
return d.errInvalidSyntax("expects string literal value for '%s'", p.Attribute().Path())
}

return d.navigator.Current().Replace(string(d.data[start+1 : end-1]))
v, ok := unquote(d.data[start:end])
if !ok {
return d.errInvalidSyntax("failed to unquote json string for '%s'", p.Attribute().Path())
}
return d.navigator.Current().Replace(v)
}

// Parses a JSON integer. This method expects an integer literal and the null literal.
@@ -459,3 +466,148 @@ func (d *deserializeState) scanNext() {
d.off = len(data) + 1 // mark processed EOF with len+1
}
}

// unquote converts a quoted JSON string literal s into an actual string t.
// The rules are different than for Go, so cannot use strconv.Unquote.
func unquote(s []byte) (t string, ok bool) {
s, ok = unquoteBytes(s)
t = string(s)
return
}

func unquoteBytes(s []byte) (t []byte, ok bool) {
if len(s) < 2 || s[0] != '"' || s[len(s)-1] != '"' {
return
}
s = s[1 : len(s)-1]

// Check for unusual characters. If there are none,
// then no unquoting is needed, so return a slice of the
// original bytes.
r := 0
for r < len(s) {
c := s[r]
if c == '\\' || c == '"' || c < ' ' {
break
}
if c < utf8.RuneSelf {
r++
continue
}
rr, size := utf8.DecodeRune(s[r:])
if rr == utf8.RuneError && size == 1 {
break
}
r += size
}
if r == len(s) {
return s, true
}

b := make([]byte, len(s)+2*utf8.UTFMax)
w := copy(b, s[0:r])
for r < len(s) {
// Out of room? Can only happen if s is full of
// malformed UTF-8 and we're replacing each
// byte with RuneError.
if w >= len(b)-2*utf8.UTFMax {
nb := make([]byte, (len(b)+utf8.UTFMax)*2)
copy(nb, b[0:w])
b = nb
}
switch c := s[r]; {
case c == '\\':
r++
if r >= len(s) {
return
}
switch s[r] {
default:
return
case '"', '\\', '/', '\'':
b[w] = s[r]
r++
w++
case 'b':
b[w] = '\b'
r++
w++
case 'f':
b[w] = '\f'
r++
w++
case 'n':
b[w] = '\n'
r++
w++
case 'r':
b[w] = '\r'
r++
w++
case 't':
b[w] = '\t'
r++
w++
case 'u':
r--
rr := getu4(s[r:])
if rr < 0 {
return
}
r += 6
if utf16.IsSurrogate(rr) {
rr1 := getu4(s[r:])
if dec := utf16.DecodeRune(rr, rr1); dec != unicode.ReplacementChar {
// A valid pair; consume.
r += 6
w += utf8.EncodeRune(b[w:], dec)
break
}
// Invalid surrogate; fall back to replacement rune.
rr = unicode.ReplacementChar
}
w += utf8.EncodeRune(b[w:], rr)
}

// Quote, control characters are invalid.
case c == '"', c < ' ':
return

// ASCII
case c < utf8.RuneSelf:
b[w] = c
r++
w++

// Coerce to well-formed UTF-8.
default:
rr, size := utf8.DecodeRune(s[r:])
r += size
w += utf8.EncodeRune(b[w:], rr)
}
}
return b[0:w], true
}

// getu4 decodes \uXXXX from the beginning of s, returning the hex value,
// or it returns -1.
func getu4(s []byte) rune {
if len(s) < 6 || s[0] != '\\' || s[1] != 'u' {
return -1
}
var r rune
for _, c := range s[2:6] {
switch {
case '0' <= c && c <= '9':
c = c - '0'
case 'a' <= c && c <= 'f':
c = c - 'a' + 10
case 'A' <= c && c <= 'F':
c = c - 'A' + 10
default:
return -1
}
r = r*16 + rune(c)
}
return r
}
2 changes: 1 addition & 1 deletion core/json/deserialize_test.go
Original file line number Diff line number Diff line change
@@ -430,7 +430,7 @@ func (s *JSONDeserializeTestSuite) TestDeserialize() {
}
{
_, _ = nav.FocusName("version")
assert.Equal(t, "W/\\\"1\\\"", nav.Current().Raw())
assert.Equal(t, "W/\"1\"", nav.Current().Raw())
nav.Retract()
}
nav.Retract()
18 changes: 18 additions & 0 deletions core/spec/attribute.go
Original file line number Diff line number Diff line change
@@ -225,6 +225,24 @@ func (attr *Attribute) SubAttributeForName(name string) *Attribute {
return nil
}

// Find the sub attribute that matches the criteria, or nil
func (attr *Attribute) FindSubAttribute(criteria func(subAttr *Attribute) bool) *Attribute {
for _, eachSubAttribute := range attr.subAttributes {
if criteria(eachSubAttribute) {
return eachSubAttribute
}
}
return nil
}

// Perform a depth-first-traversal on the given attribute.
func (attr *Attribute) DFS(callback func(a *Attribute)) {
callback(attr)
for _, each := range attr.subAttributes {
each.DFS(callback)
}
}

// Return true if one or more of this attribute's sub attributes is marked as identity
func (attr *Attribute) HasIdentitySubAttributes() bool {
for _, subAttr := range attr.subAttributes {
Loading

0 comments on commit bed1efa

Please sign in to comment.