forked from coala/cEPs
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
cEP 23: Separation of bears' metadata
Closes coala#138
- Loading branch information
1 parent
c1a4b7e
commit da6a447
Showing
1 changed file
with
378 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,378 @@ | ||
# Separation of bears' metadata | ||
|
||
| Metadata | | | ||
|----------|-----------------------------------------------| | ||
| cEP | 0023 | | ||
| Version | 1.0 | | ||
| Title | Separation of bears' metadata | | ||
| Authors | Muhammad Kaisar Arkhan <mailto:[email protected]> | | ||
| Status | Proposed | | ||
| Type | Feature | | ||
|
||
## Abstract | ||
|
||
This cEP proposes a method of separating bears' metadata and separating the | ||
usage of Python when writing bears. | ||
|
||
## How bears are written currently | ||
|
||
Most bears are composed of Python boilerplate code containing the needed | ||
metadata by coala, some more metadata to identify what a bear is, and docstrings | ||
for the bear description. | ||
|
||
[GoVetBear][GoVetBear] | ||
|
||
Of course not all bears are just boilerplate code. Some require Python code to | ||
help coala execute the linters, parse logs, make configuration files, etc. | ||
|
||
[CoffeeLintBear][CoffeeLintBear] | ||
|
||
Some bears are made locally by the coala team. | ||
|
||
[SpaceConsistencyBear][SpaceConsistencyBear] | ||
|
||
## Problems with current way of writing bears | ||
|
||
### Duplicate code all over the place | ||
|
||
This makes it annoying when introducing a new feature that deprecates the old | ||
methods. | ||
|
||
When writing bears, You have to get the Python boilerplate and put fancy | ||
metadata. | ||
|
||
When a new feature that deprecates the old way of doing things, we have to | ||
change almost every bear code. | ||
|
||
[Example 1][Example 1] | ||
|
||
### Python is not needed | ||
|
||
Bears such as [GoVetBear][GoVetBear] don't need Python to declare metadata. | ||
|
||
The usage of `@linter` decorator helps supressing a lot of boilerplate code | ||
but it still have the issue of having to use Python to just declare metadata. | ||
|
||
Some projects/orgs may need to write their own bear so coala can use their | ||
exclusive tools (such as commerical code safety checks that are commonly used | ||
by embedded software projects). | ||
|
||
Not all projects/organization want snippets of Python code in their projects | ||
just to simply declare on how to use the linter and not everyone can write | ||
Python. | ||
|
||
### Development is slow | ||
|
||
This is specific to bears that are made in-house or require a lot of fancy | ||
code to run. | ||
|
||
When writing a bear, we have to test them. | ||
|
||
This require setting up coala development in your environment, making sure | ||
coala-bears isn't installed or declare the bears directory which may result | ||
in a conflict, run coala with a long list of arguments or just make a | ||
`.coafile`. | ||
|
||
or do the other way around, write the tests first and just run `py.test` to | ||
test your fresh new bear. | ||
|
||
Either way, both of them add a lot of time to just test a bear when | ||
development. You don't need to write a lot of unneccesary boilerplate code to | ||
just run the bear ad-hoc. It should be a simple as running them in your | ||
shell. | ||
|
||
### Dual functionality of bears | ||
|
||
Are bears linters or are they just metadata to instruct coala to run linters? | ||
|
||
Should bears just declare metadata and have the code that make it coala-able | ||
separated? | ||
|
||
This has been an issue for a while and it generates inconsistencies all over | ||
the place. | ||
|
||
Some bears have needy code to generate configuration files such as | ||
[CoffeeLintBear][CoffeeLintBear]. | ||
|
||
Some bears just put their code into themselves such as | ||
[SpaceConsistencyBear][SpaceConsistencyBear]. | ||
|
||
Some of the Python bears just call the functions such as | ||
[PEP8Bear][PEP8Bear]. | ||
|
||
I believe bears should be simply metadata while the actual linter tool should | ||
be seperated from them. | ||
|
||
Needy code such as generating config files can easily be tasked into an | ||
external script. | ||
|
||
### Dependency Hell | ||
|
||
Tracking coala and coala-bears has been a problem. coala and coala-bears must | ||
be released together and releases are quite slow because coala need a lot of | ||
changes while bears should be able to be released soon. | ||
|
||
This holds back a lot of new bears and bug fixes. | ||
|
||
coala-bears should have a steady and often release cycle so people can enjoy | ||
bug fixes and new bears without coala development holding them back. | ||
|
||
Sadly this is a hard thing to do because coala-bears is a bunch of Python | ||
code that are calling things from coala that may or may not be there. | ||
|
||
This creates a dependency cycle from both coala and coala-bears that should | ||
not be ignored. | ||
|
||
### Security | ||
|
||
When declaring bears code inside the context of the coala process, it is | ||
possible to intorduce bugs that have access to the coala process. | ||
|
||
This is bad since it is possible to leak information and possible gain code | ||
execution which makes it possible in theory for services such as continuous | ||
integration or have a specific usage of coala to be exploited and leak | ||
information such as secret keys for deployment like the Play Store. | ||
|
||
coala should simply run linters in a seperated manner. It should not run | ||
them inside the same context. | ||
|
||
If we treat bears as simply just metadata, it will help implementation of | ||
good secure practices such as privilege separation, operating system | ||
specific mitigations, and many more possible and way easier. | ||
|
||
## Objective | ||
|
||
coala-bears can be simplified by order of magnitude if it was treated as a | ||
repository filled with metadata to instruct coala on how to use linters. | ||
coala-bears should operate independently of coala development enabling a faster | ||
release cycle and deliver bug fixes and new bears faster. | ||
|
||
## Structure of Bears | ||
|
||
Collection of bears will be put inside a directory that are declared in | ||
`$COALA_BEAR_PATH` with defaults such as | ||
`$HOME/.coala/bears:/usr/local/lib/coala/bears:/usr/lib/coala/bears` in addition | ||
to a possible local `.coala` directory inside the project where bears are | ||
located inside `.coala/bears`. | ||
|
||
``` | ||
/usr/local/lib/coala/bears | ||
... | ||
| | ||
|_ GoVetBear | ||
| |_ metadata.toml | ||
| | ||
|_ CoffeeLintBear | ||
| |_ metadata.toml | ||
| |_ bear.py | ||
| |_ generate_config.py | ||
| | ||
|_ SpaceConsistencyBear | ||
| |_ metadata.toml | ||
| |_ bear.py | ||
| | ||
|_ PEP8Bear | ||
| |_ metadata.toml | ||
| |_ bear.py | ||
... | ||
.coala/bears | ||
|_ AeroplaneSafetyComplianceBear | ||
| |_ metadata.toml | ||
| | ||
|_ MemoryStructureFormatBear | ||
|_ metadata.toml | ||
|_ check_memory_structure.sh | ||
``` | ||
|
||
The `metadata.toml` file will declare the metadata required to instruct coala on | ||
how to use the tool, what arguments to give when executing, what dependencies | ||
required, etc. | ||
|
||
Inside the folder, a script or an executable can be added seperating the need of | ||
coala when executing thus removing the dependency cycle. | ||
|
||
The script will be launched as a general fork+exec model to prevent the script | ||
from doing malicious things inside the context of coala. | ||
|
||
Enabling coala itself to do more safety features such as implementing operating | ||
system specific safety features (FreeBSD Capscicum, OpenBSD pledge, Linux | ||
SECCOMP, etc) and have a more fine-grained priviledge separation, however those | ||
aren't part of this cEP and will be covered in another time. | ||
|
||
## `metadata.toml` | ||
|
||
`metadata.toml` is essentially a TOML file declaring the needed information for | ||
coala. | ||
|
||
TOML is chosen since it has enough features to do what we want. We may need to | ||
research on ini files are good enough since those are already inside Python's | ||
standard library. | ||
|
||
Here are a couple of examples: | ||
|
||
**GoVetBear/metadata.toml** | ||
```toml | ||
[identity] | ||
name = "GoVetBear" | ||
description = """\ | ||
Analyze Go code and raise suspicious constructs, such as printf calls \ | ||
whose arguments do not correctly match the format string, useless \ | ||
assignments, common mistakes about boolean operations, unreachable code, \ | ||
etc.\ | ||
""" | ||
languages = ["Go"] | ||
authors = ["The coala developers"] | ||
authors_email = ["[email protected]"] | ||
license = "AGPL-3.0" | ||
can_detect = ["Unused code", "Smell", "Unreachable Code"] | ||
|
||
[[requirements]] | ||
type = "AnyOneOf" | ||
|
||
[[requirements.child]] | ||
type = "binary" | ||
name = "go" | ||
|
||
[[requirements.child]] | ||
type = "apt" | ||
name = "golang" | ||
|
||
[[requirements]] | ||
type = "GoRequirement" | ||
package = "golang.org/cmd/vet" | ||
flag = "-u" | ||
|
||
[run] | ||
executable = "go" | ||
arguments = "vet" | ||
use_stdout = false | ||
use_stderr = true | ||
output_format = "regex" | ||
output_regex = ".+:(?P<line>\d+): (?P<message>.*)" | ||
``` | ||
|
||
**SpaceConsistencyBear/metadata.toml** | ||
```toml | ||
[identity] | ||
name = "SpaceConsistencyBear" | ||
description = """\ | ||
Check and correct spacing for all textual data. This includes usage of \ | ||
tabs vs. spaces, trailing whitespace and (missing) newlines before \ | ||
the end of the file.\ | ||
""" | ||
languages = ["All"] | ||
authors = ["The coala developers"] | ||
authors_email = ["[email protected]"] | ||
license = "AGPL-3.0" | ||
can_detect = ["Formatting"] | ||
|
||
[[params]] | ||
name = "use_spaces" | ||
description = "True if spaces are to be used instead of tabs." | ||
type = "bool" | ||
|
||
[[params]] | ||
name = "allow_trailing_whitespace" | ||
description = "Whether to allow trailing whitespace or not." | ||
type = "bool" | ||
default = false | ||
|
||
[[params]] | ||
name = "indent_size" | ||
description = "Number of spaces per indentation level" | ||
type = "int" | ||
default = 8 | ||
|
||
[[params]] | ||
name = "enforce_newline_at_EOF" | ||
description = "Whether to enforce a newline at the end of file" | ||
type = "bool" | ||
default = true | ||
format="enforce-newline={}" | ||
|
||
[run] | ||
executable = "bear.py" | ||
local = true | ||
use_coala_logging_style = true | ||
``` | ||
|
||
As you can see from SpaceConsistencyBear example, It is treated not as a Python | ||
code running under coala but rather if it was it's own linter. The `local` | ||
variable is simply to indicate the file is inside the directory and not in | ||
`$PATH` and `use_coala_logging_style` variable to tell coala that it's going to | ||
use the common log format. | ||
|
||
Parameters will be given to the process via command arguments when launching. | ||
With the defaults of the above example it will result in the following command | ||
to execute: | ||
|
||
```sh | ||
/usr/local/lib/coala/bears/general/SpaceConsistencyBear/bear.py \ | ||
--allow_trailing_whitespace=false \ | ||
--indent_size=8 \ | ||
enforce-newline=true | ||
``` | ||
|
||
The above example is formatted for reading, the real command will be in one | ||
line. | ||
|
||
**CoffeeLintBear/metadata.toml** | ||
```toml | ||
[identity] | ||
name = "CoffeeLintBear" | ||
description = "Check CoffeeScript for a clean and consistent file" | ||
url = "http://www.coffeelint.org" | ||
languages = ["CoffeeScript"] | ||
authors = ["The coala developers"] | ||
authors_email = ["[email protected]"] | ||
license = "AGPL-3.0" | ||
can_detect = ["Syntax", "Formatting", "Smell", "Complexity", "Duplication"] | ||
|
||
[severity_map] | ||
normal = "warn" | ||
major = "error" | ||
info = "ignore" | ||
|
||
[[requirements]] | ||
type = "binary" | ||
name = "coffeelint" | ||
|
||
[[params]] | ||
name = "max_line_length" | ||
description = "Maximum number of characters per line." | ||
type = "int" | ||
default = 79 | ||
|
||
... | ||
|
||
[prerun] | ||
executable = "generate_config.py" | ||
local = true | ||
use_coala_logging_style = true | ||
|
||
[run] | ||
executable = "bear.py" | ||
ignore_params = true | ||
local = true | ||
use_coala_logging_style = true | ||
``` | ||
|
||
CoffeeLintBear example above indicates how the metadata will look like if it | ||
requires special treatment such as generating configuration files and | ||
translating the output of the linter. | ||
|
||
If it require some special treatment after the linter is executed, a `postrun` | ||
section can be added as well. | ||
|
||
`prerun` and `postrun` section will have the same format as the `run` section. | ||
|
||
## Process | ||
|
||
TODO | ||
|
||
[GoVetBear]: https://github.com/coala/coala-bears/blob/3cb9b148adc0dda51ac890188b38fd968f6058fd/bears/go/GoVetBear.py | ||
[CoffeeLintBear]: https://github.com/coala/coala-bears/blob/3cb9b148adc0dda51ac890188b38fd968f6058fd/bears/coffee_script/CoffeeLintBear.py | ||
[SpaceConsistencyBear]: https://github.com/coala/coala-bears/blob/3cb9b148adc0dda51ac890188b38fd968f6058fd/bears/general/SpaceConsistencyBear.py | ||
[PEP8Bear]: https://github.com/coala/coala-bears/blob/c5a5e201a42c44c159b9c118b062417e4ae4b17f/bears/python/PEP8Bear.py | ||
[Example 1]: https://github.com/coala/coala-bears/commit/3cb9b148adc0dda51ac890188b38fd968f6058fd |