Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use web streams instead of Node.js streams #61

Draft
wants to merge 17 commits into
base: master
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions .eslintignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
.DS_Store
node_modules
coverage
dist
*.log
.vscode/
esm
.eslintcache
12 changes: 6 additions & 6 deletions .github/workflows/pull_request.yml
Original file line number Diff line number Diff line change
Expand Up @@ -4,27 +4,27 @@ on: pull_request

jobs:
lint:
name: Lint on node 14 and ubuntu-latest
name: Lint on node 16 and ubuntu-latest
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- name: Use Node.js 14
- name: Use Node.js 16
uses: actions/setup-node@v1
with:
node-version: '14'
node-version: '16'
- name: Install deps and build (with cache)
uses: bahmutov/npm-install@v1
- name: Lint codebase
run: yarn lint
test:
name: Test and lint on node 14.x and ubuntu-latest
name: Test and lint on node 16.x and ubuntu-latest
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- name: Use Node.js 14.x
- name: Use Node.js 16.x
uses: actions/setup-node@v2
with:
node-version: '14'
node-version: '16'
- name: Install deps (with cache)
uses: bahmutov/npm-install@v1
- name: Test codebase
Expand Down
6 changes: 3 additions & 3 deletions .github/workflows/push.yml
Original file line number Diff line number Diff line change
Expand Up @@ -4,14 +4,14 @@ on: push

jobs:
test:
name: Test and lint on node 14.x and ubuntu-latest
name: Test and lint on node 16.x and ubuntu-latest
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- name: Use Node.js 14.x
- name: Use Node.js 16.x
uses: actions/setup-node@v2
with:
node-version: '14'
node-version: '16'
- name: Install deps (with cache)
uses: bahmutov/npm-install@v1
- name: Test codebase
Expand Down
212 changes: 125 additions & 87 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,57 +17,115 @@ specification](https://github.com/The-Sequence-Ontology/Specifications/blob/mast
with `disableDerivesFromReferences`)
- only compatible with GFF3

## Compatability

Works in the browser and with Node.js v16 and up.

## Install

$ npm install --save @gmod/gff

## Usage

### Node.js example

```js
const gff = require('@gmod/gff').default
// or in ES6 (recommended)
import gff from '@gmod/gff'
import {
createReadStream,
createWriteStream,
readFileSync,
writeFileSync,
} from 'fs'
// Readable.toWeb and Writable.toWeb are only available in Node.js v18 and up
// in Node.js 16, you'll have to provide your own stream source and sink
import { Readable, Writable } from 'stream'
// TransformStream is available without importing in Node.js v18 and up
import { TransformStream } from 'stream/web'
import {
formatSync,
parseStringSync,
GFFTransformer,
GFFFormattingTransformer,
} from '@gmod/gff'

// parse a file from a file name. parses only features and sequences by default,
// set options to parse directives and/or comments
;(async () => {
const readStream = createReadStream('/path/to/my/file.gff3')
const streamOfGFF3 = Readable.toWeb(readStream).pipeThrough(
new TransformStream(
new GFFTransformer({ parseComments: true, parseDirectives: true }),
),
)
for await (const data of streamOfGFF3) {
if ('directive' in data) {
console.log('got a directive', data)
} else if ('comment' in data) {
console.log('got a comment', data)
} else if ('sequence' in data) {
console.log('got a sequence from a FASTA section')
} else {
console.log('got a feature', data)
}
}

const fs = require('fs')
// parse a string of gff3 synchronously
const stringOfGFF3 = readFileSync('/path/to/my/file.gff3', 'utf8')
const arrayOfGFF3ITems = parseStringSync(stringOfGFF3)

// format an array of items to a string
const newStringOfGFF3 = formatSync(arrayOfGFF3ITems)
writeFileSync('/path/to/new/file.gff3', newStringOfGFF3)

// read a file, format it, and write it to a new file. inserts sync marks and
// a '##gff-version 3' header if one is not already present
await Readable.toWeb(createReadStream('/path/to/my/file.gff3'))
.pipeThrough(
new TransformStream(
new GFFTransformer({ parseComments: true, parseDirectives: true }),
),
)
.pipeThrough(new TransformStream(new GFFFormattingTransformer()))
.pipeTo(Writable.toWeb(createWriteStream('/path/to/my/file.gff3')))
})()
```

// parse a file from a file name
// parses only features and sequences by default,
// set options to parse directives and/or comments
fs.createReadStream('path/to/my/file.gff3')
.pipe(gff.parseStream({ parseAll: true }))
.on('data', (data) => {
if (data.directive) {
### Browser example

```js
import { GFFTransformer } from '@gmod/gff'

// parse a file from a URL. parses only features and sequences by default, set
// options to parse directives and/or comments
;(async () => {
const response = await fetch('http://example.com/file.gff3')
if (!response.ok) {
throw new Error('Bad response')
}
if (!response.body) {
throw new Error('No response body')
}
const reader = response.body
.pipeThrough(new TransformStream(new GFFTransformer({ parseAll: true })))
.getReader()
let result
do {
result = await reader.read()
if (result.done) {
continue
}
const data = result.value
if ('directive' in data) {
console.log('got a directive', data)
} else if (data.comment) {
} else if ('comment' in data) {
console.log('got a comment', data)
} else if (data.sequence) {
} else if ('sequence' in data) {
console.log('got a sequence from a FASTA section')
} else {
console.log('got a feature', data)
}
})

// parse a string of gff3 synchronously
const stringOfGFF3 = fs.readFileSync('my_annotations.gff3').toString()
const arrayOfThings = gff.parseStringSync(stringOfGFF3)

// format an array of items to a string
const newStringOfGFF3 = gff.formatSync(arrayOfThings)

// format a stream of things to a stream of text.
// inserts sync marks automatically.
myStreamOfGFF3Objects
.pipe(gff.formatStream())
.pipe(fs.createWriteStream('my_new.gff3'))

// format a stream of things and write it to
// a gff3 file. inserts sync marks and a
// '##gff-version 3' header if one is not
// already present
gff.formatFile(
myStreamOfGFF3Objects,
fs.createWriteStream('my_new_2.gff3', { encoding: 'utf8' }),
)
} while (!result.done)
})()
```

## Object format
Expand Down Expand Up @@ -198,23 +256,22 @@ ACTGACTAGCTAGCATCAGCGTCGTAGCTATTATATTACGGTAGCCA`)[

- [ParseOptions](#parseoptions)
- [disableDerivesFromReferences](#disablederivesfromreferences)
- [encoding](#encoding)
- [parseFeatures](#parsefeatures)
- [parseDirectives](#parsedirectives)
- [parseComments](#parsecomments)
- [parseSequences](#parsesequences)
- [parseAll](#parseall)
- [bufferSize](#buffersize)
- [parseStream](#parsestream)
- [GFFTransformer](#gfftransformer)
- [Parameters](#parameters)
- [parseStringSync](#parsestringsync)
- [Parameters](#parameters-1)
- [formatSync](#formatsync)
- [Parameters](#parameters-2)
- [formatStream](#formatstream)
- [FormatOptions](#formatoptions)
- [minSyncLines](#minsynclines)
- [insertVersionDirective](#insertversiondirective)
- [GFFFormattingTransformer](#gffformattingtransformer)
- [Parameters](#parameters-3)
- [formatFile](#formatfile)
- [Parameters](#parameters-4)

### ParseOptions

Expand All @@ -226,12 +283,6 @@ Whether to resolve references to derives from features

Type: [boolean](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Boolean)

#### encoding

Text encoding of the input GFF3. default 'utf8'

Type: BufferEncoding

#### parseFeatures

Whether to parse features, default true
Expand All @@ -256,29 +307,20 @@ Whether to parse sequences, default true

Type: [boolean](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Boolean)

#### parseAll

Parse all features, directives, comments, and sequences. Overrides other
parsing options. Default false.

Type: [boolean](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Boolean)

#### bufferSize

Maximum number of GFF3 lines to buffer, default 1000
Maximum number of GFF3 lines to buffer, default 50000

Type: [number](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Number)

### parseStream
### GFFTransformer

Parse a stream of text data into a stream of feature, directive, comment,
an sequence objects.

#### Parameters

- `options` **[ParseOptions](#parseoptions)** Parsing options (optional, default `{}`)

Returns **GFFTransform** stream (in objectMode) of parsed items
- `options` **[ParseOptions](#parseoptions)** Parser options (optional, default `{}`)

### parseStringSync

Expand All @@ -288,7 +330,7 @@ parsed items.
#### Parameters

- `str` **[string](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/String)** GFF3 string
- `inputOptions` **({disableDerivesFromReferences: [boolean](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Boolean)?, encoding: BufferEncoding?, bufferSize: [number](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Number)?} | [undefined](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/undefined))?** Parsing options
- `inputOptions` **({disableDerivesFromReferences: [boolean](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Boolean)?, bufferSize: [number](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Number)?} | [undefined](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/undefined))?** Parsing options

Returns **[Array](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Array)<(GFF3Feature | GFF3Sequence)>** array of parsed features, directives, comments and/or sequences

Expand All @@ -303,51 +345,46 @@ GFF3. Does not insert synchronization (###) marks.

Returns **[string](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/String)** the formatted GFF3

### formatStream
### FormatOptions

Format a stream of features, directives, comments and/or sequences into a
stream of GFF3 text.
Formatter options

Inserts synchronization (###) marks automatically.
#### minSyncLines

#### Parameters
The minimum number of lines to emit between sync (###) directives, default
100

- `options` **FormatOptions** parser options (optional, default `{}`)
Type: [number](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Number)

Returns **FormattingTransform**
#### insertVersionDirective

### formatFile
Whether to insert a version directive at the beginning of a formatted
stream if one does not exist already, default true

Format a stream of features, directives, comments and/or sequences into a
GFF3 file and write it to the filesystem.
Type: [boolean](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Boolean)

Inserts synchronization (###) marks and a ##gff-version
directive automatically (if one is not already present).
### GFFFormattingTransformer

#### Parameters
Transform a stream of features, directives, comments and/or sequences into a
stream of GFF3 text.

Inserts synchronization (###) marks automatically.

- `stream` **Readable** the stream to write to the file
- `writeStream` **Writable**
- `options` **FormatOptions** parser options (optional, default `{}`)
- `filename` the file path to write to
#### Parameters

Returns **[Promise](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Promise)\<null>** promise for null that resolves when the stream has been written
- `options` **[FormatOptions](#formatoptions)** Formatter options (optional, default `{}`)

## About `util`

There is also a `util` module that contains super-low-level functions for dealing with lines and parts of lines.

```js
// non-ES6
const util = require('@gmod/gff').default.util
// or, with ES6
import gff from '@gmod/gff'
const util = gff.util
import { util } from '@gmod/gff'

const gff3Lines = util.formatItem({
seq_id: 'ctgA',
...
}))
})
```

## util
Expand Down Expand Up @@ -533,9 +570,10 @@ into one or more lines of GFF3.

#### Parameters

- `itemOrItems` **([GFF3FeatureLineWithRefs](#gff3featurelinewithrefs) | [GFF3Directive](#gff3directive) | [GFF3Comment](#gff3comment) | [GFF3Sequence](#gff3sequence) | [Array](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Array)<([GFF3FeatureLineWithRefs](#gff3featurelinewithrefs) | [GFF3Directive](#gff3directive) | [GFF3Comment](#gff3comment) | [GFF3Sequence](#gff3sequence))>)** A comment, sequence, or feature, or array of such items
- `item` **([GFF3FeatureLineWithRefs](#gff3featurelinewithrefs) | [GFF3Directive](#gff3directive) | [GFF3Comment](#gff3comment) | [GFF3Sequence](#gff3sequence))**&#x20;
- `itemOrItems` A comment, sequence, or feature, or array of such items

Returns **([string](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/String) | [Array](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Array)<[string](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/String)>)** A formatted string or array of strings
Returns **[string](https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/String)** A formatted string or array of strings

### GFF3Attributes

Expand Down
5 changes: 3 additions & 2 deletions jest.config.js
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
/** @type {import('ts-jest/dist/types').InitialOptionsTsJest} */
/** @type {import('ts-jest/dist/types').JestConfigWithTsJest} */
module.exports = {
preset: 'ts-jest',
testEnvironment: 'node',
};
collectCoverageFrom: ['**/src/*.ts'],
}
Loading