diff --git a/master/404.html b/master/404.html new file mode 100644 index 000000000..7d0235bfd --- /dev/null +++ b/master/404.html @@ -0,0 +1 @@ +
In this section, we will see how you can deploy your pipeline as a REST API using the power of FastAPI.
Let's create a simple NLP model, that can:
You know the drill:
import edsnlp, edsnlp.pipes as eds
+
+nlp = edsnlp.blank('eds')
+nlp.add_pipe(eds.sentences())
+nlp.add_pipe(
+ eds.matcher(
+ regex=dict(
+ covid=[
+ "covid",
+ r"covid[-\s]?19",
+ r"sars[-\s]?cov[-\s]?2",
+ r"corona[-\s]?virus",
+ ],
+ ),
+ attr="LOWER",
+ ),
+)
+nlp.add_pipe(eds.negation())
+nlp.add_pipe(eds.family())
+nlp.add_pipe(eds.hypothesis())
+nlp.add_pipe(eds.rspeech())
+
FastAPI is a incredibly efficient framework, based on Python type hints from the ground up, with the help of Pydantic (another great library for building modern Python). We won't go into too much detail about FastAPI in this tutorial. For further information on how the framework operates, go to its excellent documentation!
We'll need to create two things:
from typing import List
+
+from pydantic import BaseModel
+
+
+class Entity(BaseModel): # (1)
+
+ # OMOP-style attributes
+ start: int
+ end: int
+ label: str
+ lexical_variant: str
+ normalized_variant: str
+
+ # Qualifiers
+ negated: bool
+ hypothesis: bool
+ family: bool
+ reported_speech: bool
+
+
+class Document(BaseModel): # (2)
+ text: str
+ ents: List[Entity]
+
Entity
model contains attributes that define a matched entity, as well as variables that contain the output of the qualifier components.Document
model contains the input text, and a list of detected entitiesHaving defined the output models and the pipeline, we can move on to creating the application itself:
from typing import List
+
+from fastapi import FastAPI
+
+from pipeline import nlp
+from models import Entity, Document
+
+
+app = FastAPI(title="EDS-NLP", version=edsnlp.__version__)
+
+
+@app.post("/covid", response_model=List[Document]) # (1)
+async def process(
+ notes: List[str], # (2)
+):
+
+ documents = []
+
+ for doc in nlp.pipe(notes):
+ entities = []
+
+ for ent in doc.ents:
+ entity = Entity(
+ start=ent.start_char,
+ end=ent.end_char,
+ label=ent.label_,
+ lexical_variant=ent.text,
+ normalized_variant=ent._.normalized_variant,
+ negated=ent._.negation,
+ hypothesis=ent._.hypothesis,
+ family=ent._.family,
+ reported_speech=ent._.reported_speech,
+ )
+ entities.append(entity)
+
+ documents.append(
+ Document(
+ text=doc.text,
+ ents=entities,
+ )
+ )
+
+ return documents
+
POST
request body. As a bonus, you get data validation for free.Our simple API is ready to launch! We'll just need to install FastAPI along with a ASGI server to run it. This can be done in one go:
$ pip install 'fastapi[uvicorn]'
+---> 100%
+color:green Successfully installed fastapi
+
Launching the API is trivial:
$ uvicorn app:app --reload
+
Go to localhost:8000/docs
to admire the automatically generated documentation!
You can try the API directly from the documentation. Otherwise, you may use the requests
package:
import requests
+
+notes = [
+ "Le père du patient n'est pas atteint de la covid.",
+ "Probable coronavirus.",
+]
+
+r = requests.post(
+ "http://localhost:8000/covid",
+ json=notes,
+)
+
+r.json()
+
You should get something like:
[
+ {
+ "text": "Le père du patient n'est pas atteint de la covid.",
+ "ents": [
+ {
+ "start": 43,
+ "end": 48,
+ "label": "covid",
+ "lexical_variant": "covid",
+ "normalized_variant": "covid",
+ "negated": true,
+ "hypothesis": false,
+ "family": true,
+ "reported_speech": false
+ }
+ ]
+ },
+ {
+ "text": "Probable coronavirus.",
+ "ents": [
+ {
+ "start": 9,
+ "end": 20,
+ "label": "covid",
+ "lexical_variant": "coronavirus",
+ "normalized_variant": "coronavirus",
+ "negated": false,
+ "hypothesis": true,
+ "family": false,
+ "reported_speech": false
+ }
+ ]
+ }
+]
+
In this section, we review a few advanced use cases:
The only ready-to-use components in EDS-NLP are rule-based components. However, that does not prohibit you from exploiting spaCy's machine learning capabilities! You can mix and match machine learning pipelines, trainable or not, with EDS-NLP rule-based components.
In this tutorial, we will explore how you can use static word vectors trained with Gensim within spaCy.
Training the word embedding, however, is outside the scope of this post. You'll find very well designed resources on the subject in Gensim's documenation.
Using Transformer models
spaCy v3 introduced support for Transformer models through their helper library spacy-transformers
that interfaces with HuggingFace's transformers
library.
Using transformer models can significantly increase your model's performance.
spaCy provides a init vectors
CLI utility that takes a Gensim-trained binary and transforms it to a spaCy-readable pipeline.
Using it is straightforward :
$ spacy init vectors fr /path/to/vectors /path/to/pipeline
+---> 100%
+color:green Conversion successful!
+
See the documentation for implementation details.
{"use strict";/*!
+ * escape-html
+ * Copyright(c) 2012-2013 TJ Holowaychuk
+ * Copyright(c) 2015 Andreas Lubbe
+ * Copyright(c) 2015 Tiancheng "Timothy" Gu
+ * MIT Licensed
+ */var _a=/["'&<>]/;Pn.exports=Aa;function Aa(e){var t=""+e,r=_a.exec(t);if(!r)return t;var o,n="",i=0,s=0;for(i=r.index;i