This software offers the functionality of the Stanford CoreNLP as HTTP-XML-Server. This avoids the time-consuming initialization every time CoreNLP is started. It is very similar to projects like this Python wrapper.
The server will be listening at http://localhost:8080. The text you want to analyze needs to be POSTed as field text
:
curl --data 'text=Hello world!' http://localhost:8080
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet href="CoreNLP-to-HTML.xsl" type="text/xsl"?>
<root>
<document>
<sentences>
<sentence id="1">
<tokens>
<token id="1">
<word>Hello</word>
<lemma>hello</lemma>
<CharacterOffsetBegin>0</CharacterOffsetBegin>
<CharacterOffsetEnd>5</CharacterOffsetEnd>
<POS>UH</POS>
<NER>O</NER>
</token>
<token id="2">
<word>world</word>
<lemma>world</lemma>
<CharacterOffsetBegin>6</CharacterOffsetBegin>
<CharacterOffsetEnd>11</CharacterOffsetEnd>
<POS>NN</POS>
<NER>O</NER>
</token>
<token id="3">
<word>!</word>
<lemma>!</lemma>
<CharacterOffsetBegin>11</CharacterOffsetBegin>
<CharacterOffsetEnd>12</CharacterOffsetEnd>
<POS>.</POS>
<NER>O</NER>
</token>
</tokens>
<parse>(ROOT (S (VP (NP (INTJ (UH Hello)) (NP (NN world)))) (. !))) </parse>
<dependencies type="basic-dependencies">
<dep type="root">
<governor idx="0">ROOT</governor>
<dependent idx="2">world</dependent>
</dep>
<dep type="discourse">
<governor idx="2">world</governor>
<dependent idx="1">Hello</dependent>
</dep>
</dependencies>
<dependencies type="collapsed-dependencies">
<dep type="root">
<governor idx="0">ROOT</governor>
<dependent idx="2">world</dependent>
</dep>
<dep type="discourse">
<governor idx="2">world</governor>
<dependent idx="1">Hello</dependent>
</dep>
</dependencies>
<dependencies type="collapsed-ccprocessed-dependencies">
<dep type="root">
<governor idx="0">ROOT</governor>
<dependent idx="2">world</dependent>
</dep>
<dep type="discourse">
<governor idx="2">world</governor>
<dependent idx="1">Hello</dependent>
</dep>
</dependencies>
</sentence>
</sentences>
</document>
</root>
Note you can olso try this online at Stanford University. Make sure you choose "XML" as output format. The output you get there only slightly differs from the XML here.
-
Clone the repository:
git clone https://github.com/nlohmann/StanfordCoreNLPXMLServer.git
-
Download and install the third party libraries:
cd StanfordCoreNLPXMLServer ant libs
-
Compile the JAR file:
ant jar
-
Run the server:
ant run
-
The server is now waiting on http://localhost:8080 for HTTP POST requests. Note the initialization can take a few minutes, because several modules and resources of Stanford CoreNLP need to be loaded.
You can also choose a port:
ant run -Dport=9000
- Oracle JDK or OpenJDK version 6 or later
- Apache Ant
The Stanford CoreNLP XML Server uses the following third party libraries:
- Stanford CoreNLP, a suite of core NLP tools
- Simple, a Java based HTTP engine
The libraries can be downloaded and set up using the ant target libs
(see Installation).
- Stanford CoreNLP is licensed under the GNU General Public License (v2 or later).
- Simple is licensed under the Apache License, Version 2.0
Due to compatibility issues (see GNU.org and Apache.org), the Stanford CoreNLP XML Server is licensed under the GNU General Public License Version 3.