This application scrapes a product’s Amazon reviews, extracts entities, and reveals sentiment.
This application uses the jsoup library to scrape and parse HTML from a URL and extract product reviews using CSS selectors.
This application communicates with the TextRazor API to extract entities.
This application communicates with the Google Cloud NL API to detect sentiment score and magnitude.
After cloning the project, you must set up the following two environment variables:
- TEXT_RAZOR_API_KEY: ***
- GOOGLE_APPLICATION_CREDENTIALS: service-account-file.json
For more info, see https://www.textrazor.com/signup & https://cloud.google.com/docs/authentication/production.
You also may need to install Apache Maven (https://maven.apache.org/) on your system.
mvn clean compile
mvn clean compile assembly:single
mvn -q clean compile exec:java -Dexec.executable="service.Main"
mvn clean compile test checkstyle:check spotbugs:check
To see bug details using the Findbugs GUI, use the following command "mvn findbugs:gui"
Or you can create a XML report by using
mvn spotbugs:gui
or
mvn spotbugs:spotbugs
mvn spotbugs:check
For more info see https://spotbugs.readthedocs.io/en/latest/maven.html
CheckStyle code styling configuration files are in config/ directory. Maven checkstyle plugin is set to use google code style.
mvn checkstyle:check
Generate a report in XML format:
target/checkstyle-checker.xml
target/checkstyle-result.xml
Generate a report in HTML format:
mvn checkstyle:checkstyle
target/site/checkstyle.html