-
Notifications
You must be signed in to change notification settings - Fork 1
/
nlpWordCloud.html
48 lines (44 loc) · 2.99 KB
/
nlpWordCloud.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
<!DOCTYPE html>
<html lang="en">
<head>
<title>NLP and WordCloud</title>
<link rel="stylesheet" href="vendor/tachyons/tachyons.min.css">
<script src="js/compromise.js"></script>
</head>
<body class="Roboto pa1 bg-near-white">
<div class="f4 link tl"><a href="./index.html">Home</a></div>
<section class="cf w-100 pv2 flex flex-wrap">
<div class="fl w-100 f4 tc">
<h1>NLP and WordCloud</h1>
<p class="f4 lh-copy">
I had answered 829 questions in Quora, then promptly scraped almost all those answers and created my own simple github page. I wanted to do more with what I have written. First among them is to create wordclouds.
The word cloud creating algorithms provide unique solution to the problem of placing the words that are having different sizes and shapes. Jason Davies' blog about creating the word cloud is eye opening and his repo on d3.cloud() will be center of attraction in this notebook
</p>
</div>
</section>
<div id="observablehq-nlpCloud-7ba061a0"></div>
<script type="module">
import {Runtime, Inspector} from "https://cdn.jsdelivr.net/npm/@observablehq/runtime@4/dist/runtime.js";
import define from "https://api.observablehq.com/@solverbot/word-clouding-quora-answers-with-d3.js?v=3";
new Runtime().module(define, name => {
if (name === "nlpCloud") return new Inspector(document.querySelector("#observablehq-nlpCloud-7ba061a0"));
});
</script>
<section>
<div class="fl w-100 f4">
<h3>Libraries Used:</h3>
<ul>
<li>Compromise by Spencer Kelly</li>
<li>d3 Cloud by Jason Davies</li>
<li>Observablehq NoteBook</li>
</ul>
<p class="f4 lh-copy">Importing the wordcloud directly from the observable notebook still had some challenges. One is, the HTML section element can confine how the wordcloud svg is rendered. After placing the imported element outside the section element, the cloud rendered. When it was inside the section element, the DOM was populating with all the sub-elements. Only they are not visible on the page.
</p>
<p class="f4 lh-copy">
After reading through multiple notebooks on NLP libraries, the <a href="https://observablehq.com/@randomfractals/nlp-word-cloud">NLP with Compromise </a>notebook nailed it. Compromise does the heavy lifting, with helper function that get word frequencies out of the answers, and put them on the cloud itself.</p>
<p class="f4 lh-copy">d3-cloud library that is native inside observable is different from the library that is refered by the books
and tutorials out there. The custom layouts like word cloud needs additional study, and lot more time to make it right. My study here is still <strong>incomplete. </strong>I have decided to spend lot more time programming with text going forward, in Javascript.
</p>
</div>
</section>
</body>