-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathindex.html
119 lines (94 loc) · 9.08 KB
/
index.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en">
<head>
<title>README</title>
<meta http-equiv="Content-Type" content="text/html;charset=utf-8"/>
<meta name="title" content="README"/>
<meta name="generator" content="Org-mode"/>
<meta name="generated" content="2013-05-10 00:54:43 PDT"/>
<meta name="author" content="Jim Blomo"/>
<meta name="description" content=""/>
<meta name="keywords" content=""/>
<link rel="stylesheet" type="text/css" href="slides/production/common.css" />
<link rel="stylesheet" type="text/css" href="slides/production/screen.css" media="screen" />
<link rel="stylesheet" type="text/css" href="slides/production/projection.css" media="projection" />
<link rel="stylesheet" type="text/css" href="slides/production/presenter.css" media="presenter" />
</head>
<body>
<div id="preamble">
</div>
<div id="content">
<h1 class="title">README</h1>
<div id="table-of-contents">
<h2>Table of Contents</h2>
<div id="text-table-of-contents">
<ul>
<li><a href="#sec-1">1 Data Mining 290</a></li>
<li><a href="#sec-2">2 Syllabus</a></li>
</ul>
</div>
</div>
<div id="outline-container-1" class="outline-2">
<h2 id="sec-1"><span class="section-number-2">1</span> Data Mining 290 <span class="tag"><span class="slide">slide</span></span></h2>
<div class="outline-text-2" id="text-1">
<dl>
<dt>Description</dt><dd>Learn how to obtain, clean, visualize, understand, model, and
predict the world around you using data. Grading will consist of homework
(30%), midterm (30%), project (40%).
</dd>
<dt>Instructor</dt><dd>Jim Blomo <jblomo@ischool>
</dd>
<dt>GSI</dt><dd>Shreyas <shreyas@ischool>
</dd>
<dt>Textbook</dt><dd>Han, J., Kamber, M., & Pei, J. (2011). <span style="text-decoration:underline;">Data Mining: Concepts and Techniques</span>, Third Edition <b>(3rd ed.)</b>. Morgan Kaufmann.
</dd>
</dl>
</div>
</div>
<div id="outline-container-2" class="outline-2">
<h2 id="sec-2"><span class="section-number-2">2</span> Syllabus <span class="tag"><span class="slide">slide</span></span></h2>
<div class="outline-text-2" id="text-2">
<p>DM[0-9]+ indicates chapters from the text, <span style="text-decoration:underline;">Data Mining</span>.
</p>
<table border="2" cellspacing="0" cellpadding="6" rules="groups" frame="hsides">
<caption></caption>
<colgroup><col class="left" /><col class="left" /><col class="left" /><col class="left" />
</colgroup>
<thead>
<tr><th scope="col" class="left">Date</th><th scope="col" class="left">Readings</th><th scope="col" class="left">Slides</th><th scope="col" class="left">Homework / Project</th></tr>
</thead>
<tbody>
<tr><td class="left">Jan 25</td><td class="left"><a href="http://try.github.com">Try Github</a> ; <a href="http://www.dataists.com/2010/09/a-taxonomy-of-data-science/">A Taxonomy of Data Science</a></td><td class="left"><a href="slides/2013-01-25-Intro.html">Class Intro</a> ; Tools Intro by <i>GUEST: Shreyas</i></td><td class="left"><a href="#https-github.com-seekshreyas-Introduction-to-Git-Github">Git Intro</a></td></tr>
<tr><td class="left">Feb 1</td><td class="left">DM1 ; <a href="http://hbswk.hbs.edu/item/6836.html">The Yelp Factor: Are Consumer Reviews Good for Business?</a></td><td class="left"><a href="slides/2013-02-01-CaseStudies.html">Case Studies</a> ; <a href="slides/2013-02-01-Obtaining-Data.html">Obtaining Data</a></td><td class="left"><a href="slides/2013-02-01-Lab.html">Obtain & Explore Data</a></td></tr>
<tr><td class="left">Feb 8</td><td class="left">DM2, DM3</td><td class="left"><a href="slides/2013-02-08-Probability.html">Probability</a> ; <a href="slides/2013-02-08-Preprocessing.html">Preprocessing</a></td><td class="left"><a href="slides/2013-02-08-Lab.html">Data Stats</a></td></tr>
<tr><td class="left">Feb 15</td><td class="left">DM4, <a href="http://www.youtube.com/watch?v=SS27F-hYWfU">Apache Hadoop: Petabytes and Terawatts</a> (<a href="http://prezi.com/u0ukvqzpyh5p/apache-hadoop-petabytes-and-terawatts/">slides</a>); <a href="http://packages.python.org/mrjob/">mrjob docs</a> (for homework)</td><td class="left"><a href="slides/2013-02-15-Data-Warehouse.html">Data Warehouse</a> ; <a href="slides/2013-02-15-MapReduce.html">MapReduce</a></td><td class="left"><a href="slides/2013-02-15-Project.html">Project Details</a> ; <a href="slides/2013-02-15-mrjob.html">mrjob</a></td></tr>
<tr><td class="left">Feb 22</td><td class="left">DM8</td><td class="left"><a href="slides/2013-02-22-Decision-Trees.html">Decision Trees</a>; <a href="slides/2013-02-22-Bayes.html">Naive Bayes</a></td><td class="left"><a href="slides/2013-02-22-Gini.html">Gini Index</a></td></tr>
<tr><td class="left">Mar 1</td><td class="left">DM[9.1-9.3], 9.5 ; <a href="http://scott.fortmann-roe.com/docs/BiasVariance.html">Understanding the Bias-Variance Tradeoff</a></td><td class="left"><a href="slides/2013-03-01-SVM.html">SVM</a> ; <a href="slides/2013-03-01-Neural-Network.html">Neural Networks</a></td><td class="left"><a href="slides/2013-03-01-Lab-NN.html">Neural Network Back Propagation</a></td></tr>
<tr><td class="left">Mar 8</td><td class="left">DM10</td><td class="left"><a href="slides/2013-03-07-Clustering.html">Agglomerative - Clustering</a> ; <a href="slides/2013-03-07-Hierarchical.html">Hierarchical, Density - Clustering</a></td><td class="left"><a href="slides/2013-03-07-k-means.html">K-Means</a></td></tr>
<tr><td class="left">Mar 15</td><td class="left">DM11.1</td><td class="left"><a href="slides/2013-03-15-Review.html">Review</a></td><td class="left">prepare 1 cheat sheet</td></tr>
<tr><td class="left">Mar 22</td><td class="left">1 cheat sheet</td><td class="left"><b>Midterm</b></td><td class="left">-</td></tr>
<tr><td class="left">Mar 29</td><td class="left">HOLIDAY</td></tr>
<tr><td class="left">Apr 5</td><td class="left">DM6</td><td class="left"><a href="slides/2013-03-15-Advanced-Cluster.html">Advanced Clustering</a> ; <a href="slides/2013-04-05-Frequent-Pattern.html">Frequent Pattern</a></td><td class="left"><a href="slides/2013-04-05-AWS.html">AWS</a> ; Project Proposal Due</td></tr>
<tr><td class="left">Apr 12</td><td class="left">DM11.3; <a href="http://ilpubs.stanford.edu:8090/422/1/1999-66.pdf">PageRank</a>; <a href="http://arxiv.org/pdf/1106.5321">Uncovering Social Network Sybils in the Wild</a></td><td class="left"><a href="slides/2013-04-12-Graphs.html">Graphs</a>; <a href="slides/2013-04-12-PageRank.html">PageRank</a></td><td class="left"><a href="slides/2013-04-12-AdjacencyRepresentations.html">Adjacency Representations</a></td></tr>
<tr><td class="left">Apr 19</td><td class="left"><a href="slides/2013-04-19-Nonlinear.pdf">Non-linear regression</a></td><td class="left">GUEST: Gene Lee Ceaser's <a href="slides/RM Pricing Strategy.ppt">Pricing Strategy</a>; <a href="slides/Campus Recruiting Deck_2012_UC Berkeley.ppt">Ceaser's Recruiting</a></td><td class="left"><a href="slides/2013-04-19-Elasticity.html">Price Elasticity</a></td></tr>
<tr><td class="left">Apr 26</td><td class="left">DM12; <a href="http://www.ee.columbia.edu/~dpwe/papers/Wang03-shazam.pdf">Shazam Audio Search</a></td><td class="left"><a href="slides/2013-04-26-Outliers.html">Outliers</a>; <a href="slides/2013-04-26-Multimedia.html">Images & Audio</a></td><td class="left"><a href="slides/2013-04-26-Midterm-HW.html">Midterm Review</a></td></tr>
<tr><td class="left">May 3</td><td class="left"><a href="https://groups.google.com/group/gsofgs/attach/2f1cdd7a999c3ad8/embedded-plots.pdf?part=2&authuser=0">Embedded Plots</a> ; <a href="http://vis.stanford.edu/files/2011-D3-InfoVis.pdf">Data-Driven Documents</a></td><td class="left"><a href="slides/2013-05-03-Visualization.html">Visualization</a> ; <a href="slides/2013-05-03-Yelp-Visualization.html">Yelp's Visualizations</a></td><td class="left"><a href="http://vogievetsky.github.io/IntroD3/">D3 Intro</a> <a href="slides/2013-05-03-D3.html">D3 Lab</a></td></tr>
<tr><td class="left">May 10</td><td class="left"><a href="http://homes.cs.washington.edu/~pedrod/papers/cacm12.pdf">A Few Useful Things to Know about Machine Learning</a> ; <a href="http://www.cs.uvm.edu/~icdm/algorithms/10Algorithms-08.pdf">Top 10 Algorithms in Data Mining</a></td><td class="left"><a href="slides/2013-05-10-Real-World.html">In Real Life</a> ; Presentations</td><td class="left">May 16th: Project Papers Due</td></tr>
<tr><td class="left">May 17</td><td class="left">-</td><td class="left">Final Presentation</td><td class="left">Bye!</td></tr>
</tbody>
</table>
<script type="text/javascript" src="slides/production/org-html-slideshow.js"></script>
<a href="https://github.com/jblomo/datamining290"><img style="position: absolute; top: 0; right: 0; border: 0;" src="https://s3.amazonaws.com/github/ribbons/forkme_right_darkblue_121621.png" alt="Fork me on GitHub"></a>
</div>
</div>
</div>
<div id="postamble">
<p class="date">Date: 2013-05-10 00:54:43 PDT</p>
<p class="author">Author: Jim Blomo</p>
<p class="creator">Org version 7.8.02 with Emacs version 23</p>
<a href="http://validator.w3.org/check?uri=referer">Validate XHTML 1.0</a>
</div>
</body>
</html>