Skip to content

Commit

Permalink
medium final
Browse files Browse the repository at this point in the history
  • Loading branch information
jjone36 committed Dec 24, 2018
1 parent 5f32969 commit 45b9f97
Show file tree
Hide file tree
Showing 3 changed files with 26,153 additions and 410 deletions.
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Online Retail.xlsx
241 changes: 117 additions & 124 deletions Cohort_Anaylsis_1.ipynb
Original file line number Diff line number Diff line change
@@ -1,12 +1,5 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Customer Segmentation Anaylsis"
]
},
{
"cell_type": "markdown",
"metadata": {},
Expand Down Expand Up @@ -98,77 +91,77 @@
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>548670</td>\n",
" <td>21975</td>\n",
" <td>PACK OF 60 DINOSAUR CAKE CASES</td>\n",
" <td>1</td>\n",
" <td>2011-04-01 15:37:00</td>\n",
" <td>0.55</td>\n",
" <td>15356.0</td>\n",
" <td>542401</td>\n",
" <td>22502</td>\n",
" <td>PICNIC BASKET WICKER SMALL</td>\n",
" <td>2</td>\n",
" <td>2011-01-27 15:51:00</td>\n",
" <td>5.95</td>\n",
" <td>14541.0</td>\n",
" <td>United Kingdom</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>580513</td>\n",
" <td>23321</td>\n",
" <td>SMALL WHITE HEART OF WICKER</td>\n",
" <td>542231</td>\n",
" <td>22726</td>\n",
" <td>ALARM CLOCK BAKELIKE GREEN</td>\n",
" <td>2</td>\n",
" <td>2011-12-04 13:59:00</td>\n",
" <td>1.65</td>\n",
" <td>14456.0</td>\n",
" <td>2011-01-26 13:40:00</td>\n",
" <td>3.75</td>\n",
" <td>16714.0</td>\n",
" <td>United Kingdom</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>568531</td>\n",
" <td>23505</td>\n",
" <td>PLAYING CARDS I LOVE LONDON</td>\n",
" <td>3</td>\n",
" <td>2011-09-27 13:49:00</td>\n",
" <td>1.25</td>\n",
" <td>16713.0</td>\n",
" <td>556956</td>\n",
" <td>22090</td>\n",
" <td>PAPER BUNTING RETROSPOT</td>\n",
" <td>40</td>\n",
" <td>2011-06-16 09:04:00</td>\n",
" <td>2.55</td>\n",
" <td>13694.0</td>\n",
" <td>United Kingdom</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>544178</td>\n",
" <td>21411</td>\n",
" <td>GINGHAM HEART DOORSTOP RED</td>\n",
" <td>3</td>\n",
" <td>2011-02-16 14:40:00</td>\n",
" <td>4.25</td>\n",
" <td>14543.0</td>\n",
" <td>573874</td>\n",
" <td>23581</td>\n",
" <td>JUMBO BAG PAISLEY PARK</td>\n",
" <td>10</td>\n",
" <td>2011-11-01 12:45:00</td>\n",
" <td>2.08</td>\n",
" <td>13868.0</td>\n",
" <td>United Kingdom</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>565381</td>\n",
" <td>22457</td>\n",
" <td>NATURAL SLATE HEART CHALKBOARD</td>\n",
" <td>6</td>\n",
" <td>2011-09-02 15:23:00</td>\n",
" <td>2.95</td>\n",
" <td>16173.0</td>\n",
" <td>580742</td>\n",
" <td>23343</td>\n",
" <td>JUMBO BAG VINTAGE CHRISTMAS</td>\n",
" <td>200</td>\n",
" <td>2011-12-06 09:30:00</td>\n",
" <td>1.75</td>\n",
" <td>13694.0</td>\n",
" <td>United Kingdom</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" InvoiceNo StockCode Description Quantity \\\n",
"0 548670 21975 PACK OF 60 DINOSAUR CAKE CASES 1 \n",
"1 580513 23321 SMALL WHITE HEART OF WICKER 2 \n",
"2 568531 23505 PLAYING CARDS I LOVE LONDON 3 \n",
"3 544178 21411 GINGHAM HEART DOORSTOP RED 3 \n",
"4 565381 22457 NATURAL SLATE HEART CHALKBOARD 6 \n",
" InvoiceNo StockCode Description Quantity \\\n",
"0 542401 22502 PICNIC BASKET WICKER SMALL 2 \n",
"1 542231 22726 ALARM CLOCK BAKELIKE GREEN 2 \n",
"2 556956 22090 PAPER BUNTING RETROSPOT 40 \n",
"3 573874 23581 JUMBO BAG PAISLEY PARK 10 \n",
"4 580742 23343 JUMBO BAG VINTAGE CHRISTMAS 200 \n",
"\n",
" InvoiceDate UnitPrice CustomerID Country \n",
"0 2011-04-01 15:37:00 0.55 15356.0 United Kingdom \n",
"1 2011-12-04 13:59:00 1.65 14456.0 United Kingdom \n",
"2 2011-09-27 13:49:00 1.25 16713.0 United Kingdom \n",
"3 2011-02-16 14:40:00 4.25 14543.0 United Kingdom \n",
"4 2011-09-02 15:23:00 2.95 16173.0 United Kingdom "
"0 2011-01-27 15:51:00 5.95 14541.0 United Kingdom \n",
"1 2011-01-26 13:40:00 3.75 16714.0 United Kingdom \n",
"2 2011-06-16 09:04:00 2.55 13694.0 United Kingdom \n",
"3 2011-11-01 12:45:00 2.08 13868.0 United Kingdom \n",
"4 2011-12-06 09:30:00 1.75 13694.0 United Kingdom "
]
},
"execution_count": 4,
Expand All @@ -179,7 +172,7 @@
"source": [
"# use a subset of full data\n",
"np.random.seed(306)\n",
"online = online.sample(frac = .2).reset_index(drop = True)\n",
"online = online.sample(frac = .3).reset_index(drop = True)\n",
"online.head()"
]
},
Expand All @@ -191,11 +184,11 @@
{
"data": {
"text/plain": [
"0 2011-04-01\n",
"1 2011-12-04\n",
"2 2011-09-27\n",
"3 2011-02-16\n",
"4 2011-09-02\n",
"0 2011-01-27\n",
"1 2011-01-26\n",
"2 2011-06-16\n",
"3 2011-11-01\n",
"4 2011-12-06\n",
"Name: InvoiceDay, dtype: datetime64[ns]"
]
},
Expand All @@ -212,16 +205,16 @@
},
{
"cell_type": "code",
"execution_count": 85,
"execution_count": 6,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"4116"
"4222"
]
},
"execution_count": 85,
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
Expand All @@ -233,7 +226,7 @@
},
{
"cell_type": "code",
"execution_count": 6,
"execution_count": 8,
"metadata": {},
"outputs": [
{
Expand Down Expand Up @@ -272,97 +265,97 @@
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>548670</td>\n",
" <td>21975</td>\n",
" <td>PACK OF 60 DINOSAUR CAKE CASES</td>\n",
" <td>1</td>\n",
" <td>2011-04-01 15:37:00</td>\n",
" <td>0.55</td>\n",
" <td>15356.0</td>\n",
" <td>542401</td>\n",
" <td>22502</td>\n",
" <td>PICNIC BASKET WICKER SMALL</td>\n",
" <td>2</td>\n",
" <td>2011-01-27 15:51:00</td>\n",
" <td>5.95</td>\n",
" <td>14541.0</td>\n",
" <td>United Kingdom</td>\n",
" <td>2011-04-01</td>\n",
" <td>2010-12-06</td>\n",
" <td>2011-01-27</td>\n",
" <td>2011-01-24</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>580513</td>\n",
" <td>23321</td>\n",
" <td>SMALL WHITE HEART OF WICKER</td>\n",
" <td>542231</td>\n",
" <td>22726</td>\n",
" <td>ALARM CLOCK BAKELIKE GREEN</td>\n",
" <td>2</td>\n",
" <td>2011-12-04 13:59:00</td>\n",
" <td>1.65</td>\n",
" <td>14456.0</td>\n",
" <td>2011-01-26 13:40:00</td>\n",
" <td>3.75</td>\n",
" <td>16714.0</td>\n",
" <td>United Kingdom</td>\n",
" <td>2011-12-04</td>\n",
" <td>2011-07-20</td>\n",
" <td>2011-01-26</td>\n",
" <td>2011-01-26</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>568531</td>\n",
" <td>23505</td>\n",
" <td>PLAYING CARDS I LOVE LONDON</td>\n",
" <td>3</td>\n",
" <td>2011-09-27 13:49:00</td>\n",
" <td>1.25</td>\n",
" <td>16713.0</td>\n",
" <td>556956</td>\n",
" <td>22090</td>\n",
" <td>PAPER BUNTING RETROSPOT</td>\n",
" <td>40</td>\n",
" <td>2011-06-16 09:04:00</td>\n",
" <td>2.55</td>\n",
" <td>13694.0</td>\n",
" <td>United Kingdom</td>\n",
" <td>2011-09-27</td>\n",
" <td>2010-12-08</td>\n",
" <td>2011-06-16</td>\n",
" <td>2010-12-01</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>544178</td>\n",
" <td>21411</td>\n",
" <td>GINGHAM HEART DOORSTOP RED</td>\n",
" <td>3</td>\n",
" <td>2011-02-16 14:40:00</td>\n",
" <td>4.25</td>\n",
" <td>14543.0</td>\n",
" <td>573874</td>\n",
" <td>23581</td>\n",
" <td>JUMBO BAG PAISLEY PARK</td>\n",
" <td>10</td>\n",
" <td>2011-11-01 12:45:00</td>\n",
" <td>2.08</td>\n",
" <td>13868.0</td>\n",
" <td>United Kingdom</td>\n",
" <td>2011-02-16</td>\n",
" <td>2010-12-10</td>\n",
" <td>2011-11-01</td>\n",
" <td>2011-11-01</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>565381</td>\n",
" <td>22457</td>\n",
" <td>NATURAL SLATE HEART CHALKBOARD</td>\n",
" <td>6</td>\n",
" <td>2011-09-02 15:23:00</td>\n",
" <td>2.95</td>\n",
" <td>16173.0</td>\n",
" <td>580742</td>\n",
" <td>23343</td>\n",
" <td>JUMBO BAG VINTAGE CHRISTMAS</td>\n",
" <td>200</td>\n",
" <td>2011-12-06 09:30:00</td>\n",
" <td>1.75</td>\n",
" <td>13694.0</td>\n",
" <td>United Kingdom</td>\n",
" <td>2011-09-02</td>\n",
" <td>2011-09-02</td>\n",
" <td>2011-12-06</td>\n",
" <td>2010-12-01</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" InvoiceNo StockCode Description Quantity \\\n",
"0 548670 21975 PACK OF 60 DINOSAUR CAKE CASES 1 \n",
"1 580513 23321 SMALL WHITE HEART OF WICKER 2 \n",
"2 568531 23505 PLAYING CARDS I LOVE LONDON 3 \n",
"3 544178 21411 GINGHAM HEART DOORSTOP RED 3 \n",
"4 565381 22457 NATURAL SLATE HEART CHALKBOARD 6 \n",
" InvoiceNo StockCode Description Quantity \\\n",
"0 542401 22502 PICNIC BASKET WICKER SMALL 2 \n",
"1 542231 22726 ALARM CLOCK BAKELIKE GREEN 2 \n",
"2 556956 22090 PAPER BUNTING RETROSPOT 40 \n",
"3 573874 23581 JUMBO BAG PAISLEY PARK 10 \n",
"4 580742 23343 JUMBO BAG VINTAGE CHRISTMAS 200 \n",
"\n",
" InvoiceDate UnitPrice CustomerID Country InvoiceDay \\\n",
"0 2011-04-01 15:37:00 0.55 15356.0 United Kingdom 2011-04-01 \n",
"1 2011-12-04 13:59:00 1.65 14456.0 United Kingdom 2011-12-04 \n",
"2 2011-09-27 13:49:00 1.25 16713.0 United Kingdom 2011-09-27 \n",
"3 2011-02-16 14:40:00 4.25 14543.0 United Kingdom 2011-02-16 \n",
"4 2011-09-02 15:23:00 2.95 16173.0 United Kingdom 2011-09-02 \n",
"0 2011-01-27 15:51:00 5.95 14541.0 United Kingdom 2011-01-27 \n",
"1 2011-01-26 13:40:00 3.75 16714.0 United Kingdom 2011-01-26 \n",
"2 2011-06-16 09:04:00 2.55 13694.0 United Kingdom 2011-06-16 \n",
"3 2011-11-01 12:45:00 2.08 13868.0 United Kingdom 2011-11-01 \n",
"4 2011-12-06 09:30:00 1.75 13694.0 United Kingdom 2011-12-06 \n",
"\n",
" CohortDay \n",
"0 2010-12-06 \n",
"1 2011-07-20 \n",
"2 2010-12-08 \n",
"3 2010-12-10 \n",
"4 2011-09-02 "
"0 2011-01-24 \n",
"1 2011-01-26 \n",
"2 2010-12-01 \n",
"3 2011-11-01 \n",
"4 2010-12-01 "
]
},
"execution_count": 6,
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
Expand All @@ -377,7 +370,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"`InvoiceDay` is the date the customr made an order and `CohortDay` is the date that each customer made the first order. Cause the `CohortDay` is made by grouping for each customer, you can understand 'cohort' as each customer. To seperate the date columns into year, month and day part by defining a funtion to extract the year, month, and day from date columns."
"`InvoiceDay` is the date the customer made an order and `CohortDay` is the date that each customer made the first order. To seperate the date columns into year, month and day part by defining a funtion to extract the year, month, and day from date columns."
]
},
{
Expand Down
Loading

0 comments on commit 45b9f97

Please sign in to comment.