Skip to content

Commit

Permalink
add lx.DataArray
Browse files Browse the repository at this point in the history
  • Loading branch information
jpn-- committed Jan 31, 2022
1 parent d168fb2 commit 971ad15
Show file tree
Hide file tree
Showing 11 changed files with 508 additions and 216 deletions.
2 changes: 2 additions & 0 deletions book/_config.yml
Original file line number Diff line number Diff line change
Expand Up @@ -73,3 +73,5 @@ sphinx:
conda:
- "https://docs.conda.io/projects/conda/en/latest/"
- null
config:
bibtex_reference_style: author_year
1 change: 1 addition & 0 deletions book/_toc.yml
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@ parts:
- file: user-guide/data-fundamentals
title: Data Fundamentals
- file: user-guide/linear-funcs
- file: bibliography

- caption: API Reference
chapters:
Expand Down
5 changes: 5 additions & 0 deletions book/bibliography.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
# References

```{bibliography}
:style: plain
```
4 changes: 2 additions & 2 deletions book/example/000_mtc_data.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"The MTC sample dataset is the same data used in the Self Instructing Manual for discrete choice modeling:\n",
"The MTC sample dataset is the same data used in the Self Instructing Manual {cite:p}`koppelman2006self` for discrete choice modeling:\n",
"\n",
"> The San Francisco Bay Area work mode choice data set comprises 5029 home-to-work commute trips in the\n",
"> San Francisco Bay Area. The data is drawn from the San Francisco Bay Area Household Travel Survey\n",
Expand Down Expand Up @@ -226,4 +226,4 @@
},
"nbformat": 4,
"nbformat_minor": 2
}
}
175 changes: 75 additions & 100 deletions book/example/201_exville_mode_choice.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -18,11 +18,9 @@
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"import numpy as np\n",
"import pandas as pd \n",
"# HIDDEN\n",
"import larch.numba as lx\n",
"from larch import P, X"
"from pytest import approx"
]
},
{
Expand All @@ -31,7 +29,11 @@
"metadata": {},
"outputs": [],
"source": [
"lx.__version__"
"import os\n",
"import numpy as np\n",
"import pandas as pd \n",
"import larch.numba as lx\n",
"from larch import P, X"
]
},
{
Expand Down Expand Up @@ -72,9 +74,7 @@
"The Exampville data output contains a set of files similar to what we might\n",
"find for a real travel survey: network skims, and tables of households, persons,\n",
"and tours. We'll need to connect these tables together to create a composite dataset\n",
"for mode choice model estimation.\n",
"\n",
"We can merge data from other tables using the usual pandas syntax for merging.\n"
"for mode choice model estimation, using the DataTree structure."
]
},
{
Expand All @@ -100,71 +100,20 @@
"metadata": {},
"outputs": [],
"source": [
"tour"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"tour_dataset = lx.Dataset.from_idco(tour.set_index('TOURID'), alts=Mode)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"tour_dataset"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"dt = lx.DataTree(tour=tour_dataset)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"dt.add_dataset('hh', hh.set_index('HHID'), relationships=\"tours.HHID @ hh.HHID\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"dt.add_dataset('person', pp.set_index('PERSONID'), relationships=\"tours.PERSONID @ person.PERSONID\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"tour_dataset = lx.Dataset.from_idco(tour.set_index('TOURID'), alts=Mode)\n",
"od_skims = lx.Dataset.from_omx(skims)\n",
"dt.add_dataset(\n",
" 'od', od_skims, \n",
"\n",
"dt = lx.DataTree(\n",
" tour=tour_dataset,\n",
" hh=hh.set_index('HHID'),\n",
" person=pp.set_index('PERSONID'),\n",
" od=od_skims,\n",
" do=od_skims,\n",
" relationships=(\n",
" \"tours.HHID @ hh.HHID\",\n",
" \"tours.PERSONID @ person.PERSONID\",\n",
" \"hh.HOMETAZ @ od.otaz\",\n",
" \"tours.DTAZ @ od.dtaz\",\n",
" ),\n",
")\n",
"dt.add_dataset(\n",
" 'do', od_skims, \n",
" relationships=(\n",
" \"hh.HOMETAZ @ do.dtaz\",\n",
" \"tours.DTAZ @ do.otaz\",\n",
" ),\n",
Expand Down Expand Up @@ -281,15 +230,6 @@
"'Motor' nest."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"dt_work.root_dataset"
]
},
{
"cell_type": "code",
"execution_count": null,
Expand Down Expand Up @@ -379,24 +319,6 @@
"## Model Estimation"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"m.loglike()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"m.data_as_loaded"
]
},
{
"cell_type": "markdown",
"metadata": {},
Expand All @@ -411,7 +333,7 @@
"metadata": {},
"outputs": [],
"source": [
"# m.dataframes.choice_avail_summary()"
"m.choice_avail_summary()"
]
},
{
Expand All @@ -420,7 +342,21 @@
"metadata": {},
"outputs": [],
"source": [
"# m.dataframes.data_co.statistics()"
"# TEST\n",
"summary = m.choice_avail_summary()\n",
"assert (summary.to_markdown()) == '''\n",
"| | name | chosen | available | availability condition |\n",
"|:---------------------------|:---------|---------:|:------------|:-------------------------|\n",
"| 1 | DA | 810 | 7564 | AGE >= 16 |\n",
"| 2 | SR | 196 | 4179 | 1 |\n",
"| 3 | Walk | 72 | 7564 | WALK_TIME < 60 |\n",
"| 4 | Bike | 434 | 4199 | BIKE_TIME < 60 |\n",
"| 5 | Transit | 6862 | 7564 | TRANSIT_FARE>0 |\n",
"| 6 | Car | 268 | 7564 | |\n",
"| 7 | NonMotor | 7296 | 7564 | |\n",
"| 8 | Motor | 7564 | 7564 | |\n",
"| < Total All Alternatives > | | 6052 | | |\n",
"'''[1:-1]"
]
},
{
Expand All @@ -437,7 +373,18 @@
"metadata": {},
"outputs": [],
"source": [
"result = m.maximize_loglike(method='slsqp')"
"m.set_cap(20) # improves optimization stability\n",
"result = m.maximize_loglike()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# TEST\n",
"assert result.loglike == approx(-3493.0397298749467)"
]
},
{
Expand Down Expand Up @@ -474,6 +421,34 @@
"m.parameter_summary()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# TEST\n",
"assert (m.parameter_summary().data.to_markdown()) == '''\n",
"| | Value | Std Err | t Stat | Signif | Null Value |\n",
"|:------------------|--------:|----------:|---------:|:---------|-------------:|\n",
"| ASC_Bike | -0.258 | 1.34 | -0.19 | | 0 |\n",
"| ASC_SR | 1.42 | 1 | 1.42 | | 0 |\n",
"| ASC_Transit | 6.75 | 2.06 | 3.27 | ** | 0 |\n",
"| ASC_Walk | 8.62 | 1.14 | 7.57 | *** | 0 |\n",
"| Cost | -0.176 | 0.12 | -1.47 | | 0 |\n",
"| InVehTime | -0.124 | 0.0292 | -4.24 | *** | 0 |\n",
"| LogIncome:Bike | -0.197 | 0.124 | -1.59 | | 0 |\n",
"| LogIncome:SR | -0.194 | 0.135 | -1.43 | | 0 |\n",
"| LogIncome:Transit | -0.557 | 0.169 | -3.29 | *** | 0 |\n",
"| LogIncome:Walk | -0.523 | 0.1 | -5.21 | *** | 0 |\n",
"| Mu:Car | 0.259 | 0.181 | -4.1 | *** | 1 |\n",
"| Mu:Motor | 0.802 | 0.201 | -0.99 | | 1 |\n",
"| Mu:NonMotor | 0.854 | 0.112 | -1.3 | | 1 |\n",
"| NonMotorTime | -0.266 | 0.0163 | -16.29 | *** | 0 |\n",
"| OutVehTime | -0.255 | 0.0646 | -3.95 | *** | 0 |\n",
"'''[1:-1]"
]
},
{
"cell_type": "code",
"execution_count": null,
Expand Down Expand Up @@ -558,7 +533,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.2"
"version": "3.9.9"
},
"toc": {
"base_numbering": 1,
Expand Down
Loading

0 comments on commit 971ad15

Please sign in to comment.