-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathREADME.qmd
172 lines (124 loc) · 5.56 KB
/
README.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
---
format: gfm
engine: knitr
execute:
warning: false
message: false
echo: false
---
<!-- README.md is generated by README.qmd. Please edit that file. -->
# Testing the gradient predictions of the monocentric city model in Addis Ababa
::: callout-note
This repo contains **replication code** and **data** for the paper [Beze (2024)](https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4803607).
The real estate dataset is available on [Zenodo](https://zenodo.org/records/11205969). For more details, see the [data section](#housing-data) below.
> Listen to an ai-generated podcast about this project, thanks to [NotebookLM](https://notebooklm.google.com).
[](https://github-production-user-asset-6210df.s3.amazonaws.com/49317723/399504592-cd420903-d663-453b-8fad-48988c7df2c5.webm)
:::
## Requirements
::: callout-tip
The order in which the scripts should be run is provided in [script/main.sh](./script/main.sh).
:::
<details>
<summary>Expand</summary>
```{r}
out = system(r"(python3 -c 'import json
import sys
with open("renv.lock", "r") as f:
r_version = json.load(f)["R"]["Version"]
py_version = sys.version[0:4]
print(r_version, py_version)')", intern = TRUE
)
r_version = strsplit(out, " ")[[1]][1]
py_version = strsplit(out, " ")[[1]][2]
```
- R `{r} r_version`
> The necessary R packages are listed in the `renv.lock` file. You can install them by running the following command in the R console:
>
> ```r
> # renv::init() # to initialize renv on the project if you don't clone the repo
> renv::restore()
> ```
- Python `{r} py_version`
> The necessary Python packages are listed in the `requirements.txt` file. You can install them with [uv](https://github.com/astral-sh/uv):
>
> ``` bash
> uv pip install -r requirements.txt
> ```
</details>
## Data
The data used in the analysis constitutes two main parts: real estate data and building footprint data.
### Housing data {#housing-data}
::: callout-important
## Data availability {#data-availability}
The dataset has been published on Zenodo and can be accessed [here](https://zenodo.org/records/11205969).
:::
<details>
<summary>Variable description</summary>
```{r}
#| label: tbl-var-description
#| tbl-cap: "Variable description"
#| output: asis
library(tinytable)
description = read.csv("./data/data_description.csv")
description |>
tt() |>
format_tt(markdown = TRUE, escape = TRUE) |>
theme_tt(theme = "resize", direction = "down")
```
</details>
If you want to reproduce the data using the scripts, you can follow the steps in [script/main.sh](./script/main.sh).
> If you run the scripts successfully, you will have:
> The primary dataset for the analysis is constructed from [data/housing/processed/listings_cleaned.csv](data/housing/processed/listings_cleaned.csv), a cleaned version of the scraped data from all providers. The raw data is available in [data/housing/raw](data/housing/raw) for the providers included in the analysis. Missing attributes in the dataset are imputed using [`Gemini Pro`](https://github.com/google-gemini/generative-ai-python), and the imputed data can be found in [data/housing/processed/structured/tidy](data/housing/processed/structured/tidy/). Finally, property addresses are geocoded using [Google Places API](https://developers.google.com/maps/documentation/places/web-service) and [OSM nominatim](https://nominatim.openstreetmap.org/ui/search.html). The georeferenced data is available in [data/housing/processed/tidy/listings_cleaned_tidy\_\_geocoded.csv](data/housing/processed/tidy/listings_cleaned_tidy__geocoded.csv).
::: callout-important
During web scraping, I tried to respect the `robots.txt` file of the website. See the contents in [data/housing/robots_txt](data/housing/robots_txt/).
:::
<details>
<summary>A list of real estate providers in Addis</summary>
```{r}
#| label: tbl-real-estate-providers
#| tbl-cap: "List of online real estate providers"
#| output: asis
providers = read.csv("./data/housing/real_estate_providers.csv", header = TRUE)
providers[] = lapply(providers, trimws)
notes = "Notes: The number of ads is as of April 2024. Qefira shut down in June 2023."
providers = read.csv("./data/housing/real_estate_providers.csv")
providers |>
transform(name = sprintf("[%s](%s)", name, url)) |>
{
\(.x) .x[, c("name", "num_ads")]
}() |>
sort_by(~ list(-num_ads)) |>
tt(notes = notes) |>
format_tt(markdown = TRUE, escape = TRUE) |>
theme_tt(theme = "resize", direction = "down")
```
</details>
### Building footprint datasets
The building variables are extracted from two sources:
- The German Aerospace Center ([DLR](https://www.dlr.de/en)): the World Settlement Footprint [(WSF) 3D](https://geoservice.dlr.de/web/maps/eoc:wsf3d) and [WSF 2019v1](https://download.geoservice.dlr.de/WSF2019/) datasets.
- [Open buildings](https://sites.research.google/open-buildings/) from Google.
## Citation
Please cite the paper or dataset for any use of the code or data in this repository.
```bibtex
@article{Beze_2024,
title = {Testing the Gradient Predictions of the Monocentric City Model in Addis Ababa},
ISSN = {1556-5068},
url = {https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4803607},
DOI = {10.2139/ssrn.4803607},
journal = {SSRN Electronic Journal},
publisher = {Elsevier BV},
author = {Beze, Eyayaw},
year = {2024}
}
```
```bibtex
@misc{Beze_2024_dataset,
title = {Georeferenced real estate data for Addis Ababa},
author = {Beze, Eyayaw},
year = {2024},
doi = {10.5281/ZENODO.11205969},
url = {https://zenodo.org/doi/10.5281/zenodo.11205969},
publisher = {Zenodo},
copyright = {Creative Commons Attribution 4.0 International}
}
```