-
Notifications
You must be signed in to change notification settings - Fork 1
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
b7d3c03
commit 0d797e1
Showing
13 changed files
with
9,389 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,14 @@ | ||
# india-bridge | ||
|
||
India bridge is a collection of Stata ado programs and datasets tracking district changes in India as between 1951-2011. | ||
|
||
## Directory structure | ||
|
||
- /ado contains `indiabridge.ado`, a Stata program that assigns district & state level identification given a Census year. | ||
- /data contains a bridge/crosswalk Stata .dta, and an Excel file that can be used to track these changes. | ||
- /docs contains relevant references and documentation used to build india-bridge | ||
|
||
## Usage | ||
Running `help indiabridge` in Stata will pop-up a dialog with syntax and examples for use. | ||
|
||
Last updated: Feb 01, 2022 |
Large diffs are not rendered by default.
Oops, something went wrong.
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,107 @@ | ||
*------------------------------------------------------------------------------- | ||
* Objective: Assign india_bridge consistent identifiers | ||
*------------------------------------------------------------------------------- | ||
|
||
* Input: year, state, district (year must be in YYYY entered directly or as a numeric var) | ||
* indiabridge, y() s() d() | ||
|
||
* define program | ||
capture program drop indiabridge | ||
program define indiabridge | ||
* syntax: statename must be string, and year must be specified | ||
syntax [if], Year(string) State(varlist string) District(varlist string) | ||
|
||
* confirm if dependency -egenmore- is available | ||
capture findfile egenmore.sthlp, path(BASE;SITE;PERSONAL;PLUS) | ||
* return error if not available | ||
if "`r(fn)'" == "" { | ||
di as error "User-written package -egenmore- needs to be installed first;" | ||
di as error "Ensure the dependency is available by running -ssc install egenmore- before indiabridge" | ||
exit 498 | ||
} | ||
|
||
* display progress | ||
di as text _dup(99) "_" | ||
di as text "Running india-bridge for year (`year') on state variable/s (`state') and district variable/s (`district')" | ||
di as text _dup(99) "_" | ||
|
||
* pass arguments to state and district programs individually over varlist specified | ||
|
||
* for each statename variable, | ||
di as text "Applying (`year') identification on states: `state'" | ||
|
||
foreach sv in `state'{ | ||
|
||
* run state programs for the given year | ||
ibrstate, y(`year') s(`sv') | ||
|
||
* for each district variable, | ||
di as text "Applying (`year') identification on districts: `district'" | ||
foreach dv in `district'{ | ||
* store isocode from above | ||
local iso iso_`sv' | ||
* run district programs for the given year | ||
ibrdist, y(`year') d(`dv') i(iso_`sv') | ||
* concatenate state & district identifiers to generate unique identifier | ||
if `year' != 2011{ | ||
qui replace dcode_`dv' = scode_`sv' + dcode_`dv' | ||
} | ||
|
||
di as text _dup(99) "-" | ||
di as text "State identifiers assigned in variables: iso_`sv' scode_`sv' ut_`sv'" | ||
di as text _dup(99) "-" | ||
di as text "District codes assigned in variable: dcode_`dv'" | ||
} | ||
} | ||
|
||
di as text _dup(99) "_" | ||
* end indiabridge | ||
end | ||
|
||
|
||
*------------------------------------------------------------------------------- | ||
* subprograms | ||
*------------------------------------------------------------------------------- | ||
* States | ||
* Objective: Assign india_bridge consistent identifiers to a list of state names | ||
*------------------------------------------------------------------------------- | ||
|
||
* define ibrstate (india_bridge state) | ||
capture program drop ibrstate | ||
program define ibrstate | ||
|
||
* syntax: statename must be string, and year must be specified | ||
syntax [if], Year(string) Statevar(varlist string) | ||
|
||
* call state clean + assign programs | ||
sclean `statevar' `year' | ||
scode `statevar' `year' | ||
|
||
* end ibrstate | ||
end | ||
|
||
*------------------------------------------------------------------------------- | ||
* Districts | ||
* Objective: Assign india_bridge consistent identifiers to a column of district names | ||
*------------------------------------------------------------------------------- | ||
|
||
* define ibrdist (india_bridge district) | ||
capture program drop ibrdist | ||
program define ibrdist | ||
|
||
* syntax: distname must be string, and year must be specified along with iso code | ||
syntax [if], Year(string) Distvar(varlist string) Isocode(varlist string) | ||
|
||
* backup original state name, and generate new sieved name | ||
qui rename `distvar' `distvar'_raw | ||
qui egen `distvar' = sieve(`distvar'_raw), keep(a) | ||
* trim spaces and standardise to lowercase | ||
qui replace `distvar' = ustrtrim(strlower(`distvar')) | ||
qui replace `distvar' = subinstr(`distvar'," ","",.) | ||
|
||
* call district clean + assign programs | ||
dclean `distvar' `isocode' `year' | ||
dcode `distvar' `isocode' `year' | ||
|
||
* end ibrdist | ||
end |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,86 @@ | ||
{smcl} | ||
|
||
{* *! version 17.1 01feb2022}{...} | ||
|
||
{p2col:{bf:indiabridge}} | ||
|
||
Census consistent district identification | ||
|
||
{marker syntax}{...} | ||
|
||
{title:Syntax} | ||
|
||
{p} | ||
|
||
{cmd:indiabridge} {it:[if]}{cmd:,} {opt y:ear(num)} {opt s:tate(varlist)} {opt d:istrict(varlist)} | ||
|
||
{marker menu}{...} | ||
|
||
{marker description}{...} | ||
|
||
{title:Description} | ||
|
||
{pstd} | ||
|
||
{cmd:indiabridge} is a collection of ado programs to standardise variations/typos in state and district names in India from 1951 until 2011. | ||
These names are standardised in accordance with Census reports and the administrative atlas (available in the project documentation). | ||
After standardising names, {cmd:indiabridge} also assigns unique state codes and district codes along with the relevant (current) ISO code & union territory status. | ||
|
||
{marker options}{...} | ||
|
||
{title:Options} | ||
|
||
{phang}{opt y:ear(num)} is a year required in YYYY format (numeric). | ||
|
||
{phang}{opt s:tate(var)} specifies a variable containing state names. | ||
|
||
{phang}{opt d:istrict(var)} specifies a variable containing district names. | ||
|
||
{marker examples}{...} | ||
|
||
{title:Examples} | ||
{hline} | ||
|
||
{pstd} Assuming 2001 identification is required for state names stored in {cmd: state_name} and district names stored in {cmd: district_name} - | ||
|
||
{phang2}{cmd:. indiabridge, y(2001) s(state_name) d(district_name)} | ||
|
||
|
||
{hline} | ||
|
||
{title:Maintainer} | ||
|
||
{p 4 4 2}{bf:Vedarshi Shastry}{break} | ||
[email protected]{break} | ||
shastryved.github.io | ||
|
||
{bf:To-do} | ||
{hline} | ||
{cmd:indiabridge} currently only works for one variable of state & district each, for one year at a time. | ||
I plan on building in input for a year variable in YYYY format for {opt y:ear()}, and parsing multiple state/district variables in one go. | ||
|
||
{pstd} Suppose now the variable {cmd: cenyear} contains a list of multiple Census years (say, 1991 and 2001) - | ||
|
||
{phang2}{cmd:. indiabridge, y(cenyear) s(statelist) d(districtlist)} | ||
|
||
{phang2} will now read the year from {cmd: cenyear} and assign the relevant state/district identifiers to variables in {cmd:statelist, districtlist}. | ||
|
||
{hline} | ||
|
||
{title:Acknowledgements & References} | ||
|
||
{p 4 8 2} | ||
{bf:Kumar, Hemanshu and Somanathan, Rohini}{break} | ||
{it:State and District Boundary Changes in India: 1961-2001 (November 6, 2015)}{break} | ||
http://dx.doi.org/10.2139/ssrn.2687484 | ||
|
||
Additional information on district splits and merges was derived from work previously done by the authors of this paper. | ||
|
||
{p 4 4 2}{bf:Nicholas J. Cox, Durham University, U.K.}{break} | ||
[email protected] | ||
|
||
{cmd:egenmore}, maintained by Nicholas J. Cox is a dependency for {cmd:indiabridge}. | ||
|
||
|
||
|
||
{hline} |
Oops, something went wrong.