last edits

JuliaActuary · Jun 5, 2024 · 22d9eab · 22d9eab
1 parent 4fa6cda
commit 22d9eab
Show file tree

Hide file tree

Showing 2 changed files with 183 additions and 19 deletions.
diff --git a/README.md b/README.md
@@ -10,21 +10,17 @@ Calculate exposures.
 ## Quickstart
 
 ```julia
-using ExperienceAnalysis
-using DataFrames
-using Dates
-
 df = DataFrame(
     policy_id = 1:3,
     issue_date = [Date(2020,5,10), Date(2020,4,5), Date(2019, 3, 10)],
-    termination_date = [Date(2022, 6, 10), Date(2022, 8, 10), nothing],
+    end_date = [Date(2022, 6, 10), Date(2022, 8, 10), Date(2022,12,31)],
     status = ["claim", "lapse", "inforce"]
 )
 
 df.policy_year = exposure.(
     ExperienceAnalysis.Anniversary(Year(1)),
     df.issue_date,
-    df.termination_date,
+    df.end_date,
     df.status .== "claim"; # continued exposure
     study_start = Date(2020, 1, 1),
     study_end = Date(2022, 12, 31)
@@ -38,18 +34,19 @@ df.exposure_fraction =
 # here we end the interval at Date(2020, 12, 31), so we need to add a day to get the correct exposure fraction.
 ```
 
-policy_id | issue_date | termination_date | status | policy_year | exposure_fraction
---- | --- | --- | --- | --- | ---
-1 | 2020-05-10 | 2022-06-10 | claim | (from = Date("2020-05-10"), to = Date("2021-05-09"), policy_timestep = 1)  | 1.0
-1 | 2020-05-10 | 2022-06-10 | claim | (from = Date("2021-05-10"), to = Date("2022-05-09"), policy_timestep = 2)  | 1.0
-1 | 2020-05-10 | 2022-06-10 | claim | (from = Date("2022-05-10"), to = Date("2023-05-09"), policy_timestep = 3)  | 1.0
-2 | 2020-04-05 | 2022-08-10 | lapse | (from = Date("2020-04-05"), to = Date("2021-04-04"), policy_timestep = 1)  | 1.0
-2 | 2020-04-05 | 2022-08-10 | lapse | (from = Date("2021-04-05"), to = Date("2022-04-04"), policy_timestep = 2)  | 1.0
-2 | 2020-04-05 | 2022-08-10 | lapse | (from = Date("2022-04-05"), to = Date("2022-08-10"), policy_timestep = 3)  | 0.35
-3 | 2019-03-10 |  | inforce | (from = Date("2020-01-01"), to = Date("2020-03-09"), policy_timestep = 1)  | 0.191667
-3 | 2019-03-10 |  | inforce | (from = Date("2020-03-10"), to = Date("2021-03-09"), policy_timestep = 2)  | 1.0
-3 | 2019-03-10 |  | inforce | (from = Date("2021-03-10"), to = Date("2022-03-09"), policy_timestep = 3)  | 1.0
-3 | 2019-03-10 |  | inforce | (from = Date("2022-03-10"), to = Date("2022-12-31"), policy_timestep = 4)  | 0.808333
+| **policy\_id**<br>`Int64` | **issue\_date**<br>`Date` | **end\_date**<br>`Date` | **status**<br>`String` | **policy\_year**<br>`@NamedTuple{from::Date, to::Date, policy\_timestep::Int64}` | **exposure\_fraction**<br>`Float64` |
+|--------------------------:|--------------------------:|------------------------:|-----------------------:|---------------------------------------------------------------------------------:|------------------------------------:|
+| 1                         | 2020-05-10                | 2022-06-10              | claim                  | (from = Date("2020-05-10"), to = Date("2021-05-09"), policy\_timestep = 1)       | 1.0                                 |
+| 1                         | 2020-05-10                | 2022-06-10              | claim                  | (from = Date("2021-05-10"), to = Date("2022-05-09"), policy\_timestep = 2)       | 1.0                                 |
+| 1                         | 2020-05-10                | 2022-06-10              | claim                  | (from = Date("2022-05-10"), to = Date("2023-05-09"), policy\_timestep = 3)       | 1.0                                 |
+| 2                         | 2020-04-05                | 2022-08-10              | lapse                  | (from = Date("2020-04-05"), to = Date("2021-04-04"), policy\_timestep = 1)       | 1.0                                 |
+| 2                         | 2020-04-05                | 2022-08-10              | lapse                  | (from = Date("2021-04-05"), to = Date("2022-04-04"), policy\_timestep = 2)       | 1.0                                 |
+| 2                         | 2020-04-05                | 2022-08-10              | lapse                  | (from = Date("2022-04-05"), to = Date("2022-08-10"), policy\_timestep = 3)       | 0.35                                |
+| 3                         | 2019-03-10                | 2022-12-31              | inforce                | (from = Date("2020-01-01"), to = Date("2020-03-09"), policy\_timestep = 1)       | 0.191667                            |
+| 3                         | 2019-03-10                | 2022-12-31              | inforce                | (from = Date("2020-03-10"), to = Date("2021-03-09"), policy\_timestep = 2)       | 1.0                                 |
+| 3                         | 2019-03-10                | 2022-12-31              | inforce                | (from = Date("2021-03-10"), to = Date("2022-03-09"), policy\_timestep = 3)       | 1.0                                 |
+| 3                         | 2019-03-10                | 2022-12-31              | inforce                | (from = Date("2022-03-10"), to = Date("2022-12-31"), policy\_timestep = 4)       | 0.808333                            |
+
 
 ## Discussion and Questions
 
@@ -274,4 +271,171 @@ And **not** like this:
  (from = Date("2019-03-01"), to = Date("2020-02-28"), policy_timestep = 4)
  (from = Date("2020-03-01"), to = Date("2021-02-29"), policy_timestep = 5)
 ...
+```
+
+## Example of Actual to Expected Analysis
+
+```julia
+# generate samples over a full leap cycle and 
+# show that we recover a ~100% A/E using a given
+# assumption
+
+println("------------------")
+println("generating simulated experience")
+
+using ExperienceAnalysis
+using Dates
+using DayCounts
+using Distributions
+using DataFramesMeta
+using StableRNGs
+
+rng = StableRNG(123)
+q = 1 - (0.6)^(1 / (365.25 * 4)) #  a daily rate for a risk that occurs ~0.1/year on average over a leap cycle 
+
+# simulate n policies and when they die using the above q
+# set the end date for the study four years in, covering a whole leap cycle
+# and presume we don't know data beyond that date
+n = 1 * 10^6
+years = 4
+d_start = Date(2011, 1, 1)
+d_end = d_start + Year(years) - Day(1)
+census = map(1:n) do id
+    issue = rand(rng, d_start:Day(1):Dates.lastdayofyear(d_start))
+    death = issue + Day(rand(rng, Geometric(q)))
+    (; id, issue, death)
+
+end |> DataFrame
+
+# calculate (1, 2) grouped over pol/cal years and  (3) total actual to expected
+
+basis = [
+    ExperienceAnalysis.AnniversaryCalendar(Year(1), Year(1)),
+    ExperienceAnalysis.Calendar(Year(1)),
+    ExperienceAnalysis.Anniversary(Year(1))
+]
+
+for b in basis
+    @show b
+    cp = let cp = deepcopy(census) # copy to avoid messing with generated data
+
+        cp.exposures = exposure.(
+            b,
+            census.issue,
+            min.(d_end, census.death),
+            census.death .<= d_end;
+            study_end=d_end
+        )
+
+        cp = flatten(cp, :exposures)
+
+        # did claim happen before cutoff
+        cp.claim = map(cp.exposures, cp.death) do e, d
+            e.from <= d <= e.to
+        end
+
+        cp.exp_days = map(cp.exposures) do e
+            length(e.from:Day(1):e.to)
+        end
+        cp.expected = @. 1 - (1 - q)^cp.exp_days
+
+        cp.cal_year = map(cp.exposures) do e
+            year(e.from)
+        end
+
+        cp.pol_year = map(cp.exposures, cp.issue) do e, i
+            y = year(e.from)
+            if monthday(e.from) < monthday(i)
+                y -= 1
+            end
+            y
+        end
+
+        cp.exp_amt = map(cp.exposures) do e
+            yearfrac(e.from, e.to + Day(1), DayCounts.ActualActualISDA())
+        end
+        cp
+
+    end
+
+    # not needed for test, but demonstrates how to do cal/pol year grouping
+    summary = map([:pol_year, :cal_year]) do grouping
+        combine(groupby(cp, (grouping))) do gdf
+            exposures = sum(gdf.exp_amt)
+            claims = sum(gdf.claim)
+            expected = sum(gdf.expected)
+
+            q̂ = claims / exposures
+            ae = claims / expected
+
+            (; claims, expected, exposures, q̂, ae)
+        end
+    end
+
+
+    @show summary
+    @show sum(cp.claim) / sum(cp.expected), sum(cp.claim), sum(cp.expected), sum(cp.exp_days), sum(cp.exp_amt)
+    @test sum(cp.claim) / sum(cp.expected) ≈ 1.0 rtol = 5e-3
+    println("---------")
+```
+
+This produces the following output, showing an actual-to-expected result for three different exposure basis as well as being on a calendar and policy-year basis:
+
+```
+------------------
+generating simulated experience
+b = ExperienceAnalysis.AnniversaryCalendar{Year, Year}(Year(1), Year(1))
+summary = DataFrame[4×6 DataFrame
+ Row │ pol_year  claims  expected       exposures  q̂         ae
+     │ Int64     Int64   Float64        Float64    Float64   Float64
+─────┼────────────────────────────────────────────────────────────────
+   1 │     2011  120715      1.20055e5  9.80143e5  0.123161  1.00549
+   2 │     2012  104919      1.05409e5  8.60476e5  0.121931  0.995354
+   3 │     2013   92578  92780.8        7.58434e5  0.122065  0.997814
+   4 │     2014   41861  41870.8        3.42226e5  0.12232   0.999766, 4×6 DataFrame
+ Row │ cal_year  claims  expected       exposures  q̂         ae
+     │ Int64     Int64   Float64        Float64    Float64   Float64
+─────┼────────────────────────────────────────────────────────────────
+   1 │     2011   61471  61363.9        5.01539e5  0.122565  1.00174
+   2 │     2012  112712      1.12714e5  9.18968e5  0.122651  0.999981
+   3 │     2013   98760  98928.2        8.0869e5   0.122123  0.9983
+   4 │     2014   87130  87109.4        7.12082e5  0.122359  1.00024]
+(sum(cp.claim) / sum(cp.expected), sum(cp.claim), sum(cp.expected), sum(cp.exp_days), sum(cp.exp_amt)) = (0.9998814228722663, 360073, 360115.70148553397, 1074485834, 2.941279085822292e6)
+---------
+b = ExperienceAnalysis.AnniversaryCalendar{Nothing, Year}(nothing, Year(1))
+summary = DataFrame[4×6 DataFrame
+ Row │ pol_year  claims  expected       exposures      q̂         ae
+     │ Int64     Int64   Float64        Float64        Float64   Float64
+─────┼────────────────────────────────────────────────────────────────────
+   1 │     2011  173896      1.73803e5       1.4376e6  0.120963  1.00054
+   2 │     2012   98793  98977.4        826104.0       0.119589  0.998137
+   3 │     2013   87161  87140.1        727311.0       0.11984   1.00024
+   4 │     2014     223    230.637        1925.0       0.115844  0.966888, 4×6 DataFrame
+ Row │ cal_year  claims  expected       exposures       q̂         ae
+     │ Int64     Int64   Float64        Float64         Float64   Float64
+─────┼─────────────────────────────────────────────────────────────────────
+   1 │     2011   61471  61363.9             5.01539e5  0.122565  1.00174
+   2 │     2012  112712      1.12735e5  938529.0        0.120094  0.999794
+   3 │     2013   98760  98942.2        825817.0        0.119591  0.998158
+   4 │     2014   87130  87109.7        727057.0        0.119839  1.00023]
+(sum(cp.claim) / sum(cp.expected), sum(cp.claim), sum(cp.expected), sum(cp.exp_days), sum(cp.exp_amt)) = (0.9997833363330586, 360073, 360151.03164316854, 1093362393, 2.9929420931506846e6)
+---------
+b = ExperienceAnalysis.AnniversaryCalendar{Year, Nothing}(Year(1), nothing)
+summary = DataFrame[4×6 DataFrame
+ Row │ pol_year  claims  expected       exposures       q̂         ae
+     │ Int64     Int64   Float64        Float64         Float64   Float64
+─────┼─────────────────────────────────────────────────────────────────────
+   1 │     2011  120715      1.20069e5       1.00093e6  0.120603  1.00538
+   2 │     2012  104919      1.05392e5       8.78469e5  0.119434  0.99551
+   3 │     2013   92578  92777.8        774366.0        0.119553  0.997846
+   4 │     2014   41861  43496.8             3.5624e5   0.117508  0.962393, 4×6 DataFrame
+ Row │ cal_year  claims  expected       exposures       q̂         ae
+     │ Int64     Int64   Float64        Float64         Float64   Float64
+─────┼─────────────────────────────────────────────────────────────────────
+   1 │     2011  120715      1.20069e5       1.00093e6  0.120603  1.00538
+   2 │     2012  104919      1.05392e5       8.78469e5  0.119434  0.99551
+   3 │     2013   92578  92777.8        774366.0        0.119553  0.997846
+   4 │     2014   41861  43496.8             3.5624e5   0.117508  0.962393]
+(sum(cp.claim) / sum(cp.expected), sum(cp.claim), sum(cp.expected), sum(cp.exp_days), sum(cp.exp_amt)) = (0.9954028354972483, 360073, 361735.9597133631, 1099590758, 3.01000275415076e6)
+---------
 ```
diff --git a/test/experience_machine.jl b/test/experience_machine.jl
@@ -1,5 +1,5 @@
 @testset "Generated Experience" begin
-    # generate samples over a fully leap cycle and 
+    # generate samples over a full leap cycle and 
     # ensure that we can recover a ~100% A/E using a given
     # assumption