.

Thursday, June 11, 2020

Importance Of Computer Automation To Insurance Companies Finance Essay - Free Essay Example

The importance of programming is of prime value for Actuarial Science and for the actuarial profession. The complex calculations merged with routine task based calculations have made programming a viable source for automation. In this dissertation, we show how the programming language, R can be used for claim models to compute aggregate claims using poisson, binomial and negative binomial distributions. We also demonstrate how to use MortalitySmooth package to compute deaths and exposure data suitable for smoothing mortality data. An essential aspect of this method is that smoothening of data allows forecasting of mortality data to use in computing annuities for different countries. We explain these methods using Danish dataset for aggregate claims and Human mortality database (HMD , https://www.mortality.org), a collection of mortality data of various developed countries. Chapter 1 Introduction The insurance firms functions making insurance products attains profitability through charging premiums surpassing overall expenses of the firm and making wise investment decisions in maximizing returns under optimal risk conditions. The method of charging premiums depends on so many underlying factors such as number of policy holders, number of claims, amount of claims, health condition, age, gender of the policy holder and so on. Some of these factors such as loss aggregate claims and human mortality rates have adverse impact on determining the premium calculation to remain solvent. Likewise, these factor need to be modelled using large amount of data, loads of simulations and complex algorithms to determine and manage risk. In this dissertation we shall consider two important factors affecting the premiums, the aggregate loss claims and human mortality. We shall use theoretical simulations using R and use Danish data to model aggregate claims and human mortality database to obtain human mortality rates and smoothen to price life insurance products respectively. In chapter 2 we shall examine the concepts of compounds distribution in modelling aggregate claim and perform simulations of the compound distribution using R packages such as MASS and Actuar. Finally we shall analyse Danish loss insurance data from 1980 to 1990 and fit appropriate distributions using customized generically implemented R methods. In chapter 3 we shall explain briefly on concepts of graduation, generalised linear models and smoothening techniques using B-splines. We shall obtain deaths and exposure data from human mortality database for selected countries Sweden and Scotland and shall implement mortality rates smoothing using mortalitysmooth package. We compare the mortality rates based on various sets such as Males and females for specific country or total mortality rates across countries like Sweden and Scotland for a given time frame ranging age wise or year wise. In chapter 4 we shall look into various life insurance and pension related products widely used in the insurance industry and construct life tables and commutation functions to implement annuity values. Finally chapter 5 we present concluding comments to the dissertation. Chapter 2 Aggregate Claim distribution 2.1 Background Insurance based companies implement numerous techniques to evaluate the underlying risk of their assets, products and liabilities on a day- to-day basis for many purposes. These include Computation of premiums Initial reserving to cover the cost of future liabilities Maintain solvency Reinsurance agreement to protect from large claims In general, the occurrence of claims is highly uncertain and has underlying impact on each of the above. Thus modelling total claims is of high importance to ascertain risk. In this chapter we shall define claim distributions and aggregate claims distributions and discuss some probabilistic distributions fitting the model. 2.2 Modelling Aggregate Claims The dynamics of insurance industry has different effects on the number of claims and amount of claims. For instance, Expanding insurance business would have proportional increase on number of claims but negligible or no impact on amount of claims. Conversely, cost control initiatives, technology innovations have adverse effect on amount of claims but has zero effect on number of claims. Consequently, the aggregate claim is modelled based on the assumption that the number of claims occurring and amount of claims are modelled independently. 2.2.1 Compound distribution model We define compound distribution as S Random variable denoting the total claims occurring in a fixed period of time. Denote the claim amount representing the i-th claim. N Non-negative, independent random variable denoting number of claims occurring in a time period. Further, is a sequence of i.i.d random variables with probability density function given by and cumulative density function by with probability of 0 is 1 for 1iN. Then we obtain the aggregate claims S as follows With Expectation and variance of S found as follows Thus S, the aggregate claims is computed using Collective Risk Model and follows compound distribution. (pg 86 Non-life actuarial model theory, methods and evaluation) 2.3 Compound distribution for aggregate claims As discussed in Section 2.1 S, follows compound distribution. were N, the number of claims is the primary distribution and X, the amount of claims being secondary distribution In this section we shall describe the three main compound distributions widely used to model aggregate claims models. The primary distribution, N can be modelled based on non-negative integer valued distributions like poisson, binomial and negative binomial. The selection of a distribution depends from case to case. 2.3.1 Compound Poisson distribution The Poisson distribution is referred to distribution of occurrence of rare events, Number of accidents per person, number of claims per insurance policy and number of defects found in product manufacturing are some of the real time examples of Poisson distribution. Here, the primary distribution N has a Poisson distribution denoted by N ~ P(ÃÆ'Ã… ½Ãƒâ€šÃ‚ » with parameter ÃÆ'Ã… ½Ãƒâ€šÃ‚ ». The probability density function, expectation and variance are given as follows for x=0,1. Then S has a compound Poisson distribution with parameters ÃÆ'Ã… ½Ãƒâ€šÃ‚ » and denoted as follows S ~ CP(ÃÆ'Ã… ½Ãƒâ€šÃ‚ », and 2.3.2 Compound Binomial distribution The binomial distribution is referred to distribution of number of success occurring in an event, the number of males in a company, number of defective components in random sample from a production process is real time examples representing this distribution. The compound binomial distribution is a natural choice to model aggregate claims when there is an upper limit on the number of claims in a given time period. Here, the primary distribution N has a binomial distribution with parameters n and p denoted by N ~ B(n,p. The probability density function, expectation and variance are given as follows For x=0,1,2.n Then S has a compound binomial distribution with parameters ÃÆ'Ã… ½Ãƒâ€šÃ‚ » and denoted as follows S ~ CB(n, p , -p) 2.3.3 Compound Negative Binomial distribution The compound negative binomial distribution models aggregate claim models. The variance of negative binomial is greater than its mean and thus we can use negative binomial over Poisson distribution if the data has greater variance than its mean. This distribution provides a better fit to the data. Here, the primary distribution N has a negative binomial distribution with parameters n and p denoted by N ~ NB(n,p with n0 and 0p1. The probability density function, expectation and variance are given as follows for x=0,1,2. / Then S has a compound negative binomial distribution with parameters ÃÆ'Ã… ½Ãƒâ€šÃ‚ » and denoted as follows S ~ CNB(n,p, 2.4 Secondary distributions claim amount distributions. In Section 2.3, we defined the three different compound distributions widely used. In this section, we shall define the generally used distributions to model secondary distributions for claim amounts. We use positive skewed distributions. Some of these distributions include Weibull distribution used frequently in engineering applications. we shall also look into specific distributions such as Pareto and lognormal which are widely used to study loss distributions. 2.4.1 Pareto Distribution The distribution is named after Vilfredo Pareto who used it in modelling economic welfares. It is used these days to model income distribution in economics. The random variable X has a Pareto distribution with parameters and ÃÆ'Ã… ½Ãƒâ€šÃ‚ » where, ÃÆ'Ã… ½Ãƒâ€šÃ‚ » 0 and is denoted by X~ ( or X ~ Pareto(, ÃÆ'Ã… ½Ãƒâ€šÃ‚ ») The probability density function, expectation and variance are given as follows For x0 2.4.2 Log normal Distribution The random variable X has a Log normal distribution with parameters and where, 0 and is denoted by X ~ LN(, ), Where, and are the mean and variance of Log(X). The log normal distribution has a positive skew and is a very good distribution to model claim amount. The probability density function, expectation and variance are given as follows For x0 and 2.4.3 Gamma distribution The gamma distribution is very useful to model claim amount distribution it has , the shape parameter and ÃÆ'Ã… ½Ãƒâ€šÃ‚ » the scale parameter. The random variable X has a Gamma distribution with parameters and ÃÆ'Ã… ½Ãƒâ€šÃ‚ » where, ÃÆ'Ã… ½Ãƒâ€šÃ‚ » 0 and is denoted by X~ ( or X ~ Gamma(, ÃÆ'Ã… ½Ãƒâ€šÃ‚ ») The probability density function, expectation and variance are given as follows For x0 2.4.4 Weibull Distribution The Weibull distribution is extreme valued distributions, because of its survival function it is used widely in modelling lifetimes. The random variable X has a Weibull distribution with parameters and ÃÆ'Ã… ½Ãƒâ€šÃ‚ » where, ÃÆ'Ã… ½Ãƒâ€šÃ‚ » 0 and is denoted by X~ ( The probability density function, expectation and variance are given as follows For x0 2.5 Simulation of Aggregate claims using R In section 2.3 we discussed about aggregate claims and the various compound distributions used to model it. In this section we shall perform random simulation using R program. 2.5.1 Simulation using R The simulation of aggregate claims was implemented using packages like Actuar, MASS. The generic R code available in Programs/Aggregate_Claims_Methods.r implements simulation of random generated aggregate claim of any compound distribution samples. The following R code below generates simulated aggregate claim data for Compound Poisson distribution with gamma as the claim distribution denoted by CP(10,. require(actuar) require(MASS) source(Programs/Aggregate_Claims_Methods.r) Sim.Sample = SimulateAggregateClaims (ClaimNo.Dist=pois, ClaimNo.Param =list(lambda=10),ClaimAmount.Dist=gamma,ClaimAmount.Param= list(shape = 1, rate = 1),No.Samples=2000 ) names(Sim.Sample) The SimulateAggregateClaims method in Programs/Aggregate_Claims_Methods.r generates and returns simulated aggregate samples along with expected and observed moments. The simulated data can then be used to perform various tests, comparisons and plots. 2.5.2 Comparison of Moments The moments of expected and observed are compared to test the correctness of the data. The following R code returns the expected and observed mean and variance of the simulated data Respectively. Sim.Sample$Exp.Mean;Sim.Sample$Exp.Variance Sim.Sample$Obs.Mean;Sim.Sample$Obs.Variance Table 2.1 Comparison of Observed and Expected moments for different sample size. The Table 2.1 above shows the simulated values for different Sample size. Clearly the observed and expected moments are similar and the difference between them converges as Number of sample increases. 2.5.3 Histogram with curve fitting distributions Histograms can provide useful information on skewness, information on extreme points in the data, the outliers and can be graphically measured or compared with shapes of standard distributions. The figure 2.1 below shows the fitted histogram of simulated data compared with standard distributions like Weibull, Normal, Lognormal and Gamma respectively. Figure 2.1 Histogram of simulated aggregate claims with fitted standard distribution curves. Figure 2.1 represents the histogram of the stimulated data along with the fitted curves for different distributions. The histogram is plotted by dividing them in to breaks of 50. The simulated data is then fitted using the fitdistr() function in the MASS package and fitted for various distributions like Normal,Lognormal,Gamma and Weibull distribution. The following R program describes how the fitdistr method is used to compute the Gamma parameters and plot the respective curve as described in Figure 2.1 gamma = fitdistr(Agg.Claims,gamma) Shape = gamma$estimate[1] Rate= gamma$estimate[2] Scale=1/Rate Left = min(Agg.Claims) Right = max(Agg.Claims) Seq = seq(Left,Right,by= 0.01) lines(Seq,dgamma(Seq,shape=Shape, rate= Rate, scale=Scale), col = blue) 2.5.4 Goodness of fit The goodness of fit compare the closeness of expected and observed values to conclude whether it is reasonable to accept that the random sample fits a standard distribution or not. It is type of hypothesis testing were the hypotheses are defined as follows. : Data fits with the standard distribution : Data does not fit with the standard distribution The chi-square test is one of the ways to test goodness of fit. The test uses histogram and compares it with the fitted density. It is used by grouping data into different intervals using k breaks. The breaks are computed using quantiles. This computes the expected frequency,. , the observed frequency is calculated using the product of difference of the c.d.f with sample size. The test statistic is defined as Where is the observed frequency and is expected frequency for k breaks respectively. To perform simulation we shall use breaks of 100 to split the data into equal cells of 100 and use histogram count to group the data based on the observed values. Large values of leads to rejecting null hypothesis The test statistic follows distribution with k-p-1 degrees of freedom where p is the number of parameters in sample data. The p-value is computed using 1- pchisq() and is accepted if p-value is greater than the significance level . The following R code computes chi-square test Test.ChiSq=PerformChiSquareTest( Samples.Claims= Sim.Sample$AggregateClaims,No.Samples=N.Samples) Test.ChiSq$DistName Test.ChiSq$X2Val;Test.ChiSq$pvalue Test.ChiSq$Est1; Test.ChiSq$Est2 Table 2.3 Chi-Square and p-value for compound Poisson distribution The highest p-value signifies better fit of data with the standard distribution. Weibull distribution is a better fit with the following parameters shape =2.348 and scale = 11.32. 2.6 Fitting Danish Data 2.6.1 The Danish data source of information In this section we shall use a statistical model and fit a compound distribution to compute aggregate claims using historical data. Fitting data into a probability distribution using R is an interesting exercise, and is worth quoting All models are wrong, some models are useful. In previous section we explained fitting distribution, comparison of moments and goodness of fit to simulated data. The data source used is Danish data composed from Copenhagen Reinsurance and contains over 2000 fire loss claims details recorded during 1980 to 1990 period of time. This data is adjusted for inflation replicating 1985 values and are expressed in Danish Krone (DKK) currencies in millions. The data recorded are large values and are adjusted for inflation. There are 2167 rows of data over 11 years. Grouping the data over years results in 11 aggregate samples of data. This would be insufficient information to fit and plot the distribution. Therefore, the data is grouped month-wise aggregating to 132 samples. The expectation and variance of the aggregate claims are 55.572 and 1440.7 respectively. The figure 2.2 shows the time series plot against the claim numbers inferring the different claims occurred monthly from 1980 to 1990, it also shows the extreme values of loss claims and the time of occurrence. There are no seasonal effects on the data as the 2 sample test for summer and winter data is compared and the t-test value infers there is no difference and conclude that there is no seasonal variation. Figure 2.2 Time series plot of Danish fire loss insurance data month wise starting 1980-1990. The data is plotted and fitted into an histogram using fitdistr() function in MASS package of R. 2.6.2 Analysis of Danish data We shall do the following steps to analyse and fit the data. Obtain the claim numbers and loss aggregate claim data month wise. Choose primary distribution to be Poisson or negative binomial and use fitdistr() function to obtain the parameters. Assume Gamma distribution as the default loss claim distribution and use fitdistr() to obtain the shape and rate parameters. Simulate for 1000 samples using section 2.5.1, also plot the histogram along with the fitted standard distributions as described in section 2.5.2. Perform chi-square test to identify the optimal fit and obtain the distribution parameters. Finally implement another simulation using the primary distribution and fitted secondary distribution. 2.6.3 R Implementation We will do the following to implement R. The Danish data is assumed to take gamma distribution Plot the computed aggregate claims and use fitdistr() to get the parameters using gamma or lognormal. Now, using generic R implementation discussed in Section 2.5 we simulate using the new dataset and finally fit with standard distributions. The following R code reads the Danish data available in DataDanishData.txt, segregate the claims month and year wise, to calculate sample mean and variance and plots the histogram with fitted standard distributions. require(MASS) source(Programs/Aggregate_Claims_Methods.r) Danish.Data = ComputeAggClaimsFromData(Data/DanishData.txt) Danish.Data$Agg.ClaimData = round(Danish.Data$Agg.ClaimData, digits = 0) #mean(Danish.Data$Agg.ClaimData) #var(Danish.Data$Agg.ClaimData) #Danish.Data$Agg.ClaimData #mean(Danish.Data$Agg.ClaimNos) #var(Danish.Data$Agg.ClaimNos) Figure 2.3 Actual Danish fire loss data fitted with standard distributions of 132 samples. In the initial case N, the primary distribution is assumed to be Negative Binomial distributed with parameter; k= 25.32 and p=.6067 and the secondary distribution is assumed to be gamma distribution with parameters; Shape =3.6559 and rate =.065817. We simulate using 1000 samples and obtain aggregate claim samples using Section 2.5.1. The plot and chi square test values are defined below as follows. The generic function PerformChiSquareTest, previously discussed in Section 2.4 is used here to compute values of and p-value pertaining to = distribution. The corresponding values are tabulated in table 2.2 below. Figure 2.4 Histogram of simulated samples of Danish data fitted with standard distributions The figure 2.4 shows simulated samples of Danish data calculated for sample size 1000, The figure also shows the different distribution curves fitted to the simulated data. These results suggest that the best possible choice of model is Gamma distribution with parameters Shape = 8.446 and Rate = .00931 Chapter 3 Survival models Graduation In the previous chapter 2, we discussed about aggregate claims and how it can be modelled and simulated using R programming. In this chapter we shall discuss on one of the important factors which has direct impact on arise of a claim, the human mortality. Life insurance companies use this factor to model risk arising out of claims. We shall analyse and investigate the crude data presented in human mortality database for specific countries like Scotland and Sweden and use statistical techniques. Mortality smooth package is used in smoothing the data based on Bayesian information criterion BIC, a technique used to determine smoothing parameter; we shall also plot the data. Finally we shall conclude by performing comparison of mortality of two countries based on time. 3.1 Introduction Mortality data in simple terms is recording of deaths of species defined in a specific set. This collection of data could vary based on different variables or sets such as sex, age, years, geographical location and beings. In this section we shall use human data grouped based on population of countries, sex, ages and years. Human mortality in urban nations has improved significantly over the past few centuries. This has attributed largely due to improved standard of living and national health services to the public, but in latter decades there has been tremendous improvement in health care in recent measures which has made strong demographic and actuarial implications. Here we use human mortality data and analyse mortality trend compute life tables and price different annuity products. 3.2 Sources of Data Human mortality database (HMD) is used to extract data related to deaths and exposure. These data are collected from national statistical offices. In this dissertation we shall look into two countries Sweden and Scotland data for specific ages and years. The data for specific countries Sweden and Scotland are downloaded. The deaths and exposure data is downloaded from HMD under Sweden Deaths https://www.mortality.org/hmd/SWE/STATS/Deaths_1x1.txt They are downloaded and saved as .txt data files in the respective hard disk under /Data/Conutryname_deaths.txt and /Data/Conutryname_exposures.txt respectively. In general the data availability and formats vary over countries and time. The female and male death and exposure data are shared from raw data. The total column in the data source is calculated using weighted average based on the relative size of the two groups male and female at a given time. 3.3 Gompertz law graduation A well-known actuary, Benjamin Gompertz observed that over a long period of human life time, the force of mortality increases geometrically with age. This was modelled for single year of life. The Gompertz model is linear on the log scale. The Gompertz law states that the mortality rate increases in a geometric progression. Hence when death rates are A0 B1 And the liner model is fitted by taking log both sides. = a + bx Where a = and b = The corresponding quadratic model is given as follows 3.3.1 Generalized Linear models are P-Splines in smoothing data Generalized Linear Models (GLM) are an extension of the linear models that allows models to be fit to data that follow probability distributions like Poisson, Binomial, and etc. If is the number of deaths at age x and is central exposed to risk then By maximum likelihood estimate we have and by GLM, follows Poisson distribution denoted by with a + bx We shall use P-splines techniques in smoothing the data. As mentioned above the GLM with number of deaths follows Poisson distribution, we fit a quadratic regression using exposure as the offset parameter. The splines are piecewise polynomials usually cubic and they are joined using the property of second derivatives being equal at those points, these joints are defined as knots to fit data. It uses B-splines regression matrix. A penalty function of order linear or quadratic or cubic is used to penalize the irregular behaviour of data by placing a penalty difference. This function is then used in the log likelihood along with smoothing parameter .The equations are maximised to obtain smoothing data. Larger the value of implies smoother is the function but more deviance. Thus, optimal value of is chosen to balance deviance and model complexity. is evaluated using various techniques such as BIC Bayesian information criterion and AIC Akaikes information criterion techniques. Mortalitysmooth package in R implements the techniques mentioned above in smoothing data, There are different options or choices to smoothen using p-splines, The number of knots ndx ,the degree of p-spine whether linear,quadratic or cubic bdeg and the smoothning parameter lamda. The mortality smooth methods fits a P-spline model with equally-spaced B-splines along x There are four possible methods in this package to smooth data, the default value being set is BIC. AIC minimization is also available but BIC provides better outcome for large values. In this dissertation, we shall smoothen the data using default option BIC and using lamda value. 3.4 MortalitySmooth Package in R program implementation In this section we describe the generic implementation of using R programming to read deaths and exposure data from human mortality database and use MortalitySmooth package to smoothen the data based on p-splines. The following code presented below loads the require(MortalitySmooth) source(Programs/Graduation_Methods.r) Age -30:80; Year - 1959:1999 country -scotland ;Sex - Males death =LoadHMDData(country,Age,Year,Deaths,Sex ) exposure =LoadHMDData(country,Age,Year,Exposures,Sex ) FilParam.Val -40 Hmd.SmoothData =SmoothenHMDDataset(Age,Year,death,exposure) XAxis - Year YAxis-log(fitted(Hmd.SmoothData$Smoothfit.BIC)[Age==FilParam.Val,]/exposure[Age==FilParam.Val,]) plotHMDDataset(XAxis ,log(death[Age==FilParam.Val,]/exposure[Age==FilParam.Val,]) ,MainDesc,Xlab,Ylab,legend.loc ) DrawlineHMDDataset(XAxis , YAxis ) The MortalitySmooth package is loaded and the generic implementation of methods to execute graduation smoothening is available in Programs/Graduation_Methods.r. The step by step description of the code is explained below. Step:1 Load Human Mortality data Method Name LoadHMDData Description Return an object of Matrix type which is a mxn dimension with m representing number of Ages and n representing number of years. This object is specifically formatted to be used in Mortality2Dsmooth function. Implementation LoadHMDData(Country,Age,Year,Type,Sex) Arguments Country Name of the country for which data to be loaded. If country is Denmark,Sweden,Switzerland or Japan the SelectHMDData function of MortalitySmooth package is called internally. Age Vector for the number of rows defined in the matrix object. There must be atleast one value. Year Vector for the number of columns defined in the matrix object. There must be atleast one value. Type A value which specifies the type of data to be loaded from Human mortality database. It can take values as Deaths or Exposures Sex An optional filter value based on which data is loaded into the matrix object. It can take values Males, Females and Total. Default value being Total Details The method LoadHMDData in Programs/Graduation_Methods.r reads the data availale in the directory Data to load deaths or exposure for the given parameters. The data can be filtered based on Country, Age, Year, Type based on Deaths or Exposures and lastly by Sex. Figure: 3.1 Format of matrix objects Death and Exposure. The Figure 3.1 shows the format used in objects Death and Exposure to store data. A matrix object representing Age in rows and Years in column. The MortalitySmooth package contains certain features for specific countries listed in the package. They are Denmark,Switzerland,Sweden and Japan. These data for these countries can be directly accessed by a predefined function SelectHMDData. LoadHMDData function checks the value of the variable country and if Country is equal to any of the 4 countries mentioned in the mortalitysmooth package then SelectHMDData method is internally called or else customized generic function is called to return the objects. The return objects format in both functions remains exactly the same. Step 2: Smoothen HMD Dataset Method Name SmoothenHMDDataset Description Return a list of smoothened object based BIC and Lamda of matrix object type which is a mxn dimension with m representing number of Ages and n representing number of years. This object is specifically formatted to be used in Mortality2Dsmooth function. Returns a list of objects of type Mort2Dsmooth which is a two-dimensional P-splines smooth of the input data and order fixed to be default. These objects are customized for mortality data only. Smoothfit.BIC and Smoothfit.fitLAM objects are returned along with fitBIC.Data fitted values. SmoothenHMDDataset (Xaxis,YAxis,ZAxis,Offset.Param) Arguments Xaxis Vector for the abscissa of data used in the function Mortality2Dsmooth in MortalitySmooth package in R. Here Age vector is value of XAxis. Yaxis Vector for the ordinate of data used in the function Mortality2Dsmooth in MortalitySmooth package in R. Here Year vector is value of YAxis. .ZAxis Matrix Count response used in the function Mortality2Dsmooth in MortalitySmooth package in R. Here Death is the matrix object value for ZAxis and dimensions of ZAxis must correspond to the length of XAxis and YAxis. Offset.Param A Matrix with prior known values to be included in the linear predictor during fitting the 2d data. Here exposure is the matrix object value and is the linear predictor. Details. The method SmoothenHMDDataset in Programs/Graduation_Methods.r smoothens the data based on the death and exposure objects loaded as defined above in step 1. The Age, year and death are loaded as x-axis, y-axis and z-axis respectively with exposure as the offset parameter. These parameters are internally fitted in Mortality2Dsmooth function available in MortalitySmooth package in smoothing the data. Step3: plot the smoothened data based on user input Method Name PlotHMDDataset Description Plots the smoothened object with the respective axis, legend, axis scale details are automatics customized based on user inputs. Implementation PlotHMDDataset (Xaxis,YAxis,MainDesc,Xlab,Ylab,legend.loc,legend.Val,Plot.Type,Ylim) Arguments Xaxis Vector for plotting X axis value. Here the value would be Age or Year based on user request. Yaxis Vector for plotting X axis value. Here the value would be Smoothened log mortality vales filtered for a particular Age or Year. MainDesc Main details describing about the plot. Xlab X axis label. Ylab Y axis label. legend.loc A customized location of legend. It can take values topright,topleft legend.Val A customized legend description details it can take vector values of type string. Val,Plot.Type An optional value to change plot type. Here default value is equal to default value set in the plot. If value =1, then figure with line is plotted Ylim An optional value to set the height of the Y axis, by default takes max value of vector Y values. Details The generic method PlotHMDDataset in Programs/Graduation_Methods.r plots the smoothened fitted mortality values with an option to customize based on user inputs. The generic method DrawlineHMDDataset in Programs/Graduation_Methods.r plots the line. Usually called after PlotHMDDataset method. 3.5 Graphical representation of smoothed mortality data. In this section we shall look into graphical representation of mortality data for selected countries Scotland and Sweden. The generic program discussed in previous section 3.4 is used to implement the plot based on customized user inputs. Log mortality of smoothed data v.s actual fit for Sweden. Figure 3.3 Left panel: Plot of Year v.s log(Mortality) for Sweden based on age 40 and year from 1945 to 2005. The points represent real data and red and blue curves represent smoothed fitted curves for BIC and Lamda =10000 respectively. Right panel: Plot of Age v.s log(Mortality) for Sweden based on year 1995 and age from 30 to 90. The points represent real data red and blue curves represent smoothed fitted curves for BIC and Lamda =10000 respectively. Log mortality of smoothed data v.s actual fit for Scotland Figure 3.4 Left panel: Plot of Year v.s log(Mortality) for Scotland based on age 40 and year from 1945 to 2005. The points represent real data and red and blue curves represent smoothed fitted curves for BIC and Lamda =10000 respectively. Right panel: Plot of Age v.s log(Mortality) for Scotland based on year 1995 and age from 30 to 90. The points represent real data red and blue curves represent smoothed fitted curves for BIC and Lamda =10000 respectively. Log mortality of Females Vs Males for Sweden The Figure 3.5 given below represents the mortality rate for males and females in Sweden for age wise and year wise. 3.5 Left panel reveals that the mortality of male is more than the female over the years and has been a sudden increase of male mortality from mid 1960s till late 1970s for male The life expectancy for Sweden male in 1960 is 71.24 vs 74.92 for women and it had been increasing for women to 77.06 and just 72.2 for male in the next decade which explains the trend. Figure 3.5 Left panel: Plot of Year v.s log(Mortality) for Sweden based on age 40 and year from 1945 to 2005. The red and blue points represent real data for males and females respectively and red and blue curves represent smoothed fitted curves for BIC males and females respectively. Right panel: Plot of Age v.s log(Mortality) for Sweden based on year 2000 and age from 25 to 90. The red and blue points represent real data for males and females respectively and red and blue curves represent smoothed fitted curves for BIC males and females respectively. The Figure 3.5 represents the mortality rate for males and females in Sweden for age wise and year wise. 3.5 Left panel reveals that the mortality of male is more than the female over the years and has been a sudden increase of male mortality from mid 1960s till late 1970s for male The life expectancy for Sweden male in 1960 is 71.24 vs 74.92 for women and it had been increasing for women to 77.06 and just 72.2 for male in the next decade which explains the trend. (https://www.scb.se/Pages/TableAndChart____26041.aspx) The 3.5 Right panel shows the male mortality is more than the female mortality for the year 1995, The sex ratio for male to female is 1.06 at birth and has been consistently decreasing to 1.03 during 15-64 and .79 over 65 and above clearly explaining the trend for Sweden mortality rate increase in males is more than in females. (https://www.indexmundi.com/sweden/sex_ratio.html) Log mortality of Females Vs Males for Scotland Figure 3.6 Left panel: Plot of Year v.s log(Mortality) for Scotland based on age 40 and year from 1945 to 2005. The red and blue points represent real data for males and females respectively and red and blue curves represent smoothed fitted curves for BIC males and females respectively. Right panel: Plot of Age v.s log(Mortality) for Scotland based on year 2000 and age from 25 to 90. The red and blue points represent real data for males and females respectively and red and blue curves represent smoothed fitted curves for BIC males and females respectively. The figure 3.6 Left panel describes consistent dip in mortality rates but there has been a steady increase in mortality rates of male over female for a long period starting mid 1950s and has been steadily increasing for people of age 40 years.The 3.6 Right panel shows the male mortality is more than the female mortality for the year 1995, The sex ratio for male to female is 1.04 at birth and has been consistently decreasing to .94 during 15-64 and .88 over 65 and above clearly explaining the trend for Scotland mortality rate increase in males is more than in females. https://en.wikipedia.org/wiki/Demography_of_Scotland . Log mortality of Scotland Vs Sweden Figure 3.7 Left panel:- Plot of Year v.s log(Mortality) for countries Sweden and Scotland based on age 40 and year from 1945 to 2005. The red and blue points represent real data for Sweden and Scotland respectively and red and blue curves represent smoothed fitted curves for BIC Sweden and Scotland respectively. Right panel:- Plot of Year v.s log(Mortality) for countries Sweden and Scotland based on year 2000 and age from 25 to 90. The red and blue points represent real data for Sweden and Scotland respectively and red and blue curves represent smoothed fitted curves for BIC Sweden and Scotland respectively. The figure 3.7 Left Panel shows that the mortality rates for Scotland are more than Sweden and there has been consistent decrease in mortality rates for Sweden beginning mid 1970s where as Scotland mortality rates though decreased for a period started to show upward trend, this could be attributed due to change in living conditions. Chapter 4 Pricing Life insurance products using mortality rates In the previous chapter 3 we discussed the methodology used in constructing mortality rates from Human Mortality Database and smoothing them using MortalitySmooth package. The smoothed graduated data is then used in life insurance companies to estimate pricing in insurance products like annuity and life insurance. Decline in mortality in general has posed one of the key challenges to actuaries in planning, estimating and designing public retirement and life annuities for smooth functioning of the business. Also, calculation of optimal expected present values required in pricing and reserving of long-term benefits depends on projected mortality values. This process eliminates the scope of future insolvency situations and safeguards from wrong projection of future cost. Therefore, actuaries use lifetables to analyse risk and estimate them efficiently. In this chapter we shall discuss about different methods involved in constructing lifetables and commutation functions using mortality rates. These computed values are used to price different insurance products like annuity, term annuity, deferred annuity, life insurance, term insurance, deferred insurance and so on. 4.1 Life insurance systems and commutation functions In this section we shall briefly describe some of the basic insurance products used in insurance industry and state the respective commutation functions. In view of the fact that, most calculations involves computation of expected present values for death benefits paid to the insurer or periodic annuity payments until death of the policy holder. Thus we define basic notations as follows discounted value for x years, where interest rate i is assumed to be .04 and Expected number of survivors at aged x. We can assume to be 100000 Expected number of deaths between x and x + 1. and and and 4.2 Life annuity Whole life annuity payable in advance Payment of 1 made at the beginning of each year while the policy taken at age x by the policy holder is alive. Whole life annuity payable in arrears Payment of 1 made at the end of each year while the policy taken at age x by the policy holder is alive. = Whole life annuity payable continuously Payment of 1 made at the end of each year while the policy taken at age x by the policy holder is alive. n year Temporary annuity payable in advance Payment of 1 made at the beginning of each year while the policy taken at age x by the policy holder is alive for a maximum of n years. n-year Deferred annuity payable in advance Payment of 1 made at the beginning of each year while the policy taken at age x by the policy holder is alive. The first payment is made to the policy holder at age x + n. The commutation function is given as follows Increasing annuity Immediate annuity due paying 1 now, 2 in next year and so on provided the policy holder is alive when the payment is due. The commutation function is defined as follows. 4.3 Life insurance Whole life insurance Death benefit of 1 payable at end of year of death of a policy holder currently aged x for death occurring anytime in near future. n-year Term insurance Death benefit of 1 payable at end of year of death of a policy holder currently aged x for death occurring within n years. n-year Pure Endowment Benefit of 1 payable at end of n years of period provided the policy holder is still alive. n-year Endowment Benefit of 1 payable immediately on the death of policy holder within n years or at the end of n years if policy holder is still alive at age x + n. This shows it is the sum of n-year term insurance and n-year pure endowment as follows. Increasing Whole life insurance Benefit payable at end of the year of death of the policy holder where the amount of payment is k+1 if policy holder dies between age x+k and x+k+1. The commutation function is defined as follows. 4.4 R program implementation In this section we shall explain the different steps applied to price insurance products. 4.4.1 Construct lifetables and commutation functions. The smoothed mortality data is used to compute other lifetables values such as ,, etc. These vector values are in turn used to construct commutation functions variable values such as , , , , . Finally, Annuity and life insurance products are calculated, plotted and tabulated. CalculateCommFunctions Method Name CalculateCommFunctions Description Construct life table values and commutation function values and returns a list of commutation function variables using as input values. Implementation CalculateCommFunctions (mux) Arguments mux Vector value of smoothened data. Details The function CalculateCommFunctions is used to return computed commutation function values. The is assumed as 100000 and values of is used to compute . These values are looped to calculate respective commutation function variables and returned as a list. Computation and graphical representation of Life insurance products Whole life annuity Method Name ComputeAnnuity.Life Description Returns vector value containing computed annuity life payable in advance. The interest rate is assumed at 4% Implementation ComputeAnnuity.Life (index,CommFunc) Arguments index length of the annuity vector. CommFunc list containing the required values of commutation variable required to compute annuity values. Details The function Calculates life annuity using, vector as input values in the CommFunc parameter as list. Figure 4.1 Plot of age v.s annuity prices for males and females based on year 2000 and age from 20 to 90. The red and blue curves represent smoothed fitted curves for males and females respectively. The Left panel represents plot for Sweden and right panel represents plot for Scotland. From figure 4.1 we infer that annuity prices for males and females in Scotland are more expensive than males and females in Sweden, It is because the mortality rates of Sweden is lesser than mortality rates of Scotland as discussed in Section 3.5. Also In general Males annuity prices are more expensive than females in each country because mortality rates of males are more than the females as discussed in Section 3.5. ComputeWholeInsurance.Life Method Name ComputeWholeInsurance.Life Description Returns vector value containing computed whole insurance life. Implementation ComputeWholeInsurance.Life (index,CommFunc) Arguments index length of the annuity vector. CommFunc list containing the required values of commutation variable required to compute whole insurance values. Details The function calculates whole life insurance using, vector values as mentioned above in previous section. Figure 4.2 Plot of age v.s Whole life insurance prices for males and females based on year 2000 and age from 20 to 90. The red and blue curves represent smoothed fitted curves for males and females respectively. The Left panel represents plot for Sweden and right panel represents plot for Scotland. From figure 4.2 we infer that whole life insurance prices increases as age increases and based on the y axis scales we can infer that Scotland whole life insurance prices are more than the Sweden. In general, females whole life insurance are less expensive than males due to lesser mortality rates as discussed in Section 3.5. Compute Increasing WholeInsurance.Life Method Name ComputeIncreasingWholeInsurance.Life Description Returns vector value containing computed increasing whole insurance life. Implementation ComputeIncreasingWholeInsurance.Life (index,CommFunc) Arguments Index length of the annuity vector. CommFunc list containing the required values of commutation variable required to compute whole insurance values. Details The function calculates whole life insurance using, vector values as mentioned above in previous section. Figure 4.3 Plot of age v.s Increasing Whole life insurance prices for males and females based on year 2000 and age from 20 to 90. The red and blue curves represent smoothed fitted curves for males and females respectively. The Left panel represents plot for Sweden and right panel represents plot for Scotland. From figure 4.3 we infer that whole life insurance prices increases as age increases until 60 and decrease rapidly till age reaches 90 and based on the y axis scales we can infer that Scotland whole life insurance prices are more than the Sweden. In general, females increasing whole life insurance are less expensive than males but converges as age appoaches to 90 this is due to lesser mortality rates as discussed in Section 3.5. Compute Increasing Annuity.Life Method Name ComputeIncreasingAnnuity.Life Description Returns vector value containing computed increasing annuity life. The interest rate is assumed at 4% Implementation ComputeIncreasingAnnuity.Life (index,CommFunc) Arguments Index length of the annuity vector. CommFunc list containing the required values of commutation variable required to compute whole insurance values. Details The function calculates whole life insurance using, vector values as mentioned above in previous section. Figure 4.4 Plot of age v.s Increasing Whole life insurance prices for males and females based on year 2000 and age from 20 to 90. The red and blue curves represent smoothed fitted curves for males and females respectively. The Left panel represents plot for Sweden and right panel represents plot for Scotland. From figure 4.4 we infer that increasing Annuity prices decreases as age increases Also, Scotland increasing Annuity prices are slightly more than the Sweden. In general, females increasing Annuity prices are less expensive than males but converges as age approaches to 90. Conclusions In this dissertation, we set out to show how R packages such as actuar,Mortalitysmooth,MASS can be used to implement aggregate loss claims and human mortality. We used compound distribution to model aggregate claims using actuar and P-splines smoothing techniques to smooth mortality data using Mortalitysmooth package. We finally explained these concepts using real time data such as Danish data and Human Mortality database for Scotland and Sweden and priced life insurance products respectively. In chapter 2 we presented general background to compound distribution in modelling aggregate claim and performed simulation using compound Poisson distribution. Our analysis suggested that Weibull fits the loss claim distribution well using goodness of test fit. Finally we analysed Danish loss insurance data from 1980 to 1990 and used Negative binomial distribution for number of claims and simulated for 1000 samples using Gamma distribution and concluded that Gamma distribution gave a better fit using histogram and chi-square goodness of test fit. In chapter 3 we explained briefly on concepts of graduation, generalised linear models. The smoothening techniques using P-splines were presented and the smoothing parameter was calculated using Bayesian information criterion techniques. We obtained deaths and exposure data from Human Mortality Database for selected countries Sweden and Scotland and implemented mortality rates smoothing using mortalitysmooth package under R. Necessary graphs representing actual data, smoothed mortality data using Bayesian information criteria and smoothing parameter =10000 were presented for the selected countries. We also compared the mortality rates based on various sets such as Males and females for specific country or total mortality rates across countries like Sweden and Scotland for a given time frame ranging age wise or year wise. We finally concluded that mortality rates for Scotland are more than Sweden and in general the mortality rates for males are more than the females. In chapter 4 we looked into various life insurance and pension related products widely used in the insurance industry and constructed life tables and commutation functions to implement annuity values using the smoothed data derived using the methods discussed in chapter 3. We compared and plotted for some of the insurance products and concluded that whole life annuity price decrease as age increases and males annuity prices are more than the females.