Post on 02-Jun-2018
8/10/2019 Slides Casarin Monte carlo
1/46
8/10/2019 Slides Casarin Monte carlo
2/46
8/10/2019 Slides Casarin Monte carlo
3/46
Contents
1 A Matlab Primer 1
1.1 Programming Languages . . . . . . . . . . . . . . . . . . . . . 11.2 Fourth Generation Languages (4GPL) . . . . . . . . . . . . . 51.3 Matlab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.3.1 Operators . . . . . . . . . . . . . . . . . . . . . . . . . 61.3.2 Logical Operators . . . . . . . . . . . . . . . . . . . . . 61.3.3 Creating Matrices . . . . . . . . . . . . . . . . . . . . . 71.3.4 Matrix Description . . . . . . . . . . . . . . . . . . . . 71.3.5 Other Functions . . . . . . . . . . . . . . . . . . . . . . 71.3.6 Loops and If Statements . . . . . . . . . . . . . . . . . 81.3.7 Procedures . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.4 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91.4.1 Input, Output and Graphics . . . . . . . . . . . . . . . 91.4.2 Ordinary Least Square . . . . . . . . . . . . . . . . . . 111.4.3 A Bayesian Linear Regression Model . . . . . . . . . . 12
1.5 From Matlab to Scilab and R . . . . . . . . . . . . . . . . . . 17
2 Monte Carlo Integration 212.1 Integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 212.2 A Monte Carlo Estimator . . . . . . . . . . . . . . . . . . . . 222.3 Asymptotic Properties . . . . . . . . . . . . . . . . . . . . . . 242.4 Optimal Number of MC Samples . . . . . . . . . . . . . . . . 25
2.5 Appendix - Matlab Code . . . . . . . . . . . . . . . . . . . . . 27
3 Importance Sampling 313.1 Importance Sampling . . . . . . . . . . . . . . . . . . . . . . . 313.2 Properties of the IS Estimators . . . . . . . . . . . . . . . . . 323.3 Generating Student-t Variables . . . . . . . . . . . . . . . . . 34
i
8/10/2019 Slides Casarin Monte carlo
4/46
ii CONTENTS
8/10/2019 Slides Casarin Monte carlo
5/46
Chapter 1
A Matlab Primer
Aim
Learn some basic facts in Matlab programming
Contents
1. Programming Languages
2. Fourth Generation Languages (4GPL)
3. Matlab
4. Examples
5. From Matlab to Scilab
1.1 Programming Languages
If you need to carry out an econometric analysis, before starting to write a
code, may be you would like to have a look to the following link
http://www.feweb.vu.nl/econometriclinks/software.html
where many of the most used econometrics softwares and their contributed
libraries are linked.
1
8/10/2019 Slides Casarin Monte carlo
6/46
2 CHAPTER 1. A MATLAB PRIMER
In the following we report a brief description of the softwares listed at the
econometriclinks webpage maintained by the Royal Economic Society:
A+, ACML, ADMB, AIMMS, ALOGIT, Alyuda, AMOS, AMPL, APL,Apophenia, Arc, AREMOS, AutoBox, Autometrics, AutoSignal
B34S, BACC, BATS, BETA, BIOGEME, BMDP, Brodgar, BUGS,BV4
BACC: Bayesian Analysis, Computation and Communication. Free high
quality generic software developed for different operating systems (Windows,
Unix) and different front-ends. Specific model procedures as well. Supportedby the US NSF. Developed by Bill McCausland under the supervision of John
Geweke.
BUGS: Bayesian inference Using Gibbs Sampling (MCMC: Markov Chain
Monte Carlo)
C(++), CART, Census X12, Caterpillar-SSA, CPLEX, ConfortS,CVar
DataDesk, Dataplore, Dataplot, DATAVIEW, DEA-Solver, DEME-TRA, Draco, DYALOG, DYNARE
DYNARE: A Program for the Resolution and Simulation of Dynamic Models
with Forward Variables Through the Use of a Relaxation Algorithm. Com-
putes k-th order approximations of dynamic stochastic general equilibrium
(DSGE) models. Also allows Bayesian Estimation of DSGEs
EasyFit, EasyReg, EcoWin, ECTS, EQS, Eviews, Excel, EXPO
FAME, ForecastPro, Fortran, FreeFore, FSQP
GAMS, GARCH, GAUSS, GAUSSX, GiveWin, Gempack, GeoDa,Genstat, GLIM, GLIMMIX, GQOPT, graphpad, Gnuplot, GSL, GRETL
GAMS: Generic Algebraic Modeling System for large scale optimization
problems.
8/10/2019 Slides Casarin Monte carlo
7/46
1.1. PROGRAMMING LANGUAGES 3
GAUSS: is a programming language designed to operate with and on ma-
trices. It is a general purpose tool. As such, it is a long way from more
specialised econometric packages. On a spectrum which runs from the com-
puter language C at one end to, say, the menu-driven econometric program
EViews at the other, GAUSS is very much at the programming end.
GRETL: Is a cross-platform software package for econometric analysis, writ-
ten in the C programming language. It is free, open-source software.
HLM
ICRFS-Plus, ILOG, IDAMS, IMSL, INSTAT, ITSM J, JMP, JMulti, JStatCom, JWAVE
KNITRO
MacAnova, Maple, Mendeley, MARS, Mathcad, Mathematica, Math-Player, MathML, MathType,MATLAB, Matrixer, M@ximize, MetrixND,
MHTS, Microfit, MiKTeX, Minitab, MINOS, MIXOR, MLE, MLwiN,
Modeleasy, ModelQED, Modler, MOSEK, Mplus, Modula, MuPAD,
Mx.
MATLAB: It is a high-level language and more specifically a 4GPL (such as
SAS, SPSS, Stata, GAUSS) which allows matrix manipulations for numeri-
cal computing.
NAG Mark 22 Numerical Libraries (2009), Genstat, MLP (ML estima-tion))
Octave, O-Matrix, Omegahat, OpenDX, Ox, OxEdit, OxGauss, Ox-MetricsOctave: a high-level language, primarily intended for numerical computa-
tions. It provides a convenient command line interface for solving linear and
nonlinear problems numerically, and for performing other numerical experi-
ments using a language that is mostly compatible with Matlab. It may also
8/10/2019 Slides Casarin Monte carlo
8/46
4 CHAPTER 1. A MATLAB PRIMER
be used as a batch-oriented language.
Ox: is an object-oriented matrix programming language for statistics and
econometrics developed by Jurgen Doornik
PASS, PASW, PcFiml, PcGets, PcGive, PcNaive, PythonPython: Free Open Source Dynamic object-oriented programming language
that can be used for many kinds of software development. It offers strong
support for integration with other languages and tools, comes with extensive
standard libraries
R,RATS, REG-X, ReSampling Stats, Rlab, Rlab+R: is 4GPL, it is a free software environment for statistical computing and
graphics. It compiles and runs on a wide variety of UNIX platforms, Win-
dows and MacOS.
RATS: developed by Estima, RATS (Regression Analysis of Time Series) is
an econometrics and time-series analysis software package.
S+,SAS, SCA, Scilab, SciPy, SciViews, Sciword, SCP, Shazam, Sigmaplot,
SIMSTAT, SOLAS, SOL, Soritec, SpaceStat, SQlite, SPAD, Speakeasy,
IBM SPSS, SsfPack, STAMP, Stata, StatCrunch, Statgraphics, Sta-
tistica, Stat/Transfer, StatsDirect, STL, Statview, SUDAAN, SVAR,
SYSTAT
SAS: is a 4GPL which allows to define a sequence of operations (statistical
analysis and data management) to be performed on data
Scilab: is 4GPL free and open source for numerical computation, similar to
Matlab
TSM, TISEAN, TRAMO/SEATS, TSP, TVARTRAMO/SEATS:
UNISTAT, VassarStats, ViSta
8/10/2019 Slides Casarin Monte carlo
9/46
1.2. FOURTH GENERATION LANGUAGES (4GPL) 5
Web Decomp, WebStat, WEKA, WinIDAMS, WINKS, Windows KWIK-STAT, XploRe, Winsolve, X-12-ARIMA, XLisp-Stat, Xtremes, X(G)PL
1.2 Fourth Generation Languages (4GPL)
Each step in the development of Computer Languages has aimed to reduce
the amount of time required to write programs and reduce the amount of
skill required to write Programs.
In the 1GPL the programs are written in binary code and can access
binary digits. To write programs with 1GPL is a very skilled job and it is
very time consuming to test and debug programs.
In the2GPL, the programs are written in symbolic assembly code, they
access bytes and are slightly less time demanding.
In the 3GPL, the programs are written in a High Level Language (e.g.
COBOL, Pascal, C, Fortran, etc), they can access records and programming
requires less time and skills.
In the 4GPL, the programs perform BOOLEAN operations on SETS
(Mathematical), they requires less time and skills. A well known example of
4GPL is SQL.
Scilab,Matlab,GaussandR, see
http://www.scilab.org/
http://www.mathworks.it/
http://www.aptech.com/
http://www.r-project.org/are 4GPL and have some common features. They are a long way from more
specialised econometric packages, are not menu-driven programs (such as E-
Views) and are very much at the programming end. Thus all of them require
a certain degree of familiarity with programming methods and structures.
8/10/2019 Slides Casarin Monte carlo
10/46
6 CHAPTER 1. A MATLAB PRIMER
Another common feature is that they are extremely powerful for matrix
manipulationand in this sense they are more useful for economists than the3GPL programming languages (such as C or Fortran), where the basic data
units are all scalars. At the same time they are very flexible and allows more
expert users to use interface to procedures written in other languages such
as C, C++, or Fortran.
An important feature of Scilab and R is that the source code of their
libraries are available, which is not generally the case for Matlab and Gauss.
Finally note that Matlab, Gauss and R have a lot of proprietary and con-tributed libraries oriented to statistics and econometrics.
1.3 Matlab
1.3.1 Operators
Select submatrix from matrix:x( startrow : endrow, startcolumn : endcolumn ) Transposition operator: Matrix Operators: + - * \ % Element-by-element operators: .+ .- .* .\ Concatenating operators:[leftmatrix, rightmatrix] [uppermatrix; bottommatrix]
Relational operators: < > == /= >= .== ./= .>= .
8/10/2019 Slides Casarin Monte carlo
11/46
8/10/2019 Slides Casarin Monte carlo
12/46
8 CHAPTER 1. A MATLAB PRIMER
y= ceil( x ); y= floor( x ); y= reshape( x,r,c ); Kronecker product: kron( x , y ) y= trimr( x,t,b );
1.3.6 Loops and If Statements
for i=start:step:increment;
...
end;
while logical expression;
...
end;
if logical expression 1;
...elseif logical expression 2;
...
else;
...
end;
Example of do loop with counter:
i=1;while (i=100);
...
i=i+1;
end;
8/10/2019 Slides Casarin Monte carlo
13/46
8/10/2019 Slides Casarin Monte carlo
14/46
10 CHAPTER 1. A MATLAB PRIMER
end;
end;
%*************************************************
% Some Pictures...
%*************************************************
% figure(1) to have distinct graphs
figure(1);
title(Time series data);
ylabel(Data);xlabel(Time);
plot(xx,yy);
figure(2);
title(Time-varying log-volatility);
a=plot(xx,s,color,[1 0 0]); %[red green blue] the rgb convention
axis([1 n min(s) max(s)]); % Set tics
figure(3);
title(Dummy);
plot(xx,d,color,[1 0 0]); %[red green blue] the rgb conventionaxis([1 n -0.1 1.1]); % Set tics
%*************************************************
% All charts in one pictures...%*************************************************
figure(4);
subplot(3,1,1);
title(Time series data);ylabel(Data);
xlabel(Time);
plot(xx,yy);
subplot(3,1,2);
title(Time-varying log-volatility);
plot(xx,s,color,[1 0 0]); %[red green blue] the rgb convention
axis([1 n min(s) max(s)]); % Set tics
subplot(3,1,3);
title(Dummy);plot(xx,d,color,[1 0 0]); %[red green blue] the rgb convention
axis([1 n -0.1 1.1]); % Set tics
%*************************************************% histogram
%*************************************************
figure(5);
hist(yy,50);
%*************************************************
% Save the results in a ouput file
%*************************************************fid = fopen(C:/Dottorato/Teaching/SummerSchoolBertinoro/...
TutorialAntonietta/TutorialRobAnt/AllLab/MatlabCode/ChapterMatlab/OutPound.txt, w);
fprintf(fid, %5.2f\n, yy);
fclose(fid);
8/10/2019 Slides Casarin Monte carlo
15/46
1.4. EXAMPLES 11
%*************************************************
1.4.2 Ordinary Least Square
We learn how to use structures in Matlab
function results=ols(y,x)
% PURPOSE: least-squares regression
%---------------------------------------------------
% USAGE: results = ols(y,x)
% where: y = dependent variable vector (nobs x 1)% x = independent variables matrix (nobs x nvar)
%---------------------------------------------------% RETURNS: a structure
% results.meth = ols
% results.beta = bhat
% results.tstat = t-stats
% results.yhat = yhat% results.resid = residuals
% results.sige = e*e/(n-k)% results.rsqr = rsquared
% results.rbar = rbar-squared
% results.dw = Durbin-Watson Statistic
% results.nobs = nobs
% results.nvar = nvars
% results.y = y data vector
Check for the correct number of input argument and if the number ofrows of x is equal to the number of rows of y
if (nargin ~= 2); error(Wrong # of arguments to ols);
else[nobs nvar] = size(x); [nobs2 ndep] = size(y);
if (nobs ~= nobs2); error(x and y must have same # obs in ols);end;
end;
k=nvar;
Evaluate all the statistics that are usually involved in a OLS estimation
results.y = y;results.nobs = nobs;
results.nvar = nvar;
%xpxi = (x*x)\eye(k);
results.beta = xpxi*(x*y);
results.yhat = x*results.beta;
results.resid = y - results.yhat;
sigu = results.resid*results.resid;results.sige = sigu/(nobs-nvar);
tmp = (results.sige)*(diag(xpxi));results.tstat = results.beta./(sqrt(tmp));
8/10/2019 Slides Casarin Monte carlo
16/46
12 CHAPTER 1. A MATLAB PRIMER
ym = y - mean(y);
rsqr1 = sigu; rsqr2 = ym*ym;
results.rsqr = 1.0 - rsqr1/rsqr2; % r-squaredrsqr1 = rsqr1/(nobs-nvar);
rsqr2 = rsqr2/(nobs-1.0);
results.rbar = 1 - (rsqr1/rsqr2); % rbar-squared
ediff = results.resid(2:nobs) - results.resid(1:nobs-1);results.dw = (ediff*ediff)/sigu; % durbin-watson
end;
We save as a function the ols.m code and run the following simulationexample
nob=100;
x1=ones(nob,1);
x2=randn(nob,1).*((1:nob)/10);
x=[x1 x2];sig=2;y=x*[10; 0.9]+sig*randn(nob,1);
res=ols(y,x);
res.beta
%%
figure(1)
plot([res.yhat y]);
figure(2)plot(res.resid);
1.4.3 A Bayesian Linear Regression Model
LetyRn,X Rn Rk and Rk. Consider the simple regression model
y= X+ (1.1)
Nn(0n, 2In)(1.2)
with the following prior specification
(1.3) R N(r, T)
or equivalently
(1.4) Q N(q, Ik)
where QQ= T1 and q= Qr.
8/10/2019 Slides Casarin Monte carlo
17/46
8/10/2019 Slides Casarin Monte carlo
18/46
8/10/2019 Slides Casarin Monte carlo
19/46
1.4. EXAMPLES 15
0.9610 11.2966 0
0.9200 11.2740 0
Theil-Goldberger estimates
1.0037
0.9569
0.9198
We apply now the inference procedure to a financial dataset. We consider
monthly data on the short-term interest rate (the three-month Treasury Billrate) and on the AAA corporate bond yield in the USA. As Treasury Bill
notes and AAA bonds are low-risk securities and one could expect that there
is a relationship between their interest rate. We consider data from January
1950 to December 1999.
Letyibe the monthly change in the Treasury Bill rate andzithe monthly
change in the AAA bond rate. We will fit on this set of data the heteroscedas-
tic model presented above with
yi=1+2zi+i
that corresponds to set xi = (1, zi) and = (1, 2)
in the multivariate
regression model given above. The results of the estimation procedure are
Gibb sampling estimates
Coefficient t-statistic t-probability
0.0053 0.7805 0.2177
0.2751 19.8628 0
Theil-Goldberger estimates
0.0057
0.2747
8/10/2019 Slides Casarin Monte carlo
20/46
16 CHAPTER 1. A MATLAB PRIMER
100 200 300 400 500 6001.5
1
0.5
0
0.5
1
1.5
Actual
Fitted
100 200 300 400 500 6001
0.5
0
0.5
1
1.5
Residuals
Figure 1.1: Actual and fitted data (top) and residuals (bottom) using theBayesian estimates of the linear regression model.
The estimates of the2 are 0.0283 for the Gibbs sampler and 0.0282 for the
Theil-Goldberger procedure.
The actual and fitted data and the residuals are given in Fig. 1.1. The
plot of the residuals shows that in the second half of the sample (say after
the 1975) the variance is underestimated. More precisely one should account
in the model for the time variation in the variance of the data. This call
for heteroscedastic linear regression models (see Chapter??) or for nonlinear
models such as stochastic volatility models (see Chapter ?? and ??).
References
Gelfand, Alan E., and A.F.M Smith. 1990. Sampling-Based Approachesto Calculating Marginal Densities, Journal of the American Statistical Asso-
ciation, Vol. 85, pp. 398-409.
8/10/2019 Slides Casarin Monte carlo
21/46
8/10/2019 Slides Casarin Monte carlo
22/46
18 CHAPTER 1. A MATLAB PRIMER
//*************************************************
// Some Pictures...//*************************************************
// figure(1) to have distinct graphs
figure(1);
title("Time series data");
ylabel("Data");
xlabel("Time");
plot(xx,yy);
figure(2);
title("Time-varying log-volatility");
plot(xx,s,color,[1 0 0]); //[red green blue] the rgb convention
a=gca();
a.data_bounds=[1,min(s);n,max(s)];// Set tics
figure(3);
title("Dummy");
plot(xx,d,color,[1 0 0]); //[red green blue] the rgb convention
a=gca();
a.data_bounds=[1,-0.1;n,1.1];// Set tics
//*************************************************
// All charts in one pictures...
//*************************************************
figure(4);
subplot(3,1,1);
title("Time series data");
ylabel("Data");
xlabel("Time");
plot(xx,yy);
subplot(3,1,2);
title("Time-varying log-volatility");
plot(xx,s,color,[1 0 0]); //[red green blue] the rgb convention
a=gca();
a.data_bounds=[1,min(s);n,max(s)];// Set tics
subplot(3,1,3);
title("Dummy");plot(xx,d,color,[1 0 0]); //[red green blue] the rgb convention
a=gca();
a.data_bounds=[1,-0.1;n,1.1];// Set tics
//*************************************************
// histogram
8/10/2019 Slides Casarin Monte carlo
23/46
1.5. FROM MATLAB TO SCILAB AND R 19
//*************************************************
figure(5);histplot(100,yy);
//*************************************************
// Save the results in a ouput file
//*************************************************
fprintfMat(C:/Dottorato/Teaching/SummerSchoolBertinoro/TutorialAntonietta/...
TutorialRobAnt/AllLab/MatlabCode/ChapterMatlab/OutPound.txt,yy,%5.2f);
// attention this overwrites the existing file
R
#*************************************************
# basic in I/O, graphical, statistical procedures
#*************************************************
# Load UK/EU exchange rate data
yy=scan("C:/Dottorato/Teaching/SummerSchoolBertinoro/TutorialAntonietta/...
TutorialRobAnt/AllLab/MatlabCode/ChapterMatlab/pound.txt",sep="\t",skip=0,na.strings=".")
dim(yy)=c(1006,1);
#*************************************************
n=dim(yy); # evaluate the number of rows #
n=n[1];
xx=(1:n);
#*************************************************# for endfor if end
# (1) Evaluate sequentially the variance
# (2) Built a dummy variable, based on the value
# of the variance estimated recursively
#*************************************************
wn=10; # set the value of a variable#
s=array(0,n); # define a n-dim null vector #
d=array(0,n);
for (j in ((wn+1):n)){
s[j]=var(yy[(j-wn+1):j]);
if (s[j]>0.45){
d[j]=1;
}
}
#*************************************************
# Some Pictures...
#*************************************************
# figure(1) to have distinct graphs
8/10/2019 Slides Casarin Monte carlo
24/46
20 CHAPTER 1. A MATLAB PRIMER
dev.new();plot(xx,yy,main="Time series data",xlab="Time",ylab="Data",type="l");
dev.new();
plot(xx,s,main="Time-varying log-volatility",xlab="Time",ylab="Data",type="l");
#[red green blue] the rgb convention
dev.new();
plot(xx,d,main="Dummy",xlab="Time",ylab="Data",type="l");
#[red green blue] the rgb convention
#*************************************************
# All charts in one pictures...
#*************************************************
par(mfrow=c(3,1),pin=c(5,1.5));plot(xx,yy,main="Time series data",xlab="Time",ylab="Data",type="l");
plot(xx,s,main="Time-varying log-volatility",xlab="Time",ylab="Data",type="l");
#[red green blue] the rgb convention
plot(xx,d,main="Dummy",xlab="Time",ylab="Data",type="l");
#[red green blue] the rgb convention
#*************************************************
# histogram
#*************************************************
dev.new();
hist(yy,50);
#*************************************************
# Save the results in a ouput file
#*************************************************
save(yy, file = "C:/Dottorato/Teaching/SummerSchoolBertinoro/TutorialAntonietta/...
TutorialRobAnt/AllLab/MatlabCode/ChapterMatlab/OutPound.txt");
8/10/2019 Slides Casarin Monte carlo
25/46
Chapter 2
Monte Carlo Integration
Aim
Apply basic Monte Carlo principles to solve some basic integrationproblems. Discuss the choice of the number of samples in a MonteCarlo estimation.
Contents
1. Integration
2. A Monte Carlo Estimator
3. Asymptotic Properties
4. Optimal Number of MC Samples
5. Appendix - Matlab Code
2.1 Integration Our aim is to approximate the integral
(2.1) (f) =
10
f(x)dx
21
8/10/2019 Slides Casarin Monte carlo
26/46
22 CHAPTER 2. MONTE CARLO INTEGRATION
for the following integrand functions f
1. f(x) =x
2. f(x) =x2
3. f(x) = cos(x)
We apply a Monte Carlo approach and re-write the integration problem instatistical terms as follows
(2.2)
10
f(x)dx=
+
f(x)I[0,1](x)dx= E(f(X))
where IA(x) if the indicator function that holds 1 ifxA and 0 otherwiseand X U[0,1] is a random variable with a standard uniform distribution.
2.2 A Monte Carlo Estimator
Let X1, . . . , X n be a set ofn i.i.d. samples from a uniform distribution.The integral= E(f(X)) approximates as follows
(2.3) n= 1
n
ni=1
f(Xi)
that is called a Monte Carlo estimator ofE(f(X)).
The results of the Monte Carlo estimates for different sample sizes n =1, . . . , 50 and different integrand functions fare given in Fig. 2.1
Find the mean and the variance of the estimator and give a Monte Carloapproximation for the expression of the variance.
8/10/2019 Slides Casarin Monte carlo
27/46
8/10/2019 Slides Casarin Monte carlo
28/46
24 CHAPTER 2. MONTE CARLO INTEGRATION
where
2(f) = V(f(X1)) = +
(x )2f(x)I[0,1](x)dx
For the differentfwe find the analytical solution of the integral (f) (see
also horizontal dotted lines in Fig. 2.1)
1. Forf(x) =x
(2.6) E(f(X1)) = 1
0
xdx= 1
2x2
0
1
= 1/2
2. Forf(x) =x2
(2.7) E(f(X1)) =
10
x2dx=
13 x30
1
= 1/3
3. Forf(x) = cos(x)
(2.8) E(f(X1)) = 10 cos(x)dx=
1sin(x)0
1 = 0
2.3 Asymptotic Properties
Under the i.i.d. and finite variance assumptions we have
(2.9) na.s.
n
(2.10)
n (n ) Dn
N(0, 2(f))
For the different fwe have
8/10/2019 Slides Casarin Monte carlo
29/46
2.4. OPTIMAL NUMBER OF MC SAMPLES 25
1. Forf(x) =x
V(f(X1)) = E(f(X1)2) (E(f(X1)))2
=
10
x2dx 1
0
xdx
2= 1/3 1/4 = 1/12
2. Forf(x) =x2
V(f(X1)) = 1/5 1/9 = 4/45
3. Forf(x) = cos(x)
V(f(X1)) = 1/2 0 = 1/2
When the variance V(f(X1)) is unknown one can use the Monte Carlo
estimator
(2.11) 2(f) = 1
n
1
n
i=1
(Xi n)2
The empirical approximations of the asymptotic variances are given in Fig.
2.2.
Exercise: use the asymptotic distribution and the approximation of theasymptotic variance to find the 5% confidence intervals of the MC estimator
of.
2.4 Optimal Number of MC Samples
It is possible to use the asymptotic properties of a MC estimator to findthe optimal number n of samples that are necessary to reach an accuracy
8/10/2019 Slides Casarin Monte carlo
30/46
26 CHAPTER 2. MONTE CARLO INTEGRATION
MC Variances
0 10 20 30 40 500.05
0.1
0.15
0.2
0.25f(x)=x
Empirical Variance
Theoretical Variance
0 10 20 30 40 500
0.05
0.1
0.15
0.2f(x)=x
2
Empirical Variance
Theoretical Variance
0 10 20 30 40 500
0.5
1
1.5
f(x)=cos(x)
Empirical Variance
Theoretical Variance
Figure 2.2: Monte Carlo variance estimates 2n (solid lines) for different sam-ple sizes n= 1, . . . , 50 and the true value 2 (horizontal dotted lines).
level , for a given confidence level , in the Monte Carlo estimation of.
The asymptotic results allow us to find nsuch that
(2.12) P r|n | 2(f)/n = 1
that is
(2.13) X=
n
2(f)n =
X
22(f)
8/10/2019 Slides Casarin Monte carlo
31/46
2.5. APPENDIX - MATLAB CODE 27
where X = 1(1/2), with 1 the inverse cumulative distribution
function of a standard normal.When the variance 2(f) is unknown one can use the Monte Carlo esti-
mator 2n(f) and then apply a similar asymptotic argument. In this case the
optimal number of simulations should satisfy the following relationship
(2.14) 2n(f)n2
X2
One can check iteratively the condition.
1. Start with n1 MC samples X1, . . . , X n1
2. If 2n(f) n2X2
then stop otherwise
3. evaluatek1=n2X2
nand generatek1samplesXn1+1, . . . , X n1+k1(xindicates the integer part ofx)
Exercise: write a Matlabs code for computing the optimal number of sam-
ples that are needed to estimate (f) for the different integrand functions f
given in Section 1 and for the accuracy level = 0.001.
2.5 Appendix - Matlab Code% Uniform Random Number% Monte Carlo method as an approximated integration technique
% integrate f(x) on the [0,1] interval
% solution: 1/2, 1/3, and 0
clc;
n=50;
x=rand(n,1);gav=zeros(n,3);
gavvar=NaN(n,3);
gav(1,1)=x(1,1);gav(1,2)=x(1,1)^2;
gav(1,3)=cos(pi*x(1,1));
for i=2:n
gav(i,1)=sum(x(1:i))/i;gav(i,2)=sum(x(1:i).^2)/i;
gav(i,3)=sum(cos(pi*x(1:i)))/i;gavvar(i,1)=var(x(1:i));
8/10/2019 Slides Casarin Monte carlo
32/46
28 CHAPTER 2. MONTE CARLO INTEGRATION
gavvar(i,2)=var(x(1:i).^2);
gavvar(i,3)=var(cos(pi*x(1:i)));
end%
%
%%%%%%%%% Graphics (mean) %%%%%%%%%%
figure(1);subplot(3,1,1);
plot(gav(:,1));
line((1:n),ones(n,1)/2,color,red);
legend(Empirical Average,Theoretical Mean,...Location,NorthEastOutside);
title(f(x)=x);
%subplot(3,1,2);
plot(gav(:,2));
line((1:n),ones(n,1)/3,color,red);
legend(Empirical Average,Theoretical Mean,...Location,NorthEastOutside);title(f(x)=x^2);
%
subplot(3,1,3);plot(gav(:,3));
line((1:n),ones(n,1)*0,color,red);
legend(Empirical Average,Theoretical Mean,...
Location,NorthEastOutside);title(f(x)=cos(\pi x));
To export picture to a .eps file one can use
%%%%%%%%% Export a picture %%%%%%%%%%%%%
dire=C:\Dottorato\Teaching\SummerSchoolBertinoro;figu=\TutorialAntonietta\TutorialRobAnt\Figure\;
figname=strvcat([strcat(dire,figu,MC1.eps)]);print (gcf,-depsc2, figname);
%
%%%%%%%%% Graphics (variance) %%%%%%%%%%figure(2);
subplot(3,1,1);plot(gavvar(:,1));
line((1:n),ones(n,1)/12,color,red);
legend(Empirical Variance,Theoretical Variance,...
Location,NorthEastOutside);
title(f(x)=x);
%subplot(3,1,2);
plot(gavvar(:,2));
line((1:n),ones(n,1)*4/45,color,red);legend(Empirical Variance,Theoretical Variance,...
Location,NorthEastOutside);
title(f(x)=x^2);
%
subplot(3,1,3);plot(gavvar(:,3));
line((1:n),ones(n,1)*1/2,color,red);legend(Empirical Variance,Theoretical Variance,...
8/10/2019 Slides Casarin Monte carlo
33/46
2.5. APPENDIX - MATLAB CODE 29
Location,NorthEastOutside);
title(f(x)=cos(\pi x));
8/10/2019 Slides Casarin Monte carlo
34/46
30 CHAPTER 2. MONTE CARLO INTEGRATION
8/10/2019 Slides Casarin Monte carlo
35/46
Chapter 3
Importance Sampling
Aim
Define and apply the importance sampling method and study itsproperties.
Contents
1. Importance Sampling (IS)
2. Properties of the IS Estimators
3. Generating Student-t Variables
3.1 Importance Sampling
Let be a probability density function, fa measurable function and
(3.1) = E(f(X)) =
f(x)(x)dx
the integral of interest.
In importance sampling (see Section 3.3 in Robert and Casella (2004)) a
distribution g (called importance distribution or instrumental distribution)
31
8/10/2019 Slides Casarin Monte carlo
36/46
32 CHAPTER 3. IMPORTANCE SAMPLING
is used to apply a change of measure
(3.2) =
(x)
g(x)f(x)g(x)dx
The resulting integral is then evaluated numerically by using a i.i.d. sample
X1, . . . , X n fromg
(3.3) ISn = 1
n
ni=1
w(Xi)f(Xi)
wherew(Xi) =
(Xi)
g(Xi), i= 1, . . . , n
are called importance weights.
3.2 Properties of the IS Estimators
The Monte Carlo estimator ISn of is unbiased
Eg(ISn ) =
1
n
ni=1
w(xi)f(xi)
ni=1
g(xi)dxi
=
(x1)
g(x1)f(x1)g(x1)dx1
=
f(x1)(x1)dx1
and converges almost surely to , under the assumption supp g
supp .
Nevertheless the existence of the variance and of a limiting distribution is
not guaranteed. We shall notice that Vg(ISn ) Eg((ISn )2) thus the condition
we need to check is the existence of an upper bound for the second order
8/10/2019 Slides Casarin Monte carlo
37/46
8/10/2019 Slides Casarin Monte carlo
38/46
8/10/2019 Slides Casarin Monte carlo
39/46
3.3. GENERATING STUDENT-T VARIABLES 35
0 1 2 3 4 5
x 104
0
1
2
Studentt
0 1 2 3 4 5
x 104
0
5
10
Normal
0 1 2 3 4 5
x 104
0
1
2
Cauchy
Figure 3.1: Importance sampling weights for the proposal distributionsT(, 0, 1),N(0, /( 2)) andC(0, 1)
where< 0 and cumulative distribution function
F(x) = 1
x
1
(1 + ((u )/)2)du
= 1
2+
1
arctan
x IR(x)
The inverse c.d.f. method can be applied in order to generate from the
Cauchy. IfX=F1(U), where U U[0,1], then X C(, ).
From the results in Fig. 3.1 one can see that the importance weights for
Student-t and Cauchy are not unstable while the importance weights asso-
ciated to the normal exhibit some large jumps. For all the functions theresults in Fig. 3.2 show that the normal proposal produces jumps in the
progressive averages (green lines) that are due to the unbounded variance of
the estimator. However for the first function the normal proposal behaves
quite well when compared with the Cauchy and Student-t proposals. For the
8/10/2019 Slides Casarin Monte carlo
40/46
36 CHAPTER 3. IMPORTANCE SAMPLING
second and third function the Cauchy proposal seems to converge faster than
the Student-t. In all the pictures we plotted (black lines) the approximationobtained with an exact simulation from a Student-t with = 12.
Exercise - Use repeated Monte Carlo experiments to find the distribution
of the estimator n(f). Plot the 95% and 5% quantiles and the mean of the
estimator for n= 1, . . . , 50000.
The Matlab code is
%%%%%%% Importance weight for T(nustar,0,1)
function w=w1(x,nu,nustar)w=pdf(t,x,nu)/pdf(t,x,nustar);
end
%
%%%%%%% Importance weight for N(0,nu/(nu-2))
function w=w2(x,nu)w=pdf(t,x,nu)/pdf(normal,x,0,sqrt(nu/(nu-2)));
end%
%%%%%%% Importance weight for C(0,1)
function w=w3(x,nu)
w=pdf(t,x,nu)/pdfcauchy(x,0,1);
end
%clc;
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% Importance sampling
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
nu=12;
nustar=7;
nIS=50000;
mu1IS=zeros(nIS,4);mu2IS=zeros(nIS,4);mu3IS=zeros(nIS,4);
%
mu1IScum=zeros(nIS,4);
mu2IScum=zeros(nIS,4);
mu3IScum=zeros(nIS,4);
%wIS=zeros(nIS,3);
for i=1:nIS% Proposal 1
x1=random(t,nustar);
% Proposal 2
x2=random(normal,0,sqrt(nu/(nu-2)));
% Proposal 3x3=tan((rand(1,1)-0.5)*pi);
%x3=random(normal,0,1)/random(normal,0,1);% Exact
8/10/2019 Slides Casarin Monte carlo
41/46
8/10/2019 Slides Casarin Monte carlo
42/46
38 CHAPTER 3. IMPORTANCE SAMPLING
plot((1:nIS),wIS(:,3));
legend(Cauchy,Location,NorthEast);
set(gca,FontSize,fs);
figure(2)
plot((1:nIS),mu1IScum(:,1:3));
hold on;plot((1:nIS),mu1IScum(:,4),-k);
hold off;
legend(Student-t,Normal,Cauchy,Exact,Location,NorthEast);
ylim([0.00001 0.00015]);set(gca,FontSize,fs);
figure(3)plot((1:nIS),mu2IScum(:,1:3));
hold on;
plot((1:nIS),mu2IScum(:,4),-k);
hold off;legend(Student-t,Normal,Cauchy,Exact,Location,NorthEast);ylim([1 1.4]);
set(gca,FontSize,fs);
figure(4)
plot((1:nIS),mu3IScum(:,1:3));
hold on;
plot((1:nIS),mu3IScum(:,4),-k);hold off;
legend(Student-t,Normal,Cauchy,Exact,Location,NorthEast);
ylim([3 9]);
set(gca,FontSize,fs);
This code calls the following function defined by the user
%%%%%%% Cauchy probability density functionfunction f=pdfcauchy(x,a,b)
f=1/(pi*b*(1+((x-a)/b)^2));
end
%
8/10/2019 Slides Casarin Monte carlo
43/46
3.3. GENERATING STUDENT-T VARIABLES 39
0 1 2 3 4 5
x 104
2
4
6
8
10
12
14
x 105
Studentt
NormalCauchy
Exact
0 1 2 3 4 5
x 104
1
1.05
1.1
1.15
1.2
1.25
1.3
1.35
1.4
StudenttNormalCauchyExact
0 1 2 3 4 5
x 104
3
4
5
6
7
8
9
StudenttNormalCauchyExact
Figure 3.2: Charts from one to three: IS for the different functions f.In each chart the IS estimators for different proposals (colored lines) andthe Monte Carlo estimator with exact simulation from theT(12, 0, 1) (blacklines).
8/10/2019 Slides Casarin Monte carlo
44/46
8/10/2019 Slides Casarin Monte carlo
45/46
Exercise
Importance Sampling
Consider a Student-t distributionT(,,2) with density
(3.6) (x) = ((+ 1)/2)
(/2)
1 +
(x )22
(+1)/2IR(x)
w.l.o.g. take = 0, = 1 and = 12.
Study the performance of the importance sampling estimator ISn of
(3.7) = E(f(X)) =
f(x)(x)dx=
(x)g(x)
f(x)g(x)dx
when the following instrumental distributions, g(x), are used
1.T(, 0, 1) with < (e.g. = 7)
2.N(0, /( 2))
3.C(0, 1)
for the following test functions
1.
f(x) =
sin(x)
x
5I(x)(2.1,+)
41
8/10/2019 Slides Casarin Monte carlo
46/46
42 CHAPTER 3. IMPORTANCE SAMPLING
2.
f(x) = x1 x
3.
f(x) = x5
1 + (x 3)2 I[0,+)(x)
Metropolis-Hastings
Write a M.-H. algorithm to generate n = 500 i.i.d. random samples from
a zero-mean and independent bivariate normal distribution,N2(0, I2), withcovariance matrix, I2 and mean 0 = (0, 0)
. Use alternatively independent
and random walk proposals with variance covariance matrix2I2. (Try with
different values of2).