| A Simple Introduction to Optimal Designs and Related Issues
|
|
|
How do you collect data to answer your research questions? What should your
design be? This site will provide a variety of optimal designs under a specific
setting. It is instructive to first consider a simple illustration.
|
|
|
Consider the simple linear regression E(y) = a + b x , x
X= [0,90].
|
|
| The interval X is usually pre-specified and is called the
design space or design interval. This is the interval where the researcher is
permitted to choose values of x to observe y. Suppose you have resources to
take a sample of size n. The design question is how do you choose the n values
of the x's from X to observe the responses y's? |
|
| A popular design is to spread these n values of x uniformly
over X. So if n=10, the design is to take observations at x = 0, 10, 20, 30,
40, 50, 60, 70, 80 and 90. This is an example of a uniform design and we denote
it by D1. |
|
| Another design D2 is to take 5 observations at 0 and 5
observations at 90. This design differs from D1 in that there are 5
replications at 0 and at 90; there were no replications in D1. |
|
| Is design D2 better than D1? |
|
| The word 'better' is subjective and depends on the
objective or objectives of the study. For example, suppose your main objective
is to estimate b accurately in the above model. It is easy to verify that the
variance of the least squares estimate of b is smaller using design D2 than
design D1. So D2 is better than D1. Can you find a design with a smaller
variance for the estimated b than that given by D2? If the answer is negative
than D2 is the optimal design. |
|
| Even for the same objective, there are different ways of
judging if the design is optimal. If you want to estimate the two parameters a
and b in the above model, one may want to find a design that minimizes the sum
of the variances of the estimates; this will give a sense that when the sum is
small, both parameters are well estimated. Alternatively, one may choose to
minimize the area of the confidence ellipsoid for a and b. So there are at
least two optimality criteria for this simple problem. The optimal design may
or may not be the same under the two criteria. |
|
| In practice, design problems are more complicated. The
researcher may be more interested to estimate b than the parameter a. The
researcher may be concerned if the model is adequate. For example, is the
assumption of constant variance of the response or the simple mean structure
specification valid? |
|
| In summary, the key challenges in designs issues include
|
|
| (i) selection of an appropriate statistical model; |
| (ii) selection of an appropriate optimality criterion or
criteria; |
| (iii) determination of the optimal design, analytically or
numerically; |
| (iv) confirmation that we have the optimal design; |
| (v) procedure to design studies when there are multiple
objectives; |
| (vi) comparison of merits of competitive designs. |
|
| This website provides a variety of optimal designs under a
specific setting after the user specifies (i) and (ii) from a menu of choices
available on the site. In addition, when appropriate, we provide efficiencies
of competing designs. A simplification is that we provide only optimal
approximate designs. This means that when the user specifies the value of n,
the design space X, a statistical model and an optimality criterion, we provide
the optimal number of x values and their values in X along with the proportion
of observations to be allocated at each of these optimal points. In optimal
design terminology, these optimal points are called support points of the
optimal design. |
|
| The main advantages working with approximate optimal
designs are that these optimal designs are easier to determine and verify. A
drawback is that large values of n are required for these designs to be
meaningful. For this reason, such optimal designs are also sometimes called
continuous or large-sample optimal designs.
|
|
| Optimal Design Programs |
|
All optimal designs constructed here are in the spirit of continuous
designs proposed by Professor Jack Kiefer; see Kiefer (1980) and the references
therein. This means that continuous designs are viewed as probability measures
on the design space. Thus these designs are fully characterized by the number
of support points , the location of the support points xi and the
mass at each point wi, with each xi inside the design
space and the non-negative masses wi summing to unity. In practice,
the implemented design takes roughly N*wi observations at xi
and N is the predetermined sample size for the problem. Usually N is determined
in advance by cost or practical considerations. To find out our featured web
based programs, click
Optimal Design Programs.
|
|
| Each objective of the design
problem is expressed as a convex functional of the expected Fisher Information
matrix and the optimal design is found by minimizing this function globally. An
equivalence theorem is used to verify the optimality of the design and we
provide this additional tool for selected designs constructed on this site. The
implication of an equivalence theorem is that in practice, given an optimality
criterion, we may use a graph to verify if a design is optimal among all
designs on the given design space. If the current design is optimal among all
designs on the given design space, one of the main features of this graph is
that the graph is bounded above by the common value attained by the support
points of the design throughout the design space. When this property is
violated, the current design is not optimal among all designs on the design
space. This design, however, can still be optimal within a selected class
of designs on the design interval. Such is the case for a minimally
supported optimal design, which is optimal among designs with support points
equal to the number of parameters in the mean
function. Alternatively, a K-point optimal design is one which is optimal
among all designs with K support points defined on the design space. Details
are available in Kiefer (1980).
|
|
| Please visit the FAQ page for explanation and
clarifications. You may find answers to your questions
there. |
|
| The basic types of optimal designs available on this
site are described below, along with a brief justification for their use. |
|
 |
D-optimal designs for estimating model
parameters; these designs minimize the generalized variance of the estimated
parameters. |
 |
Ds-optimal designs for
estimating a subset of the model parameters; justification is same as D-optimal
except for selected parameters of interest. |
 |
L-optimal designs for estimating one or more
functions of the model parameters; this includes two popular classes of
designs: A-optimal designs for minimizing the sum of variances of the estimated
model parameters and C-optimal design for estimating a specific function of the
model parameters. For example, a C-optimal design is typically used to estimate
one or more percentiles in a logistic model. |
 |
Minimax optimal designs; these are designs that seek
to minimize the maximum loss in some sense. For example, the variance of an
estimated function of parameters in a nonlinear model depends on the
parameters. A minimax optimal design can be constructed to minimize the largest
possible variance over all plausible values of the unknown parameters. |
 |
Bayesian optimal designs for selected problems;
these are designs constructed assuming there is prior information on the
model parameters or the parameters of interest. The prior information
is usually incorporated into the construction of the design by specifying
a prior distribution on the model parameters or parameters of interest. |
|
|
| The parameters of interest may appear
nonlinearly in the mean response or the optimality criterion may depend on
parameters that we wish to estimate. When this happens, a common and easy
design strategy is to find a locally optimal design; such an optimal design
clearly depends on the parameters that we wish to estimate. Unless
otherwise stated, all optimal designs generated for such a situation is
locally optimal. |
|
|
It is not the purpose of this site to provide theoretical underpinnings
of each optimal design constructed on this site. In each case, we give a brief
description of the background and refer the user to publications on which the
construction of the design is based.
|
|
We recommend that the optimal design found from this site be used as a
benchmark. If the user wants to use a non-optimal design, as is frequently the
case, the user can modify the optimal design by making sure that the resulting
design does not lose too much efficiency. If the user has a specific design in
mind, the user should check the efficiency of the design before implementation.
The final design should meet the user’s objectives and be cost-efficient.
Go to the
Optimal Design Programs page.
|