LIMDEP takes the form of an econometrics studio. Analysis of a data set is done
interactively in a set of windows. Program control may be from a 'script' or in
an unstructured session of instructions and manipulations. The program is
designed to allow easy setup of data for estimation, specification of different
forms of the models, experimentation with different specifications, hypothesis
testing, analysis of data and model results and construction of special procedures
and estimators. The following sections will list in detail the set of program
features and tools supported in Version 8.0.
CROSS SECTION, PANEL DATA, and TIME SERIES:
CROSS SECTION
Most of the models and data analyses done with LIMDEP are oriented to cross
sectional, individual level data. These include:
Descriptive statistics
Descriptive graphics
Linear regression models
Models for discrete choices
Models for censoring and truncation
Models for count data
Models for stochastic frontier production and cost functions
Models for survival and duration data
Numeric, alphanumeric (string), and date series; value labels
Extensive library of operators and statistical, mathematical, date and string functions
and many more. Data sets in this context may be large. The program allows up to 3,000,000 observations and 900 variables, which should accommodate all but the most ambitious analyses.
PANEL DATA
Nearly all of the models in LIMDEP may be analyzed with special tools for panel data. This includes fixed and random effects, random parameters and latent class models for almost all nonlinear models supported by the package. There are also numerous special estimators for the linear model, such as Arellano and Bond's GMM estimator for dynamic panels and Hausman and Taylor's estimator for random effects models. No panel data operation anywhere in the program requires that the data set be balanced. Most estimators place no limit on the number of groups in the panel - the data set is already 'in the program' so it must already fit in memory. Many tools in addition to the estimation programs are also provided. For example, you can bootstrap sample groups in your panel data set, a feature we have not seen anywhere else.
TIME SERIES
LIMDEP is not best known for its time series features. Nonetheless, a range of estimators for time series modeling are presented, including autoregressive and moving average models, ARMAX models, GARCH and GARCH-in-mean models. We also provide a set of analysis tools, such as spectral density estimation, ACF and PACF, Phillips-Perron tests, the Newey-West estimator, and so on.
MODELS:
A full list of the models that can be fit with LIMDEP appears elsewhere so we'll just give a partial list here:
CROSS SECTION
Descriptive statistics including means, variance, skewness, etc.
Binary choice including probit, logit, and several other formulations
Bivariate probit
Multivariate probit
Logit models, binomial and multinomial - discrete choice models
Ordered probit and logit
Nonlinear least squares regression
Censored data (tobit)
Models for count data
Survival models
Sample selection
Stochastic frontiers.
PANEL DATA
Fixed effects (true fixed effects - the dummy variable coefficients are estimated)
Time series - cross section, variance structure models
Random effects
Random parameters
Latent class
Dynamic panel data models - Arelland-Bond-Bover estimator
Hausman and Taylor's instrumental variable estimator
TIME SERIES MODELS
·ARMAX
·GARCH and GARCH in mean
·Autoregressive and moving average regressions
USER SPECIFIED MODELS
MAXIMIZE/MINIMIZE allow you to specify the likelihood function for your own model. GMM estimation can be specified instead.
POST ESTIMATION:
Model estimation is only part of the data analysis. By 'post estimation,' we mean the manipulation of model results along with other statistics and procedures.
RECOVERABLE RESULTS
All model estimates are 'recoverable.' Coefficients and asymptotic covariance matrices are retained and integrated into the matrix algebra package. Numerical values such as log likelihoods and sums of squares are recoverable and useable, by name, for example, in testing hypotheses or computing diagnostic statistics.
SIMULATION AND PREDICTION
Estimated models may be used for simulations and for computing predictions, generalized residuals, and other functions for forecasting and specification analysis.
PROGRAMMING TOOLS FOR MANIPULATING MODEL RESULTS
Program tools are provided for manipulating results. One of the most useful is the WALD command which is used to compute nonlinear functions and asymptotic variances for nonlinear functions of model estimates. Programming the derivatives for the delta method is unnecessary.
ESTIMATION EXTENSIONS
Facilities are provided for bootstrap sampling cross sections and from panels. Thus, bootstrap standard errors can easily be computed. All model results can be retained for use in other program functions. Programming two step estimators is simple.
PROGRAMMING TOOLS AND FLEXIBILITY:
LIMDEP's program features are an extensive set of fully integrated analysis tools. All of these tools interact to produce a studio, or 'laboratory' format for econometric analysis. Exploratory manipulation and estimation, including experimentation such as Monte Carlo analysis, are straightforward. These tools include:
DATA INPUT AND OUTPUT
Read and write ASCII, binary, spreadsheet, and other types of files.
TRANSFORMATIONS
Algebraic transformations, dozens of functions, sort, recode, expand and collapse categorical variables.
SAMPLE DEFINITION
Include or reject by algebraic criteria, time periods by date, bootstrapping cross sections or panel data, random sampling with random number generators.
MATRIX ALGEBRA
Full algebraic capabilities and over 75 matrix functions such as determinant, rank, characteristic roots, several types of inverses, and so on. Data and model results are combined seamlessly in this routine.
SCIENTIFIC CALCULATOR
Over 100 functions plus algebraic results.
ESTIMATION PROGRAMS
Nearly 100 different preprogrammed estimators for all types of cross section, time series, and panel data settings, including continuous data, discrete choice, count data, survival data, and censored, truncated and limited dependent variables.
USER DEFINED MODELS
MAXIMIZE/MINIMZE allow you to define your own optimization problem, including maximum likelihood, GMM and nonlinear least squares. These also allow quadrature and simulation estimators.
PROGRAMMING TOOLS
Procedures with adjustable parameters, DO FOR, DO WHILE and DO UNTIL, EXECUTE with bootstrapping and so on allow you to write your own iterative procedures
NUMERICAL ANALYSIS
Integration, differentiation, function plotting, computation of variances for nonlinear functions of estimates.
DOCUMENTATION:
LIMDEP 8.0 DOCUMENTATION
The LIMDEP 8.0 set of manuals, with nearly 2000 pages, contains full reference guides for the program, background econometrics, and sample applications. The LIMDEP documentation consists of two parts:
LIMDEP 8.0 Reference Guide
The Reference Guide provides all instructions for operating the program, including installation, invocation, and most of the basic setup operations that precede model estimation. These operations include reading and transforming data and setting the sample. This manual also describes the optimization procedures, how to use the matrix algebra package and scalar scientific calculator as stand alone tools and as part of LIMDEP programs, what sorts of results are produced by the program, and some of the common features of the model estimation programs, such as how to do post estimation analysis of model results. Two other components of the Reference Guide are a summary of how the model commands are documented in the Econometric Modeling Guide and a complete listing of the program diagnostics.
LIMDEP 8.0 Econometric Modeling Guide, Volume 1 and Volume 2
The 29 chapters of this guide are arranged in two volumes. These provide the econometric background, LIMDEP commands, and examples with data, commands and results. Topics are arranged by modeling framework, not by program command. There are chapters on
·Descriptive statistics
·Linear regression
·Panel data analysis
·Heteroscedasticity
·Binary choice models
·Models for count data
·Censored and truncated data
·Survival models
and many others. Each model fit by the program is fully documented. The full set of formulas for all computations are shown in this manual with the full mathematical documentation of the models. Additional chapters in this guide show how to do numerical analysis, how to program your own estimators, and provide a full listing of diagnostics.
NLOGIT 3.0 DOCUMENTATION
The NLOGIT 3.0 set of manuals consist of the three LIMDEP 8.0 manuals plus a separate reference guide for NLOGIT.
NLOGIT 3.0 Reference Guide
The roughly 200 pages of this NLOGIT Reference Guide give complete instructions for specifying and estimating discrete choice models with NLOGIT. The model simulator is also documented here.The LIMDEP 8.0 set of manuals, with nearly 2000 pages, contains full reference guides for the program, background econometrics, and sample applications. The LIMDEP documentation consists of two parts:
ACCURACY:
LIMDEP uses extremely accurate computational methods throughout. The National Institute of Standards and Technology (NIST) has published a suite of about 50 benchmark tests with certified solutions for assessing program accuracy in computing descriptive statistics, analysis of variance, linear regression and nonlinear least squares regression. LIMDEP scores high marks on all of these (see B.D. McCullough's review in the March, 1999 issue of Journal of Applied Econometrics.).
PROGRAM CAPABILITIES IN DETAIL
LIMDEP contains extensive sets of tools for every step in the analysis of a data set.
DATA MANAGEMENT
Data setup includes input, transformation, and setting sample specifications. Data management tools provide for input of data or internal generation with the random number generators, and other preparation of data for use in model estimation and analysis. Also sample definition is specified by selection of observations or bootstrapping random samples from within the data.
DATA INPUT AND OUTPUT
These are features of the program operation of importing data into the program and exporting data from LIMDEP to other programs - the read (import) and write (export) operations. You can also transform and examine the data within the program.
Read (import data)
·ASCII, XLS, WKS, Binary, DIF, CSV
·LIMDEP is supported by DBMS/Copy and Stat/Transfer
·Merge individual and group level data in a panel data set
·Recode alphanumeric data to numeric values on input
·Observation labels in spreadsheet files
·Up to 3,000,000 observations
·Blanks and other nonnumerics automatically converted to missing values
·SAS ‘.’ automatically converted to missing values
·Merge data with command files for scripts
·Fortran formatted
·Append observations to existing variables (stack multiple files)
·Format by observations (rows) or by variables (columns) - transpose
Write (export data)
·ASCII
·CSV (can be read directly into Excel, Lotus)
·DIF
·Binary
·WKS
·LIMDEP system files can be read by Stat/Transfer and DBMS/Copy
·Matrices may be written as exported data files
·
Spreadsheet style data editor
Transformations: algebraic, recoding, sorting, existing or new variables
Random number generators for random sampling
List observations
Sample selection and definition: include, reject, sample
Read from and write to output window on the screen
DATA TRANSFORMATIONS
These operations are used to compute functions of your data and to create new variables. Data may either be imported into the environment or may be created internally using the random number generators.
Command Structures
·Conditional: If(...), Else, Then...
·Parentheses to any number of levels
·Implied multiplication of grouped variables and expressions
Algebraic Transformations
·Standard operators: + - * / ^ (power)
·Binary variable operators: x > 1 = 1(x > 1) likewise for <, >= etc.
·Comparison operators: x ! y = max(x,y), x ~ y = ,in(x,y)
·Expand categorical variable into a set of dummy variables
Functions
·Log, Exp, Abs, Sqr, Sin, Rsn (arcsin), Cos, Rcs (arccos), Tan
·Gamma, digamma, trigamma, log gamma, beta
·Sign, fix (round to nearest integer), integer part
·Box-Cox transformation and derivatives of Box-Cox
Probability Distributions
·Logistic: logit function, density, cdf
·Univariate normal: density, cdf, truncated means and variances, sample selection ‘lambda,’ inverse normal cdf, inverse cdf to pdf
·Bivariate normal: pdf, cdf, partial derivatives
·Multivariate normal: cdf
Trends and Seasonal Dummy Variables
Stratification Variables and Period Variables for Panels
Leads and Lags
Matrix Functions: Dot Products, quadratic Forms
Sample Statistical Transformations
·Means, deviations, standardized variables
·Panel data, group means and deviations
·Rowwise moments of a set of variables
·Multiple of a set of variables
Random Number Generation
·Continuous random variables, uniform, normal, lognormal, t, chi-squared, F, exponential, Weibull, Gumbel, gamma, beta, logistic, Cauchy, truncated standard normal
·Discrete random variables: Poisson, discrete uniform, binomial, geometric
·Halton sequences
Recode Ranges of Values or Recode Specific Values to Discrete
Sort Ascending or Descending (Carry other Variables and/or Labels)
These features are used to define the specific observations to be used in the estimation procedures or other computations that access the data. The first group are used to define the ‘current sample.’ The second group are used to ‘subsample’ from the sample, using either a random subset of the current sample, or the ‘leave one out’ procedure of the jackknife procedure.
Sampling
·Include observations based on algebraic condition
·Reject observations based on algebraic conditions
·Period - specify time interval to be in sample
·Sample - specify particular observations or ranges of observations
Bootstrap and Jacknife
·Draw specified number of observations from current sample
·Draw with replacement
·Draw specified number of groups in a panel data set
·Execute a procedure a specified number of times, drawing a new bootstrap sample with each repetition
·Regression, least absolute deviations uses bootstrapping
The EXECUTE procedure may be used to bootstrap any estimator in the program whether one of the supported procedures or one that is created by the user. The command specifies the matrix or scalar to be bootstrapped. It may be anything that the program specifies.
Application
A simple application illustrates estimating a standard error for a nonparametric statistic. Suppose we have two sets of rankings of 100 items, labeled R1 and R2. The rank correlation between them is computed as
Since the statistic is distribution free, we can only approximate its asymptotic variance. The following computes 100 bootstrap samples and estimates the mean, variance, skewness and kurtosis of W and displays a histogram of the estimates.
·Continuous random variables: uniform, normal, lognormal, t, chi-squared, F, exponential, Weibull, Gumbel, gamma, beta, logistic, Cauchy, truncated standard normal
·Discrete random variables: Poisson, discrete uniform, binomial, geometric
·Several estimators use Halton draws instead (pseudo Monte Carlo draws). Users may generate Halton sequences.
Bootstrapping
·Draw specified number of observations from current sample
·Draw with replacement
·Draw specified number of groups in a panel data set
·Execute a procedure a specified number of times, drawing a new bootstrap sample with each repetition
The executed procedure may be used to bootstrap any estimator in the program whether one of the supported procedures or one that is created by the user. The command specifies the matrix or scalar to be bootstrapped. It may be anything that the program specifies.
Estimation
Randomly generated data enter the sample exactly the same as external, real data. Monte Carlo samples may be specified and used with any estimation procedure or any other part of the program that uses data.
Example: Fixed effects probit model
The following program does a Monte Carlo study of the fixed effects probit estimator. We fit the model 20 times with randomly generated data and compute descriptive statistics for the estimated parameter vectors. The underlying theory states that the estimator is biased when T is small. (The example is consistent with it - the estimator appears to be badly biased in our study with T = 5.) This program illustrates several features of LIMDEP.
/*We investigate the behavior of the fixed effects estimator
The following results emerge. The theory appears to be correct; the estimates (means) deviate substantially from the theoretical values of 1.0 and 1.0.
Weighted Least Squares and Weighted Log Likelihoods
Weights may be used with any estimator in any context.
Replication Weights
Weights are automatically scaled to sum to the number of observations in the sample. You may override this to request replications or simple unscaled weights.
Panel Data
Time invariant weights may be used to weight groups in panel data estimators.
Choice Based Sampling
Endogenous sampling weights may be specified for discrete choice models. This is coupled with a robust estimator of the asymptotic covariance matrix of the estimator. (Manski-Lerman WESML estimator.)
RANDOM NUMBER GENERATION
Stata of the Art Random Number Generator
LIMDEP uses L’Ecuyer’s multiple part random number generator. This generator has a period of roughly 2132 draws. (This is more than 1039.) This is large enough to avoid recycling in any conceivable Monte Carlo study.
·Set seed for the generator: random samples can be replicated.
·Discrete random variables: Poisson, discrete uniform, binomial, geometric
Randomness Tests
The random number generators in LIMDEP and three other programs were subjected to the ‘DIEHARD’ battery of 15 tests of randomness in McCullough’s (Journal of Applied Econometrics, 1999) study. LIMDEP’s generator passed all but one. Only one of the four programs passed all tests and the other two failed most of them.
Halton Sequences
This is a new method of computing simulation based integrals through use of nonrandom sequences. LIMDEP uses these in several places, and provides a function so users can create them for their own estimators.
DATA DESCRIPTION AND GRAPHICS
Descriptive statistics and graphical tools provide a range of options for preliminary description of your data set. Data description includes computation of statistics as well as graphical tools such as scatter plots and time plots.
·Means (arithmetic, geometric), standard deviations, minima, maxima
·Medians, sample quantiles (deciles, quartiles)
·Covariances
·Correlations (Pearson, rank)
·Coefficient of concordance for a set of ranks
·Autocorrelations
·Canonical correlations
·Principal components
·Condition number for data matrices
Normality Test
·Skewness, kurtosis
·Normal-quantile plot
·Chi-squared test
Example:
This is a description of Mroz’s (1987) Labor Supply Data. FAMINC is family income, stratified by KIDS, which indicates whether there are children in the household (1 = no, 2 = yes).
Features03 This is a plot of generalized cost against travel time from a study of transportation mode choice.
·Line styles
·Specify ranges on axes
·Titles and labels
Histograms
·Discrete or continuous data
·Specify bins, proportions or intervals
Features04
Graphical displays
·Kernel density plot
·Matrix of scatter plots
·Line plots for time series
·Multiple time series plots
·Box and whisker plots
·Residual plots (against observation, time, or other variables)
·User specified function plots
·Plot predictions from time series models
·Normal quantile plots
·Survival and hazard function plots
MODEL ESTIMATION AND ANALYSIS
Well over 100 model formulations for continuous, discrete, limited and censored dependent variables are provided. All model frameworks provide a wide variety of different variants. For example, there are well over 20 different types of models for count data.
This section lists the packaged estimations such as regression, binary choice and so on. See the Programming list for features provided for user defined estimation programs.
After a model is estimated, you can manipulate the results, compute predictions and marginal effects, or other functions of estimated parameters and model features. A built-in routine will compute nonlinear functions of your model estimates and associated standard errors with no programming by you other than specification of the desired function.
This section provides overall descriptions and lists of program features used in the different types of data analysis. Model estimation and hypothesis testing constitute the majority of this, but the other tools are frequently used as well.
A set of programming instructions including matrix and data manipulation commands and a nonlinear optimization program allows you to specify and estimate your own model if it is not included in the catalog of supported specifications.
This section describes tools that are used to create new estimators and to analyze results produced by the model estimation programs.
Numerical analysis tools are provided for analyzing functions of a variable and functions of the data:
·Function evaluation
·Function plotting
·Compute derivatives and save as gradient vector
·Integration by quadrature
·Integration by trapezoids
·Integration by simulation
·Matrix algebra
oCharacteristic roots
oRoots of dynamic equations
oCharacter vectors
oSingular value decomposition
·Scientific calculator: dozens of functions
·Values of and asymptotic variances for nonlinear functions
CROSS SECTION, PANEL DATA, AND TIME SERIES FEATURES
Nearly every model supported by the program for cross section analysis may also be extended to a variety of panel data frameworks, including fixed and random effects, hierarchical (random parameters) and latent class formulations. There are also a variety of tools for time series analysis.
This section describes program features related to different types of data: cross section, panel data, and time series.
·Hausman and Taylor instrumental variable estimator for the linear panel model
Hausman and Taylor’s estimator for the random effects model overcomes the possible correlation between the independent variables and the random effects. The random effects model is formulated with the possibility that there may be time invariant independent variables.
There are four sets of variables in the model, the xs which are time varying and the fs which are not time varying, and variables subscripted ‘1’ which are uncorrelated with ui and the remainder which may be correlated with ui.A three step procedure which ends with generalized instrumental variables estimation is used for estimation.
·Arellano/Bond/Bover estimator for the dynamic model.
The Arellano/Bond/Bover estimator is for the dynamic random effects model
This extends the Hausman and Taylor estimator. Two step GMM is used to estimate the model. You have a choice of covariance structures for ui (uncorrelated, random effect, freely correlated across time) and a choice of different sets of instrumental variables for the GMM estimator (variations in the number of future and lagged values). The optimal weighting matrix is computed at the second step. This is their efficient estimator. Barghava and Sargan’s specification test statistic is presented after estimation.
A small number of fixed effects econometric models can be estimated by conditioning the fixed effects out of the model. These, which can all be fit with LIMDEP, are
·Linear regression (use group mean deviations)
·Binomial logit (condition on sum of outcomes)
·Poisson (condition on sum of outcomes)
·Negative binomial (conditional model is free of effects)
Note that Hausman et al’s negative binomial model is not a true fixed effects model in the fashion these are usually formulated, that is, in terms of an index function of the form a(i) + b’x(I,t). You can see this by trying to fit their model with its full set of dummy variables and an overall constant term - it works but it shouldn’t. The unconditional fixed effects model described in the section on the unconditional models is a true fixed effects model in that the effect acts to shift the mean.
LIMDEP computes conditional maximum likelihood estimates of all these models.
We define fixed effects models in terms of the density of the observed random variable and an index function,
Density of observed yit=f(yit, ai+b’xit, other parameters)
There is one dummy variable coefficient for each individual or group. Familiar treatments in the literature are the linear regression model, in which the dummy variables are removed by deviations or by first differences, or in the binary logit or Poisson regression, in which the dummy variables are conditioned out and not estimated. LIMDEP’s implementation of this model is unconditional. All the dummy variable coefficients are actually estimated, with up to 20,000 groups, as well as the other model parameters. This is a new estimation method in Version 8.0 of LIMDEP that does not appear in any other software. This allows a far wider range of models than the conditional estimator:
Compare the preceding list to the list of conditional fixed effects estimators in LIMDEP and other programs: linear regression, binary logit, Poisson, negative binomial.
Other features of the unconditional fixed effects estimator include:
·Full maximum likelihood estimation
·Automatic data check for groups of one and groups with no variation
Density of observed dependent variable is f[y(I,t)|x(I,t),u(i)],
where x(I,t) is the vector of observed covariates and u(i) is a time invariant random variable. Random effects models are fit by two step generalized least squares for the linear model or by maximizing the unconditional likelihood for nonlinear models. The unconditional likelihood function is obtained by integrating u(i) out of the conditional likelihood. LIMDEP contains random effects estimators that use the familiar techniques in the literature for the following models.
Integration is done using closed forms, Gauss-Hermite quadrature (Butler-Moffitt) or maximum simulated likelihood: pseudo random draws or Halton sequences.
In addition to the preceding models, you can also fit a ‘random effects’ model by fitting a random parameters model which contains only a random constant term. This adds about 20 models to the list above.
The random parameters model is defined in terms of the density of the observed random variable and the structural parameters in the model:
Features05
The random effects model is a special case in which only the constant term is random. The random parameters model has been implemented in other software for the binary logit, linear regression, and Poisson regression model. LIMDEP’s implementation supports a far wider variety of models.
Model | class = familiar structure Prob[class] = probability model
The class membership is not observed by the analyst. The estimated ‘model’ is therefore a weighted average of the underlying structures. LIMDEP’s implementation of this structure uses a multinomial logit model for the class probabilities, which allows observed covariates to influence the class probabilities. The models supported are:
Results produced include the J class specific parameter vectors for the model (you specify J), and the parameters of the probability model. The estimator is maximum likelihood. (The literature contains a few applications of this model to the linear regression, Poisson and binomial logit and probit models. The preceding extend the model into several new areas.) After estimation, you can compute an estimate of the class from which an observation is drawn (the one with the largest probability), and posterior estimates of the set of class probabilities. A suitably weighted average produces an individual (group) specific parameter vector as well. Results can also include a listing of the class probabilities. Post estimation results can also include predictions, marginal effects, etc.