PASW Missing Value Analysis
Build Better Models When You Estimate Missing Data
Missing data can seriously affect your results. If you ignore missing data or assume that excluding missing data is sufficient, you risk reaching invalid and
insignificant results. To ensure that you enter the data analysis stage using data that takes missing values into account, make PASW Missing Values
(formerly called SPSS Missing Values) part of your data management and preparation step.
PASW Missing Values, is a critical tool for anyone concerned about data validity, including survey researchers, social scientists, data miners, and market researchers.
Uncover Missing Data Patterns
With PASW Missing Values, you can easily examine data from several different angles using one of six diagnostic reports to uncover missing data patterns. You can then estimate summary statistics and impute missing
values through regression or expectation maximization algorithms (EM algorithms). PASW Missing Values helps you to:
- Diagnose if you have a serious missing data imputation problem
- Replace missing values with estimates—for example, impute your missing data with the regression or EM algorithms
Quickly and Easily Diagnose Your Missing Data
Quickly diagnose a serious missing data problem using the data patterns report, which provides a case-by-case overview of your data. This report helps
you determine the extent of missing data; it displays a snapshot of each type of missing value and any extreme values for each case.
Reach More Valid Conclusions
Replace missing values with estimates and increase the chance of receiving statistically significant results. Remove hidden bias from your data by replacing missing values with estimates to include all groups in your
analysis—even those with poor responsiveness.
Use Multiple Imputation to Replace Missing Data Values
PASW Missing Values' multiple imputation procedure will help you understand patterns of "missingness" in your dataset and enable you to replace missing
values with plausible estimates. It offers a fully automatic imputation mode that chooses the most suitable imputation method based on characteristics of your
data, while also allowing you to customize your imputation model.
Several complete datasets are generated (typically, three to five), each with a different set of replacement values. Next, you can model the individual
datasets using the usual techniques, such as linear regression, to produce parameter estimates for each dataset. Then obtain final parameter estimates.
This involves pooling the individual sets of parameter estimates obtained in step two and computing inferential statistics that take into account variation within and between imputations.
Analysis of the individual datasets and pooling of the results are supported via select existing PASW Statistics procedures such as REGRESSION. When operating on datasets with imputed values, existing procedures will
automatically produce pooled parameter estimates.
Fill in the Blanks for Improved Data Management
Expand on PASW Statistics Base's capabilities with PASW Missing Values. Make better decisions about your data when you can fill in the blanks to create
higher-value data and build better models. PASW Missing Values, an PASW Statistics module, provides you with procedures for data management and
preparation. Also, it easily plugs into other PASW Statistics modules ensuring you can work seamlessly in the PASW Statistics environment.
PASW Missing Values has the statistics you need to fill in missing data:
- Univariate: compute count, mean, standard deviation, and standard error of mean for all cases excluding those containing missing values, count and percent of missing values, and extreme values for all variables
- Listwise: compute mean, covariance matrix, and correlation matrix for all quantitative variables for cases excluding missing values
- Pairwise: compute frequency, mean, variance, covariance matrix, and correlation matrix
- Expectation maximization (EM) algorithm
- Estimate the means, covariance matrix, and correlation matrix of quantitative variables with missing values, assuming normal distribution, t distribution with degrees of freedom, or a mixed-normal distribution
with any mixture proportion and any standard deviation ratio
- Impute missing data and save the completed data as a file
- Regression algorithm
- Estimate the means, covariance matrix, and correlation matrix of variables set as dependent; set number of predictor variables; set random elements as normal, t, residuals, or none
- Impute missing data and save completed data as file
PASW Missing Values also has features that enable you to analyze patterns and manage data, including the ability to:
- Display missing data and extreme cases for all cases and all variables using the data patterns table
- Determine differences between missing and non-missing groups for a related variable with the separate t test table
- Assess how much missing data for one variable relates to the missing data of another variable using the percent mismatch of patterns table
- And more
This separate variance t test table defines two groups of cases: those with data on income and those that are missing data on income. Then, the
separate variance t test table tests to see if these two groups are different from each other on a series of variables. This table shows that people with missing
data on income are more likely to have a non-professional occupation, more likely to be female, more likely to be married, and have a larger family than people who reported information on their family income.
Back to PASW Statistics Base page.
|