Training Evaluation


A training program is not complete until its effectiveness has been evaluated. The purpose of training evaluation is to find out if a training program has had its intended effect. Morrow, Jarrett, and Rupinski (1997) found that five out of the eighteen training programs that they studied cost more than the return in improved performance. In order to understand the reason and plan for effective training programs in the future, all training programs should include a training evaluation.


There are five steps in carrying out a training evaluation:


  1. Define the criteria for evaluation


    By setting the evaluation criteria, we compare the training effectiveness against a set of standards. As long as the training purpose is known, the criteria can be set to assess whether or not the training goals have been met.


    Training criteria can be classified into two levels, training level and performance level. Training level criteria are concerned with what trainees have learned at the end of the training, whereas performance level criteria are concerned with which of the newly learned skills have been applied on the job.


    Training level criteria can be further broken down into reaction criteria and learning criteria. Reaction criteria are about how much the trainees liked the training and how much they think they have learned. Learning criteria are concerned with what the trainee actually has learned; that is, what he or she can demonstrate behaviorally in terms of knowledge and skills learned.


    Performance level criteria can be further broken down into behavior criteria and results criteria. Behavior criteria comprise the trainee's behavior changes on the job after the training. Results criteria are concerned with whether the training has met its goals such as reduced costs or increased productivity.


    Both training level and performance level criteria are important in evaluating training effectiveness. Training level criteria are important when the learned skill may not be immediately used on the job, such as the operation of a new system which is to be implemented six months later. Performance level criteria are also important because most training aims at influencing job performance, or transfer of the learned knowledge to the real job.


  2. Design the study


    The design of the study specifies how data will be collected and analyzed. The two most common designs are pretest-posttest and control group design.


    Pretest-Posttest Design

    The pretest-posttest design compares trainees before (pretest) and after (posttest) training. This design focuses on measuring the change brought about by training. It can be used to assess how much the trainees learned from the training as well as how much they actually change their behavior on the job. The pretest and posttest measure can be organized as part of the training program. It is very common to start a training with an assessment to know what trainees already know and end the training with another assessment to see what trainees have learned from the training. Assessments can also be done on the job, a certain period of time before and after the training.


    One drawback of this design is that it does not eliminate other factors that could lead to an increase in knowledge or improvement in performance. For example, job performance can be improved simply because the employees are made to be aware of their own performance gaps due to the launching of the training.


    Control Group Design

    A control group design compares two different groups of people – those who attended the training and those who did not attend the training. Any differences between their performance and knowledge can be attributed to the training program. Usually the employees are randomly chosen to be in the training group or in the control group (which comprises individuals who have not gone through the training).


    This design is not always feasible because sometimes it is difficult to randomly assign employees to the two groups.


  3. Choose a measure to assess the criteria


    After the criteria have been set, a measure must be chosen to assess the criteria. Different measures can be used to assess different criteria. For example, a questionnaire can be used to measure reaction criteria. Ability or knowledge tests can be used to measure learning criteria. Observation or peer evaluation can be used to assess behavior on the job. Finally, productivity data can be used to assess cost or productivity gains.


  4. Collect data for the study


    Data should be collected by the chosen measures. However, this is sometimes difficult for several reasons. First, it is not easy to randomly choose employees to participate in the training. There are often significant organizational hurdles that make this difficult. Second, some employees may not be motivated to provide the information needed because of personal concerns such as privacy or fear of negative performance evaluations. Third, answers given by the employees may be affected by the presence of their colleagues and supervisors. Fourth, employees may feel obligated to provide positive answers to please the trainers. Finally, performance may be affected simply because the employees know they are being watched. All these problems should be anticipated and attempts made to minimize them before data collection is carried out.


  5. Analyze and Interpret Data


    Data collected should be analyzed with the appropriate statistics. The most commonly used statistic is the t-test. Both pretest-posttest and control group design can use a t-test to reach a conclusion.


    Analysis should be done at both the training and performance levels. If results show that the training works at both levels, the training is effective and should be continued.