HVAC Excellence Test Validation Procedures
By John Hohman
Developing the content
Test content is developed through the input of technical experts. Traditionally called
“focus groups”, these groups have transitioned into groups now called “Subject Matter
Experts” or SME’s. To develop any national test, a minimum of five and a maximum of
nine technical experts from three states are essential. Ideally, the more participation from
more states is desired and sought.
A work session for the SME’s at the developmental stage generally consists of a “job and
task” analysis of the work done within the field of occupation that the SME’s represent.
A job and task analysis attempts to identify elements of the occupation that can be
identified as individual jobs. It continues to identify tasks that need to be completed
within each job element. There may be many jobs within an occupation and subsequently
many tasks that need to be done before a job can be completed. This is the first step in
developing “Content Validity.”
Depending on the scope of development, SME’s may be involved in identifying other
elements of the occupations, such as: tools, equipment, work environment, conditions of
work that relate to ADA (America requirements, etc. Sometimes this information is
already sufficiently documented and doesn’t require additional study.
Developing the structure
After the job and task analysis has been completed, the same or another group of SME’s
is retained to develop the structure of the test. The structure of a test is sometimes
referred to as the “table of test specifications” and is used to determine the content and
emphasis of the test. This is the part of the test building process that establishes
“Construct Validity.”
It is generally understood that a group of questions (items) may constitute a test of
knowledge in a given field of occupation. Developing the table of test specification
maintains a degree of control on the content of the test. This process also satisfies the
condition of “Semantic Validity” whereby the labels relate to the occupation being
evaluated. This control of content helps to insure that one aspect of validity is maintained,
the second step for “Content Validity.”
The structure or table of test specification identifies the number of questions/items to be
applied to a given section/category of the test. Doing so maintains the content
relationship of the test while individual questions, relating to the section/category, may be
changed or randomly selected from a test item bank.
Developing the items
Questions on a test are typically referred to as “test items.” Each item takes the form of a
multiple-choice question. The item is made up of three parts: 1) the question, called the
“stem”, a single correct answer, and a set of plausible, possible answers called
“distracters.” The number of distracters sometimes helps to elevate the difficulty or level
HVAC Excellence Test Validation Procedures
By John Hohman
of the test. Typically there are four choices, one correct and three plausible distracters.
There can be up to seven choices for a given item.
The same or a similar occupational group of SME’s are involved with the development of
test items. Items are generated in many different ways, and they are guided by the “table
of test specifications.” More items are generated than the test requires. Often, items are
abandoned for many reasons and additional items are required to maintain the size or
length of a test.
Items are reviewed for “Bias” toward protected groups by persons other than the SME’s.
The test developer selects individuals that have a high degree of sensitivity to bias
language to review the developed test items in an attempt to eliminate all or most
language that may offend or bias an item for a protected group of people.
Pilot Testing
Pilot testing is an important step in the development of a test. Pilot testing consists of
identifying individuals within an occupation that are at approximately the target level of
the test. For example, a technician level test may require selecting individuals who have
some level of experience within the occupation to pilot the test.
Pilot participants are selected by knowledgeable people within the occupation. They are
asked to select participants who they feel are at the specific level the test is designed.
Through this selection process come aspect of “Criterion Related Validity” is generated.
Typically, selected participants have already been evaluated to some degree by the person
who selects them. Therefore, some degree of relationship exists between the level of the
participant and the level of the test. Other criterion related tests results might be used to
validate this process for a pilot group.
The size of the pilot test group will be selected to generate sufficient data. Depending on
the number of people within the occupation, the number within a location, or the number
of individuals that can be found to volunteer, will determine the pilot test group.
“Face Validity” is generated at this point in the test development process. Face Validity
refers to the recognition of the test title, test categories, and test items as being part of the
field of occupation. Individuals who pilot test are asked to respond or comment on each
of these parts of the test.
The pilot test also asks participants to mark items, words or phrases that might have an
impact on protected groups in a second attempt to eliminate bias.
Item Analysis
Item analysis is the process of technically reviewing the structure, response, and fit of a
given item and the relationship of that item to the rest of the test. Through item analysis
HVAC Excellence Test Validation Procedures
By John Hohman
some levels of “Reliability” validation can be obtained. Item analysis can use any or all
of the following statistical analysis:
Kuder Richardson Formula - KR 20 & KR 21
Cronbach Coefficient Alpha
Spilt-half Reliability Coefficient
Level of Difficulty
Easiness Scale
Coefficient of equivalence
Spearman-Brown
Standard error of measurement
Final Formatting
The item analysis will reveal items that are not working as expected. Those items will
either be eliminated or the deficiency repaired. The test will be formatted to the correct
number of total questions and each section/category is reviewed for the correct number of
questions according to the table of test specifications. All other spelling and formatting
difficulties are eliminated.
Test Delivery
Delivery of the test follows prescripts required for all national tests and requires a level of
security. The delivery process consists of identifying individuals to proctor the test who
have a high degree of personal conviction and agree to the requirements of handling and
proctoring a national test.
Continuing Analysis
Test results are continually monitored on a periodic basis. As score anomalies occur,
whole tests or individual items are scrutinized for problems. When a test shows a
significant level of problems it is slated for review ahead of its scheduled review period.
All tests are reviewed on a three year basis.
No comments:
Post a Comment