A critical evaluation of the Web-based version of the Career Key
Edward M. Levinson
The authors assessed the reliability and validity of the Web-based version of the Career Key (L. K. Jones, 1997). Ninety-nine undergraduates completed the Web-based version of the Career Key and the Self-Directed Search–Form R (J. L. Holland, 1994) in counterbalanced order and completed a second Career Key administration 2 weeks after completing the first test administration. Test–retest reliability ranged between .75 and .84. With the exception of the Conventional scale (.47), all concurrent validity coefficients were at or above .65.
Many articles have been recently published in professional journals addressing the use of the Internet by counseling professionals (see, e.g., Hohenshil, 2000; Sampson, 2000). Attention to Web-based counseling and assessment is well-founded. Web-based assessment has many advantages over conventional methods of assessment. Using the Web for assessment purposes means that the tests can be administered to groups of individuals at the same time, it saves time and money, and it reduces data entry errors (Pasveer & Ellard, 1998; Sampson, 2000). Web-based assessment also provides a medium for the creative development of instruments that use animation and audio/video technology (Sampson, 2000). Additional advantages of Web-based assessment include the following:
* adaptation of testing situations for a large number of people with disabilities who may need services
* positive responses from clients
* administration and scoring efficiency
* reduction in errors
* research opportunities (Brown, 1990; Sampson, 1983, 2000; Sampson & Pyle, 1983)
However, Web-based assessment has many disadvantages as well. The lack of counselor–client contact, the lack of personalized follow-up and follow-through, and the potential compromise of the confidentiality of client data have all been cited as disadvantages (Sampson, 2000). Oliver and Zak (1999) examined 24 no-cost career assessment Web sites and concluded that sites were easy to use but (a) provided only a moderate degree of test interpretation, (b) fit into a schema of career planning to only a very limited degree, (c) generally did not provide information about the developers of the site, (d) provided little evidence that assessment instruments contained on the site were validated for self-use, (e) provided limited confidentiality controls, and (f) contained no psychometric data or information about instrument development. Furthermore, Oliver and Zak (1999) indicated that the Webmasters or the organizations that were responsible for the sites did not readily respond to requests for this kind of information.
As such, counselors recognize that many Web sites contain assessment instruments that are untested and have not been subjected to critical review. Counselors recognize, too, that most Web-based career assessment instruments lack evidence of technical adequacy. Even the pencil-and-paper instruments that have adequate technical properties may have different properties when they are adapted for use on the Internet. Affective variables such as “computer phobia” and equipment variation from one testing situation to another can have an effect on an individual’s test performance (McKee & Levinson, 1990). Computer-linked factors (e.g., time limits, presentation of items, typing ability) can change the nature of the assessment task so dramatically that the same construct will no longer be measured (Hofer & Green, 1985; Moreland, 1992). The psychometric properties of the test, including its reliability, validity, and normative information might not be comparable between the conventional test and its computerized versio n (McKee & Levinson, 1990; Moreland, 1992). If computerized testing does not allow the examinee to skip around or know the number of items being administered as a conventional paper-and-pencil test does, then this is another factor that leads to nonequivalence between the two versions of the test (Moreland, 1992). It is important for test developers to consider these concerns and to provide empirical evidence of the psychometric properties of the computerized version of the instrument when adapting a pencil-and-paper assessment instrument for use on the Internet.
The National Council on Measurement in Education (1995), in its Code of Professional Responsibilities in Educational Measurement (Section 3.2), suggested that it is the professional responsibility of anyone who selects assessment instruments to “recommend and/or select assessments based on publicly available documented evidence of their technical quality and utility rather than on unsubstantiated claims or statements” (p. 4). Professionals who use computerized versions of tests must demand the same technical adequacy as they do for the conventional paper-andpencil version of the test (McKee & Levinson, 1990).
One popular paper-and-pencil career assessment instrument that has been adapted for use on the Internet is the Career Key (CK Jones, 1997). The 1997 paper-and-pencil version is a revision of the original instrument (Jones, 1987), is shorter and more compact, and is offered free for noncommercial use. Users (a) rate the extent to which 24 statements regarding activities, abilities, values, and self-perceptions describe them; (b) rate the extent to which 42 occupations interest or attract them; (c) sum their ratings to calculate their resemblance to each of the six Holland personality types (Holland, 1985); (d) review the jobs list to determine the two or three personality types for which they receive the highest scores; and (e) select the jobs that interest them. Studies of the original version of the CK(Jones, 1983, 1987, 1989, 1990, 1993) demonstrated that its psychometric characteristics were comparable to other instruments of its type. Internal consistency reliabilities ranged from .64 to .79 for the CK s cales. Test-retest reliabilities ranged from .66 to .92. Concurrent validity was assessed by comparing the student’s first letter code on the CK with the first letter of their majors, resulting in a 40% hit rate.
The CK has been adapted for Internet use (Jones, 1997) and is available at http://www.careerkey.org/english. This site also includes links that are designed to assist users of the Web-based version of the CK in using their results for career planning. However, according to its author (L. K. Jones personal communication, June, 2000), the Web-based version of the CK has not been subjected to a test of its psychometric properties. Despite the positive characteristics of the paper-and-pencil version of the instrument, one cannot assume that the Web-based version has the same characteristics.
The purpose of our study was to assess the test-retest reliability and the concurrent validity of the Web-based version of the CK. In addition, the CK Web site was evaluated against the standards that were published by the National Career Development Association (NCDA) regarding the use of the Internet for career information and planning services (see Appendix A. in Harris-Bowlsbey, Dikel, & Sampson, 1998).
Study participants were 99 undergraduate students who were enrolled in two sections of an educational psychology class that was specifically designed for non-education majors. Although the majority of the students were communications media majors, other majors represented in the class included child- family relations and psychology. A total of 68 women and 31 men participated; all were Caucasian, and 80% were between 20 and 22 years of age.
We assessed the test-retest reliability and concurrent validity of the Web-based version of the CK. According to its author (L. K. Jones, personal communication, June, 2000) and information available on the CK Web site, there are currently no reliability and validity studies of this form of the instrument. To assess the concurrent validity of the Web-based version of the CK, the Self-Directed Search-Form R (SDS-R Holland, 1994) was used as a criterion measure. The CK is based on Holland’s (1985) theory of vocational personalities and work environments, arguably the most popular, frequently used, and widely applicable theory of career choice. Internal consistency reliability of the SDS-R has been reported to be in the .80s and .90s, and test-retest reliability has been reported to be in the .70s and .80s and “moderately high,” based on studies conducted with the 1977 version. of the instrument (Daniels, 1994). Validity is reported to be adequate and. comparable to other interest inventories. According to Danie ls (1994), the SDS-R is an excellent vocational instrument and, as such, was an appropriate criterion measure to use in this study.
We made a brief presentation to the students during which we described. the purpose of the study, estimated the amount of time that would be needed to participate in the study, and discussed the nature of the questions asked on the CK and SDS. Students were told that they could earn extra credit in their classes if they participated in the study; they were also told that if they chose not to participate in the study, they could complete an alternative assignment to earn the same number of extra credit points. Students were told that if they participated in the study, they would have the opportunity to set up an appointment with the instructor to discuss their scores on both the CK and the SDS–R. Volunteers were then solicited. The same introduction to the study was provided to students in both sections of the course.
Participants were instructed to complete the CK and the SDS–R in a prescribed (counterbalanced) order. A coin flip was used to determine the order in which each section of students would complete which instrument first. Participants were then instructed to complete the SDS and CK in the predetermined order, one immediately after the other. Participants were also instructed to complete the CK a second time 2 weeks after completing it the first time.
To assess test–retest reliability, Pearson product-moment correlation coefficients were computed between identical scales across the two administrations of the CK for the entire sample, and separately for men and women. Test–retest reliability coefficients for the entire sample ranged between .75 and .86 and had a mean of .82 (Realistic, -.75; Investigative, -.79; Artistic, -.85; Social, -.86; Enterprising, -.84; Conventional, -.82). For men, coefficients ranged between .75 and .89 and had a mean of .81 (Realistic, -.83; Investigative, -.85; Artistic, -82; Social, -.75; Enterprising, -.75; Conventional, -.89). For women, coefficients ranged between .69 and .87 and had a mean of .80 (Realistic, -.69; Investigative, -.77; Artistic, -.85; Social, -.82; Enterprising, -.87; Conventional, -.81). All correlations were statistically significant at the p < .001 level. Coefficients were slightly higher for men than for women on the Realistic, Investigative, and Conventional scales and slightly higher for women than f or men on the Artistic, Social, and Enterprising scales.
To assess concurrent validity, Pearson product–moment correlations were computed between all six scales on each instrument. Separate analyses were conducted for the entire sample and separately for men and women. Table 1 presents the correlations between the six CK scales and the six SDS–R scales for the entire sample. Correlations between identical scales on the two instruments ranged from .47 to .83, and all were statistically significant at the p < .001 level. All CK scales except the Conventional scale had correlations at or above .65. Table 2 presents separate concurrent rent validity data for men and women. Correlations between identical scales on the two instruments ranged from .58 to .82. Again, all CK scales except the Conventional scale had correlations at or above .65. Correlations between identical scales on the two instruments ranged from .43 to .84. With the exception of the Conventional scale, all CK scales had correlations at or above .62. Coefficients were slightly higher for men than for w omen on the Realistic, Investigative, Social, and Conventional scales; and slightly higher for women than for men on the Artistic and Enterprising scales.
To further explore the concurrent validity of the CK, we compared the three-letter codes generated by each instrument. To do this, we conducted two analyses. In the first analysis, we generated three-letter codes for each participant on each instrument (for the CK, the first CK administration was used). We placed the scale with the highest point total in the first position, the scale with the second highest point total in the second position, and so on. In the case of ties (i.e., two scales had an equal point total), more than one three-letter code was constructed, one with each of the scales in that position. Using this procedure, the CK and SDS had the same letter in the first position in 72% of the cases, the same letter in the second position in 37% of the cases, and the same letter in the third position in 41% of the cases.
We used a second procedure to compare the three-letter codes yielded by the CK and SDS. This procedure was identical to the previously described procedure, with the following exception: We applied Holland’s “rule of 8” when constructing three-letter codes for the SDS. On the basis of the standard error of measurement, Holland suggested that any two SDS scales that do not deviate by 8 or more points should be considered equal. Thus, we constructed multiple three-letter codes for the SDS, using the rule of 8, and we compared the resulting codes with the three-letter code yielded by the CK, using the aforementioned procedure. Because no rule comparable to the rule of 8 has been proposed for the CK, we did not use one. Using this procedure to compare the two instruments, the CK and SDS had the same letter in the first position in 94% of the cases, the same letter in the second position in 85% of the cases, and the same letter in the third position in 76% of the cases. In 65% of the cases, the three-letter code(s) constructed for the CK matched one of the three-letter codes constructed for the SDS.
The CK Web site was also evaluated against the NCDA guidelines for using the Internet to provide career information and planning services. Table 3 lists these guidelines and the extent to which, in our judgment, the Web site adheres to these guidelines. Our assessment indicates that the Web site adheres to all NCDA guidelines with the following exceptions: The counselor provides no analysis regarding whether a client’s needs can be met by Internet communication; the counselor provides no periodic monitoring (by telephone or videophone teleconferencing) of the client’s progress; and the Web site includes no statement regarding the nature of client information that is electronically stored or the length of time the data will be maintained. In addition, we are uncertain of the extent to which awareness of local conditions exists. Finally, on the basis of our review we rated two guidelines as “in process”: “Some kinds of content have been extensively tested for online delivery” and “[the CK was] tested in compute r delivery mode to assure that their psychometric properties are the same” (Harris-Bowlsbey et al., 1998, p. 53).
The results of this study are encouraging, and they generally support the reliability and validity of the CK. In particular, reliability coefficients obtained in this study for the Web-based version of the CK were generally comparable to those reported for the paper-and-pencil version of the instrument and were similar to those reported for the SDS and for other interest inventories that were based on Holland’s (1985) theory. Although reliability coefficients of .90 or higher are considered to be good, coefficients of .80 or higher are generally considered to be adequate (Sattler, 1990). Salvia and Ysseldyke (1998) recommended using two standards of reliability in applied settings–one for group data and one for individual data. If test scores are used for administrative purposes and are reported for groups of individuals, the minimum reliability coefficient should be .60. When a test score is used to make important decisions for an individual student (e.g., placement in a particular class or program), the mi nimum standard should be .90. When a test score is used to make a screening decision for an individual student (e.g., whether a client should receive further assessment), a standard of .80 is recommended. Data for all participants in our study indicated that four of the six CK scales had adequate reliability for making screening decisions. A fifth scale (Investigative) closely approached this standard. Although one scale (Realistic, .75) did not fully meet the criteria that are generally considered to be acceptable for use with individuals when making screening decisions, analysis by gender suggested that the Realistic scale has adequate stability for men but not for women. Gender analyses suggested that for men, four of the six scales had stability coefficients that exceeded .80, and the remaining two (Social, Enterprising) had coefficients that approached this criterion (.75). Similarly, four of the six scales had stability coefficients that exceeded .80 for women, and one (Investigative, .77) approached th is criterion. Given these data, users of the CK can feel fairly comfortable using the CK to screen for occupational areas that the client can explore further. However, our results do not support the use of the CK as the basis for determining what occupation or occupations an individual should pursue, what college major the individual should pursue, or the training program in which the individual should be placed.
When evaluating criterion-related validity, one must consider the type of validity evidence provided (concurrent or predictive) and the associated validity coefficients. As Gay (2000) suggested, however, there is no “magic number” a validity coefficient should reach. Generally, the higher the validity coefficient, the more valid the instrument is. Sax (1997) noted that concurrent validity coefficients generally exceed predictive validity coefficients, and although predictive validity coefficients vary considerably, correlations of .60 or above are generally considered to be high. In light of this and on the basis of the results of our study, the criterion-related validity of the Web-based version of the CK can be considered to be adequate. However, users should be cautious in interpreting the Conventional scale, in light of its lower than desirable correlation with its corresponding scale on the SCS-R. In our study, the CK Conventional scale correlated as highly with the SDS Enterprising scale as it did with the SDS-R Conventional scale.
Furthermore, the CK Web site meets most of the NCDA standards (Harris-Bowlsbey et al., 1998) described for the use of the Internet for providing career information and planning services. Consistent with the standards, the Web site was developed with input from career counselors and other professionals, and their qualifications were clearly stated. The Web site is available through free public access points on the Internet and is being extensively tested for online delivery. There is a clear statement regarding appropriateness of client needs for receiving the Web site services. However, the Web site does not state whether it is possible for a counselor to analyze the appropriateness of meeting those needs through Internet exchange. Also, there does not seem to be an opportunity for periodic telephone or videophone monitoring of the client’s progress, as is required by the NCDA standards. However, there is evidence that individuals who staff the site identify/refer clients to qualified career counselors, if n ecessary.
Counselor credentials, agreed-upon goals and cost, where and how clients can report any unethical counselor behavior, and the degree of security and confidentiality on the Internet are all adequately addressed on the site. There is also a statement about the need for privacy during client-counselor communication, and the Web site alerts the client to circumstances that might indicate the need for counseling support. However, there is no indication of the kind of client information that is electronically stored or the length of time data are maintained.
As is required by the NCDA standards, the validity of the computerized administration of the CK is being assessed to ensure that its psychometric properties are adequate. The same ethical guidelines that are followed in the face-to-face mode are adhered to in the Web-based version, and confidentiality is protected. Finally, the Web site indicates that the client will be referred to a qualified career counselor if there is any evidence that the client does not understand the assessment results.
Although our review generally suggests that counselors can be fairly comfortable recommending the Web-based version of the CK to their clients, some cautions seem appropriate. First, the provincial nature of the sample severely limits generalizability of the results. Additional research with different samples (e.g., working adults, college students with different majors) and in settings other than one university is needed before we can be confident that the psychometric properties suggested by our study are applicable to all likely users of the Internet version of the CK. In particular, the limited representation of college majors in our study might have affected the results we obtained. Because the majority of participants were communications media majors, conventional types might have been underrepresented in our sample. Thus, a restricted range of scores might have depressed reliability coefficients on the Conventional scale, in particular. Although our findings generally support the reliability and validi ty of the Web-based version of the CK and support its use as a screening device, users of the instrument should exercise caution in using and interpreting the Conventional scale until further research is conducted.
Concurrent Validity of Web-Based Career Key Using the Self-Directed
Search–Form R as a Criterion Measure–Entire Sample
Variable R I A S E C
Realistic (R) 0.65 0.34 0.26 -0.10 0.08 0.06
Investigative (I) 0.34 0.70 0.19 -0.01 0.13 0.15
Artistic (A) 0.21 0.10 0.83 -0.27 0.20 -0.11
Social (S) -0.23 -0.03 -0.30 0.78 -0.11 0.18
Enterprising (E) 0.25 0.13 0.29 -0.04 0.79 0.23
Conventional (C) 0.22 0.10 -0.05 0.06 0.46 0.47
Note. Coefficients in bold have p < .001.
Concurrent Validity of Web-Based Career Key Using the Self-Directed
Search-Form R as a Criterion Measure for Men and Women
Variable R I A S E C
Realistic (R) 0.70 0.42 0.47 -0.19 0.23 0.17
Investigative (I) 0.47 0.81 0.31 0.12 0.19 0.31
Artistic (A) 0.39 0.00 0.80 -0.34 0.20 -0.18
Social (S) 0.05 0.10 -0.06 0.82 0.12 0.17
Enterprising (E) 0.43 0.18 0.24 0.25 0.66 0.34
Conventional (C) 0.63 0.17 -0.21 0.19 0.62 0.58
Realistic (R) 0.62 0.28 0.15 0.02 -0.00 0.04
Investigative (I) 0.27 0.64 0.14 -0.03 0.10 0.10
Artistic (A) 0.12 0.11 0.84 -0.22 0.18 -0.07
Social (S) -0.09 0.01 -0.39 0.71 -0.13 0.11
Enterprising (E) 0.07 0.08 0.29 -0.06 0.82 0.21
Conventional (C) 0.07 0.08 -0.14 0.01 0.41 0.43
Note: Coefficients in bold have p < .001.
Evaluation of the Career Key Website (a) Using National Career
Development Association Guidelines for the Use of the Internet for the
Provision of Career Information and Planning Services (b)
Standard Is Standard Met?
Guidelines for use of the internet for
delivery of career counseling and carreer
1. Qualifications of developer or provider
Developed with content input from
professional career counselors Yes
State the qualifications Yes
2. Access and understanding of the
Aware of free public access points Yes
Aware as possible of local conditions Unknown
3. Content of career counseling and planning
services on the Internet
Reviewed for appropriateness of content
offered in this medium Yes
Some kinds of content have been
extensively tested for online delivery In process
New service should be carefully
scrutinized to determine whether it
lends itself to the Internet Yes
4. Appropriateness of client for receipt of
services via the Internet
A clear statement of their needs Yes
An analysis by the counselor of whether
meeting those needs via Internet
exchange is appropriate No
5. Appropriate support to the client
Periodic monitoring of the client’s
progress via telephone or videophone
Identification of a qualified career
counselor should referral become
Appropriate discussion about face-to-face
6. Clarity of contract with the client
The counselor’s credentials Yes
The agreed upon goals Yes
The agreed upon cost Yes
Where and how clients can report any
unethical counselor behavior Yes
Statement about the degree of security of
the Internet and confidentiality Yes
A statement of the nature of client
information electronically stored and
the length of time that data will be
A statement about the need for privacy
when the client is communicating with
the counselor Yes
Making the client aware of the typical
circumstances where individuals need
counseling support Yes
7. Inclusion of linkages to other Websites
Assuring that the services to which his
or hers are linked also meet these
8. Use of assessment
Tested in computer delivery mode to
assure that their psychometric
properties are the same In process
Same ethical guidelines as in
face-to-face mode Yes
Protect the confidentiality Yes
Any evidence that the client does not
understand the results the counselor
must refer the client to a qualified
career counselor Yes
Assessment must have been validated for
self-help use if no counseling support Yes
(b) From Harris-Bowlsbey, Dikel, and Sampson (1998).
Brown, D. T. (1990). Computerized techniques in career assessment. Career Planning and Adult Development Journal, 6(4), 27-36.
Daniels, M. H. (1994). A review of the Self-Directed Search. In J. T. Kapes, M. M. Mastie, & E. A. Whitfield (Eds.), A counselor’s guide to career assessment instruments (3rd ed., pp. 208-212). Alexandria, VA: National Career Development Association.
Gay, L. R. (2000). Educational research: Competencies for analysis and application, (6th ed.). Columbus, OH: Merrill.
Harris-Bowlsbey, J., Dikel, M. R., & Sampson, J. P., Jr. (1998). The Internet: A tool for career planning. Columbus, OH: National Career Development Association.
Hofer, P. J., & Green, B. F. (1985). The challenge of competence and creativity in computerized psychological testing. Journal of Consulting and Clinical Psychology, 53, 826-838.
Hohenshil, T. H. (2000). High tech counseling. Journal of Counseling & Development, 78, 365-368,
Holland, J. L. (1985). Making vocational choices: A theory of vocational personalities and work environments. Englewood Cliffs, NJ: Prentice Hall
Holland, J. L. (1994). The Self-Directed Search–Form R. Odessa, FL: Psychological Assessment Resources
Jones, L. K. (1983). A comparison of two self-directed career guidance instruments: Occu-Sort and Self-Directed Search. The School Counselor, 30, 204-211.
Jones, L. K. (1987). The Career Key. Raleigh, NC: Author. (Originally published by Ferguson in Chicago)
Jones, L. K. (1989). Measuring a three-dimensional construct of career indecision among college students: A revision of the Vocational Decision scale–The Career Decision Profile. Journal of Counseling Psychology, 36, 477-486.
Jones, L. K. (1990). The Career Key: An investigation of the reliability and validity of its scales and its helpfulness to college students. Measurement and Evaluation in Counseling and Development, 23, 67-76.
Jones, L. K. (1993). Two career guidance instruments: Their helpfulness to students and effect on students’ career exploration. The School Counselor, 40, 191-200.
Jones, L. K. (1997). The Career Key. Retrieved May, 2002, from http://www.careerkey.org/english
McKee, L. M., & Levinson, E. M. (1990). A review of the computerized version of the Self-Directed Search. The Career Development Quarterly, 38, 325-333.
Moreland, K. L. (1992). Computer-assisted psychological assessment. In M. Zeidner & R. Most (Eds.), Psychological testing: An inside view (pp. 343-376). Palo Alto, CA: Consulting Psychologists Press.
National Council on Measurement in Education. (1995). Code of Professional Responsibilities in Educational Measurement. (Available from National Council on Measurement in Education, 1230 Seventeenth Street, NW, Washington, DC 20036-3078)
Oliver, L. W., & Zak, J. S. (1999). Career assessment on the Internet: An exploratory study. Journal of Career Assessment, 7, 323-356.
Pasveer, K.A., & Ellard, J. H. (1998). The making of a personality inventory: Help from the WWW. Behavior Research Methods, Instruments, & Computers, 30(2), 309-313.
Salvia, J., & Ysseldyke, J. E. (1998). Assessment (7th ed.). Boston: Houghton-Mifflin.
Sampson, J. P. (1983). Computer-assisted testing and assessment: Current status and implications for the future. Measurement and Evaluation in Guidance, 15, 293-299.
Sampson, J. P. (2000). Using the Internet to enhance testing in counseling. Journal of Counseling & Development, 78, 348-356.
Sampson, J. P., & Pyle, K. R. (1983). Ethical issues involved with the use of computer-assisted counseling, testing and guidance systems. The Personnel and Guidance Journal, 283-287.
Sattler, J. (1990). Assessment of children (3rd ed.). San Diego, CA: Sattler.
Sax, G. (1997). Principles of educational and psychological measurement and evaluation (4th ed.). Belmont, CA: Wadsworth.
Edward M. Levinson, Department of Education and School Psychology, Indiana University of Pennsylvania; Heather L. Zeman, Chicago Public Schools; Denise L. Ohler, Career Services and Enrollment Management, Edinboro University of Pennsylvania. Correspondence concerning this article should be addressed to Edward M. Levinson, 242 Stouffer Hall, Department of Education and School Psychology Indiana University of Pennsylvania, Indiana, PA 15705 (e-mail: firstname.lastname@example.org).
COPYRIGHT 2002 National Career Development Association
COPYRIGHT 2002 Gale Group