(PDF) Validity of the German Test Anxiet

Validity of the German Test Anxiety Inventory (TAI-G) in an Australian sample
bs_bs_banner

Australian Journal of Psychology 2014
doi: 10.1111/ajpy.12058

Validity of the German Test Anxiety Inventory (TAI-G) in an
Australian sample

Tony Mowbray,1 Kate Jacobs,1 and Christopher Boyle2
Faculty of Education, Monash University, Melbourne, Victoria, and 2School of Education, University of New England,
Armidale, New South Wales, Australia

Abstract

Test anxiety (TA) is a prevalent issue among students that can result in deleterious consequences, such as underachievement.
However, a contemporary measure that has been validated for use with Australian students seems to be lacking. This study, therefore,
investigated the suitability of the German Test Anxiety Inventory (TAI-G) for use with Australian university students. While the
original TAI-G contains 30 items and was designed to measure four factors (worry, emotionality, interference, and lack of confidence),
differing factorial models have been supported in the literature using either the original or a shortened 17-item version of the
measure. These differing TAI-G models were tested and compared in the current study via confirmatory factor analysis using 224
Australian university students. As expected, results supported the superior fit of the 17-item four-factor model. Additionally, the
convergent validity of the measure was supported since measures of self-esteem, self-efficacy, and general anxiety were all found to
correlate significantly with the TAI-G in the hypothesised directions. Finally, the finding that all of the TAI-G subscales had acceptably
high reliabilities led to the conclusion that the 17-item TAI-G is a valid and reliable measure of TA in an Australian university
population.

Key words: German Test Anxiety Inventory, TAI-G, test anxiety, test anxiety Australia, test anxiety measurement

Test anxiety (TA) comprises emotional, physiological, cogni- psychosociocultural conditions when investigating this con-
tive, and behavioural responses in examination-type situa- struct. This means that TA measures need to demonstrate
tions (Zeidner, 1998). Examinations usually determine validity within the intended cultural context. Validity can be
students’ professional future, making such high-stakes situ- described as ‘the degree to which all the accumulated evi-
ations anxiety-provoking. When students experience high dence supports the intended interpretation of the test scores
levels of TA, this can impact upon memory (Mowbray, for the proposed purpose’ (American Educational Research
2012), which can subsequently reduce performance and Association, American Psychological Association, and
well-being (Zeidner, 1998). Furthermore, TA is a worldwide National Council on Measurement in Education, 1999, p.
phenomenon (Bodas & Ollendick, 2005) and fairly preva- 11). The factorial structure of a commonly used TA measure
lent, with Knappe et al. (2011) finding that over 28% of 14- (the German Test Anxiety Inventory; TAI-G) is subject to
to 24-year-olds in a sample of 3,021 had fears regarding ongoing debate. Therefore, this study sought to assess the
testing situations. Moreover, onset of isolated fears regarding validity of the TAI-G in an Australian sample, which
test taking were found to rise steadily as students aged, included testing and comparing multiple factorial models.
plateauing at around 21 years, thereby highlighting
the importance of exploring TA in university student
The TAI-G
populations.
Consequently, the need for instruments that accu- The TAI-G is a multidimensional measure consisting of four
rately assess TA is paramount. Bodas and Ollendick (2005) subscales: worry, emotionality, lack of confidence, and inter-
attest to the need to take into account the prevailing ference. Worry refers to the cognitive manifestation of
anxiety over performance, while emotionality refers to
anxious emotional and autonomic reactions in relation to an
Correspondence: Tony Mowbray, Faculty of Education, Monash examination. Development of the TAI-G included the addi-
University, Building 6, Wellington Road, Clayton Vic. 3800, Aus-
tion of a lack of confidence subscale, defined by Hodapp
tralia. Email:
[email protected]
Received 26 September 2013. Accepted for publication 24 March (1996) as the test taker’s belief in his/her inability to perform
2014. well in an upcoming exam. The interference subscale was
© 2014 The Australian Psychological Society also added, which relates to the presence of thoughts that

2 T. Mowbray et al.

interfere with on-task performance and are not a component 17-item TAI-G, and limitations of previous research, the
of worry per se (e.g., being preoccupied with thoughts in question as to whether the 17-item four-factor TAI-G also
general that cause distraction). Last, the TAI-G contains demonstrates a superior fit in an Australian university popu-
items referring only to an individual’s experience during the lation is still to be assessed.
examination situation.
Convergent and discriminant validity indicators of TA
Cultural variants and the TAI-G
Consistent negative relationships have been found between
The TAI-G has been validated across a range of cultures that TA and measures of self-efficacy and self-esteem. Moreover,
include Germany (Keith, Hodapp, Schermelleh-Engel, & the extant literature theorises and demonstrates the TAI-G as
Moosbrugger, 2003; Rohrmann, Bechtoldt, Schnell, & a measure of trait TA as opposed to state TA (Keith et al.,
Hodapp, 2010), Spain (Sese, Palmer, & Perez-Pareja, 2010), 2003). Therefore, a significantly greater relationship with
Canada (Harpell & Andrews, 2012), South Africa (Ringeisen, trait anxiety as opposed to state anxiety may provide discri-
Buchwald, & Hodapp, 2010), and America (Hodapp & minant evidence that the TAI-G primarily measures trait TA.
Benson, 1997). However, findings from different cultural Replication of the direction and strength of these relation-
samples used in these studies may not be generalisable to ships in an Australian sample would, therefore, provide evi-
Australian students given that the reported occurrence and dence of both convergent and discriminant validity of the
characteristics of anxiety seem to differ as a function of TAI-G in Australia. While other variables exist that have
culture. Cultural differences in anxiety have been observed been found to significantly predict TA, such as neuroticism
in cognitive, affective, and behavioural components (Chamorro-Premuzic, Ahmetoglu, & Furnham, 2008), such
(Zeidner, 1998). Sharma and Sud (1990), for example, con- measures were not included in the current study due to
ducted a comparative study of TA through using a sample of inadequate replication of these results in comparison to con-
7,679 high school students from four Asian and five Euro- structs of self-esteem, self-efficacy, and general anxiety
American countries. While TA was found to be universal, (Hembree, 1988). Additionally, measures of self-esteem, and
differences in the intensity and pattern of TA were found particularly self-efficacy, have been used in validation
both between and within the different cultural groups. studies of the TAI-G among multicultural samples (Hodapp
When comparing Euro-American cultures, for example, & Benson, 1997; Keith et al., 2003; Ringeisen et al., 2010;
American students reported higher TA when compared with Rohrmann et al., 2010), thereby enabling a more direct
their Italian and German counterparts (p < .001), and comparison between the results of the current study and
reported greater worry, but not emotion, when compared previous research utilising the TAI-G.
with Turkish (p < .01) and Hungarian (p < .001) students.
Aims and hypotheses
The authors concluded that observed differences reflected
sociocultural and socioeconomic variants. These cultural dif- The aim of this study was to establish the reliability and
ferences may also be observed in studies using the TAI-G structural validity of the TAI-G in an Australian university
itself. Sese et al. (2010) reported deleting a poor fitting item student sample by comparing competing structural models
on the 30-item TAI-G in order to obtain adequate structural of the TAI-G reported in the extant research via confirma-
validity in a Spanish sample, while in the Argentinian tory factor analysis (CFA). Further, the external validity of
version, a total of two items were removed in order to obtain the measure was investigated via inspection of bivariate cor-
adequate fit (Heredia, Piemontesi, Burlan, & Hodapp, 2008). relations with well-known correlates of TA.
Furthermore, out of these studies, only two have exam- It is hypothesised that when compared with a two-factor
ined the 17-item TAI-G (Harpell & Andrews, 2012; Hodapp conception of TA (Liebert & Morris, 1967), the four-factor
& Benson, 1997), making support for the 17-item version TAI-G will be a more valid and reliable measure of TA as
limited. Moreover, the participants used in one study were confirmed in other multiethnic groups. Moreover, the
Canadian students from Grades 7 to 12 (Harpell & Andrews, 17-item, four-factor model of the TAI-G discovered by
2012), making generalisation to university populations Hodapp and Benson (1997), and confirmed by Harpell and
questionable. In contrast, participants utilised by Hodapp Andrews (2012), is predicted to fit the data best compared
and Benson (1997) comprised undergraduate university stu- with the 30-item TAI-G, as was found in their research.
dents from American and German samples. However, the Moreover, it is predicted that the best fitting models will
American sample contained a disproportionate number of specify four first-order factors of worry, emotionality, inter-
graduate students, which may have restricted the range of ference, and lack of confidence, and one second-order factor
scores observed, with authors calling for the need for repli- (TA) that accounts for the covariation between the first-
cation ‘in other national or binational samples’ (Hodapp & order factors (Hodapp & Benson, 1997; Keith et al., 2003).
Benson, 1997, p. 240). Given the impact of cultural variants It is also predicted that self-efficacy and self-esteem will
on validity, limited number of validation studies for the have significant negative relationships with scores on the

Validity of the TAI-G in an Australian sample 3

TAI-G. Finally, it is hypothesised that while the TAI-G will a task or cope with adversity (e.g., ‘I can always manage to
have significant positive relationship with both measures of solve difficult problems if I try hard enough’). Each item is
state and trait anxiety, the strength of the former association rated on a 4-point Likert scale, ranging from 1 (not at all true)
will be significantly less than the strength of the latter since to 4 (exactly true). High scores indicate higher levels of self-
the TAI-G was developed as a measure of trait TA (Keith efficacy. To capture self-efficacy in relation to examinations,
et al., 2003). a brief instruction was given, adapted from Keith et al.
(2003), which stated: ‘In relation to how you feel toward
METHOD your studies, please complete the following’. The structure of
the GSE has been validated across 25 countries, including
Participants Japan, Peru, Spain, America, and Great Britain (N = 19, 120;
Scholz, Gutiérrez-Doña, Sud, & Schwarzer, 2002). The GSE
Participants were recruited via opportunistic sampling from has demonstrated Cronbach alphas in the range of .79–.88
various Melbourne universities in Australia. Participants (Luszczynska, Gutiérrez-Doña, & Schwarzer, 2005).
were 224 university students, comprising 184 female
(82%) and 40 male (18%) respondents, aged 18–52 years
Rosenberg Self-Esteem Scale (RSES; Rosenberg, 1965)
(M = 21.3, standard deviation (SD) = 4.6). The majority of
respondents were born in Australia (79%), and described The RSES is a self-report, 10-item questionnaire that is
themselves as Australian, or a mixture of both Australian designed to measure both positive and negative feelings
and another ethnicity, while the majority of the remaining about the self (e.g., ‘On the whole, I am satisfied with
sample reported a range of Asian ethnic identities (12%). myself’). Each item is rated on a 4-point Likert scale, ranging
By way of participant self-report, the sample, on average, from 1 (strongly agree) to 4 (strongly disagree). Low scores are
had attended university for just under 3 years (M = 2.8, indicative of low self-esteem. The structure of the RSES has
SD = 1.7), with the majority having completed either a high been analysed across 53 nations, including Australia, and
school diploma (60%) or undergraduate (31%) study as has been found to be largely invariant, supporting the cross-
their highest level of education obtained. The most fre- cultural validity of this measure (N = 16,998; Schmitt &
quently reported courses in this sample were arts (22.3%), Allik, 2005). The RSES has demonstrated reliability, with a
medicine, nursing and health sciences (20.5%), and science Cronbach’s alpha of .89 in an Australian sample (Schmitt &
(18.3%). Allik, 2005).

Measures
State Trait Anxiety Inventory (STAI; Spielberger, Gorsuch,
Lushene, Vagg, & Jacobs, 1983)
TAI-G (Hodapp, 1996)
The STAI consists of both state and trait subscales containing
The TAI-G is a 30-item self-report measure of TA consisting
20 items each. The STAI state subscale measures how
of four subscales: worry (ten items; e.g., ‘I worry about my
anxious the respondent feels in the present moment (e.g., ‘I
results’), emotionality (eight items; e.g., ‘I tremble with
feel tense’), while the STAI trait subscale is designed to
fear’), interference (six items; e.g., ‘I easily lose my train of
measure how anxious the respondent generally feels (e.g., ‘I
thoughts’), and lack of confidence (six items; e.g., ‘I think
am a steady person’). Items on both subscales are measured
that I will succeed’). Each item is rated on a 4-point Likert
using a 4-point Likert scale, with the STAI trait subscale
scale, ranging from 1 (almost never) to 4 (almost always). In
ranging from 1 (almost never) to 4 (almost always), and the
contrast to the other subscales, the lack of confidence
STAI state subscale ranging from 1 (not at all) to 4 (very much
subscale contains positively oriented items and is subse-
so). Higher scores indicate greater levels of anxiety. Validity
quently reverse-scored. A total score ranging from 30 to 120
of the STAI has been demonstrated through significant cor-
is created by summing all four subscales, with higher scores
relations with other anxiety measures (Spielberger et al.,
indicative of greater TA.
1983). Average reliability coefficients calculated over 75
Reliability of the TAI-G subscales has consistently been
studies are acceptably high for both the STAI state and STAI
found to be adequate, with Cronbach alphas exceeding .70
trait subscales (.91 and .89, respectively; Barnes, Harp, &
in multiple samples (.73–.92; Harpell & Andrews, 2012;
Jung, 2002).
Keith et al., 2003; Ringeisen et al., 2010; Sese et al., 2010).

General Self-Efﬁcacy Scale (GSE; Schwarzer & Procedure
Jerusalem, 1995)
Ethics was first obtained from Monash University Human
The GSE Scale is a self-report, 10-item questionnaire that Research Ethics Committee. Recruitment methods included
relates to an individual’s belief in his/her ability to overcome advertisement via posters, social networking sites, and at the

4 T. Mowbray et al.

start of lectures. Participants were required to click on a web the overall TAI-G for both 17-item and 30-item models was
site link or enter the link into their Internet browser taking acceptably high, with all values above .70 (Cronbach, 1952).
them to the Qualtrics survey web site where the question- Table 1 displays the descriptive statistics for each of the
naires were located. The 30-item TAI-G was administered TAI-G subscales and total score.
online, along with the GSE, RSES, and STAI, with participa-
tion taking approximately 10 min. Completion of the ques- CFA
tionnaires implied consent. Participants were required to CFA analysis on the data was conducted using the maximum
record demographic information that included gender, eth- likelihood estimation procedure with AMOS version 21 (IBM
nicity, and a question asking if students were within 2 weeks Corp., Armonk, NY, USA). Akaike’s information criterion
of an upcoming exam. The majority of students reported (AIC) was inspected to allow for comparison of non-nested
having no major exams within the next 2 weeks (76%), models for goodness of fit. Difference in AIC values, whereby
helping ensure the majority of students did not have one model has a lower AIC value than another non-nested
elevated scores on the TAI-G due to the proximity of an model, indicates a superior fitting model (Kline, 2010). AIC
impending examination. The order of presentation for the was used to compare the first-order models with the second-
measures was randomised, with the exception of the order models to determine which model is preferred.
TAI-G, which was always presented first. Participants were Table 2 presents the results of the CFA on the different
given the option to enter a prize draw as an incentive for TAI-G models using the criteria outlined above. Different
participation. models of the TAI-G were tested to identify how retaining
specific factors and variables impacted model fit. Initially,
two-factor (emotionality and worry) models were tested in
RESULTS order to examine the earlier conceptualisations of TA
(Liebert & Morris, 1967; Spielberger et al., 1983). One of the
Sample size for the current study was within recommenda- two-factor models incorporated the 18 items from the emo-
tions of 5–10 participants per scale item (Streiner, 1994), tionality and worry subscales of the original TAI-G (Hodapp,
particularly since the TAI-G items have demonstrated strong 1996), and the other used the nine items retained from
factor loadings in previous studies (above .60; Guadagnoli & Hodapp and Benson’s (1997) shortened version. The four-
Velicer, 1988). The internal consistency of each subscale and factor models were then specified for the original 30-item

Table 1 Descriptive statistics for the TAI-G
No. of items M SD Range Reliability (α) Skew Kurtosis
TAI-G (30-item)
Worry 10 29.70 6.35 13–40 .88 −.40 −.62
Emotionality 8 19.92 5.75 9–32 .89 .09 −.82
Interference 6 14.17 4.67 6–24 .90 .23 −.78
Lack of confidence 6 16.26 3.80 6–24 .90 −.24 −.43
Total 30 80.05 16.89 38–120 .95 −.14 −.64
TAI-G (17-item)
Worry 4 10.93 3.14 4–16 .82 −.37 −.83
Emotionality 5 12.17 3.11 5–20 .86 −.17 −.84
Interference 3 2.46 3.33 3–12 .88 .26 −.72
Lack of confidence 5 13.52 3.23 6–20 .89 −.15 −.52
Total 17 46.47 9.90 22–68 .91 −.27 −.67

Table 2 Overall model fit indices
Model χ2 df p CFI SRMR RMSEA AIC ΔAIC
Two factors: worry and emotionality (18 items) 412.19 134 .000 .87 .07 .10 – –
Two factors: worry and emotionality (9 items) 105.18 19 .000 .93 .06 .12 – –
First-order (30 items)a 1,012.58 399 .000 .85 .09 .08 1,144.58 –
Second-order (30 items)b 1,013.29 401 .000 .85 .09 .08 1,141.29 −3.29
First-order (17 item)a 251.434 113 .000 .94 .06 .07 331.43 –
Second-order (17 item)b 251.86 115 .000 .94 .06 .07 327.86 −3.57
Note. SRMR = standardized root mean square residual.
Model with four first-order factors: worry, emotionality, lack of confidence, and interference. bModel with second-order factor accounting for
covariation between four first-order factors: worry, emotionality, lack of confidence, and interference.

Validity of the TAI-G in an Australian sample 5

TAI-G and the shortened 17-item TAI-G. Last, a second- higher order construct relating to the four secondary factors
order structure was imposed on both four-factor models supported.
explaining the covariance between first-order primary
factors, with the second-order factor labelled ‘Test Anxiety’. Correlational data
Chi-square was significant for all models tested, indicating
Subscale correlations ranged from .46 to .71 (p < .01) for the
poor fit. However, chi-square tests for perfect model fit,
30-item TAI-G and .33 to .63 for the 17-item TAI-G. The lack
making this statistic highly stringent; therefore, alternative
of confidence subscale demonstrated the weakest correla-
fit indices were examined (Kline, 2010; Tabachnick & Fidell,
tions with the remaining subscales for both versions of the
2013). The two-factor 18-item model did not meet the cri-
TAI-G, with the strongest relationships seen among the
teria for good fit, and while the comparative fit index (CFI)
worry and emotionality subscales. In particular, the worry
value for the two-factor eight-item model indicated
and emotionality factors of the 30-item TAI-G demonstrated
adequate fit the root mean square error of approximation
a strong relationship. Table 3 displays the intercorrelations of
(RMSEA) did not. The first-order 30-item model also failed
the subscales for both long and short versions of the TAI-G.
to meet the required specification for CFI, but indicated
The relationship between the 17-item TAI-G and the
better fit over the two-factor models as shown by the
selected correlates of TA were all significant (p < .01) and in
RMSEA. Upon closer inspection, lack of fit could have been
the expected direction. Lack of confidence demonstrated the
due to some items loading onto more than one factor and
strongest relationships with self-efficacy and self-esteem,
high standardised residual covariances between some of the
while emotionality had the strongest correlations with all
items, particularly items 2 (‘I think about how important the
measures of general anxiety. The difference between state
examination is for me’; z = −3.824 to 3.587), 6 (‘I worry
and trait anxiety when correlated with the overall TAI-G
about whether I can cope with being examined’; z = −2.038
score was significant. A t-statistic was used to test for signifi-
to 3.752), and 30 (‘I have the feeling everything is really
cant difference in correlation between the TAI-G and either
difficult for me’; z = −1.597 to 4.576). Item 6 was also seen to
subscale of the STAI (Chen & Popovich, 2002). A value of
load onto the interference subscale as opposed to worry, and
t = 3.04 indicated a significantly lower correlation between
item 30 was observed to load more strongly onto the emo-
the TAI-G and the STAI state subscale in relation to the
tionality subscale than the interference subscale. In contrast,
TAI-G and the STAI trait subscale (p < .005). This provides
the first-order 17-item model showed acceptable model fit
some support for the contention that the TAI-G is a stronger
over all indices, with the exception of chi-square.
measure of trait anxiety factors as opposed to transient state
The covariance between factors for the 17-item first-order
anxiety. Table 4 reports the correlations of each chosen cor-
model ranged from .18 to .48 (p < .001), with the lack of
relate of TA with the 17-item TAI-G and TAI-G subscales.
confidence factor demonstrating the weakest relationship
To guide researchers and clinicians when attempting to
with the other factors of the TAI-G. However, items 17 and
quantify scores, Table 5 shows percentile intervals for each
18 had ambiguous factor loadings and high standardised
scale and their given score.
residual covariance, potentially reducing observed fit statis-
tics. Given the moderate covariation between subscales
Table 3 Correlations of TAI-G subscales by model
(with the exception of the lack of confidence subscale), it
was expected that a higher order factor accounting for this Worry Emotionality Interference Confidence
covariation would produce adequate model fit (Keith et al., Worry .71 .56 −.46
2003). As Table 2 shows, adding a second-order factor Emotionality .63 .60 −.52
Interference .46 .49 −.46
improved model fit as seen by the decrease in AIC for the
Confidence −.48 −.46 −.33
30-item TAI-G (Fig. 1; ΔAIC = −3.29). However, according to
Note. All correlations significant at the p < .01 level (one-tailed). Above
Burnham and Anderson (2004), this value just borders on the diagonal line are values from the 30-item TAI-G; below the diagonal
being evidently less supported than the second-order model. are from the 17-item TAI-G.
This means the second-order model does not offer strong
support for improved fit over the first-order model.
Table 4 Intercorrelations of the 17-item TAI-G subscales and TAI-G
Improved fit was also seen for the second-order 17-item total with selected TA correlates
TAI-G (ΔAIC = −3.57). Again, the observed small AIC value
GSE RSES STAIT STAIS
offers marginal support for the second-order model over the
Emotionality −.33 .53 .61 .54
first-order model (Burnham & Anderson, 2004). Overall, the
Worry −.26 .45 .51 .43
17-item second-order TAI-G model provided the best fit, with Interference −.34 .50 .57 .47
parameter estimates for this model presented in Fig. 2. All Confidence −.49 .64 .59 .52
items demonstrated significant and strong loadings (p < .001; TAI-G total −.45 .68 .73 .63
Tabachnick & Fidell, 2013), with the existence of a Note. All correlations significant at the p < .01 level (one-tailed).

6 T. Mowbray et al.

Figure 1 Standardised solution of the 30-item TAI-G confirmatory model consisting of the four primary factors (emotionality, worry,
interference, and confidence) and a second-order factor (test anxiety). Variances are given in brackets, factor loading located on the arrows,
and squared factor loadings located at the top right of the variables and inside the factor ovals.

DISCUSSION two-factor structure of emotionality and worry as the com-
ponents of TA (Liebert & Morris, 1967; Spielberger et al.,
The current study investigated the validity of the TAI-G in an 1983). Convergent and discriminant validity was also exam-
Australian university student population by examining indi- ined through correlation of the TAI-G with selected corre-
cators of both internal and external validity. Specifically, lates of TA.
four-factor versions of the TAI-G were analysed, the original As expected, the 17-item TAI-G showed superior fit above
30-item TAI-G (Hodapp, 1996), and the shortened, 17-item the models tested, including the 30-item TAI-G, a result
TAI-G (Hodapp & Benson, 1997). Another two versions of consistent with previous research (Harpell & Andrews,
the TAI-G were explored, which attempted to replicate the 2012). Further, the addition of a second-order factor to both

Validity of the TAI-G in an Australian sample 7

Figure 2 Standardised solution of the 17-item TAI-G confirmatory model consisting of the four primary factors (emotionality, worry,
interference, and confidence) and a second-order factor (test anxiety). Variances are given in brackets, factor loading located on the arrows,
and squared factor loadings located at the top right of the variables and inside the factor ovals.

Table 5 Percentile scores for the 17-item TAI-G subscales and Ambiguous factor loadings, that is to say, items observed
TAI-G total to load onto more than one factor, and high standardised
10th 25th 50th 75th 90th residual covariances for items 2, 6, and 30 appeared particu-
Worry 8 10 12 15 16 larly problematic. Specifically, item 30 of the interference
Emotionality 9 11 14 17 19 subscale was also found to have an ambiguous factor loading
Interference 4 5 7 9 11 in previous studies (Hodapp, Glanzmann, & Laux, 1995;
Confidence 9 11 14 16 18
TAI-G total 32 40 48 54 –
Keith et al., 2003; Sese et al., 2010). Thus, item 30 may
represent a problematic item to be removed from the
TAI-G.
Despite the 17-item TAI-G providing adequate fit of the
four-factor models of the TAI-G resulted in improved model data, the fit statistics may be considered just adequate by
fit, indicating that the subscales of the TAI-G are representa- some authors, such as Hu and Bentler (1999), who consider
tive of the higher construct TA (Hodapp & Benson, 1997; a CFI of .95 or greater and an RMSEA of .06 or less as
Keith et al., 2003). However, statistics observed after the indicative of a close fitting model. While the 17-item TAI-G
addition of a second-order factor only weakly supported the achieved statistics close to these criteria, the values fell short.
presence of a second-order factor for both models. Moreo- This seemed to be partly due to the association between
ver, the 30-item TAI-G did not adequately fit the data as some items on the emotionality and worry subscales, par-
predicted. This is in contrast to previous research, which has ticularly items 17 and 18 due to ambiguous factor loadings
found the 30-item TAI-G to provide at least an adequate fit and high standardised residual covariance. Moreover, these
(Harpell & Andrews, 2012; Hodapp & Benson, 1997; Keith cut-off criteria are ‘rules of thumb’, with strict adherence
et al., 2003; Ringeisen et al., 2010; Rohrmann et al., potentially resulting in higher probability of type I error, as
2010). variables such as sample size and model complexity need to

8 T. Mowbray et al.

be taken into account (Worthington & Whittaker, 2006). in this study also contains positively coded items; thus,
Marsh, Hau, and Wen (2004) also caution against rigidly response tendencies may be partly responsible for
applying these criteria and point out that the misspecified the lack of confidence subscale having relatively low inter-
models used to establish the cut-off criteria by Hu and scale correlation and the strongest association with
Bentler (1999) misspecified by a small degree and were not self-efficacy.
representative of real data. In addition, the finding that As expected, relationships with the TAI-G and measures
smaller sample size led to increased rejection of these slightly related to TA were significant and in the expected direc-
misspecified models indicates that the 17-item TAI-G pro- tion. The TAI-G demonstrated significant negative rela-
vides a good fit for the data. tionships with measures of self-esteem and self-efficacy.
Similar to previous research (Ringeisen et al., 2010), the Moreover, significant positive associations were found
interference and lack of confidence factors were found to between the TAI-G and measures of trait and state anxiety.
have the weakest association with the remaining subscales As predicted, the TAI-G correlated significantly higher with
for both versions of the TAI-G. Moreover, both factors had the trait subscale on the STAI than the state subscale. This
the lowest loadings on the secondary TA factor. With regard is in line with theory (Hodapp, 1996; Hodapp et al., 1995)
to interference, Hodapp and Benson (1997) reported lower and the findings of Keith et al. (2003), who found the
factor loadings (.42–.52) than what was found in the current TAI-G measured stable interindividual differences (trait
study, but interference did not load as strongly on TA when anxiety) to a greater extent than situational specific
compared with other confirmatory studies that analysed the anxiety (state anxiety). This provides convergent and dis-
30-item TAI-G (.74–.84; Keith et al., 2003; Ringeisen et al., criminant validity evidence for the assertion that the TAI-G
2010). measures trait test anxiety and is less influenced by
All subscales of the 17-item TAI-G demonstrated high situational factors.
internal consistency, which is consistent with previous Limitations of this study include sampling issues, as it
studies (Keith et al., 2003; Ringeisen et al., 2010). Unlike utilised students primarily from Monash University
previous studies, the interference subscale showed higher and a significant majority of those participants were
item means and variances (refer to Table 1; Keith et al., female, with males being underrepresented in the sample.
2003), and relatively normal score distribution, which reflect A greater number of females are enrolled at Monash Uni-
endorsement of the items in this subscale. This may be versity (Monash University Office of Planning and Quality,
responsible for the interference subscale demonstrating good 2013), but even when taking this larger ratio into
psychometric properties in this sample for both versions of account, the sample was still unrepresentative. Further-
the TAI-G, whereas previous studies have found interference more, data on the nature of enrolment (i.e., internal vs
to be psychometrically the weakest (Hodapp & Benson, external enrolment) were not taken, so while it is assumed
1997; Keith et al., 2003). the majority of participants were enrolled internally, the
With regard to the lack of confidence subscale, factor actual number cannot be quantified. Moreover, while
loadings did not show any improvement from the 17-item sample size could be considered adequate, larger sample
version to the 30-item version (refer to Figs 1 and 2). sizes of 300 or more have been recommended when con-
Moreover, the factor loading for this subscale onto the sec- ducting CFA (Tabachnick & Fidell, 2013), and therefore
ondary TA factor was consistent with previous findings caution should be taken when generalising these out-
(Keith et al., 2003; Ringeisen et al., 2010). However, earlier comes. Future studies may expand on the current design
research has found lack of confidence to be better by attempting to incorporate a more diverse university
conceptualised as separate to TA altogether (Hodapp & sample with a larger number of male participants. Further,
Benson, 1997; Keith et al., 2003). CFA models have shown constructing and examining a lack of confidence subscale
improved fit when lack of confidence is placed separate to that is negatively coded, thereby being consistent with the
TA, as a correlate of self-efficacy under a higher order remaining subscales, will help clarify the impact item
factor labelled ‘self-esteem’ (Hodapp & Benson, 1997; wording has on the weaker relationship observed between
Keith et al., 2003). The data reflect this trend, with lack of the lack of confidence subscale and the remaining
confidence demonstrating the strongest associations with subscales.
self-efficacy and self-esteem in relation to the remaining In conclusion, the findings of the current study are con-
subscales, in addition to the smallest inter-scale correlation sistent with previous research supporting the four-factor
for both versions of the TAI-G. conceptualisation of TA, as well as the use of the 17-item
This pattern of results, however, may be due to over the 30-item TAI-G. Furthermore, considering sample
the coding of the items in the lack of confidence subs- limitations, results partially support the 17-item TAI-G as a
cale, which are coded positively while the remaining valid and reliable scale for use in ascertaining TA in Austral-
subscales are coded negatively. The self-efficacy scale used ian university students.

Validity of the TAI-G in an Australian sample 9

REFERENCES factors. Journal of Psychiatric Research, 45(1), 111–120. doi:10
.1016/j.jpsychires.2010.05.002
Liebert, R. M., & Morris, L. W. (1967). Cognitive and emotional
American Educational Research Association, American Psychologi- components of test anxiety: A distinction and some initial data.
cal Association, and National Council on Measurement in Edu- Psychological Reports, 20(3), 975–978. doi:10.2466/pr0.1967.20
cation. (1999). Standards for educational and psychological testing. .3.975647
Washington, DC: American Psychological Association. Luszczynska, A., Gutiérrez-Doña, B., & Schwarzer, R. (2005).
Barnes, L. L. B., Harp, D., & Jung, W. S. (2002). Reliability gener- General self-efficacy in various domains of human functioning:
alization of scores on the Spielberger State–Trait Anxiety Inven- Evidence from five countries. International Journal of Psychology,
tory. Educational and Psychological Measurement, 62, 603–618. 40(2), 80–89. doi:10.1080/00207590444000041
doi:10.1177/0013164402062004005 Marsh, H. W., Hau, K. T., & Wen, Z. (2004). In search of golden
Bodas, J., & Ollendick, T. H. (2005). Test anxiety: A cross-cultural rules: Comment on hypothesis testing approaches to setting
perspective. Clinical Child and Family Psychology Review, 8(1), cutoff values for fit indexes and dangers in over-generalizing Hu
65–88. doi:10.1007/s10567-005-2342-x & Bentler’s (1999) findings. Structural Equation Modeling, 11, 320–
Burnham, K. P., & Anderson, D. R. (2004). Multimodel interference: 341. doi:10.1207/s15328007sem1103_2
Understanding AIC and BIC in model selection. Sociological Monash University Office of Planning and Quality. (2013). Prelimi-
Methods & Research, 33(2), 261–304. doi:10.1177/0049124 nary 2013 student profile. Retrieved from http://www.opq
104268644 .monash.edu.au/us/summary/campus-profiles-2013-prelim-
Chamorro-Premuzic, T., Ahmetoglu, G., & Furnham, A. (2008). mar13.pdf
Little more than personality: Dispositional determinants of test Mowbray, T. (2012). Working memory, test anxiety and effective
anxiety (the Big Five, core self-evaluations, and self-assessed interventions: A review. The Australian Educational and Develop-
intelligence). Learning and Individual Differences, 18(2), 258–263. mental Psychologist, 29(2), 141–156. doi:10.1017/edp.2012.16
doi:10.1016/j.lindif.2007.09.002 Ringeisen, T., Buchwald, P., & Hodapp, V. (2010). Capturing the
Chen, P. Y., & Popovich, P. M. (2002). Correlation: Parametric and multidimensionality of test anxiety in cross-cultural research: An
nonparametric measures. Newbury Park, CA: Sage Publications. English adaptation of the German Test Anxiety Inventory. Cogni-
Cronbach, L. J. (1952). Further evidence on response sets and test tion, Brain, Behaviour: An Interdisciplinary Journal, 14(4), 347–364.
design. Educational and Psychological Measurement, 10, 3–31. Rohrmann, S., Bechtoldt, M., Schnell, K., & Hodapp, V. (2010).
doi:10.1177/001316445001000101 Validation of the German Test Anxiety Inventory by self-concept
Guadagnoli, E., & Velicer, W. (1988). Relation of sample size to the scales. Cognition, Brain, Behavior: An Interdisciplinary Journal, 14(4),
stability of component patterns. Psychological Bulletin, 103(2), 265– 401–412.
275. doi:10.1037/0033-2909.103.2.265 Rosenberg, M. (1965). Society and the adolescent self-image. Princeton,
Harpell, J. V., & Andrews, J. J. W. (2012). Multi-informant test NJ: Princeton University Press.
anxiety assessment of adolescents. Psychology (Savannah, Ga.), Schmitt, D. P., & Allik, J. (2005). Simultaneous administration of the
3(7), 518–524. doi:10.4236/psych.2012.37075 Rosenberg Self-Esteem Scale in 53 nations: Exploring the univer-
Hembree, R. (1988). Correlates, causes, effects, and treatment of TA. sal and culture-specific features of global self-esteem. Journal of
Review of Educational Research, 58(1), 47–77. doi:10.2307/1170348 Personality and Social Psychology, 89(4), 623–642. doi:10.1037/
Heredia, D., Piemontesi, S., Burlan, L., & Hodapp, V. (2008). 0022-3514.89.4.623
Adaptación del inventario alemán de ansiedad ante los Scholz, U., Gutiérrez-Doña, B., Sud, S., & Schwarzer, R. (2002). Is
exámenes: GTAI-A [German Test Anxiety Inventory Adaptation: general self-efficacy a universal construct? Psychometric findings
GTAI-A]. Evaluar, 8, 46–60. from 25 countries. European Journal of Psychological Assessment,
Hodapp, V. (1996). The TAI-G: A multidimensional approach to the 18(3), 242–251. doi:10.1027//1015-5759.18.3.242
assessment of test anxiety. In C. Schwarzer & M. Zeidner (Eds.), Schwarzer, R., & Jerusalem, M. (1995). Generalized Self-Efficacy
Stress, anxiety, and coping in academic settings (pp. 95–130). Scale. In J. Weinman, S. Wright, & M. Johnston (Eds.), Measures
Tübingen: Francke. in health psychology: A user’s portfolio. Causal and control beliefs
Hodapp, V., & Benson, J. (1997). The multidimensionality of test (pp. 35–37). Windsor, UK: NFER-NELSON.
anxiety: A test of different models. Anxiety, Stress, and Coping, Sese, A., Palmer, A., & Perez-Pareja, J. (2010). Construct validation
10(3), 219–244. doi:10.1080/10615809708249302 for the German Test Anxiety Inventory Argentinean version
Hodapp, V., Glanzmann, P., & Laux, L. (1995). Theory and meas- (GTAI-A) in a Spanish population. Cognition, Brain, Behavior: An
urement of test anxiety as a situation-specific trait. In C. Interdisciplinary Journal, 14(4), 413–429.
Spielberger & D. P. Vagg (Eds.), Test anxiety: Theory, assessment, and Sharma, S., & Sud, A. (1990). Examination stress and test anxiety:
treatment. Series in clinical and community psychology (pp. 47–58). A cross-cultural perspective. Psychology and Developing Societies,
Washington, DC: Taylor & Francis. 2(2), 183–201. doi:10.1177/097133369000200203
Hu, L., & Bentler, P. M. (1999). Cutoff criteria for fit indexes in Spielberger, C. D., Gorsuch, R. L., Lushene, R., Vagg, P. R., & Jacobs,
covariance structure analysis: Conventional criteria versus new G. A. (1983). Manual for the State-Trait Anxiety Inventory. Palo Alto,
alternatives. Structural Equation Modeling, 6(1), 1–55. doi:10.1080/ CA: Consulting Psychologists Press.
10705519909540118 Streiner, D. L. (1994). Figuring out factors: The use and misuse
Keith, N., Hodapp, V., Schermelleh-Engel, K., & Moosbrugger, H. of factor analysis. Canadian Journal of Psychiatry, 39(3), 135–
(2003). Cross sectional and longitudinal confirmatory factor 140.
models for the German Test Anxiety Inventory: A construct vali- Tabachnick, B. G., & Fidell, L. S. (2013). Using Multivariate Statistics
dation. Anxiety, Stress, and Coping, 16(3), 251–270. doi:10.1080/ (6th ed.). Boston: Allyn & Bacon.
1061580031000095416 Worthington, R. L., & Whittaker, T. A. (2006). Scale development
Kline, R. B. (2010). Principles and practice of structural equation research: A content analysis and recommendations for best prac-
modeling (3rd ed.). New York: Guilford Press. tices. The Counseling Psychologist, 34(6), 806–838. doi:10.1177/
Knappe, S., Beesdo-Baum, K., Fehm, L., Stein, M. B., Lieb, R., & 0011000006288127
Wittchen, H. U. (2011). Social fear and social phobia types among Zeidner, M. (1998). Test anxiety: The state of the art. New York: Plenum
community youth: Differential clinical features and vulnerability Press.

(PDF) Validity of the German Test Anxiety Inventory (TAI-G) in an Australian sample