Can tourism confidence index improve tourism demand forecasts?

Purpose – The link between confidence and economic decisions has been widely covered in the economic literature, yet it is still an unexplored field in tourism. The purpose of this paper is to address this gap, and investigate benefits in forecast accuracy that can be achieved by combining the UNWTO TourismConfidence Index (TCI) with statistical forecasts. Design/methodology/approach – Research is conducted in a real-life setting, using UNWTO unique data sets of tourism indicators. UNWTO TCI is pooled with statistical forecasts using three distinct approaches. Forecasts efficiency is assessed in terms of accuracy gains and capability to predict turning points in alternative scenarios, including one of the hardest crises the tourism sector ever experienced. Findings – Results suggest that the TCI provides meaningful indications about the sign of future growth in international tourist arrivals, and point to an improvement of forecast accuracy, when the index is used in combination with statistical forecasts. Still, accuracy gains vary greatly across regions and can hardly be generalised. Findings provide meaningful directions to tourism practitioners on the use opportunity cost to produce short-term forecasts using both approaches. Practical implications – Empirical evidence suggests that a confidence index should not be collected as input to improve their forecasts. It remains a valuable instrument to supplement official statistics, over which it has the advantage of being more frequently compiled and more rapidly accessible. It is also of particular importance to predict changes in the business climate and capture turning points in a timely fashion, which makes it an extremely valuable input for operational and strategic decisions. Originality/value – The use of sentiment indexes as input to forecasting is an unexplored field in the tourism literature.


Introduction
Tourism forecasting is of great value for tourism practitioners, as it provides strategic input to the decision-making process of both governmental bodies and businesses. Forecasts on the long-term development of demand are a fundamental input when formulating and implementing strategies and policies. Short-term forecasts are equally relevant to operational decisions about capacity, pricing and all other factors that can vary in the short run. As a service sector, excess output of tourism service providers cannot be stocked, hence short-term adjustments can solidly contribute to optimise the use of existing resources. This is best exemplified by the aviation and hotel pricing models, where adjustments are currently made on a daily basis, if not in real-time.
a number of studies also report about the usefulness of qualitative approaches (for reviews, see Peng et al., 2014;Goh and Law, 2011;Witt and Witt, 1995). What remains still a largely unexplored field is the use of sentiment indexes as input to forecasting (Croce et al., forthcoming;Mihalic et al., 2013;Kester and Croce, 2011).
A sentiment or confidence index measures the current and prospected situation based on the evaluations of a sample of individuals. Business surveys typically collect evaluations from a fixed panel of experts across firms representing the sector of interest, while consumer surveys typically target households. In the broader economic literature, sentiment indexes have been widely employed for the purpose of forecasting. The extent to which these indicators can forecast economic activity has also been exhaustively researched, with findings suggesting that they contain information that goes beyond economic fundamentals, suitable to improve forecasts (Eppright et al., 1998;Bodo et al., 2000). Consumer confidence indexes, for instance, have a degree of explanatory power in predicting changes of the economic output (Christiansen et al., 2013;Taylor and McNabb, 2007;Easaw et al., 2005;Bram and Ludvigson, 1998;Santero and Westerlund, 1996), and a good predictive power in identifying discrete turning points in the business cycle, while they prove less efficient in predicting changes in consumer expenditure, with appreciable differences across countries (Taylor and McNabb, 2007).
A confidence index for the tourism sector is measured by the World Tourism Organisation, the United Nations agency specialised in tourism (UNWTO). Since 2003, UNWTO has been surveying the level of confidence in the sector, and regularly disseminating Tourism Confidence Index (TCI) values in its "UNWTO World Tourism Barometer" publication. The Index is meant to assist governments, business leaders and other decision makers around the world in formulating tourism policies. Its usefulness was tested lately, during the 2008/2009 economic and financial crisis, when the UNWTO Tourism Confidence Index proved its ability to capture the impact on the tourism sector as the crisis unfolded. Ex-post analyses revealed a strong correlation between expert prospects and actual values, and renewed the interest in the use of a sentiment index as input to forecast tourism demand.
This study is a first attempt to examine whether the UNWTO TCI helps predicting changes in tourism demand. The focus is on the intra-year development of international tourism demand, worldwide and in world regions. UNWTO TCI is used in combination[1] with statistical forecasts, to test if sentiment brings benefits in forecast accuracy. If any, contributions brought by the index are measured in terms of accuracy gains and assessed in alternative scenarios, including one of the hardest crises the tourism sector ever experienced (Smeral, 2010). The usefulness of this indicator in predicting turning points is also reported. The outcome of this research is meant to provide operational guidance to tourism practitioners on the opportunity cost to produce short-term forecasts using alternative approaches.

Forecasting short-term tourism demand
Since 1990s, tourism demand modelling advanced significantly, with the adoption of up-to-date econometric methodologies, such as error correction models, time varying parameter models and structural equation models. Methods based on state-of-the-art technology have also been introduced recently, among which genetic algorithms, fuzzy theory and neural networks (for references see Peng et al., 2014;Li et al., 2005;Petropoulos et al., 2006). However, findings are still contradicting and none of the proposed methods proved to its superiority in all circumstances Wong et al., 2007;Li et al., 2006a, b;Witt and Witt, 1995).
Econometric models tend to outperform time-series methods when arrivals for a specific origindestination pair have to be forecast, but often fail in various other areas (Smeral, 2010;Witt and Witt, 1995;Dawes et al., 1994). Whenever the task involves large aggregates of international arrivals, as in this study, auto-regressive moving average methods proved to be the most efficient approach, as the effect of key exogenous variableswhich determine demand at a country levelor the impact of special events, tend offset each other at that level (see Smeral, 2007;Papatheodorou and Song, 2005;du Preez and Witt, 2003). Under appropriate conditions, ARIMA models can be interpreted as the univariate, hence more parsimonious, time-series representation of structural models (Holden et al., 1990), with the non-negligible benefit of low-marginal costs coupled with high-marginal benefits. Evidence from the tourism forecasting literature also points to auto-regressive models as to the best approach to intra-year forecasts.
Further, accuracy improvements can be sought through combination. Judgment and statistical methods are typically conceived as alternative approaches to produce forecasts, though their characteristics suggest treating them as complementary approaches. Statistical models proved to be superior in detecting patterns objectively and in attributing an equal weight to past and recent observations (Makridakis et al., 1998). Judgment-based forecasts, on the other end, incorporate "those economic and market phenomena that are known but not quantified" (Caniato et al., 2011) and tend to more effectively adjust forecasts in a timely manner . As qualitative and quantitative approaches also tend to be based on different types of predictors, theory suggests that their combination would lead to results that are more accurate than using each method alone (Bram and Ludvigson, 1998;Garner, 1991).
Forecast combination techniques acknowledge the complementarities of the two approaches, in the attempt to compensate for their deficiencies. Bates and Granger (1969) initiated this field of research, demonstrating that the combination of two unbiased model-based forecasts leads to a new, more accurate forecast. Progressively, studies on the combination of statistical forecasts with judgmental forecasts proved to produce, under appropriate conditions, more accurate results than each approach used separately (Fildes et al., 2006(Fildes et al., , 2009Goodwin, 2002;Goodwin and Fildes, 1999), although sometimes gains are only of modest magnitude (Holden et al., 1990).
The many empirical studies available in the general forecasting literature help outlining principles for a meaningful combination of forecasts. Empirical evidence suggests that combination performs best when forecasts are based on different information, and the broadest possible range of techniques (see Fildes et al., 2006Fildes et al., , 2009Goodwin, 2002;Flores and White, 1989 for a review of findings). Unbiased forecasts and constant covariance are good premises for the use of weighted approaches (Holden and Peel, 1986). When expert forecasts are combined, heterogeneity of respondents is another desirable characteristic (Figlewski, 1983). Combination works best when used on series of medium to low volatility. When dealing with volatile series, evidence suggests that combined forecasts tend to be consistently less distant, but hardly ever close to the actual value than individual forecasts (Wong et al., 2007;Shen et al., 2008).
In the tourism forecasting literature, studies on forecast combination are skewed towards quantitative models. Attempts of combining econometric models have been published, for example, by Li et al. (2006a, b). In general, the combination of forecasts produced with different methods is strongly recommended for the tourism sector (Song et al., 2009), but very few tourism studies made an attempt to combine model-based and expert forecasts. Tideswell et al. (2001) combined quantitative forecasts with the results of a Delphi survey. Findings suggest that the forecast accuracy was quite high overall, for both international and domestic visitors, but judgmental forecasts based on a limited number of experts tended to be volatile. More recently, Song et al. (2011) proved that judgmental adjustments of statistical forecasts improve forecast accuracy.
This study suggests using prospects provided by members of UNWTO Panel of Tourism Experts as judgmental forecasts, to be used in combination with statistical forecasts. The rationale is to use information captured through the UNWTO survey to improve the accuracy of model-based forecasts. Accuracy improvements, if any, are also tested in two alternative scenarios, one associated with stability ("business-as-usual") and one with economic or political uncertainty ("crisis").
An original aspect of the proposed approach, compared to the existing literature on forecast combination, is the integration of statistical and judgmental forecasts, as opposed to their combination. Most frequently, the combination of qualitative and quantitative forecasts follows a sequential use of each method: experts use statistical forecasts as input to produce their judgmental forecasts; alternatively, predictive models can be extrapolated from experts' data processing approach and used in quantitative models (Armstrong, 2001). Integration implies pooling qualitative and quantitative forecasts produced separately. Such sets of data are seldom available, and therefore object of few studies even in the broad forecasting literature (see for instance Garner, 1991;Bram and Ludvigson, 1998). In this respect, UNWTO expert forecasts and statistical database offer a unique set of data to test the proposed approach in a real-life setting and on a worldwide scale.

Research design
The major aim of this study is to test accuracy gains, if any, brought by the UNWTO TCI when used in combination with statistical forecasts. To achieve this goal the efficiency of three combinatory approaches is tested against individual forecasts, both in terms of accuracy gains and capability to predict turning points.

Statistical forecasts
Actual values of tourism demand are approximated by UNWTO series of international tourist arrivals (ITA), a selection of data included in the UNWTO database on world tourism statistics. This database contains a variety of series for over 200 countries and territories covering data for most countries. Monthly ITA since 1999 have been aggregated into four-month series and used to estimate model parameters for extrapolative forecasts. Each series of actual data consists of 45 observations. Figure 1 shows series of ITA, actual and forecast values, and series of the TCI for the six regional aggregates. Charts also highlight periods related to crises of various natures.
ARIMA models are used to forecast ITA. Series of ITA have been tested for unit root using the augmented Dickey-Fuller test. For each series, alternative ARIMA models have been estimated and tested using R "forecast" package, based on the analysis of each series' autocorrelation and partial autocorrelation functions. The best candidate model has been selected on the basis of various diagnostics and evaluation criteria (see Table I[2], more criteria such as the results of the Dickey-Fuller test, AICc and BIC valuesare available upon request).

Qualitative forecasts
Prospects of the UNWTO TCI are used as judgmental input. Prospects measure the perceived short-term development of the tourism sector on a five-step Likert scale. Prospects are collected by the means of regular e-mail surveys among members of UNWTO Panel of Tourism Experts, a heterogeneous group of tourism experts from different sector from all over the world. Since the second quarter of 2003, when the survey started, 1,127 experts participated in at least one of the 33 waves conducted to date, with an average of 300 participating experts per wave. The number of participating experts has regularly grown over time, but differences across regions remain noteworthy: a considerable number of experts provide, on average, estimates for the series Europe (119), Americas (72) and Asia and the Pacific (64), while the respondents' base for the series Africa (20) and Middle East (14) are typically smaller.
The TCI is derived from the ratio between the positive and negative responses collected through this survey (see UNWTO, 2015). In order to be compared with actual values, the index values have been homologised to series of ITA, these latter expressed as percentage change on the previous period. The homologation process takes the form of a linear transformation as follows: where P is the index value at time t. Intercept a and slope b are estimated through a linear regression between the index and the equivalent series of ITA to return estimated growth values P À Á at each time, hereafter referred to as homologised TCI. Series of homologised TCI values are used in combination with statistical forecasts.

Integration approaches
This paper tests the efficiency of three different approaches, which can be used to integrate forecasts. One method is the pioneering variance-covariance (VARCO) approach, proposed by Bates and Granger (1969) and based on a quadratic loss function to minimise error variance. An alternative approach is the Discounted Mean Square Forecast Error (DMSFE), which assumes that recent forecasts can better help predicting future values and weights them more heavily than distant ones (Diebold and Pauly, 1987;Winkler and Makridakis, 1983). A simple average combination is also computed, as baseline for comparisons (Armstrong, 1989).

Accuracy improvements
Combining forecasts should bring to gains that justify the use of a sophisticated approach. It is a commonly accepted wisdom that criteria to distinguish between good and poor forecasts need to be tailored to the specific forecasting task, as each measure synthesizes different aspects of forecasts error series (Armstrong and Fildes, 1995). In this study, accuracy is expressed in terms of Mean Absolute Errors (MAE), a measure that provides the most informative and direct information about forecast errors distribution. The MAE has been preferred, to accuracy measures based on percentages or squared errors, which more frequently used in forecasting studies, as these latter would lead to misleading results with series of growth rates, as they suffer of instability when actual values are equal or close to zero.
Accuracy gains brought by combining forecasts, if any, are assessed in terms of percentage increments compared to the best individual forecast. Accuracy gains are tested on out-ofsample values, which are further grouped into a "crisis period"[4] and a "business-as-usual" period (see Table I) [5]. The Diebold-Mariano (1995)[6] statistics is eventually computed on series of absolute errors, to compare the predictive accuracy of each combinatory approach against the best performing individual forecast. This statistics is particularly suited to compare the accuracy of model-free forecasts, as for instance with survey-based forecasts. Furthermore, the Diebold-Mariano test accommodates for a number of series characteristics, among which the presence of serially correlated forecast errors (see Diebold and Mariano, 1995, p. 10), which characterises the benchmark of a compound forecast with one of its inputs (see Table II). Coupled with the short forecast horizons, forecast errors correlation can lead to particularly conservative results of the DM test, with the null hypothesis being rejected too often. This may explain the limited number of statistically significant results, and encourages the interpretation of significant ones as solid recommendations about the validity of the correspondent forecasting approach.

Turning points
Pattern of movements in series of ITA present a sequence of downturn and upturn regimes.
A turning point is defined as the point when the regime shifts. Series co-movements are assessed using the concordance statistics originally proposed by Harding and Pagan (1999). This simple non-parametric statistics measures the proportion of times two series are in the same state. Series of actual values and forecasts are transformed into binary indicator series (S i,t and S j,t ), where a value of 0 corresponds to a contraction (i.e. growth rate is 0 or lower) and a value of 1 corresponds to an expansion. The degree of concordance between transformed series as follows: where T is the sample size. The higher the concordance value, the closer the patterns of the actual and forecast value series is.

Hypotheses
Based on the methodology described above, benefits in forecast accuracy, which can be achieved by combining the homologised TCI and ARIMA forecasts, are tested based on the following set of hypotheses.
For each series, for the whole set of data and for each of the two scenarios (crisis and business-and-usual), it can be expected that: H1a. The predictive accuracy of a simple average combinatory approach is higher than that of the most accurate individual forecast. H1b. The predictive accuracy of a VARCO combinatory approach is higher than that of the most accurate individual forecast. H1c. The predictive accuracy of a DMSFE combinatory approach is higher than that of the most accurate individual forecast.
The following hypotheses concerning individual forecasting approaches are also tested for each series: H2. The predictive accuracy of the homologised TCI is higher than that of the corresponding statistical model.

Empirical results
The TCI and ARIMA forecasts have been regressed [7] against the corresponding series of ITA, to assess their predictive power. Results (Table I) suggest that, for most series, the homologised TCI explains a larger part of variation in ITA than statistical forecasts. This is coherent with previous research findings, showing that judgmental forecasts' main value is in predicting directional change. In both cases, forecasts predictive power is considerably reduced for those two series marked by high volatility and a small base of experts, namely the Middle East. Still, for this latter region the TCI returns a most accurate forecast than any individual or combined forecast (Table III).  Overall results confirm that forecasts combination leads to consistent gains in forecast accuracy, which can be generalised only under specific conditions. In four of the six series in exam, at least one of the combinatory approaches returns forecasts, which are more accurate than the best individual forecast. Combination is particularly appropriate for the series Africa, where both constituent forecasts perform poorly. For ARIMA forecasts, this can be partly explained by the high volatility of ITA to the region, frequently caused by geo-political unrest. The recent food riots, the uprisings linked to the Arab Spring and the break out of the Syrian conflict are the most recent examples of events that negatively affected tourism flows in Africa. The low number of African experts participating in UNWTO survey may explain instead the sub-performance of the TCI. In such a situation, combination successfully pools complementary information provided by each of the constituent forecasts, as each of the combinatory approaches returns more accurate results than their inputs. The Diebold-Mariano test points to a significantly more accurate forecast when the VARCO approach is used, suggesting that controlling for error variance is relevant under such circumstances.
The largest though not significant accuracy gains are achieved for the series Asia Pacific (−30 per cent) and Europe (−22 per cent). Compared to the best performing constituent forecast, DMSFE combination brings the MAE down by nearly 1 percentage point, which is an appreciable result. Both series are marked by a comparatively stable growth pattern, which is altered by the impact of large events such as the 2008/2009 economic crisis for the period in exam. DMSFE weights correctly calibrate the contribution of each input according to the operating environment, meaning that the TCI inputmore sensitive to changesis assigned a higher weight during periods of instability, while statistical forecastsbetter in extrapolating long-term trendsare comparatively more relevant in periods of stability. α values suggest that recent values are more relevant to produce accurate forecasts for the series Europe, while a longer memory is crucial to determine appropriate weights for the series Asia and the Pacific. This can be explained by the magnitude of the impact of crises, as opposed to the frequency of their occurrences: the Asia and the Pacific region is more vulnerable to crises than Europe, due to the relatively recent development of its tourism sector and its comparatively higher dependency on visitors from outside the region.
Combination performs poorly for the highest level of aggregation, the World, as well as for the Middle East region. For the earlier, results hint that a large group of experts best captures intra-year variations that characterise ITA's flows, being in possess of recent information that can alter values which would be expected at a given time of the year. Results for the Middle East region may be better explained by series characteristics, as well as the concentration of international tourism flows in a few destinations. As for Africa, the Middle East region is marked by a comparatively recent development of tourism, high dependency from extra-regional source markets and frequent unrest, resulting in highly volatile international tourism flows. Volatility, coupled with a comparatively poor availability of comparable statistics, leads to sub-optimal conditions for the use of extrapolative forecasting methods in both regions. The number of Middle Eastern experts contributing to the UNWTO survey is the lowest of all world regions, yet they return the most accurate forecasts among those in exam. This may be explained by the fact that, different from Africa, some 70 per cent of international tourism to the Middle East is directed to only three countries, namely, Saudi Arabia, UAE and Egypt. In this setting, collecting opinions from experts from the largest receiving countries in the region seems to be sufficient to produce better forecasts than each alternative approach.
In line with previous findings (see for instance Croushore, 2005;Howrey, 2001;Bram and Ludvigson, 1998), forecast combination also brings benefits in terms of directional change accuracy. Compared to statistical forecasts, concordance statistics of combined forecasts are at least as good as individual inputs, if not better. Still, they hardly outperform the homologised TCI in terms of concordance statistics. The only exception is the series Europe, for which a simple average combination returns a pattern closer to that of actual values series than any other forecast.
Results also confirm that forecast combination tends to work best for series of medium to low volatility. In periods of high volatility (Table IV), associated with crises of various natures, a combinatory approach brings accuracy gains to only three out of five series. For Europe, the region where the impact of the crisis lasted long, a method weighting recent forecasts more heavily than distant ones (DMSFE) correctly assigns increasingly larger weights to the TCI, an indicator that rapidly adjusts to directional changes. This method improves forecast accuracy by 13 percentage points. In Asia Pacific, the region that recovered fastest from the crisis, a method considering the historical performance of the series (VARCO) proves appropriate, and leads to a MAE which is nearly the half of the best individual forecast. Still, in a crisis scenario, the hypothesis of equal accuracy cannot be rejected for any of the combinatory approaches.
The benefit brought by combinatory approaches is best proved in a "business-as-usual" scenario, with series moving more regularly around trend values. Different combination approaches bring generous improvements in MAE values to three out of five series (Table V). For the series Americas, the predictive accuracy of a simple average is significantly higher,  in a statistical sense, than each constituent forecast. This suggests that the homologised TCI can efficiently adjust patterns extrapolated with statistical models when small-sized events occur.

Predictive accuracy of individual forecasts
The Diebold-Mariano test is also used to identify statistically significant differences in the predictive accuracy of each individual forecast.
Overall, the homologised TCI is typically more accurate than ARIMA forecasts, but the hypothesis of equal accuracy can be rejected only for the most volatile series, the Middle East (p o0.1).
During the global financial and economic crisis, the index proves significantly more accurate than the ARIMA model for the series Americas, where the model, extrapolating patterns from previous crisis of shorter duration, changes sign too early. On the other end, in a "business-as-usual" scenario, the index is frequently outperformed by its model-based counterpart. Results for the region Europe are particularly noteworthy, as the ARIMA model returns significantly more accurate forecasts than the index, despite the large number and variety of European experts participating in the UNWTO survey.

Caveats and conclusions
The link between confidence and economic decisions has been widely covered in the economic literature, yet it is still an unexplored field in tourism. This study addresses this gap, and demonstrates that a TCI can provide meaningful indications about the sign of future growth in ITA, and that it can also significantly contributes to improve forecast accuracy in specific occasions.
Empirical results prove that the UNWTO TCI well captures changes in tourism demand generated by external shocks, but also by short-term, systematic factors. UNWTO TCI proves efficient in identifying turning points and, combined with statistical forecasts, it also contributes to improve the number of correctly signed observations. Directional change is an important aspect in tourism forecasting research (Liu, 1988;Kim and Moosa, 2005), with high-practical value, as tourism practitioners are keen to know the timing of change in tourism growth. Limited research has been conducted in forecasting directional change and turning points so far. Improving this aspect can contribute to the effectiveness of both business planning in the private sector and macroeconomic policy making in the public sector .
The capability to predict variations in international tourism demand flows is particularly relevant to assist policy formulation at the moment of a shock, as well demonstrated by the 2008/2009 economic and financial crisis. As the crisis unfolded, a number of challenges arose and increased uncertainty as to the depth and extent of its impact on tourism. Most advanced economies experienced a sharp decline in their economic activities, coupled with rising unemploymenta major source of uncertaintyand concern about recovery opportunities. In Europe, fears of a sovereign debt crisis progressively spread among investors. In this climate, leading indicators' capability to predict future developments of international tourism demand weakened, even over a short-term horizon. The crisis forced major economic institutions to constantly revise their short-term forecasts to keep pace with events, with a trickle-down effect on model-based tourism forecasts. In this scenario, the UNWTO TCI offered effective support to organisations and policy makers, as it estimated the upcoming impacts of the crisis independently from other economic inputs. Ex-post analyses revealed a stronger resilience of tourism compared to other sectors of the economy, a factor that could not have been captured by model-based forecasts.
Empirical evidence suggests that a sentiment index is a valuable instrument to supplement official statistics, over which it has the advantage of being more frequently compiled and more rapidly accessible. Forecast combination proves particularly appropriate for regions where statistical information is scarce or hardly comparable, as for some African and Middle Eastern countries. This factor is also relevant to smaller geographical aggregates, such as regions or cities, where the availability of future-oriented indicators is typically scarce. Expert forecasting is a cost-effective method to obtain indications about the future evolution of a phenomenon, even beyond the short-term horizon. Due to the comparatively low cost of collecting expert opinions, panels of expert could be more widely adopted to compensate for the scarcity of tourism indicators (Croce et al., forthcoming).
Results also suggest that the combination of a sentiment index with quantitative data can be a cost-effective solution to deal with volatile series. Modelling the stochastic component of volatility in ARIMA models implies the development of "asymmetric" volatility models, as negative shocks impact more on tourism demand than positive shocks would. ARIMA-volatility methods tend to sub-perform as stand-alone forecasts, and are best used in combination with other types of forecasts (Coshall, 2009). The approach proposed in this paper may offer a simpler and similarly efficient solution to embed volatility in extrapolative models, whose implementation does not require particular statistical skills.
This study confirms that a sentiment index can efficiently capture the sector's dynamics, and bring these changes into a combined forecast. Yet, the lack of significant results for most series is certainly a major limitation of this study. This can be partly justified by the conceptual discrepancy between the index and actual series: while ITA measure inbound travel flows, the TCI captures changes in the overall tourism sector, including domestic demand flows. This index therefore returns a measurement of the overall business climate rather than just its demand component, which may explain part of the error magnitude.
Used in combination with statistical forecasts, UNWTO TCI tends to improve the forecast accuracy, but results vary greatly across regions and can hardly be generalised. A combination approach is to be preferred when both constituent forecasts perform poorly, as it is the case for the region Africa. The lack of significant results suggests that the combination of forecasts produced separately should be preferred if the goal of the analysis is to avoid the risk of selecting the worst performing model, in line with previous findings (Song et al., 2009).
Recommendations on which combinatory approach to select are also difficult to draw, as performance tends to vary across series and scenarios. Results advice against the use of a simple average combination, especially with volatile series. Weighted approaches, and especially the DMSFE approach, bring appreciable accuracy gains in most cases, although never to a statistically significant level. The VARCO approach seems to perform best with events whose impact is limited in time.
This study is to be seen as a preliminary step in the assessment of the TCI predictive power. This paper focuses on four-month prospects as a substitute for judgmental forecasts on international tourism volumes. Further research is needed to sees the index predictive power against different indicators of tourism performance, both from the demand and supply, to better understand its value in tourism forecasting. Research based on full year prospects may instead be directed to assess the statistical significance of the TCI against other predictors typically used on model-based forecasts, such as GDP and cost of travel. Improvements brought by the use of changes in the confidence index, as opposed to levels, could also be tested.

Notes
1. The terms "combination", "integration" and "aggregation" refer to the process of synthesising different forecast values into a single value. "Combination" is the term occurring most frequently in tourism studies. In the general literature, when the aggregated forecast results from a staticised process, the use of the term "aggregation" seems to be preferred, while "integration" seems to be preferred when qualitative and quantitative are produced separately, and then pooled into a combined forecast. In the revised literature, the three terms seem to be frequency used as synonyms, and the same applies to this paper.
3. This variant ignores the effect of the covariance on weights, which is instead considered in the VARCO approach. This variant of the DMSFE has been selected to avoid the risk of obtaining identical results to a VARCO approach for α ¼ 1.
4. For each series, the crisis period starts with the first period after a turning point that precedes one (or more) negative peak(s) and ends at the turning point associated with the first positive peak after the crisis. This definition entails anomalous growth values, both positive and negative, which can be attributed to the impact of that crisis. On an operational level, this definition grants a sufficient number of observations to valuate accuracy gains.
5. For the Middle East series, accuracy gains are tested only for the crisis period due to the instability of actual values during the period for which prospects are available.
6. The DM test has been chosen as measure of significance due to the non-zero mean and serially correlated nature of forecast error series. Empirical applications of the test suggest that on small samples the test can have the wrong size and reject the null hypothesis too often. For this purpose, confidence levels start at 0.1.
7. Cubic models best fit series of prospects and arrivals, with the exception of the series World and Americas, which are modelled with a linear and quadratic regression, respectively.
About the author Valeria Croce has worked as a Research and Development Manager at the European Travel Commission (ETC) since 2012. In her current position she is responsible for devising and implementing the ETC research programme, which comprises trends watch activities, studies of quantitative and qualitative nature and the dissemination of results to governmental organisations and the public at large. Market intelligence has been central to her education and professional experience over the past 12 years. As an Analyst in the public and private sector, she has gained substantial knowledge about tourism and operational experience with data analysis. As a Researcher and Lecturer, she has refined her knowledge of tourism statistics and quantitative methods of analysis. Through collaboration with international organisations, among which are UNWTO, institutions and expert groups, she has gained solid experience with tourism policy making and management. Valeria Croce can be contacted at: valeria.croce@icloud.com For instructions on how to order reprints of this article, please visit our website: www.emeraldgrouppublishing.com/licensing/reprints.htm Or contact us for further details: permissions@emeraldinsight.com