Use of Log-Linear Models in Forecasting Structural Changes in Finnish Non-Industrial Private Forest Ownership

This paper presents how log-linear models can be used for modelling and forecasting structural changes of Finnish non-industrial private forest ownership. Two crosssectional sets of data, which were collected in conjunction with two separate surveys by means of mail questionnaires in 1975 and 1990, were employed. A total of six nonindustrial private forest holding and ownership attributes are forecast focusing on the earlier pace of structural change. The results show that the pace of change in the forecast attributes appears to be less than it would be when derived from extrapolation of the earlier trends. The results of the study can be applied to forest policy and forestry extension planning, by providing a more realistic outlook of the future structure of nonindustrial private forest ownership.


Use of Log-Linear Models in Forecasting Structural Changes in Finnish Non-Industrial Private Forest Ownership 1 Introduction
In most European countries and in the USA, a major part of the timber supply comes from nonindustrial private forests (NIPF).In these countries, a problem faced by public forest policy makers is how to take into consideration the structural change of NIPF ownership in such a way that the goals of forest policy are met with regard to the national role of non-industrial private forestry (Riihinen 1986).NIPF ownership is typically distributed among hundreds of thousands of individuals and there seems to be a trend towards an increasingly varied ownership structure due to the social restructuring taking place in these countries.Thus, a realistic outlook of the future structure of NIPF ownership is of crucial importance.It is then the task of forest policy makers to employ instruments that blend in with the development trends (Järveläinen 1986, Vehkamäki 1986).
Most research about NIPF owners has concerned the silvicultural or timber sales behaviour of forest owners (e.g.Kuuluvainen et al. 1996), whereas little work has addressed the question of structural changes of ownership (Reunala 1974, Ihalainen 1990, Baughman 1996).Although there are a few structural investigations, suffice it to say that they have largely concerned past changes in NIPF ownership.Less attention has been paid to problems of forecasting structural change.
Earlier forecasts have been made either by examining the factors affecting changes in the ownership structure or by merely examining ownership changes.The first approach involves determining a change in an ownership characteristic that has been studied by means of explanation models, so that is impossible to analyse several structural characteristics simultaneously.Alig (1986) and Plantinga et al. (1990), for instance, analysed the factors influencing changes in area of forest land, by owner groups, in determining external factors affecting them.
The second approach makes no distinction between several analysed variables.In Sweden, Eriksson (1990) forecast changes in several NIPF ownership characteristics based on scenarios of external elements prepared by official authorities.In Finland, with the exception of preliminary results presented by Ripatti and Järveläinen (1997), forecasts of NIPF ownership characteristics carried out so far have been based on linear trends derived from the absolute pace of change observed in the past (Järveläinen 1988, Järveläinen and Torvelainen 1993, Ripatti 1994).
The problem with the latter approach is that all the variables that have been subject to forecasts have been treated, one by one, in situations where it has not been possible to take into consideration the effects of other variables.Questions concerning the effects of the interactions of variables on forecasts was considered to be the principal aim of the present study.The objective of the study is to analyse multi-dimensional interactions of forest holding and ownership characteristics by using log-linear models and to forecast their development to the year 2020.Forecasts will also be made for both the number of owners or holdings and area of forest land.
A total of six NIPF holding and ownership attributes are forecast focusing on the earlier pace of structural change.These attributes were chosen because their changes have been demonstrated to originate from general social change in countries where NIPF ownership dominates (Grayson 1993).The farmer-non-farmer distinction, for instance, is the most used attribute describing the structure of NIPF ownership in these countries (e.g., Birch 1994).In addition, several studies have demonstrated that NIPF owner's forestry behaviour is associated with ownership characteristics (e.g., Binkley 1981, Dennis 1989, Kuuluvainen and Salo 1991).Forest holding and ownership characteristics have also connections with values and objectives of forest ownership (Karppinen 2000).The paper is structured as follows.A description of data and log-linear modelling is presented in the next section, followed by an outline of the model constructed.The results of the estimations and forecast results are then presented and discussed.

The Data
Two cross-sectional data sets were employed.Data were collected in conjunction with two separate surveys by means of mail questionnaires in 1975 and 1990, concerning Finnish NIPF holdings with at least 5 ha of forest land in whole country.The number of responses obtained in 1975 was 2897, while 2101 responses were received in 1990 (Ripatti and Järveläinen 1997).
However, the number of forest owners is greater than the number of forest holdings, because, e.g., about 40 per cent of the holdings are owned jointly by the spouses and/or by children.A common practice, in studying forest owners, is to derive ownership characteristics of the holding from person who is responsible for looking after the holding (Ripatti 1999, p. 4).
Being repeated over time with the same or nearly the same variables, the data enabled the creation of sequences of measures in discrete time.Thus, it was possible to include a time trend in the analysis.Despite the fact that the data did not concern the same holdings, i.e. it was not made up of a so-called panel data, it provided a relatively accessible way of incorporating a time dimension into the material, and thereby of drawing conclusions concerning the trends.
The forecast attributes were treated in a dichotomous form and their distributions are presented in Table 1.Due to the great number of 64 cells produced from the table of six dichotomous variables, only marginal distributions are presented.The number of attributes which could have been used to describe structural change is in principal large, but six dichotomous attributes were used because it helps in the treatment of the data and interpretation of the results (Payne et al. 1994).
The data were analysed by means of the BMDP statistical software (BMDP 1990).In the computations, the observations were weighted because of the areal cluster sampling method employed included probabilities that depended on the total area of the holding (Karppinen and Hänninen 1990, p. 23-24).

Construction of Trends by Log-Linear Models
When analysing the relationships of dichotomous variables (Evans et al. 1991), log-linear modelling was chosen because it is the most appropriate method for forecasting NIPF ownership attributes simultaneously.The main advantage of log-linear modelling is that it takes into consideration all inter-connections of the variables used.
Log-linear models also permit variables to be measured continuously with the Weinbull distribution as shown by, e.g., McFadden (1974).Furthermore, the structure of log-linear models is asymmetrical when no clear response variable exists (DeMaris 1992, p. 7).
The starting point of log-linear modelling is a cell frequency table, which shows how the frequency in each cell depends on the combination of other variables.The log-linear model represents the logarithm of the expected cell frequency as a linear combination of the main effects and interactions.As in the case of the logistic regression model, the log-linear model is operationalized in terms of odds ratios.The odds value is defined as π / (1 -π), which is the ratio of an event occurring (π) to that of it not occurring (1 -π).With odds at 2:1, the event is twice as likely to occur as not.Thus, the odds ratio, denoted by ψ (in equation 1), is the ratio of the odds of an event in class π (1) occurring rather than π (0) to the corresponding odds of the event not occurring.
The model, which takes into consideration all the potential interdependencies in the present data, is referred to as a saturated model, and it can be specified by the following equation where M bcdefg is the expected frequency in cell (bcdefg), with occupation (b), age (c), sex (d), type of ownership (e), residence on holding (f), and size of forest holding (g).The terms on the right-hand side of the equation represent the parameters to be estimated.The constant term represents the fitted log-frequency for the cell where all six variables are on the first level.However, the construction of saturated models was not the primary aim because, in theory, the data could indicate a pattern of relationship without using up all of the available degrees of freedom.If a saturated model is estimated, the estimated frequencies would correspond exactly to the observed frequencies, i.e., the estimates would merely describe the observed data (Evans et al. 1991, p. 105).
The forms of the models were determined with reference to the usual chi square tests that indicated the level of independencies of the variables.The so-called k-factor table shows the highest order included in a model (Brown 1990, p. 283).With tests indicating no statistical significance, at least at the 5 per cent level, the model may not include any k-factor + 1 order interactions.However, the above tests may hide the marginal effects of one or more interactions among the more numerous non-significant interactions because they test multiple interactions.Therefore, cell deviations between expected and observed values were also estimated (Haberman 1972).Deviations more than 5 per cent risk were not permitted.

Forecasting Procedure
Even though the odds ratios are the most important parameters in log-linear modelling, they were not used in the forecasting procedure employed here.Instead, the odds themselves were used because they have a fundamental role to play in forming the relative odds.Once the models had been estimated, the forecasting procedure was begun by comparing the odds of structurally identical models from years 1975 and 1990 in order to obtain relative odds.Thereafter, the forecasting procedure was completed using the odds and relative odds.The relative odds were calculated as follows where the RO is the relative odds of an arbitrary NIPF ownership character derived from structural identical models of 1975 and 1990.However, it should be noted here that even when using complex models, i.e., including multidimensional interactions, only the odds of the first level effects (k-factor order) were used in forecasting because the aim of the study is to forecast trends of separate attributes, not their combined effects.
Following the calculation of the relative odds, forecasts were made concerning the selected attributes for the years 2005 and 2020.This was done by using the odds of 1990 and the relative odds.As a starting point, the odds for 1990 were used.In order to obtain odds for 2005, the odds for 1990 were multiplied by the relative odds as a count variable.Similarly, the odds for 2020 were obtained by counting the odds for 1990, and multiplying them by the squared relative odds.Finally, the percentage distributions were approximated from the predicted odds for 2005 and 2020.

Model Constructions
The results of the k-factor tables for 1975 and 1990 are presented in Table 2.The results for number of owners or holdings suggest that, since the models on line number 5 were not significant, it may not be necessary to include any sixth order effects in the models.However, due to the large deviations between the expected and observed values in the fifth order effect models, the log-linear algorithm did not converge higher than the fourth order effects.Therefore, the construction of the pairs of log-linear models describing the owners or holdings can be defined as follows The k-factor table also suggests that the results concerning area of forest land on line three were not significant, and therefore it may not be necessary to include any four-way interactions in the models.Thus, the construction of the pairs of log-linear models describing the area of forest land can be specified by

Estimation Results of Log-Linear Models
The results for the number of owners or holdings derived from model pairs of 1975 and 1990 are presented in Table 3.Both models had a good ratio of likelihood chi-square with seven degrees of freedom, indicating that in terms of the overall statistical significance, the models fit the ob-  served data well.The relative odds of the owner's occupation (OCCUP) was the highest of the tested model pairs.Indicates that the odds of non-farmer ownership rather than farmer ownership was 1.4 times higher in 1990 than in 1975.
The second highest relative odds of the model pairs was the owner's sex (SEX).The odds of the forest owner being female rather than male was 1.4 times higher in 1990 than in 1975.
The relative odds of the forest owner's age (AGE) suggest that odds of a forest owner's being at least 60 years old rather than less than 60 years old was 1.2 times higher in 1990 than in 1975.Further, the odds for a holding to be both jointlyowned (TYPOWN) and with residence elsewhere (RESID) rather than family-owned and with permanent residence were 1.2 times greater in 1990 than in 1975.Conversely the relative odds of FOR, i.e., the odds of the forest holding being smaller than 20 ha rather than larger was almost one.
With the exception of the variable FOR, the results for the area of forest land were similar to the results for the number of owners or holdings.The results of FOR with respect to the area of forest land differed from the results of the owners or holdings in two important respects.First, the lambdas for FOR were highly significant both in 1975 and in 1990.Second, the relative odds for FOR indicates that odds of a holding being smaller than 20 ha rather than larger was slightly higher in 1990 than in 1975.However, the ratio of likelihood chi-square indicates that the goodness of fit of the 1975 model was not very high, thought all the six pairs of variables were statistically significant at least at the 5 per cent level.

Structural Changes in NIPF Ownership until 2020
Following the estimation of the log-linear model pairs, the next step was to use the relative odds to forecast the trends of each character describing the NIPF ownership from 1990 to 2005 and even to 2020.The number of years for which reliable forecasting can be made depends on how long the forces encapsulated in the relative odds can be expected to stay unchanged (Buckwell and Shucksmith 1979, p. 137).In the present study, the relative odds were based on a cycle of fifteen years.However, assuming that the forces behind the relative odds will not be change, it might be interesting to predict 30 years ahead by using unaltered relative odds.The results of the forecast are presented in Table 4.In general, with the exception of the holdings less than 20 ha in size and owners of at least 60 years of age, the attributes under study increased very rapidly during [1975][1976][1977][1978][1979][1980][1981][1982][1983][1984][1985][1986][1987][1988][1989][1990], but thereafter they begin to decrease.It can be also seen that the forecast trends for the owners or holdings and for the area of forest land were quite similar, but the pace of change was slower for the latter.
The results suggest that the relative odds are important element in the forecasts.For example, the proportion of the non-residents was 29 per cent in 1975, and it was forecast to increase to even 51 per cent in 2020.On the other hand, the starting point of forest owners of at least 60 years of age was over one third in 1975, but it was forecast that their proportion will increase to 46 per cent in 2020.As was mentioned earlier, these forecasts depend on the continuation of past rates of structural change in NIPF ownership as a whole.Besides, the greater the timespan of the forecasts, the more uncertain are their results.

Discussion
This paper has shown how log-linear models can be used for modelling simultaneously structural changes of six dichotomous NIPF ownership attributes.The approach adopted here has used the technique to examine complex models related to substantive structural inter-connections concerning the relationships among forecasted attributes.In most studies of the structural change of NIPF ownership, multi-dimensional interactions are ignored.Therefore, log-linear models represent a major improvement over traditional forecasting techniques concerning the structure of NIPF ownership.
The results show that the pace of change in the forecast variables appears to be less than it would be if derived from extrapolation of the earlier trends.For instance, while the actual proportion of non-farmer forest owners increased from 37 per cent to 52 per cent during the period 1975-90, their proportion was forecast to reach 68 per cent in 2020.On the other hand, the preliminary results derived from monitoring system for Finnish non-industrial private forestry show that the actual increase of the proportion of non-farmers is faster than forecasted (Ripatti et al. 2000).However, it should be pointed out that a number of exogenous factors may alter the current trends in the structural change of NIPF ownership in the future, for example, the number of farms or age structure of owners may change in unpredicted ways due to the changes in EU policies.
Log-linear models are specifically designed to take into the account the categorical nature of the variables and they enable the formulation and testing the relationships among variables.One of their particular strengths is that they allow control of the variations observed over time in the marginal distributions of variables so that changes in relationships can be assessed net of these variations (Payne et al. 1994).Besides, log-linear models provide a powerful scheme for analyzing multi-way tables.
Some problems may, nevertheless, occur in log-linear modelling.These include the sensitivity of the significance tests to sample size, the choice of measures of fit, the effect of a clustered sample design on inferences, and the presence of cells with very small or zero frequencies (Agresti 1990).From the viewpoint of the present study, the problem is associated with the use of dichotomous variables.They do not necessaryly permit the analysis of certain important effects of the variations to be measured (Evans et al. 1991, p. 102-103).For instance, the use of a farmer and non-farmer dichotomy does not enable the measurement of the theoretically important effects of variations in occupational status, especially within the increasing non-farmer category.Besides, in the present study, all variables were not statistically significant even though the models fit the observed data well.
Finally, it should be noted that the first-order interactions of two the identical six-way loglinear models were used as the means of forecasting the trends in NIPF ownership.It would be interesting to incorporate the time variable into a single model and put to use in one sevenway model instead of two identical six-way models.Further investigation should also focus on the implications of the structural changes in NIPF ownership over time.More emphasis should also be placed on factors underlying the structural changes in non-industrial private forestry.

Table 1 .
Variables used in the study, and their definition and percentage distribution in1975 and 1990.

Table 2 .
The results of the k-factor marginals.Results concerning area of forest land are shown in parenthesis.
* Significant at 5 per cent level in 1975 and 1990. 1) Deviates at 5 per cent risk.

Table 3 .
The estimated results of the log-linear models.Results concerning area of forest land are shown in parenthesis.

Table 4 .
The relative odds, observed odds of 1990, and calculated odds and approximated percentage distributions for years 2005 and 2020.Results concerning area of forest land are shown in parenthesis.