Linear Prediction Application for Modelling the Relationships between a Large Number of Stand Characteristics of Norway Spruce Stands

The aim was to produce models for a large number of stand characteristics of Norway spruce dominated stands. A total of 227 national forest inventory based permanent stand plots, dominated by Norway spruce (Picea abies), were used in modelling eight stand variables as a function of the stand mean biological age and site characteristics. The basic models were able to characterize the average development of the modelled stand variables, but resulted in a relatively high RMSE. Basal area (G) and stem number (N) were the most inaccurate, having a RMSE of 34–41%, while that of mean diameter and height characteristics varied between 16–20%. The expectations and error variances of the basic models were calibrated with known stand variables using linear prediction theory. The best linear unbiased predictor (BLUP) with a single stand variable used for calibration proved to be ineffective for unknown G and N, but relatively effective for the unknown mean characteristics. However, calibration with one sum and one mean characteristic proved to be effective, and additional calibration variables enhanced the precision only marginally. The BLUP method provided a flexible approach when characterizing the relationships between a large number of stand variables, thus enabling multiple use of these models because they were not fixed to a specific inventory system.


Introduction
Typically the need for modelling individual stand characteristics, or relationships between such characteristics, arises from the fact that different variables are assessed or that changes occur in inventory practices over time.In order to avoid having to carry out laborious measurements in practical forest management planning inventory work, the description of a stand structure is simplified to cover a number of mean and sum characteristics.Thus, in such case prediction of the breast height diameter (dbh) distribution is needed in order to be able to use Finnish simulation systems, such as MELA and MOTTI.Both are forestry models designed for decision support systems.MELA (see DemoMELA 2004) was designed 1970s for regional and national timber production analysis, while MOTTI (see MOTTIohjelmisto s.a.) was published in 2005 for stand level analysis.Both these models needs distribution models for the initial structure of the stands because they are based on tree-level modelling.In Finland, the typical variables characterizing dbh and height distributions are either i) basal area ha -1 (G), basal area-median tree including measured dbh (d gM ) and height (h gM ), or ii) number of trees ha -1 (N), mean height (H) with the corresponding mean diameter (D), if H > 1.3 m (see Solmun… 1997, Kuvioittainen… 1998).The latter variables are typically assessed in young stands up to the first thinning stage using fixed area samples, and the former in more advanced stands using relascope sampling.Stand total age (T) is determined from the visually assessed median tree by counting the number of branch whorls or by boring to the pith at breast height.In the latter case, the number of years elapsed up until 1.3 m height is reached have to be added.The average age additions can be obtained from a table by species, site types and degree days (Solmun… 1997).Sometimes stem number is required in addition to basal area and basal areaweighted variables (Kuvioittainen… 1998).Also, dominant age (T dom ) and dominant height (H dom ) are included in the same instructions.
Nowadays, each of the above-mentioned stand characteristics is visually assessed on a speciesspecific basis if their dimensions seem different.Thus, stand total basal area or stem number and the tree species proportions can be calculated from the given characteristics.Otherwise, the mean characteristics are common for all the species, and the given sum characteristics are stand total basal area or stem number.In this case, the species present are characterized on the basis of their proportions of basal area or stem number, depending on which one is assessed (Solmun… 1997).
Predicting dbh distributions for young stands is problematic because the existing models are based on basal area and basal area-weighted characteristics (e.g.Päivinen 1980, Kilkki et al. 1989, Maltamo 1997).The more recent distribution models aim to enhance accuracy using stem number as an additional independent variable in addition to d gM and G (Siipilehto 1999, Kangas andMaltamo 2000).The only distribution model for a seedling stand has been developed for planted Norway spruce utilizing mean height and stem number as input variables for the height distribution (Valkonen 1997).Alternative basal area-dbh distribution models are applied in the MELA and MOTTI simulation systems, and some of them include stem number as an input variable.Thinning rules may call for the relationship between basal area and stem number.Stem number based thinning originally arose from the assumption that it is easier for a machine driver to control number of stems than basal area, which is left growing after thinning (Niemistö 1992).The MELA simulation system applies stem number based first thinnings, but basal area based later thinnings (Siitonen et al. 2001).Thinning directions for Metsähallitus included rules based on basal area and, alternatively, on stem number (Hokajärvi 1997).
The relationships between some stand characteristics can be mathematically defined.For example, the quadratic mean can be defined directly from stand basal area and stem number as D G kN q = / , in which k = [π / (2 • 100) 2 ].Thus, basal area is given by G = D q 2 kN and stem number by N = G / (D q 2 k).Due to above relationship, D q has been used for forestry applications.Unfortunately, D q is not assessed in the Finnish field work because it is not an easily defined mean, unless both stem number and basal area is known.Sometimes the above equation for G or N has been applied as a 'trivial transformation' with D or d gM instead of D q resulting in bias in basal area or stem number, respectively.More recently, models for G and d gM using D and N as independent variables were developed for the MELA simulation system in order to avoid such bias (Nissinen 2002).
In Norway, Eid (2001) computed D q directly from N and G, when he developed models for quadratic mean diameter and stem number.Models were adapted to available variables obtained from relascope, photo and visual inventories, while the earlier models published by Naesset (1995Naesset ( , 1996)), which included crown closure, were applicable in photo inventories.Models adapted to field inventories by Andreassen (1988) have become out of date for current inventories.Existing Finnish models (Nuutinen 1986, Niemistö 1992, Nissinen 2002) and inventory practices called for new and supplementary models.Because the above-mentioned mathematical relationships could not be directly utilized with the Finnish forest management planning data, it seemed not worthwhile formulating relationships of the issued stand variable with other stand variables that would have fixed it to a specific inventory system.Instead, all stand characteristics have a logical dependence on stand age, which is also an essential variable for forest management planning.
The aim of this study was to develop prediction models for a number of stand characteristics.We selected Norway spruce (Picea abies (L.) Karst.)dominated stands to present the method, but similar models are needed for Scots pine and birch dominated stands as well.The stand characteristics included were basal area and number of stems per hectare, mean and dominant diameter and the respective height, as well as the diameter and height of the basal area-median tree.The models were intended to have two important properties: 1) prediction should be available with a limited number of known characteristics (e.g.stand age and some site properties), and 2) it should be possible to calibrate the prediction with an arbitrary set of the above-mentioned variables whenever they are known.

Modelling Data
The data were obtained from national forest inventory (NFI) based permanent sample plots in advanced stands (H dom ≥ 5 m) and seedling stands (H dom < 5 m).Norway spruce dominated stands were selected as the modelling data.The 7th NFI-based permanent sample plots mainly consisted of a cluster of three circular plots within a stand.The total number of tallied trees was about 100-120 per stand.The smallest trees were not measured, if they were considered unsuitable for growing (e.g. had inadequate growing space).A smaller radius has been applied within each circular plot in order to select one-third of the sampled area for height (and other more detailed) measurements (see Gustavsen et al. 1988).For the purpose of the present study, each cluster of three plots was combined in order to obtain reliable stand characteristics.Näslund's (1936) height curve was fitted by stand and by tree species.The height characteristics (H, h gM , H dom ) were obtained from the fitted height curve corresponding to the respective diameters D, d gM , and D dom .The first measurement of advanced stands and the second measurement of seedling stands were used as cross-sectional modelling data.The numbers of Norway spruce dominated advanced and seedling stands were 206 and 21, respectively.Data were restricted to consist of D > 0.5 cm and G > 0.5 m 2 ha -1 , which resulted in the omission of four of the original observations.The means and variations of the most important variables of this study are given in Table 1.
The most typical site for Norway spruce was Myrtillus type (MT) mesic heath (see Cajander 1925).Better sites included four grove Oxalis-Mianthemum (OMaT) and 95 grovelike Oxalis-Myrtillus type (OMT) stands, while the less fertile site included only four stands of Vaccinium type (VT) sub-xeric heath.Although these site types are typical in Southern Finland, these abbreviations are used throughout this paper to represent the corresponding sites in the whole country

Model Construction and Fitting
The structure of the models was assumed to be multiplicative (e.g.Nuutinen 1986, Eid 2001).The models were fitted using multiple linear regression after logarithmic transformations.The following eight dependent variables were fitted simultaneously: 1) total basal area (G, m 2 ha -1 ), 2) total stem number (N, ha -1 ), 3) arithmetic mean diameter (D s , cm), 4) basal area median diameter (d gMs , cm), 5) dominant diameter (D doms , cm), 6) mean height (H, m), 7) basal area-median height (h gM , m), and 8) dominant height (H dom , m).The basic models represented the average development of each stand variable over the stand total (biological) age.The common structure of the candidate model was as follows: where T = total age (yrs), and the candidate power k was either -0.5 or -1 DD = degree days (i.e.average annual sum of the mean temperatures above +5 °C) "Origin" is a dummy (value either 0 or 1) for artifi-cial regeneration methods and "Site" consists of dummy variables associated with a certain site (j) defined as forest types by Cajander (1925), and the supplementary site characteristic such as stoniness and paludification a 0 -a i are estimated parameters of the model and ε is the random error The models for the sum characteristics (ln(N), ln(G)) were formulated for the stand totals.Norway spruce specific characteristics were determined on the basis of the species proportion, P (0 < P ≤ 1), such that the logarithm of P was added to the model (e.g.ln(G spruce ) = ln(G total ) + ln(P Gspruce )).This approach ensured that the model i) corresponded more or less to the 'potential' of the site, ii) avoided useless variation/errors caused by mixed species, and iii) the species-specific random error of the logarithmic model could be directly calculated for calibration purposes.
The breast height mean characteristics were transformed to stump level (D s , d gMs , D doms ) using the equation d s = 1.25 × dbh + 2.0 according to Laasasenaho (1975).This was done in order to ensure continuous development of the mean diameters in relation to stand age and mean heights.However, the basal area was not transformed to stump height because, as it is a sum characteristic, the analytic form could not be derived from the stump height mean diameter.Additionally, respective transformation with an intercept and a multiplier would have caused bias after back transformation in mixed stands.The origin of a stand was assumed to affect the development of stand characteristics.In order to avoid overestimating the effect of artificial regeneration, it was assumed to enhance particularly the early development of a stand but to diminish along with increasing age (e.g.due to thinnings).Therefore, the origin dummy was divided either by stand age or its square root depending on the better fit in terms of RMSE and residual variation against stand age.Dummy variables were also used for characterizing significant differences caused by different site types.Stoniness and paludification were also taken into account because of their retarding effects on growth and yield.
Models were multivariate, i.e., several models were fitted simultaneously (systems of equa- tions) to the same dataset.Due to the correlation between different models, the fitting of the individual models separately would not be efficient estimation approach (Zellner 1962).Error structure of the ordinary least square (OLS) fit was utilized in re-estimating parameters of correlated response variables using the seemingly unrelated regression (SUR) method.In our case re-estimation was possible because there were some independent variables that were not common for all the models (Zellner 1962).Estimation was made using SYSLIN procedure in SAS package (SAS/ ETS User's Guide 1993) giving the estimated cross-model error variance-covariance structure as an output.

Model Application and Validation
The error terms of statistical models are random variables.The cross-model error variance-covariance matrix is valuable when calibrating the expected value using linear prediction theory (e.g.Lappi 1991).The best linear unbiased predictor (BLUP) for variable x 1 is where x 1 is a scalar (dependent unknown variable) and x 2 is a vector, (known stand variables) σ 12 is a row vector (covariances between unknown dependent and known variables), and Σ 22 is the variance-covariance matrix of x 2 (between known variables) The variance of the calibrated dependent variable x 1 is: where σ 11 is a scalar (initial variance of the residual error of the dependent variable), and σ 12 ´ is a transpose of the row vector σ 12 When assuming the residual errors of the logarithmic models to be multinormally distributed, half of the error variance (s ε 2 / 2) had to be added to the intercept in order to avoid bias when trans-forming back to the original scale.Thus, in the applied method, the variance (3) was recalculated whenever the calibrating variable for prediction (2) was changing.
When developing the structure of the models, the RMSE of the fitted model was examined as well as the residual variation of each model with respect to its expected value, temperature sum, and stand age.In addition, model behaviour was checked visually in order to ensure logical behaviour of the models when using a wide range of independent variables, especially stand age.The effect of calibration was examined with respect to RMSE, i.e. the square root of Eq. ( 3).The models could be calibrated with any combination of the presented variables.However, in order to restrict the combination to the most relevant ones, the examples of the model application given here focused on calibrating basal area weighted characteristics (d gM , h gM and G) with arithmetic stand characteristics (D, H and N), and vice versa.Finally, the biases were calculated in the original scale after back-transformation.Some examples are given to show changes in residual errors resulting from calibration.

Estimated Models
In order to be able to benefit for the SUR estimation, some of the independent variables should not be common for all the models.The SUR method proved to be advantageous, e.g.some of the independent variables became significant, while they were insignificant (or only marginally significant, 0.05 < P < 0.1) in the original OLS fit.These variables were degree days in the model for G, and stoniness in the models for D and d gM .Using mainly the same variables and transformation in the models ensured logical behaviour of the models.The curves for diameters or for the corresponding heights did not cross each other (i.e.D < d gM < D dom and H < h gM < H dom ).If we focus on model behaviour at the age when height (H, h gM ) crossed breast height, 1.3 m (i.e.threshold for dbh), then the corresponding diameter (D, d gM ) received a reasonable value of 0.6−0.7 cm, while G was 0.2−0.5 m 2 ha -1 and N varied between 2500−4000 ha -1 .
The most typical site for Norway spruce is MT mesic heath (Cajander 1925) and therefore formed the basic level of the models.Dummy variables were formed for grove and grovelike sites (denoted as OMT+) and for sub-xeric heath (VT).The effect of site fertility on the stem number or basal area was clear and could be explained on the basis of the differences in growth potential with respect to the site classes.Thus, increasing fertility resulted in more rapid development and greater basal area (Fig. 1a), but a lower number of stems ha -1 at the given age (Fig. 1b).The number of stems was about 26% lower on OMT+ compared to the basic level of MT.The VT site did not significantly differ from the MT site as regards stem number, while the basal area was only 57% of that on MT (Table 2, Fig. 1a).
Site type had an obvious effect on the stand dimensions.Thus, on OMT the diameters (D, d gM , D dom ) and heights (H, h gM , H dom ) were about 20−30% higher than on MT, while on the VT site these diameters were about 20% lower and the heights about 25% lower (Table 2).Fig. 1d shows that h gM on the grovelike OMT site was even higher than H dom on the mesic MT site when the age of the stand was above 40 yrs.The growth of spruce is known to be relatively poor on sub-xeric VT sites, as can be seen in Fig. 1d.For example, the dominant height (H dom ) on the VT site was about the same as the arithmetic mean height (H) on the OMT site, while H dom on OMT was 54% higher than on VT.
The structure of the data was such that planting only occurred among the rather young stands (mean age 28 yrs and max age 79 yrs), while the oldest stands (up to 168 yrs) were naturally regenerated.Thus, the risk of overestimating the response of artificial regeneration methods among older stands (beyond the data) was obvious if a plain dummy had been used.Instead, the effect of planting was modelled as a function of age.This effect was more obvious on the arithmetic means than on the basal area weighted means (Table 2, Fig. 1c).Finally, artificial regeneration did not show any significant effect on the dominant tree characteristics.In 20-year-old stands, planting enhanced the diameters (increase in D was 1.2 cm and in d gM 0.4 cm).Later on the planting effect diminished such that in a 100-year-old stand the increment was 0.9 cm and 0.24 cm in D and d gM , respectively.Respective differences in H and h gM were 101 cm and 37 cm at the younger stage and 85 cm and 31 cm at the older stage.
The considerably higher arithmetic means in planted stands reflected evidence from the lower proportion of small trees compared to naturally regenerated stands.In conclusion, the diameter and height distributions became more peaked as a result of artificial regeneration.However, the differences evened out with increasing age, and the logical explanation was the commonly used thinning method, selection from below.
Finally, whether the sites were exceptionally stony or there were signs of paludification were taken into account.These factors typically reduce the actual yield from the potential yield of the  site in question.The estimated retarding effect of stoniness and paludification on most of the stand variables was between 6-24%.An exceptional increase in the number of stems was probably due to enhanced germination on the moist surface of the ground (see Fig. 1b).

Calibrating the Model
The covariances and correlations coefficients of the cross-model error, as well as the error variances of the models, are given in Table 3.The correlation scatter plot showed close to linear dependence between residual errors of the models (Fig. 2).Stem number was negatively correlated with the other stand variables, except basal area, and the correlation coefficients were generally the lowest (absolute values of r between 0.22 and 0.58).The second lowest correlations were found between basal area and the other stand variables (r = 0.31−0.55).Both of these sum characteristics typically vary greatly due to differences in management activities.However, additional variation due to mixed tree species was prevented by modelling the stand total basal area and stem number.
Naturally, the individual mean characteristics were highly correlated.The highest correlation was found between h gM and H dom (r = 0.93), while the highest between the diameters were those between d gM and D dom (r = 0.89) and D and d gM (r = 0.80) (Table 3).BLUP was flexible because any of the known stand variables could be used for calibrating the estimate of the dependent variable, the stand age being the only stand variable required.The calibration efficiency of the individual stand variables in terms of RMSE is given in Table 4.The initial RMSE (16−41%) was substantially diminished after model calibration.When calibrated with the three stand variables (N, D and H), RMSE diminished from 34% to 18% for G and from 18−19% to 11% for d gM and h gM .Respectively, RMSE for N decreased from 41% to 24% and those for D and H from 19−21% to 11% when calibrating with d gM , h gM and G.It was obvious that calibrating the sum variables (G, N) with only one stand characteristic was not efficient; RMSE decreased by only 3−6 percentage units.On the other hand, calibration with two variables (sum and mean characteristic) resulted in almost the same accuracy as calibration with an additional third variable.Nevertheless, there was clearly improved accuracy of the calibrated mean when only one calibrating mean characteristic was used; at its best, calibration with only d gM decreased RMSE of h gM from 18.7% to 8.5%, and that of D dom from 16.4% to only 7.5% (Table 4).A more relevant example from the forest management planning standpoint, calibrating d gM with D, resulted in a RMSE of 11%.
The dominant diameter and dominant height were also calibrated either with arithmetic (RMSE of 11%) or basal area weighted characteristics (RMSE of 6%) (Table 4).In the present study, D dom and H dom were not applied for calibration because they are not the most typically recorded stand variables in the commonly used forest management planning systems.However, their calibration potential was briefly checked.As a conclusion, calibrating additionally with D dom and H dom improved the precision only for the basal area weighted variables; RMSE% for G, d gM and h gM decreased to values 16.2, 6.5 and 5.3%, respectively.However, the accuracy in the arithmetic variables was hardly changed.Examples are given of the relative error variation in N and G of the basic and calibrated models in Fig. 3.In addition, the effect of calibration in D and d gM are shown in Fig. 4. The expected values of the basic models are unbiased in the modelling data.However, the logarithmic transformation caused a slight bias at the original scale because the applied bias correction (s e 2 / 2) was based on a normal distribution, but the distributions of the dependent variables were more or less skewed.BLUP models with three calibrating variables (N, D, H) were less than 0.10% biased for the basal area weighted means, and 0.12% biased for dominant tree characteristics, when the bias was calculated from the original scale after back transformation.In addition, basal area was overestimated by 2.3%.When calibrating with G, d gM and h gM , a 1.9% underestimation was found in N, while 0.2% and 0.3% bias was found in D and H, respectively.In this case D dom and H dom were unbiased.The considerably higher bias in the sum characteristics partly resulted from the bias in the species proportion, because the proportion of G did not coincide with the proportion of N. Additional models for the species-specific relationships between these proportions might help reduce the biases in G and N.

Discussion
The models presented in this paper were fitted together in order to study error covariances and re-estimate the parameters using the SUR method.Some of the model parameters were not statistically significant in the OLS fit, but became significant in the SUR re-estimation, thus giving a more compatible family of models.The estimated parameters for the independent variables were nicely in line between the models.A specific factor had a logical effect on a number of stand variables.For example, the effect of stoniness or paludification had a relatively constant effect on the mean characteristics, while the effect of temperature sum or a specific site type was the highest on the arithmetic means and the lowest on the dominant tree characteristic.Thus, differences in site fertility had a relatively greater effect on the lower part of the dbh or height distributions.Artificial regeneration clearly affected the lower part of the distribution by decreasing the range of diameters (e.g.Uuttera andMaltamo 1995, Siipilehto 2002).
The main advantage of the applied approach was in the flexible calibration procedure (e.g.Lappi 1991).The only important independent variables were the total age of the stand and the average temperature sum corresponding to the location of the stand.Additional dummy variables described the origin and site factors.Formulation of the prediction equations was relatively similar to the procedure used by Nuutinen (1986) for h gM , while she modelled d gM recursively by applying h gM as its predictor.The relationships between the stand characteristics resulted in relatively high and linear correlations between the residual errors of the models.
BLUP is theoretically based on the linearly cor-related random errors between the models.The calibrating effect is related to the error covariance, but inversely related to the error variance of the calibrating variable.Thus, e.g. if the known D is overestimated, then the unknown d gM is expected to be overestimated due to positive correlation between their random errors.All the measured stand variables could be used for calibration in relation to their prediction error using linear prediction theory.However, the most effective variables for calibration are those with a relatively high covariance with the unknown variable and a simultaneous low error variance.
A typical combination of variables currently assessed in Finnish forest management planning field work is either i) N, D, and H or ii) G, d gM , and h gM .These combinations were used when the efficiency of the calibration was focused.Generally, the larger the number of known stand variables available the better the result given by BLUP, at least when the available stand characteristics were accurately determined.It was also found that calibration with only one variable was not effective for the sum characteristics G and N. Quite naturally, if we know e.g.only the mean dbh, then we do not have very much information about the corresponding stem number.However, if we also know the basal area, then the variation in the reasonable stem number is greatly reduced.
In the present study, dominant diameter and dominant height were the most accurate characteristics of the stand characteristics of the basic model.Planting, which proved to affect the lower dimensions (D, H, d gM , h gM ), did not produce any significant variation in the dominant tree characteristics (D dom , H dom ).Of the variables in question, the dominant tree characteristics are also generally the least related to the thinning intensity.This is most probably the reason why they did not prove very useful as calibrating variables.Also, they were highly correlated with the basal areaweighted characteristics.Thus, when d gM and h gM were used for calibration, D dom and H dom could not provide any meaningful additional information.They did, however, improve the precision in the basal-area weighted variables because they were less correlated with the arithmetic means.
The basic models presented in this paper were not the most accurate ones, but they did provide a reasonable base for the stand variables by stand age.In the study of Eid (2001), the RMSE of the models was 0.11 and 0.22 when H and G were available, together with age, for predicting D q and N, respectively, and 0.16 and 0.36 without these variables.The present models, with stand age as the only independent variable of the stocking, resulted in a RMSE of between 0.16−0.20 in the mean characteristics and of 0.34−0.41 in the sum characteristics.When Nissinen (2002) modelled basal area G directly with T, N and D, the accuracy was naturally better (RMSE 0.103).However, when the prediction of G (RMSE 0.35) was calibrated with N and D, the BLUP model gave a relatively satisfactory RMSE of 0.18.
In the study of Nuutinen (1986), the RMSE of d gM was 0.14 and h gM 0.17 for Norway spruce on mineral soils in the whole of Finland.Nissinen ( 2002) reported a slightly worse result for d gM when using D and T as predictors (RMSE 0.17).Thus, the accuracy in the above-mentioned h gM was relatively comparable with that of the present basic model (also the structure of the model was about the same), while d gM using h gM (Nuutinen, 1986) or D (Nissinen, 2002) as a predictor was more accurate than the present basic model (0.184), but slightly less accurate than the calibrated BLUP models; calibrating with D (or D and N together) resulted in a RMSE of 0.11 in d gM and 0.13 in h gM .(In fact, if h gM could be used for predicting/calibrating d gM , as in the model by Nuutinen (1986), then the resulting RMSE for d gM would be only 0.084).Finally, applying dominant height as an additional calibrating variable resulted in a RMSE of 0.075 and 0.056 for d gM or h gM , respectively.The results are not fully comparable because the modelling data were not the same.Nissinen (2002) restricted the corresponding data (NFI based permanent sample plots) to concern T < 70 yrs, G > 1.0 m 2 ha -1 and 5 cm < D < 15 cm, while Nuutinen (1986) applied 7th NFI data that consisted of much older stands on Norway spruce dominated mineral soils (mean age 77 yrs).Heikkinen (2002), Ojansuu et al. (2002) and Haara and Korhonen (2004) recently reported considerable measurement errors in the different stand variables.Of the commonly assessed stand variables, the most inaccurate variables were stem number (50−80% RMSE) and basal area (about 20−40% RMSE), while the most accurate of the variables in question were h gM (8−17% RMSE) and d gM (11−18% RMSE).However, considerable more accurate stand characteristics have been found when using laser scanning data -according to Suvanto et al. (2005) and Naesset (2002) RMSE in stem number and basal area was 18-35% and 8-21%, respectively.The errors in stand characteristics have an effect on forest scenarios and decision making (e.g.Eid 2000, Ojansuu et al. 2002).In the present study, the variances were used such as they were estimated from the data, and therefore the presented calibration efficiency was related to accurately defined stand characteristics.If one wants to emphasize differences in the accuracy between stand variables, then average measurement errors could be added to the diagonal variances, resulting in a diminishing calibration effect.This might have some advantages if the input data consist of uncertain (e.g.visually assessed) variables.However, this topic was not included in this study.Nevertheless, it was briefly checked and we found that adding measurement errors to the diagonal variances, while the available stand characteristics were correctly defined, resulted in a considerably less accurate calibration and, in some cases, also in irrelevant combinations of predicted variables.
Stand age was used as an independent variable for all the models because it is essential in forest management planning -it is needed for site index determination and for future simulations.According to Poso (1983), Eid (1992), Eid and Nersten (1996), and Haara and Korhonen (2004), the standard error of stand age, determined in field inventories, has been 16-33%.In Finland, Haara and Korhonen (2004) found smaller standard deviation in age in pine dominated stands (22%) than in spruce (27%) or birch dominated stands (33%).We performed a sensitive analysis, in which we assumed alternatively 25% under-or overestimation in stand age when predicting the stand characteristics of the modelling data.The resulting biases in the predicted stand variables of the basic models were 11-22%.Finally, the effect of erroneous age substantially diminished when the model could be calibrated with accurate stand characteristics.After model calibration with three stand characteristics, the bias in the remaining variables was highest in stem number, 4-5%.
Generally, the bias was about 2-3% for the mean characteristics.At its best, the basal area was only 0.2% biased when N, D, and H were used for calibration, while stand age was underestimated by 25%, whereas the dominant tree characteristics were less than 0.3% biased when G, d gM and h gM were used for calibration regardless of 25% under-or overestimation of age.
The models presented in this study are applicable in simulation systems for completing a given set of variables if none or only some of the variables are measured.In the latter case, the models are calibrated by applying linear prediction theory.In practice, the corresponding models for Norway spruce, Scots pine, silver and downy birch are already included in Finnish MOTTI simulation system.In addition to providing information about missing stand variables, they are used for generation of the user-defined, initial stand structure (see MOTTI-ohjelmisto s.a.).They are especially helpful when generating stands of given age without any prior knowledge of the values of the stand variables, especially a reasonable combination of them, e.g. with respect to stem number and basal area variation.

Fig. 3 .
Fig. 3. Relative prediction errors (%) with respect to expected value of the basic models (o) for N and G, and after calibrating the models with three stand characteristics (l).

Fig. 4 .
Fig. 4. Relative prediction errors (%) with respect to expected value of the basic models (o) for D and d gM , and after calibrating the models with three stand characteristics (l).

Table 1 .
Stand variables of the modelling data including Norway spruce dominated stands.
a) Proportion of Norway spruce out of the stand total basal area b) Proportion of Norway spruce out of the stand total number of stems

Table 3 .
Cross model error covariances above the diagonal, variances on the diagonal (in bold) and correlation coefficients below the diagonal (italics) for Norway spruce.