Reliability of the predicted stand structure for clear-cut stands using optional methods: airborne laser scanning-based methods, smartphone-based forest inventory application Trestima and pre-harvest measurement tool EMO

Accurate timber assortment information is required before cuttings to optimize wood allocation and logging activities. Timber assortments can be derived from diameter-height distribution that is most often predicted from the stand characteristics provided by forest inventory. The aim of this study was to assess and compare the accuracy of three different pre-harvest inventory methods in predicting the structure of mainly Scots pine-dominated, clear-cut stands. The investigated methods were an area-based approach (ABA) based on airborne laser scanning data, the smartphone-based forest inventory Trestima app and the more conventional pre-harvest inventory method called EMO. The estimates of diameter-height distributions based on each method were compared to accurate tree taper data measured and registered by the harvester’s measurement systems during the final cut. According to our results, grid-level ABA and Trestima were generally the most accurate methods for predicting diameter-height distribution. ABA provides predictions for systematic 16 m × 16 m grids from which stand-wise characteristics are aggregated. In order to enable multimodal stand-wise distributions, distributions must be predicted for each grid cell and then aggregated for the stand level, instead of predicting a distribution from the aggregated stand-level characteristics. Trestima required a sufficient sample for reliable results. EMO provided accurate results for the dominating Scots pine but, it could not capture minor admixtures. ABA seemed rather trustworthy in predicting stand characteristics and diameter distribution of standing trees prior to harvesting. Therefore, if up-to-date ABA information is available, only limited benefits can be obtained from stand-specific inventory using Trestima or EMO in mature pine or spruce-dominated forests.


Introduction
In the prevailing Nordic harvesting system, trees are cut to final lengths in the forest.Efficient manufacturing of both existing and new wood and fibre-based products can be considerably enhanced by improved recognition and optimization of harvesting, transportation and product manufacturing procedures.To make wood supply systems more efficient, the physical manufacturing of each product should start in the forest, not at industrial sites.The potential economic gain from initiating more advanced production processes at the beginning of the supply chain may well be 10-40% of the value of the end products before retailing (Uusitalo et al. 2008).With better fitness and integration of the forestry procedures into the industrial production chains, higher process efficiency and increased profitability can be realized.Consequently, considerably lower emissions and lower energy consumption could be achieved as well.
The current harvesting and transportation processes used in the Nordic countries were recently thoroughly described and modelled (Nurminen et al. 2006;Nurminen and Heinonen 2007;Niemistö et al. 2012).Moreover, costs can be assigned by the activity-based costing technique for each log (product) cut by a harvester, extracted to the roadside by a forwarder and transported by a timber truck to the mill (Nurminen et al. 2009).It is also possible to make accurate estimations about the costs and revenues of producing a final product from a single log.Recent findings showed that cost structure for different type of logs, in terms of quantity and quality, may differ considerably in each manufacturing process (Korpunen et al. 2010;Korpunen et al. 2013).With the help of the process models described above it is possible to calculate optimal wood allocation solutions -what products are cut and from which stand.However, reliability of these calculations is highly dependent on the quality of the information available from the stand prior to cutting.
In Finland, the earliest method for obtaining timber assortment information prior to cutting was the PMP system, which was based on measuring the diameter at breast height (dbh) of every tree (PMP-ohje 1982).The method provided accurate estimates at the stand level, but the system was found to be too laborious and expensive.After that, the common practice for obtaining information on timber assortments prior to cutting was based on stand-wise field inventory (SWFI).In SWFI, every stand has been visited by a forest planner, and stand characteristics have been derived from relascope measurements and/or visual assessment.The main purpose of the SWFI has been collecting information for forest planning and thus it has been conducted on an approximate 10-year interval.If the SWFI-based information is outdated for the area of interest, a similar inventory would have been implemented for stands to be harvested.
During the past decade in forest mapping and monitoring applications, the possibility of acquiring spatially accurate, 3D remote-sensing information by means of airborne laser scanning (ALS) has been a major opportunity (Holopainen et al. 2014).An ALS-based forest inventory method in which low density (~0.5 pulses per m 2 ) ALS data are used to generalize field-measured stand characteristics over an entire inventory area has replaced SWFI.Compared to SWFI, the ALS-based inventory method, also known as area-based approach (ABA), has provided a comparable or even more precise estimation of stand characteristics as well as cost savings (Suvanto et al. 2005;Haara and Korhonen;Uuttera et al. 2006;Järnstedt et al. 2012;Holopainen et al. 2010).Although the total timber volume is obtained with relatively high accuracy, information about size-distribution, timber assortments or the number of trees has reliability no better than SWFI (Naesset et al. 2004;Holopainen et al. 2010).Currently, pre-harvest information is obtained from ABA, but reliability of the information is inspected in the field and additional measurements are carried out if required (Suomen metsäkeskus 2014).
To achieve further cost-saving, ABA should be linked with the planning systems of the wood supply chain.If accuracy of the ABA is not satisfactory, additional pre-harvest information should be collected by cost-efficient means in the field.The possible alternatives include, for example the EMO method (Uusitalo and Kivinen 2000) as well as the smartphone-based app and inventory method, Trestima (Rouvinen 2014), which are based on subjective cruising along the stand combined with sample measurements or sample photos, respectively.
The timber assortment information can be calculated from species-specific dbh distributions.In the PMP system dbh distributions were obtained by measuring the dbh of every tree and tree heights were predicted using a stand-specific height curve fitted for the sample trees with measured dbh and height.In ABA, Trestima and EMO inventories, stand characteristics, such as species-specific number of stems (N, ha -1 ), basal-area (G, m 2 ha -1 ), basal-area weighted mean dbh (DG, cm), Lorey's height (HG, m) and stem volume (m 3 ha -1 ) are obtained.
Stand-level information is converted into tree-level information through size distribution modelling, which means selecting the distribution function and the distribution modelling approach (Cao 2004;Siipilehto 2011).Diameter distributions are presented as either unweighted with respect to tree frequency (i.e., dbh-frequency distribution) or weighted with respect to tree basal area (i.e., basal area-dbh distribution) (Gove and Patil 1998).Weighting affects the shape of the distribution (Gove and Patil 1998;Siipilehto 1999) and it has been commonly used in Finland for beta, Weibull or Johnson's SB distribution (Päivinen 1980;Kilkki et al. 1989;Maltamo et al. 1995;Siipilehto 1999).In the parameter recovery method, the input mean (e.g.DG) and sum characteristics (G and N) are compatible with those of the solved distribution.Thus, when basal area and number of stems are compatible, the used mean dbh (e.g.arithmetic vs. basal area-weighted mean) or weighting of dbh distribution has only marginal meaning (Siipilehto and Mehtätalo 2013).Finally, tree height is predicted for the known dbh sampled from the predicted distribution.One option is a bivariate dbh-height distribution (Schreuder and Hafley 1977;Siipilehto 2000;Wang and Rennols 2007) but typically height is predicted using a commonly known height curve (see Mehtätalo et al. 2015 for e.g.Chapman-Richards, Curtis, Korf, Näslund, Prodan, Schumacher curve).The predicted height curve is used for the expected tree height, or tree height is randomized utilizing the standard deviation of the residual error (Mehtätalo 2005;Siipilehto 2000;Siipilehto and Kangas 2015).The reliability of the predicted stand characteristics followed by size distributions determines the quality of the pre-harvest information (Holopainen et al. 2010).
The aim of this study was to assess and compare the accuracy of three pre-harvest inventory methods in predicting a dbh-height distribution in boreal forest stands marked for clear-cutting.The investigated methods were ABA, Trestima and EMO inventories.Trestima and EMO inventories require stand-specific cruising whereas ABA does not require stand-specific field measurements.The predicted dbh and height distributions based on these methods were compared to accurate tree taper data measured and registered by a harvester's measurement system.

Study area and measurement of ground-truth data
The study material comprised seven stands from southern Finland, situated in the municipalities of Hämeenlinna, Orimattila and Myrskylä (Table 1).The sites were grove-like (OMT), mesic (MT) and dryish (VT) heath (Cajander 1925).The forests were dominated by Scots pine (Pinus sylvestris L.) or Norway spruce (Picea abies (L.) Karst.) and were roughly 90 to 120 years in age.Most stands also contain a minor birch admixture (Betula pendula Roth or B. pubescens Ehrh.).Four stands were owned by the HAMK University of Applied Sciences and the remaining three stands were privately owned.All stands had updated forest plans based on the SWFI method.The stands were marked with coloured ribbons to allow easy detection of the boundaries during harvesting.Stand boundaries were then digitized using ArcGIS software.
The study stands were clear-cut during autumn 2014 with three harvesters: two manufactured by John Deere and one by Komatsu.While processing, tree taper data for each commercial tree is registered in the STM-format (StandForD 1997), in which tree diameter is registered to an accuracy of 1 mm every 10 cm from the height of 1.3 m from the stump (10 cm) to the final cutting point.Cubing of trees was calculated by summing the stem profile in 10 cm intervals.Species-specific functions by Laasasenaho (1982) were applied for cubing of the first 1.3 metres that harvester does not register.The missing top of a tree was estimated with the models by Varjo (1995) having the root mean square error (RMSE) of 73 cm (27%), 69 cm (21%) and 107 cm (22%) for Scots pine, Norway spruce and birch, respectively.Varjo's models are relevant when assuming the minimum of the timber length to be 1.5 m and the last cut diameter 5-10 cm.The last condition was not always fulfilled and thus, a restriction for total height was needed in order to avoid unrealistic tree heights.The maximum tree height was limited by the simple function hmax = 5√dbh resulting in, e.g., hmax of 31.6 m, when dbh is 40 cm and hmax of 35.4 m with dbh of 50 cm.Without restriction the known tallest trees in Finland were exceeded in some cases due to the last cut diameter being around 15 cm.Species-specific thresholds for the minimum commercial dbh were 7, 8 and 6 cm for Scots pine, Norway spruce and birch, respectively.The stand characteristics based on harvester measurements are shown in tables 1 and 2. Table 2 includes all the stand characteristics that are used for validating different inventory methods.They are given as species-specific and as stand total characteristics (Table 2).

General workflow
When predicting species-specific stand characteristics and tree lists for marked stands, best practices of the ALS-based forest inventory were followed (White et al. 2013).In the used approach, sample plots were distributed over the study area based on ALS data.Then tree-by-tree measurements were collected from the sample plots.After that, ALS metrics were calculated from the respective areas and predictive models were developed between the field measured stand characteristics and ALS metrics.Finally, stand characteristics were predicted for a systematic grid with a resolution corresponding to plot size.These steps are described in more detail in the following subsections.
It should be noted that sample plots used in ABA located around the studied stands, not within the stands.

Sample plot inventory for ABA
The field data consisted of individual tree measures for 364 plots around the Hämeenlinna test site (Yu et al. 2015), each with a size of 16 m × 16 m.Sampling of the field plots was based on prestratification of existing airborne laser scanning data to distribute plots over various stand height and density classes.The plots were georeferenced, i.e., spatially located, to a map coordinate system using GNSS and total station measurements with an expected accuracy better than 10 cm.Field measurements were collected during the summer 2014 from May to August.The following variables were measured for trees with a dbh larger than 5 cm: tree species, dbh, and height.Stem volumes were calculated according to models developed by Laasasenaho (1982) using tree species, dbh, and height as input variables.Plot-level estimates were obtained by summing the tree-level data.An average of 23 trees was measured per field plot.Mean, standard deviation and the range of these characteristics are shown in Table 3.

Prediction of species-specific stand characteristics
Species-specific N, G, DG, HG and volume were predicted by means of ALS metrics using the nearest-neighbour (NN) approach.Stand characteristics measured in the field were used as target observations, and plot-specific metrics derived from ALS were used as predictors.The random forest approach (RF, Breiman 2001) was applied in the NN search.In the RF method, several regression trees are generated by drawing a replacement from two-thirds of the data for training and one-third for testing for each tree.The samples that are not included in training are called out-of-bag samples, and they can act as a testing set in the approach.The measure of nearness in RF is defined based on the observational probability of ending up in the same terminal node in the classification.The R statistical computing environment and yaImpute library (Crookston and Finley 2008) were applied in the RF predictions.In the present study, 1200 regression trees were generated, and the square root of the number of predictor variables was picked randomly at the nodes of each regression tree.The number of neighbours was set to one to keep the original variance in the data (Hudak et al. 2008).
Prior to the final modelling, RF was used to reduce the number of predictor variables.The aim of the variable reduction was to build up parsimonious models that are capable of accurate predictions.A step-wise looping procedure was used to iterate RF, discarding the least important of the candidate variables at each iteration, based on variable importance.The used predictors in the final imputations were penetration of the laser pulses, mean, standard deviation and maximum laser height as well as percentiles (30%, 50%, 70% and 90%) calculated from point height distribution.To improve accuracy of the species-specific estimates, the marked stands were divided into three strata according to existing stand register information.The first stratum included Scots pine-dominated stands, the second stratum Norway spruce-dominated stands and the third stratum Scots pine-dominated stands with Norway spruce mixture.The first stratum comprised 185 sample plots as the second and third strata had 104 and 26 sample plots, respectively.The final imputations were carried out for each stratum separately.
The study stands appeared as geo-referenced polygons while the results of the ALS inventory are input as 16 m × 16 m grid cells.Match of each grid in relation to stand boundaries were analysed with ArcGIS software.The proportion of the pixels (1.3 m 2 ) within each 16 m × 16 m cell that locate inside the stand polygon was calculated.The lidar metrics from imperfect grid cells were weighted with this proportion 1) when the stand characteristics for each stand were summed from the grid level predictions and 2) when the number of trees-based sample size was calculated for the boundary grids.

Trestima app for standwise forest inventory
The Trestima app for stand-wise forest inventory is based on image analyses of photos taken with a smartphone.Photos taken with the Trestima app are transferred to the Trestima cloud service (Rouvinen 2014;Trestima 2015).Trestima's basal area calculation is fundamentally based on the principles of the Bitterlich (1984) relascope.Trunk widths and heights, as well as tree species from each sample, are measured in the cloud service.
The angle count sampling principle in the app uses a dynamic basal area factor where a single tree leads to a response of basal area of 0.6 to 1.4 m 2 ha -1 (Vastaranta et al. 2015).The total basal area is then obtained by multiplying the number of trees by their respective basal area factor.Species-specific basal area can be calculated in a similar way but by tree species.The Trestima app calculates mean error of basal area after each basal area shot and thus actively instructs the user to take an additional image or not.Trestima inventory is based on freely chosen cruising along the stand.Estimation of standing volumes and timber assortments requires some information about tree heights.Trestima allows the user to input median tree dbh and tree heights manually, but these can also be measured by shooting extra diameter and tree height photos.Once the sample tree is selected, a yardstick is attached to the surface of the trunk and special tree diameter and tree height photos are taken separately.It is recommended to measure at least 2-3 median trees for each tree species from the whole stand (Trestima 2015).
The Trestima application was utilized differently from the application's general guidelines.At the inventory phase, a fixed number of ten basal area shots were taken from each stand.In addition, more than thirty trees per each stand were photographed for tree dbh and tree height for a forthcoming tree quality study.In order to modify the sampling to resemble the typical Trestima method, the following modifications were made.At the office, four sample trees around the estimated mean dbh (D, cm) were chosen for estimating the mean height (H, m).Finally, H was calculated from the fitted stand-specific trendline, i.e. h = a + b×dbh by inputting D to the equation.Thus, the Trestima app determined species-specific basal area, mean dbh, quadratic mean dbh (Dq, cm), mean height and the number of stems in the cloud service.

A pre-harvest measurement tool EMO for predicting stand structure
The EMO method employs combined nearest trees/basal area sampling and modelling applications to create estimation of forest composition (Uusitalo 1997;Uusitalo and Kivinen 1998;Uusitalo and Kivinen 2000).In this method the measurer walks along a subjectively chosen route inside a forest stand and distribute certain number of sampling points evenly within the stand.For each sampling point a certain number of the nearest trees are selected as sample trees.Breast height diameter is measured from every sample tree.From Scots pines, dead branch height is measured as well.In order to calibrate the height and crown height models, some sample trees need to be measured also for these variables.Beside the nearest trees' selection, the basal area of the stand is determined by using a relascope.
Within the study stands, the six nearest trees were measured for dbh from five different sampling points (30 sample trees per stand).At each sampling point one relascope measurement was carried out for basal area.In addition, one tree for each species at each sampling point was measured for tree height.

Weibull in ABA and Trestima
Theoretic dbh distributions were solved using parameter recovery for the two-parameter Weibull function (Siipilehto and Mehtätalo 2013).The needed second moment (Dq 2 ) was obtained from the predicted basal area and number of stems as G/qN (q = π/200 2 ).Predicted basal area-weighted mean dbh (DG) was used as an input variable with ABA while the first moment (arithmetic mean dbh, D) was used with Trestima.There is basically no difference in the recovered Weibull distributions solved from D or DG as far as these input mean characteristics are as accurate (Siipilehto and Mehtätalo 2013).Further, Weibull dbh-frequency distribution can easily be transformed to a basal area-weighted distribution (see Gove and Patil 1998;Siipilehto and Mehtätalo 2013).Throughout this study, dbh-frequency distribution was recovered.
ABA provided predicted stand characteristics for 16 m × 16 m grid cells.Thus, there were two possibilities for estimating stand-specific dbh distribution based on ABA: 1) By predicting dbh distribution for each grid cell and summing those (ABA grid) and 2) averaging stand characteristics predicted for the grid cells and then predicting dbh distribution from the average of the mean characteristics (ABA stand).Both approaches were tested.Because the number of neighbours was set to one in the ABA, quite many grid cells within a stand had equal imputed stand characteristics.This means that using dbh-distribution models deterministically (i.e.systematic sampling and expected tree height) would provide identical trees/dbh distribution for these grids.(Similarly, if the measured trees are used as such, they are identical whenever the same plot is used).This seems unrealistic and inefficient from a natural perspective.Thus, we applied random sampling from a predicted dbh distribution for each grid.The interpreted number of trees per hectare (N) by ABA was divided by 39 (one tree in a grid represents 39 trees ha -1 ) in order to get the sample size for a grid.We sampled trees from the predicted cumulative probability distribution by randomizing the probability (P) from the uniform 0-1 distribution.The cumulative Weibull distribution function is F(dbh) = 1 -exp{-(dbh/b) c } and the tree dbh is solved as dbh = b{-ln(1-P)} (1/c) where b and c are the scale and shape parameters (Bailey and Dell 1973).
In order to test the significance of the number of images on the accuracy of the Trestima method, two alternatives were formed: 1) dbh-frequency distribution predicted from average stand characteristics interpreted from all ten images by the Trestima application (Trestima 10) and 2) using only half of the images in the analysis (Trestima 5).(Each image analysed in Trestima app has fixed expense).The Trestima interface includes a function to disable or enable usage of the image in calculations.Disabling every second basal area image in chronological order was regarded as a suitable way to reduce the number of images from ten to five.
Randomly sampled tree dbh from the recovered Weibull distribution was used with the ABA grid method while tree frequencies in 1-centimeter intervals were calculated from the recovered dbhfrequency distributions for further analyses using ABA stand, Trestima 5 and Trestima 10 methods.

Kernel in EMO
In the EMO inventory method, the dbh distribution of the stand is calculated from the nearest trees' selections and the basal area of the stand is calculated from the relascope measures with a basal area factor of 1. Diameter distribution is then constructed by the nonparametric estimation methodology employing the Epanechikov function as the kernel estimator (Uusitalo and Kivinen 1998).The dbh distributions of each tree species are proportioned to the real scale according to the mean of the relascope measures and sum of each tree species' basal area calculated from sample trees.
The original distribution in 2-cm classes, constructed by the EMO software, were transformed to 1-cm classes to enable similar comparison with ABA and Trestima methods.

Tree height
Tree height predictions are based on Näslund's height curve.The parameters of the Näslund's height curve were predicted with stand characteristics using models by Siipilehto and Kangas (2015).These models provide predictions for Scots pine, Norway spruce, and broadleaves (mainly birch species).The models also include predictions for the standard deviation of the residual error in the form of a variance function.Random deviation was added to the predicted tree height resulting in realistic variation in tree dimensions as well as a reasonable marginal distribution of tree heights (see Siipilehto 2000).When the 1-cm dbh class was used, the number of random tree heights were generated according to the rounded predicted frequency per dbh-class, when it was less than or equal to 5 trees ha -1 .However, a maximum of five random heights per dbhclass were regarded adequate, no matter how high the predicted frequency was.The average total number of generated trees was approximately 220 and ranged from 85 to 415 trees per stands in ABA stand, Trestima and EMO methods.Note that predicted trees with a dbh below commercial limit were deleted.
We focused on the predicted stand characteristics needed for recovering the dbh distribution and finally analysed the accuracy in the volume estimates between the alternative methods.The ground-truth data was the harvester measured data by tree species (Cut trees).Stand characteristics such as N, G, DG and HG, total stem volume, log and pulp wood volumes were calculated species-specific as well as stand totals and they were compared to harvester measured data.Based on differences between the ground-truth and evaluated methods, bias and RMSE was calculated.
The Kolmogorov-Smirnov one-sided goodness-of-fit test (KS) was applied for the predicted dbh distributions using the alpha 0.1 risk level.Alpha 0.1 risk level was chosen due to small data set.Predicted distributions were compared with the ground-truth distribution of the cut trees (tree species pooled) in each clear-cut stand.When the number of sampled trees (n) exceeded 100, the approximate limit value was calculated as √(-ln(alpha/2)/2n) (Sokal and Rohlf 1981).In practice, we calculated the KS-quotient as the actual KS value divided by the limit KS value (Tham 1988).Thus, when the KS-quotient was below one, the predicted distribution passed the test and vice versa; a KS-quotient above one meant a rejected case.Generally, the KS-quotient can be used for ranking different methods regardless of the sample sizes (Tham 1988;Siipilehto 2000).The KS test measures the maximum difference between compared distributions.Error-Index (EI) by Reynolds et al. (1988), instead, measures the differences along the whole distribution.EI is the sum (or weighted sum) of the absolute values of the differences in the real or relative frequencies by dbh classes.In this study, the differences were calculated between the relative frequencies in 1-cm dbh classes without weighting.
In order to rank the applied methods, we counted the numbers of the best and worst cases by the fit criterions: bias and RMSE in stand characteristics and KS and EI goodness-of-fit tests.

Accuracy in generated stand structure
Biases and RMSEs for stand (total) and for each species (pine, spruce and broadleaves) separately are given in Table 4 and 5.Both Trestima alternatives were the least biased in the predicted stand sum characteristics, N and G (Table 4).ABA grid provided generally the most accurate estimates for dimensions (DG and HG) for individual species.In addition, ABA grid resulted in the highest bias only in one case while, e.g., the ABA stand provided the highest bias in five cases (individual species or stand totals) In general, the most biased stand characteristics typically resulted from use of the EMO method (6 times out of 16 cases).However, EMO was the least biased in terms of HG for the whole data and for Scots pine.
If we look at the stand totals, the largest biases were found in the number of stems, namely a 251 ha -1 (46%) underestimation by EMO method and a 181 ha -1 (33%) overestimation using the ABA stand method (Table 4).The worst result in total basal area was the underestimation by 27% using the EMO method.Scots pine was typically the dominant species and its characteristics were generally most accurately inventoried.With Norway spruce and birch (including all broadleaved species) admixtures, results varied more.For example, if the applied method did not capture the admixture at all, the resulting error was 100%.EMO did not capture birch admixture in stand no. 1 and spruce admixture in stand no. 2, while EMO, Trestima 5 and Trestima 10 did not capture birch admixtures in stand no. 4 and finally EMO and Trestima 5 in stand no. 6.Both ABA methods instead detected all the tree species admixtures from all of the stands.Generally, the RMSEs were lowest in determining the mean characteristics.RMSE for the stand DG varied between 8-16% and for stand HG between 8-18% (Table 5).For total basal area the RMSEs varied between 25-40%.Total number of stems was the least accurately predicted characteristic with RMSEs varying between 34-58%.In terms of RMSE, EMO was the best performing method for Scots pine (Table 5).The smallest RMSE for Norway spruce and broadleaves admixtures was most frequently provided by Trestima 10 and ABA grid methods.Even if the ABA stand method provided the smallest bias in stand basal area, it was never among the best methods, when comparing species-specific biases or RMSEs (Tables 4 and 5).This indicates a high variation among stand predictions using the ABA stand method.Quite naturally, the results were generally close to those of the ABA grid method.Also, the results by Trestima 5 and Trestima 10 were close to each other but systematically utilizing all 10 images provided a slightly smaller RMSE.

Reliability in volume characteristics
The ABA grid performed the best in terms of bias in stand total stem volume and log volume while Trestima 5 provided the least biased estimates for stand total pulpwood volume (Table 6).The least biased species-specific results were split among ABA and Trestima options.The largest bias was most often found in the EMO results (9 out of 12 cases) and all were underestimations for Norway spruce and broadleaves admixtures as well as for stand totals (Table 6).
ABA grid provided the lowest RMSEs in stand total and assortment volumes (18-37%) and the EMO method the lowest (18-21%) RMSEs for Scots pine (Table 7).Trestima 10 was the most accurate inventory method for Norway spruce and broadleaves except for broadleaved species pulpwood volume, which was most accurately estimated by the ABA stand method.ABA grid and Trestima 10 methods did not result in any worst result in terms of RMSE (Table 7) but simultaneously, their variants, ABA stand and Trestima 5, resulted in the highest RMSE in two and four cases out of 12 cases, respectively.The EMO method generated six the highest RMSE (Table 7).

Goodness-of-fit test
The goodness-of-fit results varied substantially between inventory methods.According to the Kolmogorov-Smirnov (KS) test the EMO application resulted in only one rejected case, ABA grid and Trestima 10 resulted in two, while ABA stand resulted in five rejected cases and finally, Trestima 5 resulted in a total of six rejection cases out of seven possible (stands) (Table 8).The ABA grid method was rejected mainly because of overestimating the understory (spruce) cohort.The number of the smallest trees was too high for stands 1 and 3 (see example in Fig. 1 for stand 1).However, sometimes the phenomenon of a predicted, peak in small dimensions, fitted the ground-truth data surprisingly well (see Fig. 2 and 3 for stands 5 and 7, respectively).The smallest Error-Index (EI) was most often found with the EMO method (Table 8).In addition, the ABA grid and Trestima 10 provided twice the best EI.Trestima 5 provided the least accurate dbh distributions according to goodness-of-fit statistics (8 times the worst KS or EI test result).Note that in terms of goodness-of-fit statistics, generally the best method was EMO and Trestima 10, which both requires a visit to the target stand.In stand no. 1, EMO provided visually the best fit and was the only one that passed the KS test (Fig. 1).Trestima 10 followed slightly better the observed distribution than Trestima 5, especially when focusing on the largest dbh classes (Fig. 1).In stand no. 5, Trestima 5 showed excess kurtosis but, when all ten images were utilized in Trestima 10, the predicted distribution was one  with the best fit (Fig. 2).The ABA grid reproduced the peaks and hollows of the observed distribution but the ABA stand resulted in a biased location for the peak for the smallest dbh classes and thus, generated the worst KS test values.Stand no.7 had a decreasing, almost inverse J-shaped, dbh distribution.ABA grid and ABA stand were best at capturing the type of distribution (Fig. 3).The rejected case by Trestima 5 overestimated the proportion of the largest dbh classes but quite evenly distributed trees by Trestima 10 passed the KS test.Stand no.6 was the easiest case to predict as each method passed the KS test.However, there were clear differences between predicted distributions.The observed dbh distribution was almost symmetric but with a slight bimodality due to the high proportion of the smallest dbh classes (Fig. 4).In terms of visual interpretation and according to KS and EI goodness-of-fit tests, Trestima 10 was the best.In this case, the peak by ABA grid methods did not coincide well with the peak of cut trees.The EMO generated distribution had an overabundance of trees with a DBH greater than 35 cm.

Ranking of the applied methods
Ranking revealed two superior methods: Trestima 10 and ABA grid (Table 9).The strength of these methods is a superior goodness-of-fit of a predicted dbh distribution.Goodness of the distribution fit also resulted in good performance in standard stand and assortment characteristics, namely 9 and 7 times the least bias as well as 8 and 11 times the smallest RMSE.
The advantage of the Trestima 10 method was especially clear in the predicted number of stems, basal area and volume of the Norway spruce and broadleaves admixtures (see Table 3  and 4).The Trestima application, however, proved to be quite sensitive to the number of photos in the analysis.Applying 5 images often resulted in the worst test criterion (Table 9).Still, Trestima 5 had a total of 7 times the smallest bias.Note that a sample of 5 images was sufficient to reach 10% standard error in basal area in only two stands (Supplementary file 1, available at http://dx.doi.org/10.14214/sf.1568).
Also the EMO application proved sensitive to a small number of sample trees.Namely the basal area of a stand is divided into tree species according to measured sample trees (six nearest trees per sample plot).EMO provided the best test criterion16 times (out of 70), but simultane- ously, it frequently provided the worst criterion (30 times).EMO performed well with a dominating Scots pine stratum, but typically generated the worst test criteria for Norway spruce and birch admixtures (see Tables 4 to 7).

Discussion
The aim of this study was to assess and compare the accuracy of three pre-harvest inventory methods in predicting the dbh-height distribution of clear-cut stands.Of the applied methods, Trestima and EMO inventories require stand-specific samples but ABA does not require any stand-specific field measurements for prediction.The estimates of dbh and height distributions based on these methods were compared to accurate tree taper data measured and registered by the harvester measurement systems during the final cut.Finally, the accuracy of the prediction method is dependent on the generated individual trees -how alike or unlike they are compared with the observed trees and how alike they are distributed.The example given for the ABA grid method for stand 3 showed a remarkable similarity between modelled and cut tree dimensions (see Supplementary file 2, available at http://dx.doi.org/10.14214/sf.1568).
In ABA, we used RF to select NN.In our study, RF was chosen based upon the quality of the results in previous studies and desirable statistical characteristics (i.e., the capability to predict multiple-response variables simultaneously, to use a large number of predictors without the problem of overfitting and to evaluate the accuracy with built-in functionality).The use of RF in NN estimation of forest variables has been common (Hudak et al. 2008;Yu et al. 2011;Vastaranta et al. 2013Vastaranta et al. , 2014)).For example, Hudak et al. (2008) demonstrated that the RF method is more robust and flexible for forest-variable prediction when compared to other NN distance measures, such as Euclidian distance, Mahalanobis distance or canonical-correlation analysis.In our NN predictions, the number of neighbours was set to one to keep the original variance in the data (Hudak et al. 2008).ABA results were generally in line or slightly worse than in other ABA studies carried out in the same study area (Holopainen et al. 2010;Yu et al. 2010;Vastaranta et al. 2013).Vastaranta et al. (2013) predicted G, DG, HG, and total volume with RMSEs of 17.8%, 19.1%, 7.8% and 17.9%, respectively.Especially the differences in the accuracy of basal area and total volume were obvious compared to our study and they could be explained partly from 5% and 12% underestimations using the ABA grid method, while those of Vastaranta et al. (2013) were only 0.3-0.4%.On the other hand, DG by the ABA grid method was even better (RMSE 11.0%) in the present study.It should be pointed out that our results were validated in limited mature forest conditions (7 mature stands) whereas a whole range of forest conditions (500 sample plots; D from 10 to 60 cm) were used in Vastaranta et al. (2013).It is typical in remote sensing-based inventories that stand characteristics in densely stocked stands are underestimated (Vastaranta et al. 2013).Holopainen et al. (2010) used ABA for predicting dbh distribution and timber assortments.Their results were validated with harvester-measured tree data from 12 clear-cutting sites.With ABA they obtained RMSEs of 79.2%, 33.6% and 78.6%, for pine, spruce and birch saw logs, respectively.The respective RMSEs in our study were 42.2%, 106.2% and 111.2%.Thus, accuracy of the dominating tree species in our study was 42.2% (pine) as it was 33.6% (spruce) in study by Holopainen et al. (2010).The species composition varied more in our validation stands compared to the study by Holopainen et al. (2010), which limits an interpretation.Based on previous studies, it can be deduced that the accuracy of the predicted timber assortments is much lower than prediction accuracy of total volume (Peuhkurinen et al. 2008) that is often used when validating quality of the inventory.In general, low prediction accuracy of the species-specific stand characteristics is seen as weakness of the ABA (White et al. 2013).Accuracy of the species-specific stand characteristics correlates highly with the accuracy of the predicted timber assortments.
Note that Weibull distribution in some ALS studies in Finland (Maltamo et al. 2006;Peuhkurinen et al. 2007;Packalén and Maltamo 2008;Holopainen et al. 2010) has been based on an old model by either Mykkänen (1986) or Kilkki et al. (1989).These Weibull models are not capable in predicting the smallest dbh classes resulting in serious underestimation in N. Nevertheless, as basal area-dbh distribution models they were relevant for predicting total and log wood volume (Siipilehto 1999;Maltamo et al. 2006).Inaccuracy in smallest diameters is due to modelling data -NFI based sample plots including trees selected with basal area factor 2. More accurate dbh distribution models exist (Kangas et al. 2007;Siipilehto et al. 2007) and here, we applied a parameter recovery method (Peuhkurinen et al. 2011).The advantage of parameter recovery is that it does not involve any additional errors in the used stand characteristics (Mehtätalo et al. 2007;Siipilehto and Mehtätalo 2013).Thus, stand characteristics interpreted for each grid could be fully utilized for solving the Weibull distribution.However, due to small 16 m × 16 m grid area we preferred random sampling when selecting dbh from the solved distribution.
It should be taken into account that Trestima and EMO applications required observations (photos or sample trees) from the target stands that are not needed in ABA.On the other hand, ABA is typically carried out at 10-year intervals and for large-areas at once (inventoried areas can vary from 50 000 ha to 1 million ha with the same scanning parameters, field plots, etc.).The current price of the ALS data is around 0.3€ per ha whereas the price of the field inventory is few euros per ha.In Finland, forest owners can access forest resource information collected using ABA for free.In our study, we had ABA information just before the cutting, which is not always possible in practice.Trestima and EMO provide flexible options for collecting stand-specific information just before cutting, but the additional costs is an issue.The Trestima app was superior to EMO in generated stand characteristics for admixtures, whereas EMO was superior for Scots pine.Trestima seemed quite sensitive to the number of images, but users can follow the results and standard error in basal area in real time.If sequential sampling is presumed, the risk of having a sample that is inadequate for target accuracy can be sharply reduced.Some clear cut stands appeared quite heterogeneous and even with 10 images the 10% convergence in the standard error of basal area was not achieved.Further, if only five images were included into the analysis, basal area convergence was met only twice (Suppl.file 1).The EMO method generalized relatively small samples to characterize stand structure using kernel smoothing.In addition, EMO divides the species-specific characteristics based on the measured sample trees.Nevertheless, EMO was obviously the best when it comes to volume characteristics of Scots pine.Unfortunately, EMO performed poorly for minor admixtures.
The small samples of the EMO method translated into difficulties with capturing the shape of the dbh distribution.This was shown in some of the largest error-index values by EMO method.
Trestima and ALS methods were the most reliable in predicting stand characteristics and dbh distribution of standing trees prior to harvesting.Both methods had their strengths and weaknesses.When ABA was applied at grid level, it provided at its best very accurate dbh distributions that were able to follow bimodal-or even multimodal-shaped distributions.This is a very useful feature in stands having heterogeneous structure.On the other hand, all remote sensing methods have difficulties in two-or multistoried stands or situations in which trees have a clustered spatial structure (Uuttera et al. 1998;Mehtätalo 2006;Peuhkurinen et al. 2007).
The Trestima method provides the most accurate stand characteristics for the whole stand but the applied Weibull smoothing lacks the ability to produce bimodal-or multimodal-shaped species-specific distributions.The EMO method had originally been developed for producing accurate diameter-height-quality distributions of mature Scots pine forests and comprises features that inevitably provide biased results for admixtures.The method could be improved by executing relascope sampling for each tree species separately and by measuring the dbh of each tree within the relascope sample.
From the compared methods ABA is the least expensive, the Trestima method is the second least expensive and the EMO method the most expensive.The stand-specific cruising takes from half an hour to one hour in a typical Finnish stand of a size ranging from 1-3 hectares.With the EMO method additional time has to be reserved for data input.It is difficult to determine how accurate the predictions of standing trees need to be after all.What is the threshold value of accuracy for making decisions resulting in more profitable wood allocation and conversion solutions?The final judgment of the accuracy required has to be based on analytical comparisons of costs and revenues related to these decisions (Eid et al. 2004;Mäkinen et al. 2012).
Alternatives for kNN area-based ALS approach is to estimate regression models for required stand characteristics and/or for dbh distribution directly from ALS canopy metrics (Naesset 2002;Gobakken and Naesset 2004;Suvanto et al. 2005;Uuttera et al. 2006;Maltamo et al. 2007;Magnussen et al. 2012) or single tree detection (Peuhkurinen et al. 2007;Kankare et al. 2015).Also, one option is to produce dbh distributions directly from the measured trees of the field plots (Packalén and Maltamo 2008;Peuhkurinen et al. 2008).In the latter case the same trees will be multiplied, whenever the same plot describes the grid.In the Trestima method, empirical dbh distributions produced from the interpreted trees included in the relascope samples are also available.Those species-specific distributions can be bi-or multimodal.However, we believe that smoothing the small sample of interpret trees improved the results.At least, the species-specific distributions of the ground truth data resembled more like unimodal, although often extremely skewed for Norway spruce.Trestima methods included an average of 40 and 76 interpret trees from 5 and 10 images, respectively.This amount corresponded to 7% and 13% of the total number of cut trees per stand (see Suppl.file 1).Kopakka (2015) found that Trestima underestimates individual tree dbh and thus, its empirical dbh distribution also underestimated mean dbh.Accordingly, in this study DG was systematically underestimated with Trestima methods resulting in underestimation in basal area, too.Vastaranta et al. (2015) instead found slight overestimation (0.8-1.4%) in the dbh of the individual median tree for pine and broadleaves but 3.1% underestimation for spruce, while tree height and stand basal area were underestimated using Trestima.Trestima is under continuous development work.Thus, we do not know if the version of the processing software was the same in these studies.
The study material may be regarded as rather homogenous in terms of stand maturity and stand size.The study stands were all mature Scots pine-or Norway spruce-dominated forests with a stand size varying from 0.7 to 2.0 ha.The stand sizes may be regarded as rather typical for southern Finland.The pre-harvest inventory methods (Trestima and EMO) tested and compared here comprise a system, where sampling and measurements are spatially distributed evenly within the whole stand.In the ALS grid method, the diameter distributions are constructed by summing the individual diameter distributions of each 16 m × 16 m grid within the stand instead.In this sense the best methods, the Trestima and the ABA grid methods, probably produce similarly good results also in larger stands.Actually, if we look, say, at RMSE of the total volume, it seemed to decrease along with increased stand size.This was probably due to a decreased proportion of the most uncertain boundary grids.

Fig. 1 .
Fig. 1.Example of the breast height diameter (dbh) distributions for stand no. 1. EMO provided the best fit for total dbh distribution (all species together) according to Kolmogorov-Smirnov (KS) and Trestima 10 according to Error-Index (EI) goodness-of-fit tests.

Fig. 2 .
Fig. 2. Example of the breast height diameter (dbh) distributions for stand no. 5. ABA grid, Trestima 10 and EMO methods passed the Kolmogorov-Smirnov goodness-of-fit test while EMO had the best test values.Trestima 5 resulted in a too peaked dbh distribution.

Fig. 3 .
Fig. 3.Example of the breast height diameter (dbh) distributions for stand no. 7. ABA stand and ABA grid methods performed the best and second best according to Kolmogorov-Smirnov (KS) but according to Error-Index (EI), the ABA grid was slightly better.Trestima 5 and EMO did not pass the KS test and they provided the highest EI values.

Fig. 4 .
Fig. 4. Example of the breast height diameter (dbh) distributions for stand no. 6. Trestima 10 fitted the best according to Kolmogorov-Smirnov (KS) and Error-Index (EI) goodness-of-fit tests.The second best EI and KS test values were found using Trestima 5 and EMO, respectively.All the predicted dbh distributions passed the KS test.

Table 1 .
General characteristics of the study stands.Volumes of trees are calculated from the harvester data.

Table 2 .
The average ground-truth stand characteristics calculated from harvester data (Cut trees).
Silva Fennica vol.50no.3 article id 1568 • Siipilehto et al. • Reliability of the predicted stand structure for…

Table 4 .
Absolute and relative bias in stand characteristics by the optional methods.The best methods by tree species and stand totals are highlighted in bold and the worst are shown in italics.
N denotes number of stems, G basal area, DG basal area-weighted mean diameter, HG Lorey's height.

Table 5 .
Absolute and relative RMSE in stand characteristics by methods.The best methods by tree species and stand totals are highlighted in bold and the worst are shown in italics.
N denotes number of stems, G basal area, DG basal area-weighted mean diameter, HG Lorey's height.

Table 6 .
Absolute and relative bias in volume characteristics by methods.The best methods by tree species and stand totals are highlighted in bold and the worst are shown in italics.Silva Fennica vol.50 no.3 article id 1568 • Siipilehto et al. • Reliability of the predicted stand structure for…

Table 7 .
Absolute and relative RMSE in volume characteristics by methods.The best methods by tree species or stand totals are highlighted in bold and the worst are shown in italics.

Table 8 .
Goodness of fit tests usingKolmogorov-Smirnov (KS-quotient, Tham 1998)andError-Index (Reynolds et al.  1988).The smaller the value, the better the fit is.KS-quotient < 1 means passed case.The best test values by stands are highlighted in bold and the worst in italics.

Table 9 .
Ranking the methods by the number of the best and the number of the worst cases among analysed criteria: Bias and RMSE for number of stems, basal area, basal area-weighted mean diameter, Lorey's height, total volume, log and pulp wood volume by tree species and stand totals, as well as Kolmogorov-Smirnov (KS) and Error-Index (EI) tests.(Total of 70 criterion).The best performed method by each criteria is highlighted in bold.Silva Fennica vol.50no.3 article id 1568 • Siipilehto et al. • Reliability of the predicted stand structure for…