Estimation of forest biomass by means of genetic algorithm-based optimization of airborne laser scanning and digital aerial photograph features

Information on forest biomass is required for several purposes, including estimation of forest bioenergy resources and forest carbon stocks. Airborne laser scanning is today considered the most accurate remote sensing method for forest inventory. The three-dimensional nature of laser scanning data enables estimation of the volumes of the tree canopies. The dimensions of the tree canopies show high correlation with the amount of forest biomass. Optical aerial photographs are often used to complement laser data, for improved distinction between tree species. The paper reports on a study testing the estimation of forest biomass variables in two study areas in Southern Finland. The biomass variables were derived on the basis of tree-level field measurements, with biomass models used for pine, spruce, and birch. The sample-plot-level biomass components were derived on the basis of tree-level data and used as reference data for airborne-laserand aerial-photograph-based estimation. Results were slightly better for total biomass (RMSE 22.5% and 23.6% for the two study areas) than total volume (RMSE: 23.4% and 26.1%). Species-specific estimation errors were large in general but varied between the study areas, because of differences in their forest structures.


Introduction
Accurate estimates of forest biomass are required for several purposes.Demand for forest biomass for bioenergy consumption is increasing markedly, supported by, for example, requirements imposed by EU directives for promotion of the use of energy from renewable sources (e.g., Steierer 2010).Industrial bioenergy-users need up-to-date information in cartographic form on the location and quantity of forest biomass resources.Forests also have a key role in the carbon cycle, and forest Silva Fennica vol. 47 no. 1 article id 902 • Tuominen & Haapanen • Estimation of forest biomass… biomass is a significant carbon sink.Estimates of forest carbon stocks and predictions of their change are required for implementation of effective climate policy.
For a long time, remote sensing and earth observation techniques have been applied to produce mapped estimates of forest parameters such as the volume of growing stock.Typical remote sensing data sources that have been utilised are optical satellite images and aerial photographs.A forest inventory technique based on satellite imagery has been applied for estimation of forest biomass as well (e.g., Tucker et al. 1983;Roy and Ravan 1996;Tuominen et al. 2010;Nichol and Sarker 2011).Variables related to the quantity of aboveground forest biomass are well suited to remote-sensing-based estimation, since the canopy reflectance is correlated with the entire volume of forest biomass.This also holds true for active remote sensing techniques, such as airborne laser scanning (ALS), wherein the laser pulses reflected from the forest crown layer are used in prediction of forest characteristics.
In Finland, a new generation forest inventory system has been introduced for forest management planning.The new-generation system replaces the traditional visual inventory by stand.Estimation of forest variables in this new-generation forest inventory method is based on interpretation of ALS data and digital aerial photographs, with application of field measurements from sample plots as reference data.Typical sources of remote sensing data used in the system are low-density ALS data (typically 0.5-2 pulses/m 2 ) and digital aerial photographs with a spatial resolution of 0.25-0.5 m, typically including the following spectral channels: blue (B), green (G), red (R), and near-infrared (NIR).
In the new forest inventory method the inventory units are square shaped elements of a systematic grid (i.e.grid cells).For the selection of the field plots the inventory area is typically stratified on the basis of earlier stand inventory data, and the field plots are allocated into these strata in order to get representative reference data covering all major types of forest for the interpretation of forest variables.
Currently, ALS is considered to be the most suitable remote sensing method for estimation of stand-level forest variables (Naesset 2002;Naesset 2004;Maltamo et al. 2006).Compared with optical-remote-sensing data sources, ALS data are particularly well suited to estimation of forest attributes related to physical dimensions of tree canopies, because ALS produces three-dimensional (3D) information on forest canopies.On the other hand, ALS data are not as well suited to estimation of tree species proportions or dominance with the applied point density (e.g., Törmä 2000).Therefore, optical imagery usually is needed to complement the ALS data.Aerial photographs have been widely used in forest inventory and their affordability and availability are generally good (e.g., Tuominen and Pekkarinen 2005;Maltamo et al. 2006).
The combination of aerial photographs and ALS data makes it possible to derive a very large number of remote sensing features describing the characteristics of a field plot or a stand.Consequently, the dimensionality of the remote sensing feature space increases greatly.It is, in general, computationally infeasible to use all possible remote sensing features when processing large inventory areas.Also, when the dimensionality grows, the data become sparse in relation to the dimensions (Hinneburg et al. 2000).This causes problems for estimators based on distance or proximity in the feature space ('the curse of dimensionality'; e.g., Beyer et al. 1999).Therefore, the number of remote sensing features must be reduced in a way that produces an appropriate subset of features for the estimation procedure, considering their usefulness in predicting the forest attributes as well as their mutual correlation.Many types of feature selection algorithms have been applied for this purpose (e.g., Siedlecki and Sklansky 1989;Pudil et al. 1994;Jain and Zongker 1997;Kudo and Sklansky 1998;Kudo and Sklansky 2000).In remote sensing-aided forest inventory applications e.g.correlation analysis (Tuominen and Pekkarinen 2005;Breidenbach et al. 2010), canonical analysis (Packalén et al. 2012), stepwise selection using various criteria and proceeding Silva Fennica vol. 47 no. 1 article id 902 • Tuominen & Haapanen • Estimation of forest biomass… either forwards by adding or backwards by eliminating features, or combining these operations (Tuominen and Pekkarinen 2005;Maltamo et al. 2006;Packalén and Maltamo 2007;Haapanen and Tuominen 2008;Hudak et al. 2008;Packalén et al. 2009;Latifi et al. 2010;Breidenbach et al. 2010;Packalén et al. 2012), simulated annealing (Packalén et al. 2012) and genetic algorithms (e.g.Van Coillie et al. 2005;Haapanen and Tuominen 2008;Latifi et al. 2010) have been used.
The current plan of operation for the new-generation forest inventory system is to cover the area of the private forests of Finland in approximately 10 years with aerial photographs and ALS data.In the current inventory system, mainly growing stock variables related to stem dimensions and volume will be estimated.However, it would be relatively straightforward to also include variables related to forest biomass in the inventory system, by means of existing biomass modes.The estimation of forest biomass by means ALS and aerial imagery or other optical remote sensing data has been examined in several studies (e.g.Lefsky et al. 2005;van Aardt et al. 2006;Popescu 2007;Naesset and Gobakken. 2008;Kotamaa et al. 2010;Hauglin et al. 2012).
The objective of this study was to test ALS-and aerial-photograph-based estimation of forest biomass.We focus on feature selection, as it is an important aspect in cases where a multitude of potential image features is available.Two study areas were taken into analyses, because in earlier studies, we have seen differing estimation success of individual tree species depending on the forest composition of the study area.In addition, the most optimal set of features often differ between study areas.

Study areas and field data
The laser-scanning-and aerial-image-based estimation was tested in two study areas.Study area 1 was in the municipality of Lammi,in Southern Finland (approximately 61°19´N and 25°11´E).The area covered approximately 1800 ha of state-owned forest.The field data employed in this study encompassed 263 fixed-radius (9.77 m) circular plots that were measured in 2007.From each plot, all living tally trees with a breast-height diameter of at least 50 mm were measured.The total number of trees measured in this study area was 8027.For each tally tree, the following variables were recorded: location, tree species, crown layer, diameter at breast height, height, and height of living crown.The plots were located with Trimble's GEOXM 2005 Global Positioning System (GPS) device, and the locations were processed with local base station data, with an average error of approximately 0.6 m.The forests in the study area were dominated by coniferous tree species, mainly Scots pine (Pinus sylvestris) and Norway spruce (Picea abies).Of the deciduous species, birches (Betula pendula and B. pubescens) were most common.Other, mainly non-dominant tree species present in the study area were aspen (Populus tremula), grey alder (Alnus incana), rowan (Sorbus aucuparia), contorta pine (Pinus contorta), larches (Larix spp.), and firs (Abies spp.).
Study area 2 was located in Eastern Finland, in the municipalities of Kuopio and Karttula (approximately 62°55´N and 27°12´E), covering approximately 36 700 ha of mainly privately owned forest.The field data employed in this study covered 504 fixed-radius (9.00 m) sample plots measured in 2009.In each plot, all living tally trees with a breast-height diameter of at least 50 mm were measured.The total number of tally trees measured in this area was 14 657.The following variables were measured for each tally tree: species, crown layer, tree class, and diameter at breast height.Height and age were measured from sample trees used for each tree species and crown layer.The tally trees' heights were estimated via a model by Eerikäinen (2009) developed for generalisation of sample trees' characteristics to tally trees.The model is based on locally calibrated species-specific dependence between stem diameter and height.The field plots were located with a high-precision GPS device, and the locations were processed with local base station data.The accuracy of the field plot positioning was specified to be 1 m.The forests in the study area were dominated by coniferous tree species, and Norway spruce was the dominant tree species, with approximately half of the total volume, followed by Scots pine.The deciduous species, making up less than a quarter of the volume, were mainly birches (Betula pendula and B. pubescens).The proportion of other species was insignificant.
In order to cover all types of forest, the study areas were stratified on the basis of earlier stand inventory data and the field sample plots were assigned to these strata.The statistics of the study areas, from the sample plot measurements, are presented in Tables 1 and 2. In both areas, Scots pine and Norway spruce were treated as separate classes and other species as one class.In study area 1, this group then included a variety of species ('other species'), but in study area 2 only deciduous species.The locations of the study areas and the field sample plot layouts within them are presented in Fig. 1.Fig. 2 shows the distribution of total aboveground (ABVG) biomass in both areas.

Remote sensing data
In study area 1, the remote sensing data consisted of orthorectified colour-infrared digital aerial photographs from 2006 (containing near-infrared, red, and green bands) with a ground resolution of 0.5 m and ALS data from 2006 acquired from a flying altitude of 1900 m with an average density of 0.88 pulses/m 2 .The aerial photographs were combined into an image mosaic covering the entire study area.In study area 2, digital aerial photographs were acquired in 2008 from an altitude of 5600 m with an overlap of 60%.The images were orthorectified with 0.5 m ground resolution and featured blue, green, red, and near-infrared bands.ALS data were acquired in 2008, from a flying altitude of 2000 m.The density of the ALS data was 0.6 pulses/m 2 .
In addition to the ALS point data, the ALS height and intensity data (of the first or only pulses) were interpolated to a raster format with a resolution similar to that of the aerial photographs.The output pixel values were calculated as inverse-distance-weighted (with a power of 2) averages of the two nearest ALS points.These raster images were used for computation of the textural features (see Subsection 2.4).

Estimation of biomass components
Sample-plot-level biomass quantities were estimated for the tally trees on the basis of the tree measurements.The main interest was in the total ABVG biomass; in addition, also branch and needle/foliage biomass was computed.Scots pine models were applied for all pines (Pinus sylvestris and P. contorta).Norway spruce models were applied for Norway spruce, larches, and firs (Picea abies, Larix spp., and Abies spp.).Birch models were applied for all deciduous tree species.
The following models were applied for estimation of biomass components in study area 1: -Stem volume: volume models for Scots pine, Norway spruce, and birch by Laasasenaho (1982) utilising height and diameter as independent variables.-Branch biomass: multivariate models for Scots pine and Norway spruce by Repola (2009) and multivariate model for birch by Repola (2008) based on diameter, height, and length of the living crown.-Needle/foliage biomass: multivariate models for Scots pine and Norway spruce by Repola (2009) based on diameter, height, and length of the living crown and multivariate model for birch by Repola (2008) based on the diameter and crown ratio.-Total ABVG biomass: multivariate models for Scots pine and Norway spruce by Repola (2009) based on diameter, height, and length of the living crown and multivariate model for birch by Repola (2008) based on diameter and height.
In study area 2, where the variable 'length of the living crown' was not measured in the field, the biomass estimation differed from that in study area 1 for a number of biomass components.The following models were applied in estimation of those biomass components in study area 2: -Stem volume: volume models for Scots pine, Norway spruce, and birch by Laasasenaho (1982) utilising height and diameter as independent variables.-Branch biomass: multivariate models for Scots pine and Norway spruce by Repola (2009) and multivariate model for birch by Repola (2008) based on diameter and height.-Needle/foliage biomass: multivariate models for Scots pine and Norway spruce by Repola (2009) based on diameter and height, and multivariate model for birch by Repola (2008) based on diameter.-Total ABVG biomass: multivariate models for Scots pine and Norway spruce by Repola (2009) based on diameter and height.
The sample-plot-level biomass quantities were calculated by summing of tree-level biomasses and conversion to tons per hectare.

Extraction of laser and aerial photograph features
A multitude of features can be derived from the registered ALS point altitudes, starting from the simplest ones, such as mean and maximum, which describe the stand height.Variables related to the range, deviation and distribution of point altitudes within the forest estimation units describe the canopy structure and they have been found suitable for area-based (i.e.stand level) estimation at the applied ALS point density (e.g.Naesset 2002;Naesset 2004).In applications using higher point densities and aiming at individual tree estimation also other features can be used, such as alpha shape metrics that describe the shape of individual canopies (e.g.Vauhkonen et al. 2012).
Spectral averages and standard deviations of aerial photograph bands within the field plots vary according to canopy closure and tree species, although they are also dependent on other factors (such as sensor aperture and location in the image) that weaken the correlation (e.g.Tuominen and Pekkarinen 2004).Textural features of both remotely sensed data types describe the horizontal structure of the canopy, which, in turn, is affected by the tree species, canopy size and closure.The following statistical and textural features were extracted from the aerial photographs and ALS data for each sample plot area or from a square window centred on the sample plot: In study area 1, also Normalized Difference Vegetation Index and near-infrared/red channel transformations were derived from the aerial photographs and features of sets 1-3 computed as above.This was expected to aid in the discrimination of biomass and volume by species.They were, however, rarely selected for the final feature sets in the feature selection procedure (see Subsection 2.5), so the process was not repeated for study area 2. Estimation with laser features only was tested in both areas as well.
For the estimation of forest biomass attributes, all features were standardised to a mean of 0 and a standard deviation of 1.This was done because the original features had diverse scales of variation.Without standardisation, variables with wide variation would have greater weight in the estimation, regardless of their correlation with the estimated forest attributes, when using estimation methods that are based on proximity or distance in the feature space such as k-means clustering or k-nearest-neighbour method, which was applied in this study.

Estimation and feature selection methods
The k-nearest-neighbour (k-nn) method was used for estimating the forest variables (e.g., Kilkki and Päivinen 1987;Tomppo 1991;Tokola et al. 1996).In remote sensing based forest inventory Silva Fennica vol. 47 no. 1 article id 902 • Tuominen & Haapanen • Estimation of forest biomass… applications the k-nn has distinct advantages compared to, e.g., regression modelling.In practical inventories the number of inventory variables is often high.In k-nn all inventory variables can be estimated simultaneously and k-nn is also likely to retain their original covariance structure of the inventory variables, whereas, when using regression analysis, the variables must be estimated separately or in groups, which may lead to estimates whose covariance structure is different from that of the original field variables, and not necessarily compatible estimates, in addition to a laborious estimation procedure (e.g.Tomppo and Halme 2004;Wilson et al. 2012).
In the k-nn method, Euclidean distances between the sample plots were calculated in the n-dimensional feature space, where n represents the number of features extracted.The estimates for the tested variables were calculated as weighted averages of the variables of the k nearest sample plots (Eq.1).Weighting by inverse squared Euclidean distance in the feature space was applied (Eq.2) to reduce the bias of the estimates (Altman 1992).
The value of k was set to 5, which is a compromise suitable for this kind of study not exploring the utility of the k-nn method as such.
where ŷ = estimate for variable y y i = measured value of variable y on the ith nearest neighbour plot (2) d i = Euclidean distance (in the feature space) to the ith nearest neighbour plot k = number of nearest neighbours The estimated variables were total volume of growing stock; volumes of Scots pine, Norway spruce, and deciduous/other species; total ABVG biomass and biomasses, by species; and total biomass of branches + foliage and these biomasses, by species.Later in the text, the combined branch and foliage biomass is referred to as 'branch biomass'.The accuracy of the estimates was calculated via leave-one-out cross-validation through comparison of the estimated forest variable values with the measured values (ground truth) of the field plots.The accuracy of the estimates was measured in terms of the relative root mean square error (RMSE) and bias (see Eqs. 3-6).In order to select an appropriate set of features for the estimation task, automatic feature selection was carried out through a simple genetic algorithm (GA) presented by Goldberg (1989) and implemented in the GAlib C++ library (Wall 1996).The method has been successfully used for problems of similar types (e.g., Haapanen and Tuominen 2008;Tuominen and Haapanen 2011).The GA process starts by generating an initial population of strings (chromosomes or genomes), which consist of separate features (genes).The strings evolve over a user-defined number of iterations (generations).This evolution includes the following operations: selecting strings for mating by applying a user-defined objective criterion (the better, the more copies in the mating pool), allowing the strings in the mating pool to swap parts (cross over), causing random noise (mutations) in the offspring (children), and passing the resulting strings to the next generation.Here the starting population was 300 strings, which were developed over 30 generations.The probability of crossover was 80% and that of a mutation one per cent.The objective was to minimise the RMSE of total biomass (or stem volume, depending on the case).When tree species biomasses or volumes were taken into account in the feature selection criteria, we created an artificial univariate response variable which was a weighted combination of species-specific relative RMSEs.The following weights were applied: total 55%, Scots pine 15%, Norway spruce 15%, deciduous/other species 15%.As the result depends on the randomly generated initial population, three 30-generation runs were completed.The expected proportion of activated features to all features is 50%, which means that with a large initial number of features, finding a good combination with only a few features is unlikely.Therefore, the features found in the best run were used as the base for a new selection step.In all, 3-4 of these GA steps were taken before the RMSEs stopped falling (see Fig. 3 in 'Results').

The number and properties of the selected features
Features selected by the GA when both laser and aerial photograph features were available are listed in Appendix 1, and Table 3 summarizes the number of selected features in different cases.Fewer features were needed for the estimation of total amounts (biomass or volume) as compared with the situation wherein the success of species-specific estimation too was considered in the feature selection.In study area 2, there were more ALS features than aerial photograph features available for the first step of feature selection (52% of features), but in study area 1 the situation was the reverse (46% of the features were laser-based).After feature selection, ALS features dominated in both areas.Of the ALS feature types, those describing the vertical distribution of observations (bullet point 4 in section 2.4.) dominated.Some ALS intensity features were selected in most cases.The proportion of aerial photograph features was greater when species-specific RMSEs were employed in the GA.In study area 1, nearly all aerial photograph features selected were various textural features based on grey level co-occurrence matrices (Haralick et al. 1973).In study area 2, also spectral averages of grey values and various standard deviations were included in the optimised feature sets.It was expected that the number of features needed would be greater in the case in which only laser features were given as the starting population, but that was not the case -there was no clear pattern (see Table 4).In some cases, more Haralick textural features or standard deviation-based features (bullet points 2 and 3 in section 2.4) were selected when only ALS features were available, but no clear pattern could be detected there, either.However, total amounts again required fewer features than did species-specific amounts.
The effect of reducing the number of features step by step with the genetic algorithm is demonstrated in Fig. 3.The objective was to minimise the RMSE of total biomass of study area 1.The best feature combination found from among the three 30-generation repetitions was used as input  for a new round of repetitions.At the starting point, all laser and aerial photograph features were available.The generation in which the lowest RMSE of the repetition was found varied between 5 and 29.The figure shows that during the consecutive steps: -The RMSEs become generally lower -The difference between the lowest RMSE of the first generation and the lowest RMSE found in the full course of the repetition diminishes

Accuracies in the estimation of biomass and volume
Results were slightly better for total biomass than for total volume (see Tables 5 and 6).Species-specific estimation errors were large.In study area 1, the species had fairly even proportions in terms of the volumes and biomasses, which was reflected in the relatively uniform RMSE percentages.Taking the tree-species-specific estimates into account in the feature selection process reduced their error, but not much.Fig. 4 presents the residual biomasses (observed -estimated) in study area 1.Total biomasses have large errors in case of largest biomasses, whereas speciesspecific biomasses have large errors along the whole biomass distribution.In study area 2, the dominance of Norway spruce is noticeable in the clearly lower RMSE value among the species and the small effect of considering the species-specific errors in the feature selection.For Scots pine and other (deciduous) species, the residuals are large along the whole biomass distribution, as in study area 1 (Fig. 5).Branch biomass results were similar to total ABVG biomass results.In the case of Norway spruce, and in study area 2 also Scots pine, slightly lower relative RMSEs were obtained for branch biomass than for total ABVG biomass.Biases of total amounts were small in relation to the mean values.In study area 1, species-specific biases were also low, but in study area 2, volume of other species and biomasses of all species had some notable bias.When feature selection based solely on total amounts, the biases were generally larger than when taking into account the tree-species-specific estimates.In estimation with laser features only, slightly lower RMSEs were obtained for total ABVG biomass and total volume in study area 1 than with the use of both types of features.There the GA was not able to eliminate unnecessary aerial photograph features.In study area 2, combined feature sets produced lower RMSEs.In species-specific estimation, the aerial photograph features were useful in both areas (see Table 7).

Discussion
Airborne laser was found to be a suitable method for biomass estimation and can also be used without supporting aerial photography.The accuracy of the aboveground biomass estimates was on similar or slightly lower level than achieved by e.g.Kotamaa et al (2010) in similar conditions, although in our study area 1 the number of field reference observations was lower, and in study area 2 the distribution of field variables was wider than in study material used by Kotamaa et al.Concerning the nature of features selected for the estimation of biomasses or volumes, the following clear patterns were detected: 1.The ALS-based features dominated the final feature combinations 2. Fewer features were needed for the estimation of total amounts as compared with species-level estimation 3. Aerial photograph features were always useful when species-level estimation accuracy was of interest 4. Of the aerial photograph features, the Haralick textural features especially were selected.
Despite the benefit of having aerial photograph-based features available when estimating specieslevel biomasses or volumes, the results with ALS-based features only were not too far.Contrary to expectations, estimation with purely ALS-based features did not require a larger number of features, nor was the nature of selected ALS features different, compared with the cases when both remotely sensed data types were available.
For ABVG biomasses, lower relative RMSEs were obtained than for stem volumes.This seems to confirm the idea that laser features describe the canopy biomass better than they do the stem dimensions.It is noteworthy that we used stem volumes and biomasses obtained using models based on measurements of few tree variables.Furthermore, canopy width, a parameter appearing in more accurate biomass models, was not measured in the field in study area 2, and the height was measured from sample trees only, with the rest of the heights being based on modelling.One could ask: 'Shouldn't the results be more or less similar, as the same tree measurements were used both in biomass and in volume models?'However, the biomasses and volumes are not completely linearly correlated on the stands, as different tree species and tree development stages have different biomasses in comparison with stem volumes (see Fig. 6).The unexplained variation of the biomass models that were applied in this study (Repola et al. 2007) is significantly higher than that seen under the volume models (Laasasenaho 1982).From Laasasenaho's work (ibid.), the relative prediction error for Scots pine volume was 7-8%, whereas it was approximately 30-40% for, e.g., the foliage biomass (Repola et al. 2007).Some of the tally trees in both study areas considerably exceeded the diameter range of the sample trees used in the construction of the biomass models (Repola 2008;Repola 2009), but these account for a minor amount of the total tree mass (at approximately 100 trees in each study area).
In this study, we used biomass models that were published for the main tree species present in the study areas.In many cases, there are no models readily available for converting tree measurements to biomass weights, in which case it is necessary to harvest biomass samples from sample trees for laboratory analyses.If there is great variation in tree species, large samples covering different biomass fractions may be required for modelling forest biomass content (e.g., Nichol and Sarker 2011).
Despite the differences between the study areas, the relative estimation errors for total amounts of ABVG biomass and volume were very similar.Differences were observed in speciesspecific errors, where the errors were many times those seen for the total estimates.In both study areas, the dominant tree species showed the lowest estimation errors, as there is a larger number of suitable field plots in the nearest-neighbour estimation and thus finding a stand consisting of a 'correct' tree species is more likely.The relative error of the tree species biomass or volume was taken into account with a weight of 15% in the feature selection.Even with this low weight, a species obtaining very high relative RMSEs (typically a low percentage of the total volume) will probably have an unreasonably great impact on the features selected.
In this study, the estimations were carried out at the sample plot level.Another option would be the estimation of biomass components for individual trees.The individual-tree approach typically requires higher laser pulse density than this area-based approach does, making the inventory procedure more costly.Furthermore, even with a considerably higher pulse density, not all trees would be recognised by airborne sensor, because of tree canopies intermingling with each other or being hidden beneath larger trees.

Conclusions
The results of this study indicate that ALS in combination with aerial photography is well suited to the estimation of the variables related to the ABVG wood biomass.Since the same data are already in operational use for estimation of stem-related forest variables, the operational forest inventory could be similarly extended to ABVG biomass, through application of the same methodology.The resulting biomass maps should contribute greatly to the utilisation of forest biomass as a renewable energy source, as well as to the knowledge of forest carbon storage.

Fig. 1 .
Fig. 1.The location of study areas 1 and 2 within Finland, and the spatial distribution of the field plots within the study areas.
measured value of variable y on plot i ŷ i = estimated value of variable y on plot i y ̅ = mean of the observed values n = number of plots

Fig. 3 .
Fig. 3. Lowering of the RMSEs during the steps taken in the genetic algorithm process.The objective was to minimise the total ABVG biomass of study area 1.In each step, three 30-generation repetitions are made, after which the best feature combination is used as the starting point for the next step.

Fig. 5 .
Fig. 5. Residuals of biomasses on study area 2. Species-specific biomasses were taken into account in the feature selection process.

Fig. 4 .
Fig. 4. Residuals of biomasses on study area 1. Species-specific biomasses were taken into account in the feature selection process.

Fig. 6 .
Fig. 6.Relationship of plot-level stem volumes with aboveground biomasses in the study areas.

Table 1 .
Forest statistics of study area 1 (average, maximum and standard deviation of values for the 263 sample plots used).

Table 3 .
Number of features selected in each case when both laser and aerial photograph features were available.Percentage of aerial photograph features is in parentheses.

Table 4 .
Number of features selected in each case when only laser features were available.

Table 5 .
Relative stem volume RMSEs and biases (% of mean values) for study areas 1 and 2, where the forest variables used in the objective criterion in the feature selection process were total volume and volumes of Scots pine, Norway spruce, and other tree species (values obtained with minimisation of only the RMSE of total volume in the feature selection process are in brackets).

Table 6 .
Relative ABVG biomass RMSEs and biases (% of mean values) for study areas 1 and 2, where both aerial photograph and laser features were used in the estimation and the forest variables used in the objective criterion in the feature selection process were total biomass and biomass figures for Scots pine, Norway spruce, and other tree species (values obtained with minimisation of only the RMSE of total biomass in the feature selection process are in brackets).

Table 7 .
Relative stem volume and ABVG biomass RMSEs and biases (% of mean values) for study areas 1 and 2. In this case, only laser features were used in the estimation and the forest variables used in the objective criteria in the feature selection process are total volume or biomass and volume (biomass) figures for Scots pine, Norway spruce, and other tree species (values obtained with minimisation of only the RMSE of total volume or biomass in the feature selection process are in brackets).