Mapping Biomass Variables with a Multi-Source Forest Inventory Technique

Map form information on forest biomass is required for estimating bioenergy potentials and monitoring carbon stocks. In Finland, the growing stock of forests is monitored using multisource forest inventory, where variables are estimated in the form of thematic maps and area statistics by combining information of field measurements, satellite images and other digital map data. In this study, we used the multi-source forest inventory methodology for estimating forest biomass characteristics. The biomass variables were estimated for national forest inventory field plots on the basis of measured tree variables. The plot-level biomass estimates were used as reference data for satellite image interpretation. The estimates produced by satellite image interpretation were tested by cross-validation. The results indicate that the method for producing biomass maps on the basis of biomass models and satellite image interpretation is operationally feasible. Furthermore, the accuracy of the estimates of biomass variables is similar or even higher than that of traditional growing stock volume estimates. The technique presented here can be applied, for example, in estimating biomass resources or in the inventory of greenhouse gases.


Introduction
Adequate biomass resource assessments are needed for decision-making on the bioenergy options and climate policy.Industrial bioenergy users are interested in the spatio-temporal availability of the resources, while climate change research is focusing on the estimation of the net change of the carbon stocks in ecosystems.Both of these interest groups need precise up-to-date estimates of biomass and predictions for its temporal change.Adequate systems for monitoring carbon stocks of forests are also required for the implementation of the Reducing Emissions from Deforestation and Forest Degradation in Developing Countries (REDD) initiative.A pivotal requirement for the REDD initiative is a costeffective biomass mapping method that is also operationally feasible at the national level.
In Finland, the country-wide estimates of forest biomass based on the national forest inventory (NFI) data are reported on an annual basis to the United Nations Framework Convention on Climate Change (UNFCCC) following the given guidelines (IPCC 2003).The NFI-based biomass estimation methods have improved recently due to methodological changes, i.e. transition from conversion factors (Karjalainen andKellomäki 1996, Lehtonen et al. 2004) to national tree-level biomass equations (Repola et al. 2007).In bioenergy research a common technique to estimate biomass and energy potential of forested areas is to link roots and harvesting residues to stem volumes assessed in forest inventories.This method produces coarse estimates at the municipal level, but is insufficient when spatially explicit bioenergy potential is needed.
Forest inventory research has used earth observation data for the stratification of inventory areas for more efficient sampling, and for producing estimates of forest resources in the form of area statistics and thematic maps.These maps have been produced using multi-source forest inventory technique by combining data from field sample plots, digital maps and satellite images (e.g.Tomppo 1993).In order to obtain field measured biomass components at the plot or standlevel, which are combined with the satellite-image information, previous research has used biomass expansion factors (BEFs) and simple biomass equations to convert stem volume estimates to biomass estimates (Baccini et al. 2004, Muukkonen andHeiskanen 2005).The disadvantage of the biomass expansion factors is that they are estimated at the national-level and are therefore insensitive to the structural variation of forest stands.Due to the insensitiveness of BEFs to the form variation of the stand diameter distributions or within-stand variation in tree crown dimensions, stands with similar stem volume estimates will get similar biomass estimates, although the true biomass values depend on the size distribution and crown dimensions of individual trees.
The aim of the present study was to demonstrate an advanced biomass mapping method based on the state-of-the-art ground truth reference data, and to test the accuracy of biomass estimates by cross-validation.Recently developed mixed models were used for generalizing sample tree heights and crown ratios over tally trees of the 10th national forest inventory (NFI10) data (measured in 2004-06).The data from the NFI10 were used as an input to tree-level biomass equations, which made it possible to obtain biomass estimates for each NFI plot.Plot-level biomass estimates were used in combination with satellite images and digital maps to produce thematic maps for the forest biomass over the study area.

The Study Area and the NFI Data Set
The study area was the Forestry Centre of Central Finland (Keski-Suomen Metsäkeskus).The Forestry Centres are regional administrative units, whose tasks are, among others, supervision of forest laws and the guidance of forest owners.The Forestry Centre of Central Finland is located approximately between 61°25´-63°35´ N and 24°00´-26°45´ E. The study area lies within the southern boreal vegetation zone except the northwestern part of the area which belongs to the middle boreal zone (Ahti et al. 1968).
We used tree-level data from the NFI to predict tree-wise components of biomass (foliage, branches, stem, aboveground and belowground biomass).The NFI field plots located on non-forest land, such as agricultural or built-up areas, were excluded from the data.Furthermore, only plots where the centre compartment covered no less than 90% of the plot's area were included into the analysis, i.e. plots divided into smaller compartment units were excluded.As a result, the data used in the analysis comprised 1645 relascope sample plots on forest land and poorly productive forest land.
The field data were measured between 2004 and 2006.Diameter at breast height and tree species were measured for each relascope-sampled tally tree, whereas every seventh tree was treated as a sample tree.The variables measured for sample trees included, for instance, the total tree height and height to the living crown.The map illustrating the location of the study area and the field sample layout within the study area is presented in Fig. 1.
The main tree species in the study area were Scots pine (Pinus sylvestris), Norway spruce (Picea abies) and birch (Betula pendula and B. pubescens).The remaining species were assigned to the following groups in order to reflect the biomass equations available: Scots pine (Pinus contorta, Larix sp.; 46%), Norway spruce (Juniperus communis and other spruce species; 31%) and birch (Populus tremula, Alnus incana, Alnus glutinosa, Sorbus aucuparia, Salix caprea and other broad-leaved species; 23%).Trees with a diameter at breast height of less than 1 cm were removed.The characteristics of the field material are presented in Table 1.

Satellite Imagery
Five Landsat 5 TM image scenes were used in this study for covering the Central Finland study area.They were as follows: -path 190, row 16-18, date July 17 2006 -path 191, row 15-16,   The images were georeferenced to the Finnish uniform coordinate system (yhtenäiskoordinaatistojärjestelmä, YKJ, in Finnish) on the basis of topographic maps, and the pixel size was re sampled to 25 m.In georeferencing the images, a second-degree polynomial model was applied aiming at maximum value of 0.5 pixel for the combined geometric error of pixel rows and columns.Land use classes other than forestry land were masked out using digital map data.The digital numbers of satellite channels 1 to 5 and 7 were used in the estimation procedure.

Predicting Heights and Crown Ratios for Tally Trees
In the case of the Finnish National Forest Inventory (NFI) data, diameter at breast height is obtained for every relascope-sampled tree on each plot, while other tree characteristics are measured only for a subsample (every 7th tally tree) (Tomppo 2006).Total tree heights and heights to the living crown base were, however, needed for all trees in order to increase the accuracy of biomass predictions (Repola et al. 2007).
The generalization of sample tree characteristics over the tally trees in the NFI dataset was implemented using a multivariate parametric predictor for total height and crown ratio, i.e. the ratio of live crown length to tree height (Eerikäinen 2009).Due to the hierarchical correlation structure of the data (clusters -forest stands -sample trees) the basic assumption of uncorrelated error terms did not hold, and therefore a mixed modeling technique with mixed-effects models of the linear type was employed.Thus, it also became possible to localize the generalization models of sample tree characteristics in accordance with linear prediction theory (Searle 1971(Searle , 1987;;Henderson 1975;Lappi 1986Lappi , 1991)).
The fixed parts of the linear mixed-effects models in the simultaneous system consisted of both tree and stand-level independent variables such as diameter at breast height, indicators for tree species-specific effects, measurable stand characteristics and site quality indicators.The random parts of the models included parameters for cluster and stand effects along with components for tree species-specific effects.The Best Linear Unbiased Predictor (BLUP) -based localization of these models is applicable if either of the two sample tree characteristics is available for at least one sample tree with known (measured) independent variables (see Lappi 1991).

The Biomass Equations
In order to obtain biomass estimates for the relascope sampled trees of the NFI data we used equations by Repola et al. (2007); models based on crown components including diameter at breast height, height and living crown length were applied as independent variables to determine foliage and branch biomasses.Stem biomasses, by contrast, were predicted using the stem wood density models by Repola et al. (2007), containing diameter at breast height, tree age, temperature sum as independent variables, and stem volume functions by Laasasenaho (1982) for pine, spruce and birch based on height and diameter at breast height.The total aboveground biomass was obtained by combining the model-derived biomass estimates for foliage, branches and stem.
The belowground biomass for spruce and birch was computed by combining the biomasses of stump and roots thicker than 1 cm.However, for estimating the belowground biomass of Scots pine, the functions by Petersson and Ståhl (2006) were used.Finally, individual biomasses of trees were summed up at the sample plot-level.For this purpose inclusion probability of tally-and sample trees was estimated on the basis of the applied relascope factor, which is 2 for the NFI set of data from Central Finland, and the distance of the tree from the plot centre.
In this study, the volume and biomass variables (applied model in parentheses) calculated at the plot-level were as follows: 1) stem volumes (Laasasenaho 1982) for the groups of Scots Pine, Norway Spruce and Birch, respectively, and for all trees; 2) biomass of stem (Laasasenaho 1982& Repola et al. 2007) for the groups of Scots Pine, Norway Spruce and Birch group, respectively, and for all trees; 3) biomass of living branches (Repola et al. 2007) for the groups of Scots Pine, Norway Spruce and Birch, respectively, and for all trees; 4) biomass of needles (Repola et al. 2007) for the groups of Scots Pine, Norway Spruce and Birch, respectively, and for all trees; 5) total aboveground biomass (Repola et al. 2007) for the groups of Scots Pine, Norway Spruce and Birch, respectively, and for all trees; and 6) total belowground biomass for Scots Pine group (Petersson and Ståhl 2006), Norway Spruce group (Repola et al. 2007) and Birch group (Repola et al. 2007), and for all trees.

k-Nearest Neighbour Estimation Method
The k nearest neighbour (k-nn) method (e.g.Kilkki and Päivinen 1987, Muinonen and Tokola 1990, Tomppo 1990) was applied for producing estimates of biomass variables in the form of thematic maps covering the area of Central Finland.The k-nn method is based on the assumption that NFI sample plots having similar biomass characteristics also have similar satellite image spectral features, i.e. are located near each other in the n-dimensional image feature space, where n represents the number of satellite image channels.The k nearest neighbours were determined by the Euclidean distances between the observations in the satellite image feature space.Since the satellite reflectance of a stand depends on the season and the atmospheric and illumination conditions and is, therefore, image scene-specific, the k-nn selection was limited to the sample plots located in the same satellite image as the target pixel.The generated biomass map was a mosaic of the k-nn estimations of individual satellite image scenes.
The satellite image channels were weighted on the basis of cross-validating the volume estimates of the tree species groups (pine, spruce, birch and other broadleaved), and the same weights were applied to the biomass variables.The parameters were optimized for minimizing the RMSE values and biases of stem volumes by different tree species.In optimizing the estimation parameters, Norway spruce and Scots pine were weighted more than deciduous tree species because of their better representativeness in the study area.Initially, all the field plots covered by the satellite images were utilized when computing the parameters for the k-nn estimation procedure, such as the value of k or the weights of the satellite image channels.
The estimates of biomass variables were calculated as weighted averages of the measured biomass variables of the k nearest neighbors (i.e.NFI plots).The k nearest neighbors were weighted by their inverse squared Euclidean distances in the feature space (Eq.1).This method reduces the bias of the estimates (e.g.Altman 1992).The value of k varied depending on the number of NFI plots covered by a satellite: k=3 was applied with satellite images covering the smallest total number of NFI plots, whereas k=5 was applied to the remaining satellite images.

ˆ( )
where ŷ = estimate for variable y y i = measured value of variable y in the ith nearest neighbor plot d i = Euclidean distance to the ith nearest neighbor plot k = number of nearest neighbors Several studies have shown that when a large number of field plots are available for k-nn estimation, increasing the number of nearest neighbours (value of k) generally improves the accuracy of the (plot-level) estimates (e.g.Tokola et al. 1996, Nilsson 1997, Franco-Lopez et al. 2001).On the other hand, the value of k is a trade-off between the accuracy of the estimates and the variation in the original field material that is retained in the estimates.The greater the value of k, the more averaging occurs in the estimates.Thus, for example Franco-Lopez et al. (2001) have suggested k = 1 for the purpose of map production in order to retain the full variation of the field data in the estimates.

Cross-validating the k-nn Estimates
The biomass estimates were tested using a leaveone-out cross-validation technique, in which each sample plot in turn was left out from the set of reference plots and estimated with the aid of the remaining plots.The estimates of the field plots were compared with the corresponding values of the plots that were modelled on the basis of field measurements, thus producing a measure of their accuracy as root mean square errors, RMSE (Eq.2), as well as their biases (Eq.3).

Results
The cross-validation results for the biomass variable estimates indicate that the methodology for producing satellite image-based biomass maps is well suited for the task.The estimation accuracy of biomass variables was generally similar or even better than what typically is achieved with traditional growing stock related stand variables.
The results are presented separately for peatlands and mineral soils, because the peatland mask was applied in the classification and, therefore, the estimation of peatlands was carried out utilizing different set of field plots than with mineral soils.The cross-validation results are presented for satellite images 188/15-17 and 190/14-16 acquired in September 2, 2005 and August 31, 2005, respectively.These two images covered almost all of the sample plots in the study area, 1555 sample plots out of the total of 1645, and part of those were covered by both images.The remaining images were used for covering the cloud areas and gaps at the edges of the study area.The image 188/15-17 covered 968 field plots on mineral soil sites and 198 field plots on peatland sites.Image 190/14-16 had 715 field plots on mineral soil sites and 223 field plots on peatland sites.For these satellite images the number of k used in the estimation was 5 The cross validation results of stem volume, aboveground and belowground biomass components were obtained for all tree species and by species, respectively.The relative RMSE values obtained for total volumes and biomasses were of the same magnitude.However, lower relative RMSE values were obtained for the branch and foliage biomasses compared to the stem volumes.
The results demonstrate very clearly that it is possible to estimate biomasses as precisely as stem volumes by satellite image based estimation.The RMSE values for different tree species groups were very high; in many cases the relative RMSE was over 100%.The estimation results are presented for two main satellite images of the research area in Tables 2a and 2b.
Thematic maps for the estimated aboveground and belowground forest biomasses (tons/ha) are illustrated in Figs.2a and 2b, respectively.

Discussion
When validation results of this study were compared to the results reported in other studies with similar data sources, the absolute and relative RMSE values obtained for the volume estimates by tree species were of the same magnitude (e.g.Katila and Tomppo 2001).Moreover, the relative RMSEs for stem volume and biomass were also of the same magnitude in this study.However, the relative RMSEs obtained for the branch and foliage biomasses were lower than that of volume.
In the case of boreal forests, the digital numbers of the satellite image channel values are mainly based on the reflectance from: 1) the sun-illuminated parts of the tree crowns, 2) the shadowed sections of tree crowns, and 3) the ground (i.e.gaps in the crown canopy cover).The reflectance of a forest stand is highly dependent on its structural properties, such as the size, shape, volume and density of the tree crowns, size and angle of the leaves or needles, and the canopy closure (e.g.Asner 1998, Rautiainen. et al. 2004, Lang et al. 2007).Furthermore, the reflectance is also affected by the illumination conditions and the sun zenith angle; for example, in boreal areas where the solar zenith angles are typically low, the forest undergrowth is often shadowed by larger trees (e.g.Chen and Leblanc 1997).Thus, stands with similar amounts of biomass or growing stock volumes often have different reflectance characteristics because of the aforementioned factors.
A high correlation between the measured canopy size, the stand density characteristics and the spectral characteristics of the satellite images could explain why the biomass variables, especially the biomass in foliage and needles, seem to be estimated more accurately than stem volumes.On the other hand, there is no direct interdependence between the stem volume variables and the canopy size characteristics, and thereby neither between the stem volume variables and the satellite image pixel values.Thus, the better estimation results obtained for the aboveground biomass variables were a logical outcome.
The belowground biomass variables, on the contrary, have very little direct influence on the reflectance of a certain forest stand in a satellite image.Thus the successful estimation of belowground biomass mainly depends on the performance of the biomass models and the correlation between the biomass variables and the measured tree size characteristics.Furthermore, in the satellite image interpretation phase, the estimation success depends on the correlation between the biomass variables and the crown canopy characteristics.
The thematic maps rely on the NFI measurements and also on the empirical biomass models.The unexplained variation of the biomass models (Repola et al. 2007) is higher compared to that of the volume models (Laasasenaho 1982).This, in turn, leads to the fact that there is more averaging in the biomass estimates than in the stem volume estimates.According to Laasasenaho (1982), the relative prediction error for Scots pine volume was 7-8%, while it was approximately 30-40% for the foliage biomass when estimated using a similar methodology (see Repola et al. 2007).
However, the biomass data of this study were available only for the field plots within the area of the Forestry Centre of Central Finland, and therefore the results of the study cannot be deemed to be completely representative for larger geographical areas.
The estimation results for the k-nn -derived biomass components on mineral soils indicate higher precision for the groups of pine and spruce (Tables 2a and 2b) when compared to those of the birch group.In the peatland set of data, however, the reliability figures for the spruce group showed highest uncertainty in the k-nn -based biomass estimates.Even if the reliability characteristics do not unambiguously reveal the superiority of estimates obtained for the two groups of coniferous species, it is possible that the seasonal changes in foliage biomass may have affected, especially, the reliability of biomass estimates for the birch group (Kobayashi et al. 2007, Rautiainen et al. 2009).In this case, however, the seasonal factors affecting the results should not be overestimated since the satellite images were acquired between 5th July  and 2nd September, meaning that the optical imagery originated from the main growing season when the foliage of deciduous trees has attained its physiological maturity with high assimilation capacity (Rautiainen et al. 2009).Moreover, as noted in chapter 2.5, the k-nn estimation was implemented separately for each image scene.It is therefore concluded that possible errors in the biomass estimates for the birch group caused by the seasonal variation in foliage reflectance are only minor and do not violate the generality and relevance of the results.
The results of this study support the assumption that canopy dimensions are well correlated with the amount of biomass.Thus, the plot-level biomass estimates can be used as input data when the k-nn -based multi-source forest inventory techniques are applied to the estimation of biomasses.The map form biomass estimates, in turn, should reflect other stand-level attributes such as tree size distribution, competition and canopy structure.
Spatially explicit biomass maps provide solid background information for estimating both the technical and economic potential of bioenergy.The improved information about the location and quantity of forest bioenergy is valuable due to the notable variation in the earlier estimates obtained for these resources.These maps also provide valuable information when optimal locations for energy plants are explored.In the case of the greenhouse gas inventory and the research on climate change, these maps can be used, for instance, for deriving estimates for gross primary production and soil carbon change.
The approach presented here was based on the Landsat images and prediction models for treelevel characteristics, such as height, crown ratio, stem volume and biomass.This methodology could also be applicable to the tropics to meet the monitoring demands related to deforestation and the REDD initiative related monitoring demands.
The preparation of digital, wall-to-wall maps for the forest biomass requires remote sensing imagery, additional biomass models and geographically representative ground reference data.The costs caused by inventories in the field are high, whereas other costs related to map production are relatively low.It is also important to bear in mind while judging the general applicability of the present approach that the field inventory procedures applicable to the boreal region may not be technically appropriate in other conditions.In that case the model-based estimation procedure used in this study that was used to obtain biomasses by plot needs to be carefully verified.Future studies should also search for new and more accurate methods for the tree species detection.In addition, the applicability of the present technique for the estimation of tree and biomass increment should be studied in more detail in the future.

Fig. 1 .
Fig. 1.The location of the study area (Forestry Centre of Central Finland, left), and field sample plots and land use in the study area (right).

Table 1 .
The average and maximum values of forest variables for the field plots in the study area.

Table 2a .
The estimation results for patched images 188/15-17.Units for the volumes and biomasses are m 3 /ha and tons/ha, respectively.

Table 2b .
The estimation results for patched images 190/14-16.Volume units are m 3 /ha and biomass units tons/ ha.