Modeling the Relationships among Internal Defect Features and External Appalachian Hardwood Log Defect Indicators

As a hardwood tree grows and develops, surface defects such as branch stubs and wounds are overgrown. Evidence of these defects remain on the log surface for decades and in many instances for the life of the tree. As the tree grows the defect is encapsulated or grown over by new wood. During this process the appearance of the defect in the tree’s bark changes. The defect becomes flatter and its dimension changes. This progressional change in appearance is predictable, permitting the size and location of the internal defect to be reliably estimated. This paper concerns the development and analysis of models for the prediction of internal features. With the advent of surface scanning and external detection systems, the prediction of internal features promises to significantly improve the quality, yield, and value of sawn wood products.


Introduction
Accurate information about the size, shape, and location of internal hardwood log defects is the key to dramatically improving the value and quality of lumber sawn.Historically, the research effort examining the relationships among external indicators and internal defect features has been sporadic.This is surprising given that external log defects provide clues about internal defect features.Given the advent of computerized log scanning systems able to detect defects on the log surface (Thomas et al. 2006), there is more interest in developing reliable internal defect prediction models.The implementation of these models in conjunction with scanning and detection software promises to provide internal defect information using simple laser surface scanning equipment.Schultz (1961) studied German Beech and found that the ratio of the bark distortion width to distortion length was the same ratio of the stem when the branch stub was encapsulated to the current stem diameter.However, for species with heavier, irregular bark, like hard maple, Shultz found that it was difficult to judge the clear area above the defect using this method.Similarly, Shigo and Larson (1969) discovered that for many hardwoods the ratio of defect height to width indicates the depth of the defect with respect to the radius of the stem at the defect (Fig. 1).Hyvärinen (1976) used sugar maple (Acer saccharum) defect data collected by Marden and Stayton (1970) to examine the relationships of external indicators to grain orientation and defect encapsulation depth.The sugar maple defect data was collected from 44 trees obtained from three sites in upper Michigan.Hyvärinen found strong correlations among encapsulation depth and bark distortion width, length, and rise using linear regression methods.The best simple correlation (r = 0.66) was with diameter inside bark (DIB) that had a 0.66-inch standard error of estimate.The final model used stepwise multiple linear regression to find a strong correlation (r = 0.74) with DIB and distortion length to encapsulation depth.
Perhaps the most in-depth examination of the external/internal defect relationship was conducted in Canada on a softwood species.Lemieux et al. (2001) conducted a similar study using 21 black spruce trees (Picea mariana) collected from a natural stand 75 km north of Quebec City.Three trees, each with three logs were selected from which a total of 249 knot defects were dissected and their data recorded.It is interesting to note that the researchers found better correlations between external indicators and internal features on the middle and bottom logs than the upper logs.Strong correlations (r > 0.89) among the length and width of internal defect zones and external features such as branch stub diameter and length were found.The researchers modeled the defects using three distinct zones corresponding to how the penetration angle changes over time in black spruce.The penetration angle is the angle at which a line through the center of the defect intersects the log surface.The Lemieux et al. study examined only branches that had not been dropped nor pruned, thus preventing an examination of encapsulation depth.
The largest known study examined correlations among external defect size and type populations and their impact on the grade or quality of the lumber sawn from those logs.This study examined thousands of logs and several hardwood species.The results from this study were used to develop the hardwood tree and log grading rules (Rast et al. 1973).The correlation of external indications to internal defects was limited to relating log grade to lumber grade yield (Hanks 1976).Although useful, this study only grossly considered the relationships among external and internal defects.
X-ray/CT (computed tomography) has been the traditional research path to determining internal defect structures in recent years (Chang 1992, Thawornwong et al. 2003).This research has typically used medical CT scanners modified for industrial use.However, CT scanning technology remains expensive with slower than mill speed data acquisition speeds.Further, the technol-   This research takes an alternative approach to CT scanning for determining internal log defect characteristics.Like the earlier studies by Lemiux and Hyvärinen, this research seeks to estimate internal defect characteristics based on external indicator measurements.The scope of this study was larger than the earlier ones in that more samples were collected and the model sought to predict more internal features for a greater variety of defect types.
In an earlier study the relationships among the most commonly occuring yellow-poplar defect and their internal features examined (Thomas 2008).Using a series of multiple linear regression analyses, this study quantified the relationship among external defect indicators and internal defect characteristics for yellow-poplar (Liriodendron tulipifera) logs.Based on these results, models were developed to predict internal features (penetration angle, penetration depth, midpoint cross-section diameters, and encapsulation depth) using visible external features (log diameter; indicator width, length, and rise).Good correlations and small prediction errors were observed with sound (sawn), overgrown, and unsound knot defects.For less severe defects such as adventitious buds/clusters and distortion type defects, weaker correlations were observed, but the magnitude of prediction errors were small and acceptable (Thomas 2008).
However, that research dealt with a single species and did not address clustered knot defects due to an insufficient number of samples.Clustered knot defects in general are larger and are associated with more severe internal defect features.This paper determines if data from multiple clustered (unsound, sound, and overgrown) knot defect types can be pooled and used to develop accurate internal prediction models.In addition, this paper examines defects in red oak (Quercus rubra), one of the most important commercial hardwood species in the Appalachian forest.

Sample Collection
Red oak and yellow-poplar defect samples were collected from three sites in the central Appalachian forest in North America (Fig. 2).Fig. 2 also shows the native ranges of red oak and yellow-poplar (Little 1971).Site 1 was located in northern West Virginia at an elevation of approximately 700 m.Site 2 was located in southern West Virginia at an approximate elevation of 790 m.Site 3 was located between these sites and at an elevation of 975 m.Site 2 was the southernmost and also had the lowest elevation.The winters at this site are significantly milder than those of sites 1 and 3. Further, site 2 was on abandoned farm with very good soils.Site 1 was the northernmost and was on very rocky soil.Site 3 was located approximately between sites 1 and 3 and was at the highest elevation.Table 1 lists the number and types of defects collected from each site by species.
A species sample from a site consisted of a random selection of 33 trees.The goal was to collect four defects of each type from each tree whenever possible.For example, if there were eight sound knots on a tree, every second sound knot was selected.Not all trees had four defects of each type.In other cases, selecting one defect would prevent another from being selected due to defect overlap.Whenever this occurred, the defect that was least common on the tree was given preference and collected and a different occurrence of the other defect collected.

Sample Processing
All defects were identified using the methodology defined by Carpenter et al. (1989).Once a defect was identified, the section containing it was cut from the log.The average length of the defect sections was approximately 300 mm.The sample was discarded if it was found during slicing that the part of the interior defect was not completely contained in the section.An alignment groove was cut into the top of the section in a line between the indicator and the pith.This provided a smooth area for ring counts and a reference point for collection and position of the defect cross-sections.An identifying tag was stapled to the section with the surface indicator and all sections photographed (Fig. 3).

Statistical Methods
Within each species the defect data from different sites were combined and grouped by defect type.
A series of chi-square tests were used to test for outliers within each defect type grouping (Komsta 2006).Any data identifi ed as an outlier was examined and corrected if in error.The R statistical  2).
The independent variables used were surface indicator width (SWID), length (SLEN), rise (SRISE), and log diameter at the defect (DIB).These variables were selected because they are easily measured when the log is inspected or laser scanned (L.Thomas et al. 2006).AREA (SWID * SLEN), SLEN 2 , SWID 2 , VOLUME (SWID*SLEN*SRISE), as well as all possible interaction term combinations with DIB also were tested for any correlation with internal feature measurements.The dependent variables representing the internal features for which prediction models are sought were: encapsulation depth (CLEAR), penetration depth (DEPTH), penetration angle (RAKE), and cross-section width and length at penetration depth midpoint (HWID and HLEN).Using these fi ve variables an approximate internal structure of a log defect can be determined (Fig. 4).

Results
The severe defect types overgrown, sound, and unsound knots and knot clusters have the greatest impact on log value and utility.For analyses and discussion of the less severe defect types refer to earlier work by Thomas (2008) column reports the mean of the absolute value of the residual errors for the fitted equations.MAE indicates the +/-range that can be expected using the fitted equation to predict internal features.

Overgrown Knots
The number of overgrown knot defect samples for each species were about the same, 159 red oak samples compared to 163 yellow-poplar samples.All model development and model testing correlations were significant (α < 0.01).Overall, the model development results of both species were similar.Correlations (multiple adjusted R 2 ) among external indicator measurements for HWID, HLEN, and RAKE features differed little among species (Table 2).However, the correlation of depth to external features was much stronger with yellow-poplar than with red oak, R 2 value of 0.76 versus 0.45.In addition, with model development the Mean Absolute Error (MAE) was smaller for all variables with yellow-poplar.
The model testing correlation coefficients, (R), were much stronger with yellow-poplar for the HWID, HLEN, and DEPTH features.However, the correlation of RAKE with external indicator measurements was stronger with red oak, R = 0.60 versus R = 0.46.The MAE was smaller with yellow-poplar for most variables, HLEN, RAKE, and DEPTH.However, overall the difference in the mean absolute prediction errors was less than those observed with model development.

Sound Knots
All yellow-poplar model development and testing correlations were significant.In addition, all red oak model development correlations also were significant.However, only correlations with HLEN for red oak were significant with the model testing data set.The main cause of this is that there were only 55 red oak sound knot samples, compared to 72 yellow-poplar samples.When the red oak sample was divided into model testing and development sets, only 18 samples were available for model testing.In response to the sample limitation for red oak, the data for sound, unsound, and overgrown knots were grouped together.The analysis of the grouped data is discussed in section 3.4.
Strong correlations for all internal features were found with the yellow-poplar defect data.The model development adjusted multiple R2 values varied from 0.63 to 0.73.In addition, the mean absolute error values ranged from a low of 7.9 mm with HWID to a high of 14.5 mm for HLEN.Model testing results for yellow-poplar sound knots showed similar strong correlations, with R values ranging from a low of 0.71 with RAKE to a high of 0.87 with HLEN.The MAE values for the predicted feature measurements ranged from 14.2 mm with DEPTH to 20.8 mm for HLEN.RAKE had a mean absolute error of 13.8 degrees.

Unsound Knots
Overall some of the weakest correlations were observed with the unsound knot samples.There are two reasons for this.First, by definition an unsound knot contains rot, the extent of which is not predictable based on external features.Second, there were fewer unsound knots, 61 red oak and 39 yellow-poplar samples, than overgrown and sound knots.Thus, the number of samples for model testing with both species was low.Predictably, in the model testing results only one variable of each species was significantly correlated with external features.In an attempt to develop and test a model for predicting unsound knots, the unsound knot samples were grouped with the sound and overgrown knot samples.The grouped knot data is analyzed in the next section.Despite the difficulties described above, the model development correlations (multiple adjusted R 2 ) for both species were all significant (α < 0.01).With the exception of RAKE (adjusted multiple R 2 = 0.36) all correlations for yellow-poplar unsound knots ranged from 0.65 to 0.70.RAKE also had the lowest correlation with red oak (adjusted multiple R 2 = 0.54).Correlation coefficients (adjusted multiple R 2 ) for the other features ranged from 0.63 to 0.87.

All Knots Grouped
Grouping the samples from the overgrown, sound, and unsound knots generated large sample sizes for both species, 275 red oak and 274 yellow-poplar samples.All model development and model testing correlations for both species were significant (α < 0.01).Comparisons between results from the grouped knot samples and the individual results varied by species and knot type.

Yellow-Poplar
For yellow-poplar model development, correlations with the grouped knot data were slightly weaker than those discovered with the pure sound and unsound knot data sets.However, correlations with the grouped knot data were slightly stronger than those discovered with the single overgrown knot data set (Table 2).Further, often the mean absolute error measurements of the grouped data were comparable to the smallest MAE value of the best pure data sample analysis, regardless of knot type.Thus, grouping the data had little overall impact on the accuracy of the predicted internal features during model development.
With the model-testing subset, most correlations with the prediction equations were stronger with the grouped knot data than with most of the pure overgrown and unsound knot samples.For all sound knot internal features better correlations were realized using the pure sound knot defect data.In addition, the MAE values of the grouped knot data for most internal values were less than or approximately the same as those observed with the single overgrown, sound, and unsound knot data sets.

Red Oak
The correlations of external feature measurements to internal feature sizes for the grouped knot data were weaker than those discovered with the pure sound and unsound knot data sets during model development.Similarly, the grouped knot data also had higher MAE values than most of those observed with fitting of the single overgrown, sound, and unsound knot data sets (Table 2).However, with the model-testing subset, most correlations with the prediction equations were stronger with the grouped knot data than with most of the single knot sample sets.The only exceptions being with the slight advantage of the overgrown knot RAKE (R = 0.60 versus R = 0.55) and DEPTH (R = 0.62 versus R = 0.60) features.Further, the MAE values for the predicted internal features in most cases were smaller with the grouped data set.The MAE values for HLEN and HWID were 10.67 mm and 23.62 mm, respectively.The reason for the large differences in performance when comparing the model testing and model development results of the pure knot samples with the grouped sample is sample size.The limited number of pure knot samples prevents an accurate pure model from being developed.

All Knot Clusters Grouped
Cluster knot defects are composed of two or more overgrown, sound, or unsound knots such that their surface indicators are merged or intertwined together.In nature, these defects occur less frequently than other types of knot defects.The small number of overgrown, sound, and unsound knot cluster samples required grouping the clustered knot samples for well-founded statistical analysis.The red oak clustered knot sample consists of a total of 85 samples.However, there are only 23 yellow-poplar knot cluster defect samples.Even with grouping, the number of yellow-poplar knot cluster samples are too few for the development of reliable prediction models.Thus, knot cluster model analysis is limited to red oak defects.
The model development correlations (adjusted multiple R 2 ) of HWID, HLEN, and DEPTH with surface features were significant while those with RAKE were not (α < 0.01) (Table 2).The cluster

Discussion
Although this study examined log defects from different sites in the Central Appalachian mountains, it is believed that these are representative throughout the natural range of the two species.Carpenter's (1950) examination of surface indicators found that although the frequency and occurrence of defects within a given species vary by region, in general the same indicator will be found with its defect in the underlying wood.For example, as growth rate varies from site to site and region to region, the defect encapsulation rate will differ also.However, the rate at which the encapsulation occurs, and the degree to which the defect is encapsulated will be indicated in the defect's bark pattern.Thus, the faster the diameter growth, the faster the defect is encapsulated, and thus the faster the bark distortion pattern changes.
Overall the correlations among external and internal defect features were stronger with yellow-poplar.Further, grouping all yellow-poplar knot samples together resulted in stronger correlations for most internal features.Grouping all red oak knot samples together resulted in slightly weaker correlations in model development, but stronger correlations were observed when testing the model.All correlations for yellow-poplar and red oak were significant with the grouped knot data.However, more yellow-poplar clustered knot samples are needed, as there were not enough samples for analysis after grouping.Grouping the red oak and yellow-poplar clustered knot samples together also did not yield significant correlations among internal and external features.Grouping the red oak cluster knot samples yielded signifi-cant correlations in most cases.The success of grouping knot types together indicates that knot classification in automated scanning and detection systems may not be critical, if grouped internal models are used.
The size, type, and location of defects on hardwood lumber dictate the grade and value of each board.Optimized log breakdown must consider internal defect features and how they will impact the value of the boards produced.Thus, a key problem to optimizing the sawing of hardwood logs is acquiring internal defect data.The development of models to predict internal defect features based on external indicators is one solution.In addition, recent advances in image processing have produced methods capable of accurately detecting moderate to severe defects with 12mm or more of surface rise using laser range data (Thomas et al. 2006).Further development of the prediction models and scanning methods and a joint confirmation study is planned.
analysis program was used to perform stepwise multiple linear regression analyses on the grouped defect data (R Development Core Team 2006).Within each species/defect group, the data was randomly partitioned into two sets.The fi rst set contained approximately two-thirds of the data and was used for model development.The second set held the remaining one-third of the data and was used for model testing and validation.The model testing samples were used to analyze the model's predictive capabilities.The regression equations generated with the model development samples were used to predict internal feature measurements.The correlation coeffi cient r, the mean absolute error (MAE), and the signifi cance level of the correlation were determined for each defect type and feature combination (Table

Fig. 3 .Fig. 4 .
Fig. 3. External indicator and series of internal defect sections for a heavy distortion defect from a red oak log.

Table 1 .
Numbers of defect samples collected by species, site, and type.ThomasModeling the Relationships among Internal Defect Features and External Appalachian Hardwood Log Defect Indicators . Correlation results for model development and testing for severe defects are presented in Table 2.The table is broken down into two main sections, red oak and yellow-poplar.Within each section, the results for model development and testing are listed by defect type.The significance level used to evaluate all correlations, development, and testing was α < 0.01.The mean absolute error (MAE)

Table 2 .
Correlations among external defect measurements and internal features.
ThomasModeling the Relationships among Internal Defect Features and External Appalachian Hardwood Log Defect Indicators knot adjusted multiple R 2 values were comparable to those of the other red oak knots.However, the MAE values for the cluster knots were higher, ranging from 16.8 mm to 29.2 mm.Model testing revealed that the correlations with DEPTH and the testing set were not significant.