Smooth Height / Age Curves from Stem Analysis with Linear Programming

Stem analysis data defines a range of possible heights for each age. A smooth stem/age curve is obtained with linear programming (LP) when the sum of the absolute second differences of heights is minimized subject to constraints obtained from the stem analysis. The method is analogous to cubic splines. A LP problem can include additional constraints that are based on the assumption that the crosscut is randomly located within the annual height increment. The method produces smoother height curves than Issa method which is utilizing second order differences of ring counts. It was found using simulated data that the method provides better results than earlier methods for short bolts if height growth is sufficiently regular.


Introduction
In a stem analysis, stems are cut at different positions and rings are counted from the crosscuts.Ring widths are often also measured.Each ring disappearing in a bolt corresponds to a hidden tip in the bolt.The unknown locations of the hidden tips determine the height/age curve of the stem.Several authors have presented methods for estimating the height/age curves from stem analysis data, e.g., Carmean (1972) and Lenhart (1972).A review in a unified formalism and test results were given by Dyer and Bailey (1987).More recently Fabbio et al. (1994) presented an Issa method that used second differences of ring counts to get smoother height/age curves.The previous methods used stem analysis data only locally, and hence the estimated heights and height increments were not smooth over crosscuts.All methods have assumed that all height increments that are totally hidden in a bolt are of equal length.The methods are based on different assumptions how crosscuts occur within an increment that is divided into two bolts.
The method proposed in this paper will produce smoother height/age curves than the Issa method using simultaneously data from all crosscuts.The motivation for striving for smoothness is the same as why smooth regression functions are used.Even if it is known that there are fluctuations in height/age curves it is better to use smooth curves if the measurements do not indicate how the curves fluctuate.The smoothness is not a goal as such but a means for removing artificial fluctuations from the estimated curves.An estimated curve can also be too smooth if it removes nonlinear trends in addition to random fluctuations.
The definition of smoothness is taken from the theory of interpolating cubic splines (see, e.g., DeBoor 1978).Interpolating cubic splines are curves that pass through known points and minimize ′′ ∫ f x dx a b ( ) .In this paper smooth curves are obtained by minimizing the sum of absolute values of second differences of heights (note that in the Issa method second differences of ring counts are used).Because a height/age curve is not assumed to describe the within-year growth pattern, the second order difference is an appropriate measure of smoothness and not just a numerical approximation to the second derivative.
Constraints for the height/age curve, corresponding to known points in splines, are obtained from the stem analysis.The resulting problem is a standard linear programming problem.
As the resulting curve is the smoothest curve consistent with the data, it may be too smooth.Additional constraints that make the curve less smooth are obtained by assuming that the crosscut occurs randomly within the annual height increment.This sounds similar to but is not equivalent to the assumption behind the method of Carmean (1972).
The proposed method is illustrated using several bolt lengths in simulated data.The results are compared with the Issa method of Fabbio et al. (1994) and with Carmean's method that Dyer and Bailey (1987) found to produce most consistent results before Issa.
The basic ideas of the paper are simple.The implementation of the ideas in the linear programming framework is also straightforward but rather tedious.To serve readers not interested in technical details, the symbols are presented in Appendix 1, and linear programming details in other appendices.Carmean's and Issa methods are also presented in Appendix 1.

The Smoothest Possible Curve
The second order difference is a natural measure of the smoothness of a where D t + and D t -are nonnegative variables.When the sum of (3) is minimized in LP, then in the solution either D t + = D t and D t -= 0 or D t -= -D t , and D t + = 0.The complete linear programming problem producing the smoothest possible curve is shown in Appendix 2.

Crosscuts in the Middle of Annual Increment
The solution for the LP problem in Appendix 2 may be too smooth (too linear) so we may try Lappi Smooth Height/Age Curves from Stem Analysis with Linear Programming to append more constraints that would make the curve more realistic and less smooth.Assume for the moment that all bolts contain hidden tips.If there are no systematic interaction between the locations of crosscuts and height growth, then the location of a crosscut within a height increment behaves as a uniform random number.This implies that the expected location is in the middle of the height increment.Thus we might add to the basic LP problem additional constraints requiring that each crosscut is in the middle of height increment, or crosscuts are on the average in the middle of the height increments for any stem section.
The smoothness of the height curve becomes more interesting when the bolts are short.With short bolts there can be bolts without hidden tips, or equivalently an annual height increment can include whole bolts.In that case the constraint that each crosscut occurs in the middle of the height increment is not logical.The proper condition is then that the middle point of a bolt segment occurring within a height increment is in the middle of the height increment.Appendix 3 shows how to implement these constraints.Note that Carmean's method is also based on the concept that crosscut is in the middle of height increment.But this is implemented so that the crosscut occurs at a point where the distance from the previous hidden tip is half of the constant increment occurring in the previous bolt, and the distance form the next hidden tip is the half of the increment in the next bolt.In Carmean's method 'in the middle' refers to time, and in the linear programming method 'in the middle' refers to distance.If crosscuts are independent of the height growth, the interpretation given in this paper is theoretically sounder.

A Closer Look at the Solutions
The proposed problems can be directly solved with any linear programming software.If there are more than two hidden tips in any bolt, a further analysis will provide us smaller LP problems, alternative solutions and further insight.Let us assume, using a case illustrated in Fig. 1, that there are at least three hidden tips in bolt with endpoints at h 2 and h 3 .Let t 2 , and t 3 be the ages of the first and last hidden tips in the bolt (t 3 > t 2 + 1).For given H(t 2 ) and H(t 3 ) , all nondecreasing heights H(t), t 2 < t < t 3 satisfy the constraints of the LP problems proposed above.It can be shown that an optimal solution is such that H(t) for t 2 < t < t 3 is on the line connecting H(t 2 ) and, (the second differences are zero for intermediate points).That solution will be called the standard solution, and it is provided by standard LP algorithms which will keep variables zero (nonbasic) as long as the solution cannot be improved by making them nonzero (basic).The standard solution is convex in The three curves shown are examples of possible estimated height/age curves consistent with these constraints.For ages between t 2 and t 3 all such estimated heights that the overall curve between t 1 and t 4 is convex yield height/age curves having equal sum of second order differences.The solid line shows the highest such curve, provided by standard linear programming solution.The lowest curve (dashed line) is obtained by continuing the lines in [t 1 ,t 2 ] and in [t 3 ,t 4 ] to the point where they cross.The middle line (dotted line) is obtained by minimizing the sum of absolute third differences.
and the solution is concave if ≤ in ( 4) is replaced by ≥.
If the standard solution is neither convex nor concave in [t 1 , t 4 ] then the standard solution is unique, that is for other feasible solutions the objective function would increase.If the standard solution is convex or concave, then the standard solution is not unique.More specifically, if the standard solution is convex, then the properties of difference operators imply that for any general convex solution such that the value of objective function is the same, and the contribution of this section is equal to the total change of the first difference, i.e., Similarly, if the standard solution is concave, the objective function has the same value for any general concave solution and the contribution to the objective function is the negative of Eq. ( 6).
There are two possibilities to utilize the above properties.First, instead of using the standard solution we can use other solutions producing the same sum of the absolute values of the second differences.As indicated in Fig. 1, one solution is to continue the line through H(t 1 ) and H(t 2 ), and the line through H(t 3 ) and H(t 4 ) to the point where these lines intersects.For convex sections, this curve is the smallest feasible curve having the same value for the objective function, and for concave sections this is the largest curve.Note that the standard solution provides the other extreme feasible solution.The average between these extreme solutions can be taken to be a compromise solution.Another compromise solution can be obtained by putting a small penalty to the sum of the absolute values of third differences.This will force the second differences to change in smaller step than in the standard solution or in the solution obtained by continuing the lines.Appendix 4 describes how this can be implemented.
Second, we can get a smaller LP problem utilizing the fact that the standard solution is a solution to the original problem.We can reformulate the problem by considering only ages t such that there is a crosscut just below or just above t.This formulation is given in Appendix 5.It is possible to implement the constraints that force the crosscuts to occur on the average in the middle of height increments also in this formulation.

Test Results
The Carmean, Issa, and LP methods were compared using simulated data as no real data with accurate height measurements were available.Using simulated data it can be demonstrated how the relations between different methods are dependent on the smoothness of height growth and on the bolt length.Height curves are smooth if there is little random variation in height growth or the autocorrelation is large.High autocorrelation and large variance lead to curves that are locally smooth but have irregular fluctuations.
It was assumed that the expected height curve is the following form of the Chapman-Richards function: where the asymptote parameter a was assumed to be 20 m, b was 0.1 and c was 2.
It was further assumed that height increments have a multiplicative error term: where e has log-normal distribution with expected value 1, i.e. and smooth growth is illustrated in Fig. 2. Table 1 shows the test results when s gets values 0, 0.1 and 0.3, and the autocorrelation r between e t and e t-1 is 0.2 or 0.4.The results are averages over 6000 trees generated using the assumed random fluctuations (except only one curve was used when s = 0).The LP method was applied without constraints for the crosscuts, and with one overall constraint, and having three constraints for different (overlapping) parts of stem.For each case, the results were computed both with and without the additional penalty for third differences.Tables 2 and 3 collect information from Table 1 so that differences between methods are easier to notice.The principal differences between methods were:

Second Differences
The true second differences are larger the larger is the error deviation s.For given s, the larger is the autocorrelation, the smaller are the second differences.For the estimation methods, the sum of absolute second differences is naturally smallest for the LP method and largest for Carmean's method.For each method the second differences are smaller for the long bolts.

Bias in Predicted Heights
The bias in the estimated heights (estimated-true) is generally very small.Linear programming gave the largest bias (almost 10 cm) when the height growth is completely regular (s = 0) and the bolt is 200 cm.This is probably because results were computed just for one single curve.There are no differences that could be easily interpreted.If the bias would be computed separately for the convex and concave parts of the heigh/age curves, linear programming methods might have some patterns in the bias for heights.

Standard Deviation of the Prediction Errors for the Height
For short bolts and regular growth, linear programming provides the best results with respect to predicting heights.When the bolt size and irregularity of height growth start to increase,  7) when there is no random variation in growth.The dashed lines show the height increments when the height/age curves were estimated from stem analysis data when the bolt length was 50 cm using Carmean's, Issa and LP methods (subfigures).Using bolt length 100 cm results look similar but there are only two peaks in the Carmean and Issa methods.
Table 1.The height/age curves were predicted with the Carmean's method (Carm), with the Issa method and with the linear programming (LP) method using different bolt lengths.Six variants of the LP method were used.Variant 'LP1' is the basic method described in Appendix 2 (LP1).Variant 'LP1b' the basic method together with the additional penalty for third differences (see Appendix 4).Variant 'LP2' contained additional constraint forcing the average of crosscuts to be the middle the height increments (see Appendix 3).Variant 'LP2b' included the penalty for third differences.In variant 'LP3' the constraint for average location of crosscuts was done for ranges [0,0.4H], [0.3H,0.7H]and [0.6H,H]where H is the total height of the stem.Variant 'LP3b' contained again the penalty for the third differences.Six thousand 'true' curves were simulated using Eqs.( 7)-( 9) and with different values for standard deviation s and for the autocorrelation r (when s = 0, only one curve was used).D denotes the average absolute second difference both for the 'true' curve and for the predicted curves, 'hbias' denotes the average value for predicted height-true height, 'hsd' denotes the standard deviation of predicted height-true height, and 'isd' denoted the standard deviation of predicted annual height increment -true increment (as total height is known, mean of increment errors is always zero).
the Issa method provides the smallest standard deviations for height predictions.For long bolts, Carmean's method provides smallest standard deviations.

Standard Deviations of the Prediction Errors for Height Increment
With respect to height increment prediction, the relations between methods are similar with respect to height prediction.But the differences are more pronounced.For instance, for the bolt length 25 cm and s = 0, the error variance of Carmean's method for the height is 2.8 times the error variance of the LP method.But for the height increment, the error variance of Carmean's method is 13 times the error variance of the LP method.The LP method remains also optimal for longer bolts or for more irregular height growth.The Issa method is optimal for longer bolts or for more irregular height growth than with respect to the standard deviation of height predictions.

Relations between Different Versions of the LP Method
There were two modification of the LP method that were tested.First, there were one or three additional constraints that forced the average crosscut to be in the middle of annual growth segment either for the whole stem or in three different segments, respectively.Second, there was a modification that minimized absolute values of the third differences.
The differences between different versions of the LP method were small.The differences were logical in the sense that constraints decreasing the smoothness improved the results a little for long bolts and variable height growth.

Conclusions
The method presented in this paper extends the method of Fabbio et al. (1994) into its logical extreme by forcing the estimated height/age curve to be as smooth as possible by considering all second differences of heights simultaneously.The smoothness idea implemented in the method has certain intuitive appeal but if it is useful or not in practical estimation is dependent how stem analysis data are collected and for what purpose.If stem analysis is done accurately, i.e., using short bolts then it may be in balance with the overall measurement cost to estimate the height/age curves also with this more complicated method.What bolt length is sufficiently short for making the LP method good is dependent on the smoothness of the height growth itself.So tests with empirical, very accurately measured data would be needed before practical recommendations can be done.The presented method works better for annual height increments than for the accumulated height.This agrees well with the result of Fabbio et al (1994) that the Issa method compares well with Carmean's method especially with respect to the annual increment.That annual height increment is accurately measured (estimated) can be important in studies where the annual increment is related to annual diameter increment or annual variation of climatic variables.From a statistical point of view, the height/age measurements obtained from stem analysis are always problematic as the measurements contain measurement errors which do not behave according to any simple stochastic model.

Appendix 1. Symbols, and Carmean's and Issa methods
The symbols used for the stem analysis data are the same as in Dyer and Bailey (1987)

Carmean's (1972) method
In Carmean's method the locations of hidden tips are computed according to the formulation of Dyer and Bailey (1987) as follows: For hidden tips located in the first bolt this must be modified as (Fabbio et al. 1994) ( ) A and for hidden tips located in the last bolt as (Newberry 1991) .

Issa method
In the Issa method (Fabbio et al 1994) the second (central) difference of ring counts is first defined for ith crosscut as The age of ith crosscut is calculated as research articles Thereafter the heights of hidden tips are computed as Taking into account that t ij = nr i + j we note that if R i = 0 Eq. ( A7) is equivalent to (A1).

Appendix 2. LP Problem for Smoothest Possible Curve
A LP formulation for the problem described in section 'The Smoothest Possible Curve' is: subject to the following constraints:

.., ( ) B
Constraints of form (B2) define the second order difference variables D t + and D t -in terms of the heights H t .Constraints (B3) require that the height over time is nondecreasing.Constraints (B4) and (B5) express the knowledge of the initial and total heights.This information could be also substituted directly into constraints (B2) and (B3) for t = 0, and t = n -1.Constraints (B6) and (B7) express the information provided by the crosscuts.Term e in (B6) is a small positive constant used in accordance to the assumption that a ring ending exactly at the crosscut will be noticed.This way we can also directly tell whether the obtained estimate for H t

Appendix 3. Crosscuts in the Middle of Annual Increment
The crosscut at h i occurs in the middle of the annual increment if If both the bolt below h i and the bolt above h i contains at least one hidden tip, we could force crosscut i to be in the middle of the annual increment by appending constraint (C1) to the constraints (B2)-(B8).The crosscut can be forced to occur, in the average, in the middle of annual increment for a stem segment, if we take the average of (C1), or equivalently, by summing up both sides: where L s is the first crosscut in the segment s and U s is the last.Different segments can overlap.The constraint (C2) is well defined also in case some bolts do not contain any hidden tips.However, in that case H t + H t+1 for the same t appears several times in (C2), and such t may get too much weight (when computing coefficient of H t , all occurrences are counted).

Fig
Fig. 1.Illustration of possible stem analysis data.The last hidden tip in bolt [h 1 ,h 2 ] corresponds to the unknown height at age t 1 , H(t 1 ), the height of the first hidden tip in bolt [h 2 ,h 3 ] is H(t 2 ), the height of the last hidden tip is H(t 3 ) and the first hidden tip in bolt [h 3 ,h 4 ] is H(t 4 ).The stem analysis thus implies that H(t 1 ) < h 2 ≤ H(t 2 ) < …. < H(t 3 ) < h 3 ≤ H(t 4 ).The three curves shown are examples of possible estimated height/age curves consistent with these constraints.For ages between t 2 and t 3 all such estimated heights that the overall curve between t 1 and t 4 is convex yield height/age curves having equal sum of second order differences.The solid line shows the highest such curve, provided by standard linear programming solution.The lowest curve (dashed line) is obtained by continuing the lines in [t 1 ,t 2 ] and in [t 3 ,t 4 ] to the point where they cross.The middle line (dotted line) is obtained by minimizing the sum of absolute third differences.

Fig. 2 .
Fig. 2. The solid line shows the annual height incrementaccording to (7) when there is no random variation in growth.The dashed lines show the height increments when the height/age curves were estimated from stem analysis data when the bolt length was 50 cm using Carmean's, Issa and LP methods (subfigures).Using bolt length 100 cm results look similar but there are only two peaks in the Carmean and Issa methods.

Table 2 .
The relative MSE (hbias 2 + hsd 2 ) of predicted heights for different bolt lengths, estimation methods and height curve parameters, results obtained from Table1.For each bolt length and height curve set, the method with smallest MSE is scaled into one.The LP method is the method 'LP1b' in Table1.Best method is written in bold.
As can be seen from the definition of k i , it is assumed that if H t = h i then the ring of year t is noticed at the crosscut i.
is at the lower or upper end of the feasible range, and there will be no ambiguity if the estimated height/age curve is turned back to rings data.Note that H i0 and H i1 are for each crosscut i among variables H 0 ,..., H n .If there are bolts without hidden tips, then some of constraints are redundant (e.g. they can include constraints H 22 ≥ 10 and H 22 ≥ 11, so that constraint H 22 ≥ 10 could be removed).Nonnegativity constraints (B8) guarantee, for the given objective function, that for each t either D t + or D t -will be |D t |.Nonnegativity of heights H t is already guaranteed by (B3) and (B4).Constraints (B3) are redundant for such values of t that H t and H t+1 are in different bolts (increasing heights are guaranteed by constraints B6 and B7).The problem is an ordinary linear programming problem that can be easily solved.