Introduction Part II and Solving Equations
1 Numerical Solution to Quadratic Equations
Recall from last lecture that we wanted to find a numerical solution to a
quadratic equation like

One obvious attempt at a solution would use the familiar quadratic formula:

But, while it is trivial for a computer to perform additions, subtractions,
multiplications, and
even divisions, it is unclear what algorithm we could use to calculate the
square root. Even if we
have such an algorithm, we have to ask if using the quadratic formula is the
most efficient way
to solve this equation. As we shall show now, we can use the square root
algorithm we proposed
in the last class to solve general quadratic equations, making the use of the
quadratic formula (2)
unnecessary (and, in fact, inefficient).
2 Finding Square Roots and Solving Quadratic Equations
2.1 Finding Square Roots
As we discussed last time, there is a simple scheme for approximating square
roots to any given
precision. More formally, we can find a nonnegative solution to the quadratic
equation:

using an iterative method that allows us to control the precision of the
solution. The idea is that
we find a function that, given an approximation of the solution as input (xold),
outputs a more
precise approximation (xnew). If we use this function iteratively by recycling
the values it produces
on each iteration, we will arrive at better and better solutions. We can
continue this process until
we reach any level of precision we want.
Recall that the equation we used was:

An interesting property of this algorithm is that the precision of the output
(xnew) doubles the
precision of the input (xold). As we shall see later, this speed of convergence
on the actual solution
will be an important factor in evaluating numeric algorithms.
2.2 Solving Quadratic Equations
Now that we have a scheme for solving a restricted kind of quadratic
equation, can we use the
scheme to solve our original problem? The answer is yes. To solve equations of
the form

We simply need to add another term to the denominator of the formula:

We can use this new formula iteratively to arrive at numerical solutions of
the quadratic equation
that are arbitrarily precise. Each iteration will take only 5 operations (2
multiplications, 2 additions,
and a division).
So, it should now be clear that algorithms that are good for finding
numerical solutions are
often quite different from the methods we use to find analytic solutions.
3 Calculating Definite Integrals
Another good example of the difference between numerical computation and
analytic solutions is
in calculating definite integrals. Recall that the definition of the definite
integral of a continuous
(and we will assume positive) function f(t) over the interval [a, b],
f(t)dt
is the area under the
curve, as shown in Figure 1.

Figure 1: The Definite Integral of f(t) over [a, b]
What is the best way to calculate the definite integral? Our experience in
math classes suggests
that we might use the anti-derivative, that is a function F′ = f:

But this brings up two important questions:
1. How do we determine an anti-derivative function?
2. How difficult is it to evaluate the anti-derivative function at a and b?
The answer to question 1 may seem simple - we all learned to calculate
anti-derivatives in
Calculus class. It is important to remember that there are often situations
where we do not have
a simple rule for determining the anti-derivative.
There are, likewise, many examples of anti-derivative functions that are more
difficult to evaluate
than the original function, for example:

So is there a method for calculating the definite integral that uses only
information we already
know about the function? The answer is yes, we can use the definition of the
definite integral to
approximate its value, as shown in Figure 2. We start by splitting up the
interval into subintervals
of equal size h, then estimating the definite integral over each subinterval,
then adding them all
together. If we split the interval [a, b] into N subintervals of equal width
we can calculate
the definite integral as:


Figure 2: Approximating the Definite Integral Using Subintervals
We have reduced the problem to estimating the definite integral over smaller
subintervals. Here
are two possible strategies for estimating the smaller definite integrals:
Rectangle Rule

In the rectangle rule, we calculate the area of the rectangle with width h
and height determined
by evaluating the function at the left endpoint of the interval.
Midpoint Rule

In the midpoint rule, we calculate the area of the rectangle with width h and
height determined
by evaluating the function at the midpoint of the interval

Figure 3: The Rectangle and Midpoint Rules
These two methods are depicted in Figure 3. Using these methods, we can
calculate the definite
integral by simply evaluating the original function at definite points, which we
should always be able
to do (otherwise, we don’t know what the function is). We can control the
precision in each case
by changing the size of the subintervals h - the smaller the subinterval, the
greater the precision.
The next question is, how can we decide which of these algorithms to use?
Intuitively, it seems
that the midpoint rule would be a better choice, but how can we quantify this
intuition?
3.1 Evaluating Numeric Algorithms
In general, we evaluate numeric algorithms using 3 major criteria:
Cost - The amount of time (usually calculated as the number of operations)
the computation will
take
Speed - How quickly the algorithm approaches our desired precision. If we
define the error as the
difference between the actual answer and the answer returned by the numeric
algorithm,
speed is often measured as the rate at which the error approaches 0.
Robustness - How the correctness of the algorithm is affected by different
types of functions and different
types of inputs. The algorithm for calculating square roots turns out to be
extremely robust,
it will work for any square root, even if we give it a terrible initial value.
3.2 Evaluation of Algorithms for Definite Integrals
Applying these criteria to the midpoint and rectangle rules:
Cost Each algorithm performs a single function evaluation to estimate
the definite integral over a
subinterval, so the cost is identical. The total cost over the entire
computation grows linearly
with the number of intervals N in each case, meaning that there is some fixed
constant c1, not
dependent on the number of intervals, such that the total cost is c1N = O(N). For
example,
if the number of intervals doubles, the cost will double as well.
Speed Clearly as the size of each subinterval h gets smaller, the
error gets smaller as well. Is there
any difference between the speed of the rectangle and midpoint rules?
Rectangle Rule -For the rectangle rule, the error approaches 0 in direct
proportion to the
rate at which h approaches 0. In other words, the error
for some fixed constant
c1, or
.
For example, halving the size of h will halve the amount of error.
Midpoint Rule -For the midpoint rule, the error approaches 0 in
proportion to the rate at
which the square of h approaches 0. That is the error
for some fixed constant
c2, or
.
For example, halving the size of h will divide the error by 4.
Robustness It turns out that for functions that are reasonably
well-behaved, these methods are
quite robust. There do exist situations, however, in which it is possible to
evaluate a function
at the left endpoint of an interval, and not at the midpoint. It is an
interesting exercise to
try to come up with such a function.
Through this evaluation, we have shown that, in general, the midpoint rule is
a far better choice
than the rectangle rule for approximating the definite integral, since, for the
same cost, it will give
us much more precision.
4 Loss of Significance
One final way that numerical computation is different than the way you have
done mathematics to
this point is the potential loss of significance. Consider two real numbers
stored with 10 digits of
precision beyond the decimal point:
π = 3.1415926535
b = 3.1415926571
If we subtract π from b, we will get the answer 0.0000000036, which has only
two significant
digits, so that we have lost most of the precision of the two original numbers.
One might try to
solve this problem by increasing the precision of the original numbers, but this
is not a solution. For
any finite precision, numbers that are close enough together will be
indistinguishable. An example
of the effect of loss of significance occurs in calculating derivatives.
Example 4.1 (Calculating Derivatives). Given the function f(t) = sin t, what
is f′(2)?
We all know from calculus that the derivative of sin t is cos t, so one could
calculate cos(2) in
this case. However, just as with calculating definite integrals, it is not
always possible, or efficient
to evaluate the derivative of a function.
Instead, recall that the value of the derivative of a function f at a fixed
point t0 is the slope
of the line tangent to f at t0. Recall, also, that we can approximate this slope
by calculating the
slope of a secant line that intersects f at t0 and at a point very close to t0,
say t0 + h, as shown in
Figure 4. We learned in calculus, that as h → 0, the slope of the secant line
approaches the slope
of the tangent line, or:

Just as with the definite integral, we could approximate the derivative at t0
by performing
this calculation for small values of h. However, notice that as h gets very
small, the numerator
of Equation 12 will become 0, due to loss of significance, so that we will
erroneously calculate all
derivatives as 0.
Preventing loss of significance in our calculations will be an important part
of this course.

Figure 4: Tangent and Secant Lines on f(t) = sin t
5 Solving Equations
Now that we have some idea of what algorithms for computation are, we will
discuss algorithms
for performing one of the most basic numerical tasks, solving equations in one
variable.

Figure 5: Graphical Solution for x^3 = sin x
Consider the following equation, with solution depicted in Figure 5.

There is no analytic solution to this equation. Instead, we will present an
iterative method for
finding a numerical solution.
5.1 Iterative Method
As in our solution to finding square roots, we would like to find a function
g, such that if we input
an initial guess at a solution, g will output a better approximation of the
actual solution. More
formally, we would like to define a sequence
with
such that

converges to the solution. We can continue calculating values xi until we reach
two values with

Clearly, one property of g is that g(r) = r, where r is the real solution to
Equation 13. A
transformation of Equation 13 (e.g. x = sin x − x^3 + x) will satisfy this
property, the trick is to
choose the transformation that produces a converging series of values
