Calculus as we know it today was developed in the later half of the seventeenth century by two mathematicians, Gottfried Leibniz and Isaac Newton. There are two main branches of calculus: Differential Calculus and Integral Calculus. Differential calculus determines the rate of change of a quantity, integral calculus finds the quantity where the rate of change is known. “Functions” are defined by a formula.

It is useful to know that differentiation and integration are inverse functions. That means if you have a function (equation), then getting the derivative or the integral can be considered to be going in opposite directions. Look at the box on the left side of page 25 of the formulae and tables booklet. On top f(x) represents the equation or function you start with and f ‘(x) represents its derivative, so the formulas on the right of this box are the derivatives of the corresponding formulas on the left. But because differentiation and integration are inverse functions you could take the formulas on the left as your starting equations and then the ones on the left are their integrals. I know I said you do not need integration for ordinary level but where this might be useful is that if you look at the boxes for integration on page 26 and take the formulas on the right hand side as your starting function then those on the left are the derivatives.

The mathematics of limits underlies all of calculus. Limits enable you to zoom in on the graph of a curve — further and further — until it (for practical purposes) becomes straight. Once it’s straight, you can analyse the curve with regular-old algebra and geometry.

No need to be scared of calculus or of calculus symbols. They are really very simple once you know how to think about them and know what they represent. For example, often you will see the symbol *d *or perhaps *dx *or Δx in a formula. Well, *d *simply means a small amount of something*. *So, *dx *simply means a small amount of whatever x represents. Don’t try to multiply the two (*d *and *x*), they are not meant for that, just think of *dx *as a small amount of x, period. The symbol *dx *is called a differential.

The dy/dx means instantaneous change in y divided by instantaneous change in x. The Slope of a line or curve is measured by change in y divided by change in x. So between two points on a curve, the y-value of the second minus the y -value of the first, all divided by the x-value of the second divided by the x-value of the first, will give you the slope of the straight line between those two points. But we want the slope at a point, which poses some problems. How can there be any change at one point? Well, there can’t, really, but what we can do is find the change between two points which are closer to one another than any finite distance. We can determine through algebra that as you make the distance between them smaller and smaller, the change in y over change in x gets closer and closer to some definite ratio, which is the “limit” as the distance between them “approaches” but never actually reaches zero. Thus, the “dy/dx” is that ratio at an infinitely small distance, thereby effectively being the slope at one point

If you are asked to find the derivative by first principles then you use this theory to work it out. Otherwise we accept that someone has already figured out how to do differentiation and have developed rules or formulae to get the required answer. This is called doing differentiation by rule and we use the product, quotient or chain rule as required depending on the “shape” of the equation we are working on.

# Differentiation From First Principles

### Gradient of a Curve

A curve does not have a constant gradient. At any point on a curve, the gradient is equal to the gradient of the tangent at that point (a tangent to a curve is a line touching the curve at one point only). For example, the gradient of the below curve at A is equal to the gradient of the tangent at A (in green).

An approximation to the gradient at any point can be found by drawing a chord. A chord joins together two points on a curve. The closer together these two points are, the closer one gets to the actual gradient of the graph at the point in question.

Therefore in the above diagram, AB and AC are chords. The gradient at A is closer to the gradient of AB than AC, since the chord AB is shorter. Every time one makes the chord shorter, the gradient of the chord gets closer and closer to the gradient of the curve at A. Eventually, when the chord becomes so short that it is a tangent, the gradient of the graph will equal the gradient of this tangent.

### The Derivative

We can use algebra to find out what the gradient of this tangent will be.

A is any point, (x, y). To find the gradient at A, we need to find the gradient of the tangent at A. Let B be a point which is just a little further along the graph. The gradient of the chord AB is approximately the gradient of A. If the horizontal distance between A and B is called dx (“delta” x) and the vertical distance between A and B is called dy, the coordinates of B are (x + dx, y + dy).

From the coordinate geometry we know that the gradient of a straight line joining two points is:

y_{2} – y_{1}, where the two points are (x_{1}, y_{1}) and (x_{2}, y_{2})

x_{2} – x_{1}

In this case, the two points are (x, y) and (x + dx, y + dy). So substituting these values into the formula, the gradient of the chord is:

y + dy – y = dy (pronounced “delta y by delta x”)

x + dx – x dx

This is the gradient of the chord. The gradient of the curve is the gradient of the chord when the chord has no length- i.e. when it is a tangent. This will happen when dx = 0 .

The gradient of the curve is therefore:

lim ( dy )

dx→0 ( dx )

This basically means that the gradient is dy/dx as dx approaches or “tends to” (→) zero.

We can rewrite the coordinates of (x, y) as (x, f(x)) and the coordinates of (x + dx, y + dy) as (x + dx, f(x + dx)), since y is a function of x (y = f(x)).

So the gradient of the curve is:

lim (y + dy – y)

dx→0 (x + dx – x)

since y = f(x) and y + dy = f(x + dx):

Gradient is:

lim f(x + dx) – f(x)

dx→0 dx

This is denoted by dy/dx (“dee y by dee x”). dy/dx is known as the derivative of y with respect to x.

So, in summary,

dy = lim f(x + dx) – f(x)

dx dx→0 dx

#### Example

Find the formula for the gradient of the graph y = x² .

dy = lim (x + dx)² – x²

dx dx→0 dx

= lim x² + 2xdx + (dx)² – x²

dx→0 dx

= lim 2xdx + (dx)²

dx→0 dx

The dx on the denominator cancels with those on the numerator.

Therefore dy/dx =

lim 2x + dx

dx→0

When dx becomes zero, dy/dx = 2x.

Therefore the gradient of y = x² is 2x.

For example, at the point (2, 4), the gradient is 2x = 4 .