Baptiste Ravina

PhD student - high energy physics
CERN/ATLAS/SUSY/TOP

Here’s the solution:

# Exercise 3
# Fit Data using a linear regression T^2(L)
dl = 2e-3
dt = 0.005
x = length
y = time**2

slope, intercept, r_value, p_value, std_err = linregress(x,y)
g = 4*pi**2/slope
dg = 4*pi**2*(std_err/slope**2)

print 'Slope, Error, Intercept, Correlation Coefficient:'
print slope, std_err, intercept, r_value**2
print 'g=',g,'+/-',dg,'m/s'

# Exercise 3
# Fit Data using a linear regression T(sqrt(L))
dl = 2e-3
dt = 0.005
x = sqrt(length)
y = time

slope, intercept, r_value, p_value, std_err = linregress(x,y)
g = 4*pi**2/slope**2
dg = 8*pi**2*(std_err/slope**3)

print 'Slope, Error, Intercept, Correlation Coefficient:'
print slope, std_err, intercept, r_value**2
print 'g=',g,'+/-',dg,'m/s'


So what’s going on ?

We’ve provided two snippets of code, corresponding to the two different approaches you could take at linearising the equation $T=2\pi\sqrt{L/g}$, but they have the same structure. We’re defining dl and dt, the errors on length and time respectively, but you should already know how to do this; more interestingly, why are we defining new variables for (square root of) length and time (squared)? Actually, there is no stringent reason for it, besides simply for convenience, as we could have written, e.g. sqrt(x) everywhere instead of x. The lesson here is that we’re physicists, not computer scientists, and that we should always aim at striking a balance between readability and efficiency when writing code (although writing out x=sqrt(length) is not a big deal in terms of computer power…).

As we saw in the previous exercise, linregress is a function, which takes two sets (arrays) of measurements as arguments and outputs the slope of the regression line, its intercept, the correlation coefficient, the p-value and the standard error on the slope. At this stage, there are two common mistakes : assuming that slope is $g$, and assuming that std_err is the error on $g$. The former can be avoided with a second or two of reflection, and a good hard look at the equation at hand, $T=2\pi\sqrt{L/g}$. The latter is unfortunately a bit less trivial, and we expand on it below.

Note : see here for an excellent explanation of the features of the print statement.

Calculating the error

Starting with the expression

we can define a linear fit in the form $y=mx+c$ using 2 different approaches:

1. Setting $x=L$ and $y=T^2$, the slope is then $m=4\pi^2/g$.
2. Setting $x=\sqrt{L}$ and $y=T$, the slope is then $m=2\pi/\sqrt{g}$.

Approach 1

Setting $x=L$ and $y=T^2$ results in a value of:

The error $\Delta g$ can be calculated as:

using the error on the fitted slope, $\Delta m$, and the magnitude of the partial derivative:

Hence we can calculate the final error using:

which in the example above results in

This can also be defined using the quadrature approach:

Approach 2

Setting $x=\sqrt{L}$ and $y=T$ results in a value of:

The error $\Delta g$ can be calculated as:

using the error on the fitted slope, $\Delta m$, and the magnitude of the partial derivative:

Hence we can calculate the final error using:

which in the example above results in

This can also be defined using the quadrature approach:

Difference

Even though the relative error on the slope is the same in both cases $(3\%)$, the resulting errors on $g$ differ by a factor of 2. This is due to the $g\propto\frac{1}{m^2}$ in approach 2, where the square of the slope doubles the error on $g$.

Solutions to exercise 4