Distributed Coding at sweetiQ
It’s been a long time since I really wrote anything on this blog, so it is time for a little update.
Just over a year ago I joined another startup called sweetiQ. It’s a step up from the other startups that I’ve worked for in that it has actually gone live and it is actually making money. It’s also interesting because this is the longest amount of time I’ve spent working for a single company.
The system uses RabbitMQ for driving a distributed computing cluster that essentially implements Actor model of computation, except rather than working with threads within a single machine we’re using multiple processes on multiple machines (100+). Each process has a set of endpoints that listen on different queues, RabbitMQ manages dispatching these messages to the different worker nodes.
One of my major contributions is what I’m calling a “distributed job control system.” There are many steps in each “job” that the system must do, and each step of the job may be handled by a different process across different machines. As a particular computing job becomes more complex, several problems arise:
- managing the dependency structure between various components of each job. In a simple sequential system you can do A, then B, then C; the dependency structure is very simple: if x depends on y, do y after x. In a distributed system B may not depend on A, so you can do them in parallel, but C depends on the output of both A and B so the system cannot start C until both A and B are done. It is not possible to just have the node handling C wait for a response of both A and B, because the messages from A and B may not be delivered to the same node that can handle C. Even more complex, in certain cases C may not need the computation from A but in other times it does – for example if we’re aggregating social data from different networks but the user hasn’t linked their Twitter account, we don’t need to fetch data from Twitter.
- the possibility of failure increases – sometimes a node will lose its database connection. Sometimes a node will die (exmaple: we use Amazon spot instances that at any time can just shut down). Sometimes a node that attempts to fetch something on the Internet may fail for whatever reason (the API version of Twitter’s fail whale is something that has happened relatively frequently). In this case the process needs some sort of elegant failure handling – often the solution is just to try again later. Other times we need to send a message to the user informing them that they need to take some sort of action (if a user changes their Facebook password, it will invalidate the stored OAuth token).
- need to rate-limit certain jobs – some jobs can have many sub-components done in parallel, we have encountered some problems sending out too many messages in parallel. The first and most obvious problem that we hit is that the database will choke, however once we learned how to use MongoDB properly this became a non-issue (having scaled both MySQL databases and MongoDB databases, I can tell you I am fully sold on MongoDB at this point). The bigger issue was a problem of process starvation: at peak times jobs will begin spawning at an enormous rate, and if we keep sending parallel messages for stage 1, the computation nodes spend most their time processing messages for stage 1 (Rabbit has no concept of priority). There is a need for the control system to detect starvation and alleviate it (a variety of different ways we can do this).
- recursive job types – our data is hierarchical. A user can categorize components of what they “own” in groups and sub-groups, and may want to be able to break down aggregated information at each level. Each job may need to perform itself separately on a child node, which may in turn need to perform itself on its own children, etc. Having some sort of way to easily specify recursive sub-jobs saves a ton of time.
What I ended up building is a system that takes a high-level description of a distributed job and controls that job across the different nodes. Rather than having each endpoint communicate directly with one another, they communicate with the central controller that tracks their progress and makes decisions on what control points to activate. The huge benefit of this is that it is a separation of concerns: the endpoints can focus on doing their particular computation well, while the “bigger picture” concerns like starvation and node failure can be handled by the control system.
The system can handle recursive job structures and in fact it can handle any type of sub-job: any job can trigger any other job as a child, including itself. This makes it trivially easy to run one component of a job so that you don’t need to go through everything in order to do what needs to be done. It also allows you to remain DRY: you can abstract out certain commonly-used components and use them as a “library” of sorts to compose more complex jobs.
The code is not currently available, however we are trying to figure out the legal implications of open-sourcing the software. Ideally we’ll figure all this out in the near future, and I’ll be happy to release it for everyone to play with.
Shameless company promotion: If this type of work interests you, send me a message. Like most dev shops, we’re always happy to bring in smart folks.
Vim Relative Line Numbers
A little trick I learned the other day at Montreal.rb: relative-line numbering (can be enabled with :set rnu). The current line number is zero, and the left-hand column shows the distance from your cursor. This way you can easily do commands like d5k to delete the 5 lines above the cursor, or d5j to delete the 5 lines below it.
Asynchronous Function Looping in C#
It is often the case in code that you have to do several things in a sequence since each computation is dependent on the one(s) before it:
// ... // stuff 1 // ... // stuff 2 // ... // stuff 3 // ... etc.
Good software techniques will tell you that you should break some of these up into methods:
stuff1(); stuff2(); stuff3();
If it gets big, you can even put it all in a collection and iterate (we’re starting to get into weird coding now, I don’t think anybody would actually do this):
var collection = new List<Action>() { stuff1, stuff2, stuff3 };
foreach (var func in collection){
func();
}
Now the part where this would actually be useful. What if some of these functions could potentially be asynchronous? That is, they depend on some value that may not be readily available – maybe user input, maybe some data from a network, etc. Blocking is not usually a great option – a modal dialog demands that the user pays attention to it even if there is something more important somewhere else. It would be better if this computation could “pause” and then resume later on when we get what we need. In some languages including Scheme and Ruby, you can accomplish this using a construct called callcc:
var collection = new List<Action<Action>>() { stuff1, stuff2,
stuff3 };
foreach (var func in collection){
// pseudo-code warning
call_cc(func);
}
Here, call_cc() will call func and pass in a function which will start executing right after the call_cc() call: it is a continuation of the loop. When func is done (or when it receives the response it wants), it can call this function to nicely continue executing the loop.
Unfortunately, C# 4.0 and lower do not support anything like callcc. C# 5.0 will support the await and async keywords which will accomplish exactly what we want, but for the time being we’ll have to make do with what we have. How can we do that without callcc?
Let’s give it a shot using a recursive function:
void AsyncForeach(IEnumerator<Action<Action>> iter){
if (iter.MoveNext()){
iter.Current( () => {
AsyncForeach(iter);
});
}
}
void OtherFunc(){
// ...
var collection = new List<Action<Action>>() { stuff1, stuff2,
stuff3 };
AsyncForeach(collection.GetEnumerator());
}
This would require every function in collection adhere to the Action<Action> delegate and when it is done, it will need to call the continuation manually in order to resume the computation. This is a bit annoying, and it’s why all the BeginConnect, BeginSend, etc. in System.Net require an AsyncCallback to call when they are done. The new async and await keywords will be extremely useful to accomplish our task since everything is called automatically:
var collection = new List<Action>() { stuff1, stuff2, stuff3 };
foreach (var func in collection){
// func doesn't even need to call anything to
// keep this thing going!
await func();
}
It is useful to learn this from approach though. Say we want to halt the loop prematurely from within one of the functions. In that case, the function could simply not call the continuation. That would end our recursion, causing us to break out of our loop – the equivalent of the break keyword. In order to do that with the await keyword we’d have to have some sort of exception handling system, or return type, etc.
We could go even further and implement something similar to Python’s for...else construct where if break is called somewhere in the computation it will run the else block:
for i in range(10):
if i == 5:
break
else:
# this is executed
print "should run"
for i in range(10):
if i == 12:
break
else:
# this is not executed
print "should not run"
We can do this by adding failure “continuations” to our functions:
void AsyncForeach(IEnumerator<Action<Action, Action>> iter,
Action failure){
if (iter.MoveNext()){
iter.Current( () => {
AsyncForeach(iter, failure);
}, failure);
}
}
void OtherFunc(){
// ...
var collection =
new List<Action<Action, Action>>() { stuff1, stuff2, stuff3 };
AsyncForeach(collection.GetEnumerator(), () => {
// handle failure
});
}
In this case the functions stuff1, stuff2, etc. will call the first function if they should continue looping, or call the second one in case of failure.
There’s one final tweak to all of this. At the moment there are two problems with AsyncForeach: it depends on the type of the list we’re iterating over (IEnumerator<Action<Action, Action>>), and it does not close over any variables that we may need for the loop. Can we do this using a closure?
In fact, we can:
var collection =
new List<Action<Action, Action>>() { stuff1, stuff2, stuff3 };
// declare looper early so that it closes over itself
Action looper = null;
var iter = collection.GetEnumerator();
looper = () => {
if (iter.MoveNext()){
iter.Current(looper, () => {
// handle failure
});
}
};
// don't forget to start the loop
looper();
Since this isn’t so DRY, we can top it all off with a function that returns a function:
Action GetAsyncForeach<T>(IEnumerable<T> collection,
Action<T> body){
var iter = collection.GetEnumerator();
return () => {
if (iter.MoveNext()){
body(iter.Current);
}
};
}
void OtherFunc(){
// ...
var collection =
new List<Action<Action, Action>>() { stuff1, stuff2, stuff3 };
Action checker = null;
checker = GetAsyncForeach(collection, (current) => {
current(checker, () => {
// handle failure
});
});
checker(); // start the loop
}
We now have a DRY, re-usable component for implementing an asynchronous foreach loop in our code. It’s not the most elegant approach, but it works really well and we don’t need that much extra boiler plate to get this done (if C# supported a let rec keyword, we could make it even shorter!).
This is a useful method of looping through some asynchronous tasks that you may have to do. I found myself needing this sort of thing when calling ShowDialog would lock the entire GUI while the system waited for the user to input something, however sometimes the user would have to attend to something else before responding to the dialog. Since later actions in the loop depended on the result of the dialog box, a more asynchronous method was necessary.
Ultimately, this is why I believe that all programmers should have some experience with functional programming; this is a technique that would be obvious to a programmer in Lisp or OCaml but might be a bit trickier to someone who just has OO experience. Having functional programming know-how in your toolbelt will make you a better C# programmer.
Non-deterministic Programming – Amb
I’ve been very slowly plowing through SICP and I’ve recently read through their chapter on non-deterministic programming. When you program this way your variables no longer have just one value, they can take on all of their possible values until stated otherwise. An example:
x = amb(1, 2, 3, 4)
In this case x is all of 1, 2, 3, and 4. If you try to print out x it will print out 1 because the act of printing it temporarily forces a value, but otherwise you can treat the variable as though it had all of those values.
You can then force certain subsets of the values with assertions:
assert x.odd?
In this case x would become just 1 and 3. If you then added a final assertion that x > 2 you would force a single value and x would be 3. If you instead added the assertion x > 3 then x would have no values: an exception would be thrown saying that x is basically “impossible”.
This is useful when you are searching for something. Suppose you’re trying to find numbers that satisfy Pythagorus’ theorem:
a = amb(*1..10) b = amb(*1..10) c = amb(*1..10) assert a**2 + b**2 == c**2 puts a, b, c next_value puts a, b, c
This code would print out 3, 4, and 5 on the first output, followed by 6, 8, and 10 on the second. The next_value function would tell the amb system to find another solution to the set of variables that satisfy the assertions we specified. If nothing else is found, an exception will be thrown.
Even more interesting, there is a library in Ruby called amb that implements this stuff. Unfortunately it doesn’t work just like the above example, you can only get the either the first set of values that fit, or all of them:
require 'rubygems'
require 'amb'
include Amb::Operator
a = amb(*1..10)
b = amb(*1..10)
c = amb(*1..10)
# calling amb with no arguments causes it
# to backtrack until the criteria is met
amb unless a*a + b*b == c*c
# prints out the first match
puts "#{a}, #{b}, #{c}"
# prints out every match and then crashes
amb
puts "#{a}, #{b}, #{c}"
This is a pretty cool way to program, and it would be interesting to know when it might be practical to use. I tried it with a couple of problems on Project Euler but unfortunately since the backtracking method that amb uses isn’t always the most efficient approach it would choke a bit. Perhaps if this gem gets some attention and some love, it might end up as something a bit more performant!
The Brilliant Design of Magic Ink
I’ve been plowing through Bret Victor’s Magic Ink essay and I’ve noticed a set of very interesting UI elements that I really enjoy and hope to add to my style of writing (it might even be worth it to add these features to WordPress’ publishing code).
These are:
- Anchors at every paragraph: the essay is long. When I sent it to some of the fellow programmers at work, the first thing they did was complain about how long it was. Since you don’t usually read a 50 page essay in one sitting, you need a way to bookmark not only the page you are on (useless since everything is on one page) but where you are on that page. Victor takes advantage of HTML’s ability to set anchor points within a page to anchor every paragraph so that when you don’t really want to read anymore, you can just click on the hash next to paragraph (only visible when you mouse over the paragraph) and it will update your address bar to anchor to that paragraph. This is not exactly fancy modern technology, this entire feature just takes advantage of named anchors and the fact that when you click on one the browser doesn’t reload the page. Extremely useful for long essays, yet I can’t recall seeing this anywhere else.
- Footnotes/endnotes are actually sidenotes. It’s a bit annoying when you read an essay online that has footnotes/endnotes with a star or a cross or something and you have to scroll all the way to the bottom to see those notes (some posts make it easier by providing a link, but as vi users know moving your hand to the mouse is a pain). It takes effort and there is a disconnect between when you read what you are writing and navigating to wherever the note is. By putting the note on the side next to the paragraph, it is much easier to just move your eyes to look at the note rather scrolling or clicking. This comes at the cost of screen real-estate, however if you make the actual content of your essay thinner you don’t lose a whole lot since there is still a flow when you’re reading line-to-line.
I really like these UI tweaks because they are so incredibly simple, yet somehow are not very common practice. If ever I write something that is long and actually decent enough to slog through I hope to remember these little features to help people better read my stuff.
Thoughts on Stanford’s Online Courses
Last year I enrolled in a couple of the extremely popular Stanford online classes: Intro to AI and Machine Learning. While many other people who took the courses also wrote on the topic I feel like none of the descriptions quite matched how I feel about the courses, so I think I will talk about them a little bit here.
First, the good things:
- The connection – I feel like in a course like this you have a much closer connection to the professor than you do in recorded videos like the MIT OpenCourseWare lectures or even a real classroom (with the exception of smaller class sizes) since the professor is talking directly to the camera or the camera is filming some sort of drawing surface. The professors were all extremely passionate about their topics and talked about them with so much enthusiasm (especially Sebastian Thrun) that I couldn’t help but feel more enthusiastic about the topics as well – although I didn’t really need the help, I found the courses very interesting! Despite the fact that in reality the video method of teaching is extremely impersonal, it seems much more personal because it is like a one-on-one enviroment where the professor is talking just to you instead of an entire classroom. Most of the time this makes you feel like you have to pay attention! Compare this to a class of 100+ students where if you fall asleep, zone out, or start doodling it’s not likely the professor will notice or care.
- The flexibility – Picture scenario one: you’re sitting in your living room in a comfy chair covered in a warm blanket sipping delicious beer. You’re watching the online videos on your laptop while occasionally pausing the video to go get a snack, do some chores, or check on dinner.
Now picture scenario two: you’re sitting in a hard plastic chair in a windowless room dimly lit by the projection of PowerPoint slides onto the wall (you’ve all had a class like that, don’t lie) at 8am in the morning. Some guy is chewing loudly in the seat behind you while someone, somewhere is rapidly tapping their pen against the desk.
Which enviroment would you prefer? While these are two extreme examples, personally I would prefer the former. - The technology – I found the websites modern and very easy to navigate with no clutter distracting you from what you were looking for. The two main hiccups that I noticed with the technology:
- With the AI class the videos wouldn’t always skip ahead to the next video or to the quiz – probably because I was watching the videos under Linux. The first time it happened it took a little while to realize that you could click on the question mark to access the quiz.
- With the ML class the video would appear on a “pop-up” div inside the window and the background would fade out, but if you clicked on the background it would close the video. This was annoying when you flip to another window but then click on the browser to re-focus it and end up killing the video.
These aren’t huge problems, just little bugs or UI quirks that are expected with new software. I really enjoyed the submission system for the ML class where you could submit your assignments directly from within Octave by typing
submit. I’ve had some programming classes do this type of thing and it really makes it easy to get your work done.
Some things I found so-so:
- The content – I’m going to be a little harsh. The content was very interesting for both courses, however I found myself disappointed with the depth of the knowledge. It seemed like we were only scratching the surface of the topics without going into the detail or rigour that I would expect from a full university course. Then again, this is an introductory course and it has been a long time since I took an introductory course to anything. Maybe I’m just biased.
Inevitably, the downsides:
- The assignments – I’m a person that learns by doing. I can see the things in the lecture and understand them fully, but everything tends to go in one ear and right out the other. When it comes to applying the concepts learned in the lecture I need to do it myself before the material really sinks in. In the AI class I found that the only assignment/quiz that I really connected with was the optional assignment for a simple decryption problem using NLP and statistics that was presented at the very end of the course. The rest of the assignments were just quizzes and while you do learn something by having to answer questions, I felt that the learning wasn’t quite as deep on these topics as it could have been if we were actually supposed to program something.
With the ML class they actually did have programming assignments in addition to quizzes, however with those I found that most of the time all you really had to do was translate the equations from the ones written in the assignment description to Octave and it would work. You weren’t really learning and understanding what was going on, you were essentially just copying and pasting. I feel like I would have learned a lot more if the assignments had a bit more challenge to them. - Lack of interaction – Some of my favourite classes involved a good deal of interaction between the professor and/or the other members of the class (seminar or lab classes come to mind). In these types of classes the interaction component is a huge benefit to learning compared to the StackOverflow-style Q&A forums that they had for the AI class. Because of the online nature of the courses, there isn’t really a solution to this problem that I’m aware of.
I will most certainly continue to spend free time taking these online classes, unfortunately there are too many courses that I would like to take and not enough time to take them (I had this same problem back when I was in university). One thing that would be great to see is a Khan Academy style approach where you can take the class whenever you want instead of just during the semesters that they are offered.
In the very unlikely chance that any of the Stanford profs read this, thanks! I really have enjoyed the classes and will continue taking them for a long time yet.
Dynamic Pictures
At CUSEC 2012 a programmer/designer by the name of Bret Victor gave one of the most interesting presentations at the conference (he received a standing ovation for it) on the importance of having visual connections between the code you write and what that code does.
Among other things, he showed a very interesting prototype where when you modify some Javascript code it will execute the code in real-time and show you exactly what will happen when you make those changes. On top of that, it provides a number of tools for “tweaking” constants: click on an integer and a slider will pop up above it that will modify that integer. As you move the slider the number will change, updating the visual display at the same time. Press the ‘.’ after an object and it will give you an autocomplete list, but unlike every other autocomplete system it will actually show you what will happen if you call this method. Select drawRect and you will see a little black rectangle appear. Select arc and a little black circle will appear.
The demo worked amazingly well, however he admitted afterwards that they weren’t based on a program that works but just some proof-of-concept demos that don’t actually work outside of his presentation. That inspired me to actually attempt building some of this stuff to make it easier for people to build “dynamic pictures” – that is, pictures that change over time and respond to what the user might want to do. This is a non-trivial task because it not only involves processing the Javascript code live, but also determining what might be “good” values in each case. For example if you were to be adjusting numbers in context.lineTo() you might want to go between 0 and the width/height of the canvas, but if you were in context.arc() you would want to be adjusting angles. It would involve some kind of annotations to functions to determine what the valid ranges of values are for that argument.
You can check out the basic prototype here (warning: it is very basic at the moment) and see the code on Github here. When you enter code in the right-hand panel it will execute the code and any canvas drawing done will appear in the left-hand side.
World Peak Oil
In my last post I wrote a bit about Canadian oil production vs. our proven reserves and gave some projections on how long our reserves might last. Our government says that our oil reserves will last 200 years, however I showed that that is subject to the fragile assumption that there will be no increase in production during that time. The post further shows that over the last 30 years Canadian oil production has averaged about 2.65% growth per year, so using that as a baseline for future oil growth the length of time until Canadian reserves run dry is about 70 years.
This post will blow all that out of the water and show that 70 years is much longer than a more realistic estimate.
I picked up some data from the CIA World Factbook on global oil consumption and proven reserves. From this we can see that world consumption is about 36.75 billion barrels of oil per year, and the total proven reserves is about 1.48 trillion barrels. If we do the simple calculation of just dividing reserves by consumption we get just over 40 years. This means that if oil consumption does not grow any more, we will run out of all the oil we have proven to exist in 2051. This seems like a long time away, however it is within one human lifetime: I will turn 65 in 2051.
That is assuming that there will be no growth in consumption during those 40 years. Looking at this page the average growth rate in oil consumption is 1.18% per year (using a geometric mean since this is a growth rate). If we predict that oil consumption will continue to rise at this rate, we will run out of our reserves a bit sooner: just under 33 years from now, sometime in 2044. This is less than half of our prediction from the Canadian example, and one sixth of the Canadian government’s prediction of how long our oil supplies will last.
How is it that the amount of time here is so much smaller than in the Canadian case? Why is it that the world oil reserves will somehow be drained sooner than the Canadian ones, a logical impossibility? The main culprit here is our assumption that Canadian oil production will continue to grow at 2.65% and not something higher. If you look at global oil production by country you can see that four countries (Russia, the USA, China and Iran) all have much higher production than Canada but all have lower proven reserves; they will run out of their oil far sooner than we will. When that happens, consumers around the world who want oil will have to turn to somebody else. Will our 2.65% per year increase in production be able to satisfy those consumers once other sources start running out?
As in the last post, I am stressing that this is not driven by ideology. There is no liberal or conservative bias here, it is just numbers from reasonably good sources and simple, widely known mathematics. You can repeat this analysis yourself and see that you will come to the same numbers. The R code for this one is available here, you’ll need the data that I listed above and then convert the CIA files from fixed-width format to CSV format.
Canadian Peak Oil
One thing that interested me back in my university days was peak oil – the idea that at some point, our ability to produce oil will at some point peak and then decline. It seemed very logical to me that given a finite resource and an ever-increasing demand for it, we will eventually run out. Given the drastic consequences that could happen if we happened to run out of this resource that seems to be used everywhere, it seemed like a good idea to be concerned about this problem and try to figure out how to at least minimize the damage should this disaster scenario come sometime soon.
Today I’m marginally wiser than I was then, so I like to verify myself if these types of things I hear through the media are true or not. I decided to look up some of the details about this whole peak oil thing and wonder, “when would this whole thing start going down?”
I looked on Natural Resources Canada’s web page to see some information about oil production. I found this tidbit which says, “Canada’s oil reserves are sufficient to meet demand for the next 200 years at current rates of production.” Nice! I suppose that means I don’t have to care, and let the future generations figure out how they are going to deal with this problem.
Right?
Well, maybe not. The first thing we want to do is verify that this number is actually correct. After all, while we know that the government would never lie to us, it’s always a smart idea to figure things out for yourself in case somebody over there made a mistake.
To figure out our rate of extraction, this FAQ tells us that Canada currently produces about 2.5 million barrels of crude oil per day (consistent with some data that is mentioned below). To figure out how much oil Canada has proved to have, a 2010 report by the US Energy Information Association tells us that Canada currently has about 178 billion barrels of proven oil reserves that can be extracted, including the oil sands. If you take this number and divide it by the amount of oil we produce each day, we get approximately 195 years of producing oil; the number given by the government website is more or less correct.
Does this mean all is well for the next 200 years? Well, not really. See, the bit that breaks this whole analysis is right in the quote taken from the government’s FAQ: “at current rates of production.” The 200 year figure assumes that Canadian production of oil will not grow at all over the next 200 years. Intuitively that doesn’t seem right, but let’s be scientific about this. Maybe it is in fact true that oil production is constant.
Doing a bit of a search with my trusty sidekicks Google and Wikipedia, I found some data here on the website of the Canadian Association of Petroleum Producers that shows the yearly oil production in Canada from 1971 to 2010. Rather than describe it to you, here’s a picture:
If the government’s assumption were true, this graph would be a flat line at 2.5. As you can see it is definitely not, in fact it seems to be increasing at a fairly quick rate!
What happens to our numbers if we start increasing oil production each year? Let’s find out.
With a linear model, we get that it is increasing by about 0.0336 Mbbl, or 33 000 barrels of oil per year. If this trend continues, it means that the reserves will run out in…194 years. So one year less than if we assumed it would be flat.
Of course, typically in economics it is assumed that economies grow exponentially, not linearly. Since both population and productivity due to technology seem to increase at exponential rates, it a fairly safe assumption to make that production of anything increases at an exponential rate provided nothing stops it.
When we use an exponential model, we can calculate that the oil production is growing at about 1.68% per year. If we extrapolate this out, it turns out our oil supply will run out after…87 years. That’s a bit shorter than the last time frame! Even scarier for the people living in that time, we will use up the first half of our current amount in about 60 years, with the second half being used up in a short 27 year time frame! So in short, if production keeps up as it has been, then our grandchildren (assuming you’re my age, I’m 25) will see the day when the known oil deposits of Canada will run dry.
In fact, there’s another problem with this analysis. The data set we’re using includes the 1970′s with a massive oil boom followed by a collapse in the mid-70′s that lasted until about 1980. If we re-run the exponential model after dropping the 1970′s (so we are left with 30 years of data) the rate of increase goes up to 2.56% with an R2 of 0.98 (meaning a near perfect fit). With that rate of increase, our oil will run out in 70 years. This is a lot closer than the 200 years given by NR Canada.
I tried to leave out any speculation from this post and just use numbers and math based on what has been happening and from those numbers, make a simple extrapolation to see what would happen should the existing trends continue. The growth rate of 2.56% per year may change in the future (up or down), which obviously would change the results of this simple analysis.
To add my own little bit of speculation, from what I’ve seen in the news I think that this rate of production will actually increase over time given the attitudes of the current Canadian government – as more projects such as the Keystone pipeline get completed it will be even cheaper to transport oil into the United States and other countries from Canada. An economics 101 class will tell you that as the efficiency of supply increases, the amount of production will increase: as Canadians get better at shipping the oil to other places, more oil will be produced to ship.
As always with my statistics-oriented posts, you can see the code here. In the code you can see I did a linear model with the 1970′s excluded but didn’t talk about it here, it’s because the exponential model fits both better with the data and better with economic theory.
Rob's Blog


