I Can Predict
Earthquakes Too! Appendix Section 2 - Comparison of EQDB to Stan Deyo's 'Forecasts' copyright © 2006-2009, Brian Vanderkolk - |
Overview The primary driving force for me to develop the Earthquake Dart Board was as a rebuttal to the earthquake forecasts of Stan Deyo. In this appendix I cover the method by which I determined the criteria for my predictions based on an analysis of historical maps made by Stan, and an analysis of Stan's success rate for the same period as my first 90 days of predictions. Determining my prediction criteria In order to do a proper comparison, I needed the characteristics of my predictions to be as similar as possible to Stan's. I needed a similar number of predictions per day, for the prediction windows to be the same length, and for the size of the prediction circles to be similar. However, most of these parameters varied from day to day on Stan's maps, so I decided to go by a long term average. To determine the average characteristics of Stan's forecast maps, I downloaded 209 days worth of maps - those from 5-1-05 thru 10-22-05. This date range is actually more than 209 days, but there were several days where Stan did not publish any maps, or they simply were not archived. Time Window The length of all my prediction windows is the same as the length of Stan's - 5 days. This is as stated on each of Stan's maps. Number of predictions per day To determine how many predictions per day I should make, I simply determined the average number of circles per day that were on Stan's maps. This actually was not as simple as just counting circles. Although most of Stan's circles were circles or ellipses, several of the areas were formed from compound shapes. That is, two or more circles overlapped. Due to the method Stan used to draw the circles, only the outline of the compound shape is shown. I decided that such areas would not be counted as one, but rather as the sum total of the individual circles making the final shape.
Now, it was simply a matter of tallying up the total number of circles per map, adding it all together, and dividing by the number of maps. The actual number of circles per map ranged from 0 to as many as 29 circles. The average was about 11.85 circles per day. Therefore, I decided that my maps would have exactly 11 predictions per day, every day, thus being similar too, but actually less than the historical long term average of Stan's maps. Size of prediction circles The size of Stan's circles varied quite a bit. Some were quite small, only a few hundred kilometers across. On occasion there were behemoths well over 4000 kilometers across. However, most circles were all about the same size, and were typically around 2000km diameter, or 1000km radius. Therefore I chose all my circles to be exactly 1000 km in radius. Determining hits Determining which quakes were hits or not should be something that is straight forward. Either it is a hit or it is not. Unfortunately, this is difficult to determine for Stan's maps. At first glance, one would expect that for the quake to be a hit, it should fall within the boundary of the mapped circles. However, Stan seems to employ a technique which I have called "close enough". For example, on October 8th, 2005, a magnitude 7.6 quake struck the Hindu-Kush region of northern Pakistan. Stan claimed this quake as a hit. Since Stan's maps are good for five days, it was easy to examine previous maps to see if there was indeed a circle covering the region. Unfortunately, I could find none. The closest circle, which was posted two days prior, would have to have nearly doubled in size in order for the quake's epicenter to be inside of it.
I contacted Stan via email to express my concern over this claimed hit, Stan's reply included the following statements,
However, I chose to avoid such ambiguous criteria and decided to clearly state in the rules for my predictions that the boundary is defined as exactly 1000km distance from the center of my circles, which are given as latitude and longitude coordinates, and the formula for determining the distance between the prediction and the epicenter is given as well. That way there is no room for any sort of "close enough" strategies. This "close enough" strategy plays a significant role in comparing my results for the first 90 days of my predictions to Stan's results for the same time period. But more on that in the comparison section below. Magnitude criteria Most of Stan's claimed hit's were for the larger quakes. Since he makes reference to the USGS recent quakes website, and the quakes listed there are greater than magnitude 4 globally, I decided to make my quake predictions for any quake magnitude 4 and above. However, Stan has also claimed hits for quakes as small as magnitude 2.8. But, I stayed with the criteria of M4. Checking Stan's Results Stan does not make a habit of publishing the results of his own predictions, rather just making claims for significant quakes as they occur. Therefore it was up to me to determine the accuracy of his claims. It was not an easy task to do so. Unlike my predictions, which were clearly defined, all I had to go on for Stan's predictions were the maps themselves. To avoid as much subjective judgment as possible, which could lead to questions about my fairness of the analysis of Stan's maps, I decided to automate as much of the process as possible. First was dealing with the problem of "close enough". I have no idea what goes on in Stan's mind. I cannot determine objectively what Stan would consider a "close enough" hit or not. Therefore, I decided to eliminate the problem entirely. This also helps to balance the comparison of my results to his. I do not claim any hits for quakes outside my circles, therefore I will not consider any quakes outside of his circles. Automated checking of hits by a computer program is a logical way to eliminate any guessing on my part. It also makes the process go much faster. To begin with, I needed a way to tell the computer where Stan's circles were. Unlike my method where I state a latitude and longitude, all I have are some circles drawn on a map in Photoshop. Since image editing software was used to draw the circles, it made it easy for me to use such software to create an index map for the computer to use. It is obvious that Stan used a shape tool to draw the circles and ellipses on his maps. I used the same tool, but made sure the circles were solid, and not just outlines. I also did not use any anti-aliasing to make soft edges. I made my circles just large enough to completely cover Stan's circles. I did this on a separate editing layer so that I could then black out the background, leaving only two colors on the map - one representing areas that were outside of any predictions, and another representing areas that were in or on one of Stan's prediction circles.
Now that I had my index map, all the computer had to do was to select a map, go through the list of earthquakes from the NEIC, compute the position on the map for the quakes epicenter, and determine the color for that quake's location. If the color was black, it was a miss. If it was green, it is counted as a hit. The computer automatically kept track of all the data and reported the results. This took care of determining the number of candidate quakes and the number of hits for Stan's maps for the time period in question. This gives the hit ratio. To determine the successful prediction ratio still required a little work on my part. The program was designed to take the index map and generate a copy of that map but with circles plotted where the hits were. It was then a simple task for me to view each of these maps to determine the number of predictions that had hits. I had also counted the number of predictions made, and the ratio of these two gives the prediction success ratio.
My next concern is with the time window. It's a given that the time window is five days long. It says so right on his maps. I presume there is no fudge factor involved here. That is, if a quake occurs six days later it is not counted as a hit by Stan if it falls within one of his circles. I've never noticed this occurring. However, I have no clue when his time windows start. I clearly define my windows down to the second. I only state the time to the minute, as the seconds are assumed to be zero. On Stan's website, he states that the data for which he bases his maps on "generally becomes available by 3:30pm Mountain." Further, he states that his "analysis should be uploaded by 6pm daily." I presume he means Mountain time there as well. So, generally, his maps are posted sometime between 3:30pm and 6:00pm Mountain time. I don't know when his maps are actually uploaded, so admittedly I am making a generalized assumption, but for this analysis I decided to assume that all his maps start at 6:00pm Mountain time, which during the time frame involved is 0100 UT the next day. All my maps are marked in UT and the analysis software works with UT time and Julian days. Conversion is a simple matter. The Results I'll just let the numbers speak for themselves. Here's the results:
Summary Stan Deyo claims a high success rate for his forecasts. I don't know how he can claim this level of success when he doesn't even check for the hits himself! All he does is make a point of the few significant quakes that he happens to hit after checking places like the NEIC near real time quake list. Like he says here from an email,
But does Stan care? Probably not. He still gets all the attention he wants by making his warnings and getting himself heard over late night talk radio shows. I think the following lines from a couple emails sum up his attitude,
Analysis Details This is the detailed analysis for each of Stan's 88 maps during the first 90 days of my EQDB project. There were two days he did not publish maps.
|