Wednesday, October 1, 2014

Analysis of short-centered PCA

There has been a lot of to and fro (see recent posts) about short-centered PCA - MBH style (background). My general view here is that there is an effect on the first principal component, but that merely moves to a particular alignment of basis vectors, and doesn't affect any reasonable reconstruction. And it all happened so long ago...

But still it's an interesting analytic problem, so I did some analysis. It particularly concerns the role of persistence, or autocorrelation. Increased autocorrelation tends to show more effect, because for a given number of proxies, it diminishes the effect of noise relative to the underlying short-centering operator (graphs below). But that underlying pattern is interesting, because it is much less dependent on autocorrelation. And the next PC's in succession are interesting, as I wrote about here.

Tuesday, September 30, 2014

What Steve McIntyre won't show you - now

but he did once. It's what the end effect is of all these interminable claims about mining for hockey sticks etc. What it actually does to a reconstruction.

That was shown in MM05EE, their 2005 Paper titled:
"The M&M critique of the MBH98 Northern Hemisphere climate index: update and implications". I've shown that plot in an appendix here, and I'll show it again below. But for here, I'll show the plot with the MBH decentered and M&M centered superimposed.



Update - this graph changed slightly, as Brandon noted. It comes from an earlier  post comparing what happens in 1400-1450 with and without Gaspe cedars. The first posted was with, which corresponded to MBH, and so is more appropriate, in my view. The second agrees, as you can see, with M&M. Since the difference doesn't affect my point, I'll let that stand. The change was not intended - I was widening the smoothed curves for visibility. The first version is here.
The agreement is very good between, say, 1800 and 1980. These are the years when all the alleged mined hockey sticks should be showing up. They aren't. There is a discrepancy in the earlier years, which DeepClimate explains here. When Wahl and Ammann corrected various difference between the M&M emulation and MBH99 (in their case) this early period discrepancy disappeared. But anyway, for now what is important is that both centered and uncentered agree very well in the period where decentering is supposed to be mining for hockeysticks. Steve Mc has said very little about this recon since it was published, so much so that when Wegman was pressed (properly) by Rep Stupak at Congress on why he hadn't shown the results of a corrected calculation, he had no good answer, and didn't refer to M&M2005, even though he was supposed to be familiar with the code (which he had) which did it. I think this has become a very inconvenient graph.

I described earlier how apparent PC1 alignments have an effect on reconstruction that disappears rapidly as more than one PC are used. I'll give below a geometric explanation for this, which explains the irrelevance of MM)5 fig 2, which resurfaced as Wegman's fig 4.2.

Sunday, September 28, 2014

More ClimateBall at Climate Audit

Steve McIntyre has a new post up at ClimateAudit. It's called "What Nick Stokes won't show you". It's a continuation of the smokescreen about demanding the unselected PC1s be shown with orientation favorable to a hockey-stick interpretation, using a hockey stick index (HSI), rather than as his program produces them. Again pretending that it's about Wegman aligning the orientation, rather than selecting the top 1% by HSI without disclosure.

He's made some pretty outrageous claims about how people here are, well I'll quote: "Some ClimateBallers, including commenters at Stokes’ blog, are now making the fabricated claim that MM05 results were not based on the 10,000 simulations reported in Figure 2, but on a cherry-picked subset of the top percentile. "

I wrote a substantive response to this, soon after the post appeared. It went into moderation - I've posted the text here. About four hours later it disappeared from the queue; I don't know what is happening there. Steve says he'll look in the morning. (It's here).

I reran the code to get some quantitative HSI numbers for the various cases, matching those described in detail here, and pictured here. It's a new run, not exactly matching. Here are the numbers, matching the cases described . For unselected sets, it's the mean absolute value HSI:

Decentered PCA (MBH98)        Centered PCA
Selected 100 out of 10000 by HSI     1.981.60
Not Selected1.610.65

The mean of 1.61 for unselected decentered (MBH) matches the mid-range figure in Steve's post. The difference between 1.98 and 1.61 made by selection may not seem so great, but these are like t-values. And it shows when centered but selected has almost the same mean HSI. The undisclosed selection is about as effective in creating HS appearance as decentering.

Emphasising the compression of the t-like HSI scale, centered unselected, which shouldn't have any HSI effect, and don't seem to, still show a mean (absolute) HSI of 0.65.

Anyway, below the jump I'll show various plots relevant to Brandon's contention that they should be oriented.

Friday, September 26, 2014

There's more to life than PC1

There's PC2, PC3, ...

Recent interest in PCA and paleo has got me doing some stuff I should have done a while ago. I think it is bad that Steve McIntyre and Wegman have been able to maintain focus on just the first component PC1, leading people to think they are talking about reconstructions. They aren't, and that's why, whenever someone actually looks, the tendency of Mann's decentered PCA to make PC1 the repository of HS-like behaviour has little effect on recons. I'll show why.

Steve's post showed Fig 9.2 from the NAS report as an example of an upright PC1. That's got me playing with the NAS code that generated it. It's an elegant code, and easily adapted to show more eigenvalues, and do a reconstruction. So I did.

Mann pointed out many years ago that M&M had used too few PC's in their recon. Tamino explained that PCA simply created a different basis, aligned to some extent with real effects, which may be physical. But there is conservation involved - if HS behaviour is collected in PC1, then it is depleted in PC2,3 etc, and in the recon, it averages out.

And it does. For the NAS example, I'll show how the other PC's do have complementary behaviour, and since the HS effect of decentering isn't real physical, but drawn from other PCs, it doesn't last when you use more PCs, as they do.

Thursday, September 25, 2014

ClimateBall at Climate Audit

There's a post at Climate Audit on Kevin O'Neill's comments exposing aspects of the Wegman report. I would like to respond there, but am currently not able to. All my comments go to spam, and at CA, they don't re-emerge.

I'll say a little about this situation. It affects my interaction with all Wordpress blogs. Last month I was temporarily banned at WUWT, in circumstances I describe here. The mechanism is that I was designated a spammer, and my comments went to spam. After a week or so, I tried commenting again, but same result. This apparently was picked up by Akismet, and my comments at CA started going into moderation, then into spam. Same at other Wordpress blogs.

I can comment using my Twitter ID, but CA does not allow that. WUWT nominally does, but my comment was removed because Twitter substitutes my Twitter address for the email address. So I'm out there too.

Anyway, back to CA. Back in 2010, DeepClimate noted some strange features of the Wegman report. There was much plagiarism, but also the statistics had some very odd features. One concerned the trumpeted claim that Mann's algorithm would create hockey sticks out of red noise input. Wegman showed a dozen profiles generated by red noise. He said in the caption to Fig 4,4:

"One of the most compelling illustrations that McIntyre and McKitrick have produced is created by feeding red noise [AR(1) with parameter = 0.2] into the MBH algorithm. The AR(1) process is a stationary process meaning that it should not exhibit any long-term trend. The MBH98 algorithm found ‘hockey stick’ trend in each of the independent replications."

As DC found, what they had actually done, using M&M's code, was to do 10000 runs with red noise input, select the top 100 by hockey stick index, and then select randomly from that 100. I described the consequences of this here. I showed, inter alia, that selecting that way gave hockey sticks whether you used Mann's off-centre PCA or centered PCA.

Brandon Shollenberger responded by trying to move the goalposts. The selection by HS index used by Wegman had the incidental effect of orienting the profiles. That's how DC noticed it; the profiles, even if Mann's algorithm did what Wegman claimed, should have given up and down shapes. Brandon demanded that I should, having removed the artificial selection, somehow tamper with the results to regenerate the uniformity of sign, even though many had no HS shape to base such a reorientation on. And so we see a pea-moving; it's now supposed to be all about how Wegman shifted the signs. It isn't; its all about how HS's were artificially selected. More recent stuff here.

So now Steve McIntyre at CA is taking the same line. Bloggers are complaining about sign selection: "While I’ve started with O’Neill’s allegation of deception and “real fraud” related to sign selection,...". No, sign selection is the telltale giveaway. The issue is hockey-stick selection. 100 out of 10000, by HS index.

Update. It seems that if I disown my WP id, and change my name slightly, I advance at CA from the spam bin to the moderation queue (probably as a first time commenter). That can be a long wait too, but we'll see.
Update. In comments, Rachel from "Engineering Happiness" made a helpful suggestion about contacting Akismet. I followed advice, and someone emailed me. Not solved yet, but we're working on it. Thanks, Rachel.

Monday, September 22, 2014

Mesh peel

I've long been interested in what can be done with irregular triangular meshes. And lately I've been using them a lot, in apps like this. If you have a set of data on the Earth with no special pattern, like temp measurements, then the best way for both analysis and graphics is to create an irregular mesh of triangles joining the points. Here is an HTML 5 version which allows you to display the mesh.


In earlier times in finite elements, problems on meshes were often solved with direct solvers, which greatly benefit from a banded matrix. This depends on node numbering. On a regular grid, you'd number by rows; in an irregular mesh something of that can be achieved by an advancing front, which passes through every node, and numbering is by order encountered.

I have found two current requirements for a front. One is in a shorter way of defining the mesh, for web transmission. And the other is in WebGL. For the latter, I've developed a more easily visualized method, which, as you might guess, leads to pretty pictures. And yes, WebGL. It describes the mesh as a peel. The rind is one triangle thick - each triangle has two nodes on one side, and one on the other. Well, mostly. Details below.
Update: I found that the original mesh had some incorrectly oriented triangles. The method assumes consistent orientation, so it's rather surprising that the algorithm ran to completion. Anyway, it looks much more regular now.

Tuesday, September 16, 2014

August GISS Temp up by 0.18°C

GISS has posted its August estimate for global temperature anomaly. It rose from 0.52°C in July to 0.7°C in August. TempLS rose by 0.1°C, which I commented was in line with the rise in SST. GISS once again is jumpier, and like TempLS is back to the high levels of April/May.

The comparison maps are below the jump.