Dan-Erik

En jägmästare från Piteå
It is currently 21 Oct 2014, 02:31

All times are UTC + 1 hour [ DST ]




Post new topic Reply to topic  [ 1 post ] 
Author Message
 Post subject: Constructing correlated data (95% CI for a model fit curve)
PostPosted: 11 Mar 2010, 13:35 
Offline
Site Admin

Joined: 18 Oct 2007, 09:51
Posts: 55
Location: Umeå, Sweden
I have a small sample of data from smolt (juvenile Atlantic salmon) passing through a hydroelectric power plant. The data is a list of fish-size and 0/1 for death/survival. Data looks like this:
Code:
smolt2005 =
  Map[{#[[1]], #[[2]]/10} &, {{1, 220}, {1, 220}, {1, 195}, {1,
     175}, {1, 220}, {1, 175}, {1, 205}, {1, 180}, {1, 235}, {1,
     200}, {1, 195}, {1, 195}, {1, 195}, {1, 215}, {1, 220}, {1,
     195}, {1, 190}, {1, 185}, {1, 215}, {1, 215}, {1, 190}, {1,
     210}, {1, 200}, {1, 215}, {1, 205}, {1, 175}, {1, 225}, {1,
     180}, {1, 220}, {1, 220}, {1, 234}, {1, 177}, {1, 207}, {1,
     242}, {1, 228}, {1, 225}, {1, 216}, {1, 239}, {1, 197}, {1,
     200}, {1, 194}, {1, 194}, {1, 182}, {1, 232}, {1, 240}, {1,
     203}, {1, 178}, {1, 188}, {1, 216}, {1, 245}, {1, 261}, {1,
     230}, {1, 206}, {1, 260}, {1, 166}, {1, 167}, {1, 202}, {1,
     204}, {1, 225}, {1, 233}, {1, 175}, {1, 195}, {1, 197}, {0,
     230}, {0, 235}, {0, 218}, {0, 230}, {0, 240}, {0, 185}, {0,
     240}, {0, 260}, {0, 248}, {0, 262}, {0, 197}, {0, 235}, {0,
     195}, {0, 192}, {0, 222}}];

So I want to get a model fit for this data and plot it. Then I want to plot the confidence interval for the model. So I started out by doing this:
Code:
logmf = LogitModelFit[smolt2005[[All, {2, 1}]], x, x]
logmf["ParameterConfidenceIntervalTable"]
logmf["CovarianceMatrix"]
logmf["CorrelationMatrix"]

Followed by applying the Confidence Interval Table to functions and plot what I thought to be confidence intervals:
Code:
c1[x_] := 1/(1 + E^(-3.24531 + 0.0913045 x));
c2[x_] := 1/(1 + E^(-15.261 + 0.631024 x));
Plot[{1 - logmf[x], 1 - c1[x], 1 - c2[x]}, {x, 17, 30}]

As you can see (if you are running this in a notebook), those aren't confidence interval functions at all. So what to do? With a bit of help from Kjell Leonardsson at SLU and http://www.markdiamond.com.au/2009/07/c ... ated-data/ I found a way to randomize a large matrix and then force it to be correlated with my original data so I can then use that large matrix to get the 97.5 and 2.5 Quantiles for a 95 % confidence limit.
Code:
(* Create a matrix with uncorrelated random numbers *)
rlen = 10000;
tmp = Table[{RandomReal[], RandomReal[]}, {rlen}];
* Introduce correlation on the uncorrelated matrix *)
cormat = tmp.CholeskyDecomposition[logmf["CovarianceMatrix"]];
(* Unfortunately, there is a rescaling taking place that needs to be \
countered *)
Covariance[3.49*cormat]

Now, this rescaling that is done last needs to be done manually. If you find a good way to automate this rescaling process, I would be glad to know about it. So the number 3.49 may need some adjusting depending on what random numbers you are getting. The output of this Covariance[] should be as close as possible to the logmf["CovarianceMatrix"] above.
Code:
(* Apply the rescaling to the forced correlated matrix *)
cormat2 = cormat*3.49;
Mean[cormat2]
(* Adjust mean *)
cormat3 =
  Map[{#[[1]] + (Covariance[cormat2][[1, 1]] -
        Mean[cormat2][[1]]), #[[
       2]] + (Covariance[cormat2][[1, 2]] + Mean[cormat2][[2]])} &,
   cormat2];
(* Covariance should now match fairly well *)
Mean[cormat3]
Covariance[cormat3]
logmf["CovarianceMatrix"]

If your output here isn't matching up very well, go back and adjust the 3.49-value above, then re-run the last few lines of code.
Code:
logmf["BestFit"]
flist = Table[
   1/(1 + E^(-a - b x)) /. {a -> cormat3[[i, 1]],
     b -> cormat3[[i, 2]]}, {i, rlen}];
flist[[1 ;; 10]]

This should give you an opportunity to compare the original LogitModelFit function with the new set of functions that will be used to plot confidence intervals.
Code:
fCI = Table[{x, 1 - Quantile[flist, 0.025], 1 - Median[flist],
    1 - Quantile[flist, 0.975]}, {x, 15, 30, 1}];
ListLinePlot[{fCI[[All, {1, 2}]], fCI[[All, {1, 3}]],
  fCI[[All, {1, 4}]]}, PlotRange -> All]

If you did everything right... You should be looking at a plot with a close representation of the original curve and then two curves defining 95% confidence intervals for the model fit. If it doesn't look like the first plot you made, you need to make some adjustments to the 3.49-value to rescale the data to a better fit.

_________________
Ga Ryu - the bowing dragon


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 1 post ] 

All times are UTC + 1 hour [ DST ]


Who is online

Users browsing this forum: No registered users and 1 guest


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
cron
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group  
Design By Poker Bandits