3 views (last 30 days)
Show older comments
Janett Göhring on 22 Apr 2013
Accepted Answer: Teja Muppirala
Open in MATLAB Online
Hello!
I have a somewhat embarrassing question, but me and my colleagues cannot figure it out since several days. Thinking block ^^ So I would appreciate help!
I have a pdf of my data called pdfxcor (598x1), which resembles a normal distribution when I plot it along a x-axis resembling the molecular weight of my data (called pixelweight (598x1)).
plot(pixelweight,pdfxcor)
This is the plot: http://imageshack.us/a/img27/1694/ploti.png
boxplot(pdfxcor)
The same data as boxplot: http://imageshack.us/a/img812/7528/boxplot.png
I want to display the distribution as boxplot according to the correct molecular weight.
Thanks for your patience! :)
Jette
0 Comments Show -2 older commentsHide -2 older comments
Show -2 older commentsHide -2 older comments
Sign in to comment.
Sign in to answer this question.
Accepted Answer
Teja Muppirala on 23 Apr 2013
Open in MATLAB Online
How about something like this. Generate the CDF from your data as Tom suggested, invert it, use the inverted CDF to generate a bunch of samples that follow your distribution exactly, and send those to BOXPLOT:
%%Just making some data that resembles yours
x = linspace(1000,12000,598);
P = normpdf(x,5800,1800);
figure, plot(x,P), title('PDF');
%%Generate the CDF
C = cumsum(P);
C = C/C(end);
figure, plot(x,C); title('CDF');
%%Sample linearly along the inverse-CDF to get a bunch of points
% that have your same distribution
BigNumber = 100000;
p = interp1(C,x,linspace(C(1),C(end),BigNumber));
figure, hist(p,100); % Confirm p indeed has your distribution
figure ,h = boxplot(p);
delete(findobj(h,'tag','Outliers')) % Hide the outliers
4 Comments Show 2 older commentsHide 2 older comments
Show 2 older commentsHide 2 older comments
Tom Lane on 23 Apr 2013
Direct link to this comment
https://support.mathworks.com/matlabcentral/answers/73084-how-to-create-a-boxplot-from-a-pdf#comment_145017
Open in MATLAB Online
I like this idea, but here's a simpler version:
p = ((1:100)/101)'; % prob value to invert
x = [norminv(p,10,2), norminv(p,15,3)]; % invert two distributions
boxplot(x)
Janett Göhring on 23 Apr 2013
Direct link to this comment
https://support.mathworks.com/matlabcentral/answers/73084-how-to-create-a-boxplot-from-a-pdf#comment_145046
Edited: Janett Göhring on 23 Apr 2013
Hi Teja and Tom,
Both are really nice solutions, but I still run into one problem with my data.
This is the cdf of my data: http://imageshack.us/a/img835/5130/cdfc.png
The histogram: http://imageshack.us/a/img41/1253/histc.png
And the resulting boxplot: http://imageshack.us/a/img138/1534/newboxplot.png
So strangly, the histogram doesn't resemble the pdf plotted against the pixelweigth. I get the same result with the inverted distribution.
thanks for your help!!
Tom Lane on 23 Apr 2013
Direct link to this comment
https://support.mathworks.com/matlabcentral/answers/73084-how-to-create-a-boxplot-from-a-pdf#comment_145056
It looks like your distribution is not symmetric. The normal distribution is symmetric, so it would not resemble the histogram in that respect.
Janett Göhring on 23 Apr 2013
Direct link to this comment
https://support.mathworks.com/matlabcentral/answers/73084-how-to-create-a-boxplot-from-a-pdf#comment_145096
Open in MATLAB Online
Hi Tom,
the curve was calculated via a Gaussian fit and is symmetric. The x-axis though is based on data, which was fitted with nlinfit and looks like a logarithmic decay. So, after correction the x-axis is not linear anymore. That's why it is so important to plot the pixelweigth against the pdf, otherwise the distribution is not symmetric anymore.
modelFun = @(p,x) p(1)*exp(p(2)*x);
In between, I calculate start parameters for the fit, which is not important for the example.
Next, I fit the pixel position and the Molecular weight of the DNA standard.
p = nlinfit(positionOfStandard, MWOfStandard, modelFun, paramEstsLin(:,1));
The pixelrange is just the y-length of my image in pixel. Here 1:598
pixelweigth = p(1)*exp(p(2)*pixelrange);
After lots of corrections of the original data I fit a Gauss fit through it and calculate the curve, mean and sigma.
cf3 = fit(pixelweigth',data','gauss1');
pdfxcor = cf3(pixelweigth)
After that I need a representation of the normal distributed data along this specialized x-axis (pixelweigth). But not as a curve ... I was asked to display it as a boxplot. And since it is a normal distribution, I thought it must be possible. But Matlab doesn't give an option in "boxplot" to specify a different axis.
thanks for the help! much appreciated :)
Sign in to comment.
More Answers (1)
Tom Lane on 22 Apr 2013
The boxplot shows the median, lower quartile, and upper quartile. You may be able to calculate these for your pdf. For example, if you have the pdf as a numeric vector, you might compute cumsum on the vector, then divide by the last value to impose the correct probability normalization, then interpolate.
The boxplot also shows a notion of the range of the data, and sometimes outliers. These are harder to extend to a pdf. You could decide that you want to compute the 1% and 99% points as in the previous paragraph, and use those to represent the end points of the range. You could decide not to show outliers.
Plotting these as lines or points will be relatively simple. It would be more of a challenge to plot them in exactly the way that the boxplot function does.
1 Comment Show -1 older commentsHide -1 older comments
Show -1 older commentsHide -1 older comments
Janett Göhring on 23 Apr 2013
Direct link to this comment
https://support.mathworks.com/matlabcentral/answers/73084-how-to-create-a-boxplot-from-a-pdf#comment_144971
Hello Tom, thanks for your answer! Can you explain how to interpolate in this case?
For my problem I created two solutions, but I don't like both.
a) I gauss fit my original data to create the pdf, mean and sigma. Then, I sample with randn (1Mio) & the mean and sigma as parameters. This creates a normal distribution based on my fit which can be plotted via boxplot. Since I already fit my original data with a gaussfit, I am not very interested in the outliers. I just was asked to represent the normal distribution as boxplot for easier comparison of mean and range of data. So, I would feel much better when I wouldn't have to sample a new distribution and of course it takes ages to calculate.
b) I calculate mean and the quartiles of the pdf and extract the respective position from the pixelweigth. Then I draw a barplot(colored for the upper quartile and white for the lower quartile) with error bars. I couldn't make this work, since the pdf is only normally distributed when it is plotted against the pixelweigth.
Bit stuck there ^^ Thanks for your help! Jette
Sign in to comment.
Sign in to answer this question.
See Also
Categories
AI, Data Science, and StatisticsStatistics and Machine Learning ToolboxProbability DistributionsExploration and Visualization
Find more on Exploration and Visualization in Help Center and File Exchange
Tags
- boxplot
- plot
- normal distribution
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!
An Error Occurred
Unable to complete the action because of changes made to the page. Reload the page to see its updated state.
Select a Web Site
Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: .
You can also select a web site from the following list
Americas
- América Latina (Español)
- Canada (English)
- United States (English)
Europe
- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)
- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- Deutsch
- English
- Français
- United Kingdom(English)
Asia Pacific
- Australia (English)
- India (English)
- New Zealand (English)
- 中国
- 日本Japanese (日本語)
- 한국Korean (한국어)
Contact your local office