Archive for the ‘Cross-Cultural’ Category.
Oct 27th, 2008| 09:24 am | Posted by hlee
The notions of missing data are overall different between two communities. I tend to think missing data carry as good amount of information as observed data. Astronomers…I’m not sure how they think but my impression so far is that a missing value in one attribute/variable from a object/observation/informant, all other attributes related to that object become useless because that object is not considered in scientific data analysis or model evaluation process. For example, it is hard to find any discussion about imputation in astronomical publication or statistical justification of missing data with respect to inference strategies. On the contrary, they talk about incompleteness within different variables. Putting this vague argument with a concrete example, consider a catalog of multiple magnitudes. To draw a color magnitude diagram, one needs both color and magnitude. If one attribute is missing, that star will not appear in the color magnitude diagram and any inference methods from that diagram will not include that star. Nonetheless, one will trying to understand how different proportions of stars are observed according to different colors and magnitudes. Continue reading ‘missing data’ »
Tags:
bootstrap,
catalog,
Efron,
estimator,
ignorable,
imputation,
incompleteness,
Little,
MAR,
MCAR,
missing data,
nonparametric,
Rubin,
Schafer,
survey Category:
Astro,
Cross-Cultural,
Data Processing,
Stat |
2 Comments
Oct 10th, 2008| 01:09 pm | Posted by hlee
I do not like to be serious. papers…papers…papers. Off from papers for bridging two, allow me to talk about something relevant to the cultural difference between astronomers and statisticians. I hope this could generate a series of comments. Continue reading ‘Off the line’ »
Oct 9th, 2008| 04:28 pm | Posted by hlee
Without signal processing courses, the following equation should be awfully familiar to astronomers of photometry and handling data:
$$c_k=\int_\Lambda l(\lambda) r(\lambda) f_k(\lambda) \alpha(\lambda) d\lambda +n_k$$
Terms are in order, camera response (c_k), light source (l), spectral radiance by l (r), filter (f), sensitivity (α), and noise (n_k), where Λ indicates the range of the spectrum in which the camera is sensitive.
Or simplified to $$c_k=\int_\Lambda \phi_k (\lambda) r(\lambda) d\lambda +n_k$$
where φ denotes the combined illuminant and the spectral sensitivity of the k-th channel, which goes by augmented spectral sensitivity. Well, we can skip spectral radiance r, though. Unfortunately, the sensitivity α has multiple layers, not a simple closed function of λ in astronomical photometry.
Or $$c_k=\Theta r +n$$
Inverting Θ and finding a reconstruction operator such that r=inv(Θ)c_k leads spectral reconstruction although Θ is, in general, not a square matrix. Otherwise, approach from indirect reconstruction. Continue reading ‘[tutorial] multispectral imaging, a case study’ »
Tags:
matrix,
Mona Lisa,
multispectral,
noise,
signal processing,
signal processing magazine,
Tutorial Category:
Algorithms,
arXiv,
Cross-Cultural,
Data Processing,
Fitting,
Imaging,
Methods,
Quotes,
Spectral,
Stat,
Uncertainty |
2 Comments
Oct 8th, 2008| 07:55 pm | Posted by hlee
All of a sudden, partially owing to a thought provoking talk about visualization by Felice Frankel at IIC, I recollected a book, The Grammar of Graphics by Leland Wilkinson (2nd Ed. – I partially read the 1st ed. and felt little of use several years ago because there seemed no link for visualization of data from astronomy.) Continue reading ‘[Book] The Grammar of Graphics’ »
Oct 8th, 2008| 01:31 am | Posted by hlee
In order to understand a learning procedure statistically it is necessary to identify two important aspects: its structural model and its error model. The former is most important since it determines the function space of the approximator, thereby characterizing the class of functions or hypothesis that can be accurately approximated with it. The error model specifies the distribution of random departures of sampled data from the structural model.
Continue reading ‘A Quote on Model’ »
Tags:
error model,
Friedman,
Hastie,
model,
structural model,
Tibshirani Category:
Astro,
Cross-Cultural,
Jargon,
Methods,
Quotes,
Stat |
1 Comment
Oct 1st, 2008| 04:16 pm | Posted by hlee
People of experience would say very differently and wisely against what I’m going to discuss now. This post only combines two small cross sections of each branch of two trees, astronomy and statistics. Continue reading ‘survey and design of experiments’ »
Tags:
213,
AAS,
Alanna Connors,
catalog,
census,
detection,
experimental design,
Long Beach,
special session,
SPS,
survey Category:
Astro,
CHASC,
Cross-Cultural,
Data Processing,
Jargon,
Methods,
Misc,
News,
Stat |
3 Comments
Sep 26th, 2008| 11:49 pm | Posted by hlee
To my personal thoughts, the history of astronomy is more interesting than the history of statistics. This may change tomorrow. Harvard statistics department (chair Xiao-Li Meng) organizes a symposium titled
Quintessential Contributions:
Celebrating Major Birthdays of Statistical Ideas and Their Inventors
When: Saturday, September 27, 2008, 9:45 AM – 5:00 PM
Where: Radcliffe Gymnasium, 18 Mason Street, Cambridge, MA
Continue reading ‘Quintessential Contributions’ »
Tags:
Gosset,
Harvard,
history,
S.M.Stigler,
student t,
symposium Category:
Bayesian,
Cross-Cultural,
Frequentist,
News,
Quotes,
Stat |
1 Comment
Sep 18th, 2008| 07:48 pm | Posted by hlee
Another deduced conclusion from reading preprints listed in arxiv/astro-ph is that astronomers tend to confuse classification and clustering and to mix up methodologies. They tend to think any algorithms from classification or clustering analysis serve their purpose since both analysis algorithms, no matter what, look like a black box. I mean a black box as in neural network, which is one of classification algorithms. Continue reading ‘Classification and Clustering’ »
Tags:
black box,
book,
catalog,
Classification,
clustering,
haste,
outliers,
R,
Robert Serfling,
semi-supervised learning,
survey Category:
Algorithms,
arXiv,
Astro,
Bad AstroStat,
Cross-Cultural,
Data Processing,
Frequentist,
Jargon,
Methods,
Stat |
Comment
Sep 17th, 2008| 02:11 pm | Posted by hlee
I’ve been joking about the astronomers’ fashion in writing Markov chain Monte Carlo (MCMC). Frequently, MCMC was represented by Monte Carlo Markov Chain in astronomical journals. I was curious about the history of this new creation. Overall, I thought it would be worth to learn more about the history of MCMC and this paper was up in arxiv: Continue reading ‘A History of Markov Chain Monte Carlo’ »
Tags:
BUGS,
data augmentation,
EM,
Gibbs sampling,
Hasting,
history,
Metropolis,
reversible jump,
simulated annealing Category:
Algorithms,
arXiv,
Bad AstroStat,
Bayesian,
Cross-Cultural,
Data Processing,
Imaging,
MC,
MCMC,
Methods,
Quotes,
Stat |
2 Comments
Sep 16th, 2008| 03:20 pm | Posted by hlee
A nice book by Christopher Bishop.
While I was reading abstracts and papers from astro-ph, I saw many applications of algorithms from pattern recognition and machine learning (PRML). The frequency will increase as large scale survey projects numerate, where recommending a good textbook or a reference in the field seems timely. Continue reading ‘[Book] pattern recognition and machine learning’ »
Tags:
Bishop,
catalog,
machine learning,
pattern recognition,
PCML,
SPS,
survey Category:
Algorithms,
Astro,
Cross-Cultural,
Data Processing,
Jargon |
Comment
Sep 12th, 2008| 11:30 pm | Posted by hlee
To claim results are powerful statistically, astronomers highly rely on eyeballing techniques (need apprenticeship to acquire skills but look subjective to me without such training). Some cases, I know actual statistical tests to support or to dissuade those claims. Hence, I believe astronomers are well aware of those statistical tests. I guess they are afraid that those statistics may reject their claims or are not powerful enough in numeric metrics. Instead, they spend efforts to make graphics more appealing. Continue reading ‘appealing eyes == powerful method’ »
Sep 10th, 2008| 10:15 am | Posted by hlee
Physicists believe that the Gaussian law has been proved in mathematics while mathematicians think that it was experimentally established in physics — Henri Poincare
Continue reading ‘Why Gaussianity?’ »
Tags:
CLT,
Gaussianity,
Henry Poincare,
IEEE,
normal,
signal processing,
signal processing magazine,
Why Category:
arXiv,
Cross-Cultural,
Data Processing,
Fitting,
Frequentist,
Methods,
Physics,
Quotes,
Stat,
Uncertainty |
Comment
Sep 5th, 2008| 08:46 pm | Posted by hlee
The problem with data analysis is of course that it is a performing art. It is not something you easily write a paper on; rather, it is something you do. And so it is difficult to publish.
quoted from this conversation Continue reading ‘A Conversation with Peter Huber’ »
Tags:
art,
Babilonian astronomy,
computers,
computing,
computing history,
conversation,
FFT,
history,
Peter Huber,
project pursuit,
robust statistics,
robustness Category:
Algorithms,
arXiv,
Cross-Cultural,
Data Processing,
Jargon,
Languages,
Quotes |
Comment
Sep 5th, 2008| 08:28 pm | Posted by hlee
My greatest concern was what to call it. I thought of calling it “information”, but the word was overly used, so I decided to call it “uncertainty”. When I discussed it with John von Neumann, he had a better idea. Von Neumann told me, “You should call it entropy, for two reasons. In the first place your uncertainty function has been used in statistical mechanics under that name, so it already has a name. In the second place, and more important, nobody knows what entropy really is, so in a debate you will always have the advantage.”
Continue reading ‘An anecdote on entrophy’ »
Tags:
anecdote,
Entropy,
epistemology,
Information,
Information theory,
Shannon,
von Neumann,
Wikiquote Category:
Cross-Cultural,
Jargon,
Quotes,
Uncertainty |
Comment
Sep 2nd, 2008| 08:13 pm | Posted by hlee
The whole story can be found from the page 8 of IMS Bulletin, Vol.37 Issue 7. (click for the pdf file) Continue reading ‘Irksome’ »