- Sort Blog:
- All
- APIs
- CSS
- Excel
- Linear Algebra
- mathpy
- Modding
- Numerical Analysis
- Python
- R
- Random Number Generation
- Set Theory
- SQL
- Statistics
- Uncategorized

November 14, 2017

I am very excited to announce the release of Petpy v1.0! Petpy is a simple to use wrapper for the Petfinder API written in Python. The Petfinder API enables users...

September 12, 2017

I am excited to announce the release of mathpy 0.3.0! This release adds a ton of Excel UDFs including many new statistical and number-theoretic functions, several random number generators and...

September 7, 2017

Combined linear congruential generators, as the name implies, are a type of PRNG (pseudorandom number generator) that combine two or more LCGs (linear congruential generators). The combination of two or...

August 31, 2017

Multiplicative congruential generators, also known as Lehmer random number generators, is a type of linear congruential generator for generating pseudorandom numbers in [latex]U(0, 1)[/latex]. The multiplicative congruential generator, often abbreviated...

August 30, 2017

My Python library, mathpy, a collection of mathematical and statistical functions with Excel integration, has a new release! Version 0.2.0 introduces a ton of additional mathematical and statistical functions have...

August 24, 2017

A Linear congruential generator (LCG) is a class of pseudorandom number generator (PRNG) algorithms used for generating sequences of random-like numbers. The generation of random numbers plays a large role...

August 17, 2017

Simpson’s rule is another closed Newton-Cotes formula for approximating integrals over an interval with equally spaced nodes. Unlike the trapezoidal rule, which employs straight lines to approximate a definite integral,...

August 10, 2017

The Trapezoidal Rule is another of Closed Newton-Cotes formulas for approximating the definite integral of a function. The trapezoidal rule is so named due to the area approximated under the...

August 3, 2017

Numerical differentiation is a method of approximating the derivative of a function [latex]f[/latex] at particular value [latex]x[/latex]. Often, particularly in physics and engineering, a function may be too complicated to...

July 27, 2017

The divided differences method is a numerical procedure for interpolating a polynomial given a set of points. Unlike Neville’s method, which is used to approximate the value of an interpolating...

July 19, 2017

Neville’s method evaluates a polynomial that passes through a given set of [latex]x[/latex] and [latex]y[/latex] points for a particular [latex]x[/latex] value using the Newton polynomial form. Neville’s method is similar...

July 13, 2017

Polynomial interpolation is the method of determining a polynomial that fits a set of given points. There are several approaches to polynomial interpolation, of which one of the most well...

July 6, 2017

Ordered and Unordered Pairs A pair set is a set with two members, for example, [latex]{2, 3}[/latex], which can also be thought of as an unordered pair, in that [latex]{2, 3}...

June 29, 2017

The set operations, union and intersection, the relative complement [latex]-[/latex] and the inclusion relation (subsets) [latex]\subseteq[/latex] are known as the algebra of sets. The algebra of sets can be used...

June 22, 2017

The union and intersection set operations were introduced in a previous post using two sets, [latex]a[/latex] and [latex]b[/latex]. These set operations can be generalized to accept any number of sets. Arbitrary...

June 15, 2017

The set operations of unions and intersections should ring a bell for those who’ve worked with relational databases and Venn Diagrams. The ‘union’ of two of sets [latex]A[/latex] and [latex]B[/latex]...

June 8, 2017

Sets define a ‘collection’ of objects, or things typically referred to as ‘elements’ or ‘members.’ The concept of sets arises naturally when dealing with any collection of objects, whether it...

April 13, 2017

The more common approach to QR decomposition is employing Householder reflections rather than utilizing Gram-Schmidt. In practice, the Gram-Schmidt procedure is not recommended as it can lead to cancellation that...

March 23, 2017

QR decomposition is another technique for decomposing a matrix into a form that is easier to work with in further applications. The QR decomposition technique decomposes a square or rectangular...

March 9, 2017

Hierarchical clustering is a widely used and popular tool in statistics and data mining for grouping data into ‘clusters’ that exposes similarities or dissimilarities in the data. There are many...

March 3, 2017

The iterated principal factor method is an extension of the principal factor method that seeks improved estimates of the communality. As seen in the previous post on the principal factor...

February 23, 2017

As discussed in a previous post on the principal component method of factor analysis, the [latex]\hat{\Psi}[/latex] term in the estimated covariance matrix [latex]S[/latex], [latex]S = \hat{\Lambda} \hat{\Lambda}' + \hat{\Psi}[/latex], was...

February 16, 2017

In the first post on factor analysis, we examined computing the estimated covariance matrix [latex]S[/latex] of the rootstock data and proceeded to find two factors that fit most of the...

February 9, 2017

Factor analysis is a controversial technique that represents the variables of a dataset [latex]y_1, y_2, \cdots, y_p[/latex] as linearly related to random, unobservable variables called factors, denoted [latex]f_1, f_2, \cdots,...

January 26, 2017

Image compression with principal component analysis is a frequently occurring application of the dimension reduction technique. Recall from a previous post that employed singular value decomposition to compress an image,...

January 19, 2017

Often, it is not helpful or informative to only look at all the variables in a dataset for correlations or covariances. A preferable approach is to derive new variables from...

January 12, 2017

Quadratic discriminant analysis for classification is a modification of linear discriminant analysis that does not assume equal covariance matrices amongst the groups [latex](\Sigma_1, \Sigma_2, \cdots, \Sigma_k)[/latex]. Similar to LDA for...

January 5, 2017

Similar to the two-group linear discriminant analysis for classification case, LDA for classification into several groups seeks to find the mean vector that the new observation [latex]y[/latex] is closest to...

December 29, 2016

As mentioned in the post on classification with linear discriminant analysis, LDA assumes the groups in question have equal covariance matrices [latex](\Sigma_1 = \Sigma_2 = \cdots = \Sigma_k)[/latex]. Therefore, often...

December 23, 2016

Classification with linear discriminant analysis is a common approach to predicting class membership of observations. A previous post explored the descriptive aspect of linear discriminant analysis with data collected on...

December 15, 2016

Discriminant analysis is also applicable in the case of more than two groups. In the first post on discriminant analysis, there was only one linear discriminant function as the number...

December 8, 2016

Multiple tests of significance can be employed when performing MANOVA. The most well known and widely used MANOVA test statistics are Wilk’s [latex]\Lambda[/latex], Pillai, Lawley-Hotelling, and Roy’s test. Unlike ANOVA...

December 1, 2016

MANOVA, or Multiple Analysis of Variance, is an extension of Analysis of Variance (ANOVA) to several dependent variables. The approach to MANOVA is similar to ANOVA in many regards and...

November 17, 2016

The term ‘discriminant analysis’ is often used interchangeably to represent two different objectives. These objectives of discriminant analysis are: Description of group separation. Linear combinations of variables, known as discriminant functions,...

November 10, 2016

As mentioned in a previous post, image compression with singular value decomposition is a frequently occurring application of the method. The image is treated as a matrix of pixels with...

November 3, 2016

Following from a previous post on the Cholesky decomposition of a matrix, I wanted to explore another often used decomposition method known as Singular Value Decomposition, also called SVD. SVD...

October 21, 2016

Eigenvalues and eigenvectors prominently appear in many statistical and other computational fields that require transformations of linear systems or are interested in the evolution of systems from an initial point....

October 12, 2016

Although comparatively straightforward in nature, the matrix trace has many properties related to other matrix operations and often appears in statistical methods such as maximum likelihood estimation of the covariance...

October 6, 2016

Cholesky decomposition, also known as Cholesky factorization, is a method of decomposing a positive-definite matrix. A positive-definite matrix is defined as a symmetric matrix where for all possible vectors [latex]x[/latex],...

September 28, 2016

The bisection method is another approach to finding the root of a continuous function [latex]f(x)[/latex] on an interval [latex][a, b][/latex]. The method takes advantage of a corollary of the intermediate...

September 14, 2016

The secant method for finding roots of nonlinear equations is a common and popular variation of the Newton-Raphson method that has been used for several millennia before the invention of...

September 8, 2016

The Newton-Raphson method is an approach for finding the roots of nonlinear equations and is one of the most common root-finding algorithms due to its relative simplicity and speed. The...

August 31, 2016

R and SQL make excellent complements for analyzing data due to their respective strengths. The sqldf package provides an interface for working with SQL in R by querying data from...

August 24, 2016

In a previous post on multiple regression with two predictor variables, the relationship between the number of products and the distance traveled on total delivery time was examined in the...

August 18, 2016

Inverses of Numbers and Matrices The inverse of a number is its reciprocal. For example, the inverse of 8 is [latex]\frac{1}{8}[/latex], the inverse of 20 is [latex]\frac{1}{20}[/latex] and so on. Therefore,...

August 11, 2016

Multiple regression is a widely utilized method due to its relatively straightforward nature and power of fitting linear relationships. The concepts explored in a previous post on simple regression apply...

August 3, 2016

[raw] // // // [/raw]...

July 27, 2016

The linear regression models examined so far have always included a constant that represents the point the regression line crosses the y-axis, called the intercept. However, there are some cases...

July 19, 2016

In a previous example, linear regression was examined through the simple regression setting, i.e., one independent variable. Fitting a linear model allows one to answer questions such as: What is the...

July 13, 2016

Linear regression is a widely used technique to model the association between a dependent variable and one or more independent variables. In the Simple Linear Regression setting, which is what...

July 7, 2016

The Games-Howell post-hoc test is another nonparametric approach to compare combinations of groups or treatments. Although rather similar to Tukey’s test in its formulation, the Games-Howell test does not assume...

June 28, 2016

In a previous example, linear correlation was examined with Pearson’s [latex]r[/latex]. The cars dataset that was examined exhibited a strong linear relationship, and thus Pearson’s correlation was a good candidate...

June 16, 2016

Introduction to Correlation Often of interest in analyzing data is measuring the strength of association between two variables. This allows the analyst to answer such questions as “Does X predict Y?”...

May 31, 2016

In a previous example, ANOVA (Analysis of Variance) was performed to test a hypothesis concerning more than two groups. Although ANOVA is a powerful and useful parametric approach to analyzing...

May 24, 2016

The Kruskal-Wallis test extends the Mann-Whitney-Wilcoxon Rank Sum test for more than two groups. The test is nonparametric similar to the Mann-Whitney test and as such does not assume the...

May 17, 2016

ANOVA, or Analysis of Variance, is a commonly used approach to testing a hypothesis when dealing with two or more groups. One-way ANOVA, which is what will be explored in...

May 13, 2016

In previous examples, hypothesis testing with two independent samples drawn from normally distributed populations was explored. Often, however, data is not normally distributed, which causes the t-test to output incorrect...

May 10, 2016

Introduction Estimating with confidence intervals is another form of hypothesis testing that is often preferred over standard hypothesis testing such as what was explored in the previous post. A primary reason...

May 4, 2016

Introduction to Hypothesis Testing Classical hypothesis testing is concerned with testing two statements, the null, and alternative hypothesis. The null hypothesis is believed to be true while the alternative hypothesis is...

September 10, 2015

In this example, we'll build classification decision trees to analyze if a particular individual will commit an affair on their partner based on demographics and other data. Getting Started Start by loading...

August 25, 2015

July 29, 2015

The IPython Notebook is a useful tool for creating reproducible research and sharing work with other users. The notebook's ability to combine code, text, plots, images, math and even web...

July 28, 2015

Nbviewer is a wonderful way to share IPython Notebooks as it allows users to enter a URL of the notebook location, or a Github repo link, and statically displays the...

July 23, 2015

In the previous post, a new Ipython Notebook theme was created with custom CSS. In this post, we will explore how to export notebooks using the nbconvert tool with custom...

July 21, 2015

The IPython Notebook is a powerful tool for doing quick and reproducible research and analysis by combining code, text, math, plots and even web pages into a single environment. It's...

July 14, 2015

[raw] function resizeIframe(ifrm) { ifrm.style.height = ifrm.contentWindow.document.body.scrollHeight + 'px'; // Setting the width here, or setting overflowX to...

July 13, 2015

AWS, or Amazon Web Services, is a cloud-computing platform with a variety of different products and services, ranging from data storage, computing and machine learning. Some of the more popular...

April 24, 2015

Introduction In this post, we will learn more about using logistic regression to classify and predict categorical values. An introduction to classification and logistic regression will be discussed in order to...

April 16, 2015

April 11, 2015

April 11, 2015

April 11, 2015

Python can be a chore to install and test on Windows systems for new users due to the need to deal with system paths and other settings that can get...

March 15, 2015

Hello! Today I am going to walk you through an introduction to the ARIMA model and its components, as well as a brief explanation of the Box-Jenkins method of how...

July 13, 2014

Linear regression models find relationships between a dependent variable, often designated y, and one or more dependent variables often denoted x. Linear regression has two primary functions and has a...

June 29, 2014

Expanding on my previous post about xlwings, I wanted to see if I could create a method in Excel to perform linear regression using statsmodels, a Python package for statistical...

June 12, 2014

I recently discovered xlwings, a package that allows you to work interactively with Excel and Python. You can read and write Excel from an IPython notebook, and it works perfectly...

May 23, 2014

During my free time at work, I like to work on Excel and other projects that help the team. I've recently fallen in love with Github and decided to start...

May 11, 2013

Hey, everyone! Today I'll walk through how to find non-destructively unique values in your data. I say 'non-destructively' because I notice it is common for people looking for unique values...

May 11, 2013

What up Excel-party people! Coming at you with something that I find pretty useful and applies to a large number of situations, Parsing Words in Text Strings and Cells. Originally, the...

May 10, 2013

Today I'm going to introduce you to the awesome function that is SUMPRODUCT. You can tell I'm excited as this is one of my go to functions due to its...

May 10, 2013

Alrighty, imagine a situation where you have a set of data but aren't too familiar with it. Maybe it was handed off or whatever, and you need to figure it...

May 10, 2013

Everyone knows VLOOKUP is an excellent formula and is useful in a wide variety of cases. However, there are some limitations to the function. The biggest one is VLOOKUP can...