$$ \newcommand{\st}{\text{ s.t. }} \newcommand{\and}{\text{ and }} \DeclareMathOperator*{\argmin}{arg\,min} \DeclareMathOperator*{\argmax}{arg\,max} \newcommand{\R}{\mathbb{R}} \newcommand{\N}{\mathbb{N}} \newcommand{\O}{\mathcal{O}} \newcommand{\dist}{\text{dist}} \newcommand{\vec}[1]{\mathbf{#1}} \newcommand{\diag}{\mathrm{diag}} \newcommand{\d}{\mathrm{d}} \newcommand{\L}{\mathcal{L}} \newcommand{\Tr}{\mathrm{\mathbf{Tr}}} \newcommand{\E}{\mathbb{E}} \newcommand{\Var}{\mathrm{Var}} \newcommand{\Cov}{\mathrm{Cov}} \newcommand{\indep}{\perp \!\!\! \perp} \newcommand{\KL}[2]{\mathrm{KL}(#1 \parallel #2)} \newcommand{\W}{\mathbf{W}} % Wasserstein distance \newcommand{\SW}{\mathbf{SW}} % Sliced-Wasserstein distance $$

Mathematical Morphology

In mathematical morphology, the basic structure of an image is a complete lattice. Definition: Complete lattice A complete lattice is a set $K$ equipped with an order relation $\leq$ that satisfies: Reflexivity: $\forall x \in K$, $x \leq x$; Antisymmetry: $\forall x, y \in K$, $x \leq y$ and $y \leq x$ implies $x = y$; Transitivity: $\forall x, y, z \in K$, $x \leq y$ and $y \leq z$ implies $x \leq z$; $\forall x, y \in K$, the supremum and infimum of $x$ and $y$ exist and are denoted $x \lor y$ and $x \land y$ respectively. We will focus on the boolean lattice and the function lattice. ...

October 21, 2024 · 5 min

Hypothesis Testing

We present different methods to test data against hypotheses. Statistical test We consider the null hypothesis ($H_0$) and the alternative hypothesis ($H_1$). We are interested in rejecting or not the null hypothesis. Definition: Null hypothesis The null hypothesis $H_0$ is that considered true in the absence of data (default choice). Here, let $\delta$ denote the decision function used to reject or not the null hypothesis. $$ \delta(x) = \begin{cases} 0 & \text{do not reject $H_0$} \\ 1 & \text{reject $H_0$ in favor of $H_1$} \end{cases} $$ Definition: Error types The Type-I error rate is the rate of false positives: $\alpha = \mathbb{P}(\delta(x) = 1 \mid H_0)$. The Type-II error rate is the rate of false negatives: $\beta = \mathbb{P}(\delta(x) = 0 \mid H_1)$. Example If the question is “Is there a danger?”, the null hypothesis is the absence of any danger. A type-I error corresponds to a false alarm, while a type-II error rate corresponds to the non-detection of the danger. Info: Neyman-Pearson principle In order to do a hypothesis test, we first set $\alpha$ (test level) and, then, we try to minimize $\beta$ as much as possible. The power of the test is $1 - \beta$. Definition: $p$-value The $p$-value of a sample is the probability of observing a given value under the null hypothesis. Example Consider a test of level $\alpha = 5\%$. The null hypothesis is rejected whenever the observed sample has a $p$-value below $\alpha$. If the $p$-value is $1\%$, there is a $1\%$ (or less) probability of having observed this sample under the null hypothesis. So, the null hypothesis is rejected with high confidence. Parametric model In parametric models, the hypotheses form a subset of the parameters: ...

October 15, 2024 · 7 min

Alternative classification techniques

K-Nearest Neighbors classifier The idea is to represent each record in the data set as an element in $\mathbb{R}^n$ $\DeclareMathOperator*{\argmax}{arg \,max \,} \DeclareMathOperator*{\argmin}{arg \,min \,}$. Then, to predict the class of a new point $x$, compute the $k$ points that are nearest to $x$. The majority class of these $k$ points is the predicted class of $x$. To run this algorithm, we need to define a distance function and also a value for $k$. ...

October 9, 2024 · 4 min

k-cores and Densest Subgraph

Definition: Induced subgraph A graph $H = (V_H, E_H)$ is an induced subgraph of $G = (V_G, E_G)$ if $V_H \subseteq V_G$ and if $u, v \in V_H$ and $(u, v) \in E_G$, then $(u, v) \in E_H$. We will say that $\delta_G(v)$ is the number of edges incident to $v$ in $G$. Definition: $k$-core Given a graph $G$ and $k \geq 0$, a subgraph $H$ of $G$ is a $k$-core if: ...

October 9, 2024 · 3 min

Bayesian Statistics

Tip: Beta distribution The Beta distribution is a continuous probability distribution defined on the interval $[0, 1]$. It has two parameters: $\theta = (\alpha, \beta)$. Then, we have: $$ \begin{align*} p_\theta(x) &= \frac{x^{\alpha - 1} (1 - x)^{\beta - 1}}{B(\alpha, \beta)} \\ \mathbb{E}[X] &= \frac{\alpha}{\alpha + \beta} \\ \mathrm{Var}(X) &= \frac{\alpha \beta}{(\alpha + \beta)^2 (\alpha + \beta + 1)}, \end{align*} $$ where $B(\alpha, \beta)$ is the Beta function, defined as: ...

October 7, 2024 · 4 min