Source: http://onmyphd.com/?p=mosfet.subthreshold.model

- MOSFET model
- Circuit theory

In MOSFET model, it was assumed that current only flows through the MOSFET channel when $V_{GS}>V_{TH}$. In reality, **current flows even when $V_{GS}$ is below the threshold voltage**, but it is orders of magnitude weaker than currents in strong inversion. The inversion layer that is seen in strong inversion is barely seen in this case, and this regime can also be called **weak inversion**.

Under weak inversion, the relation between current and gate-source voltage becomes exponential, according to the following equation for a NMOS:
$$\begin{equation}I_{DS} = I_S \frac{W}{L}e^{\frac{V_{GB}}{nV_T}}\left(e^{-\frac{V_{SB}}{V_T}} - e^{-\frac{V_{DB}}{V_T}} + \frac{V_{DS}}{V_A}\right)\label{eq:subthresholdeq1}\end{equation}$$
$I_S$ is a characteristic current that defines the current that leaks through the transistor, $W$ and $L$ are the transistor's width and length, $n$ is the subthreshold slope factor (around 1~1.5) and defines the effect that the gate voltage has on the drain current, and $V_T=kT/q$ is the thermal voltage. $V_A$ is the Early voltage, a fictious voltage that represents the influence that $V_{DS}$ has on the drain-source current.
For a PMOS, the expression is similar, just replacing the signs in the exponentials:
$$I_{DS} = I_S \frac{W}{L}e^{-\frac{V_{GB}}{nV_T}}\left(e^{\frac{V_{SB}}{V_T}} - e^{\frac{V_{DB}}{V_T}} + \frac{V_{SD}}{V_A}\right ).$$
Continuing with the NMOS and considering the usual case that the source is at the same potential as the bulk ($V_S=V_B$), $\eqref{eq:subthresholdeq1}$ simplifies to
$$\begin{equation}I_{DS} = I_S \frac{W}{L}e^{\frac{V_{GS}}{nV_T}}\left(1 - e^{-\frac{V_{DS}}{V_T}} + \frac{V_{DS}}{V_A}\right)\label{eq:subthresholdeq2}\end{equation}.$$

Similarly to the strong inversion, the weak inversion has a $V_{DS}$ point after which $V_{DS}$ has a weak influence on the drain current. This point was previously called saturation point. In the weak inversion operation, $V_{DS}$ influence becomes small when the term $e^{-\frac{V_{DS}}{V_T}}$ is close to 0. That happens when $V_{DS} > 4V_T$: considering that $V_T$~26 mV at room temperature, the saturation point is at ~104 mV. This is a big difference between weak and strong inversion, the saturation point does not depend on the gate voltage or any device parameters and it is rather low. The influence thereafter of $V_{DS}$ in $I_{DS}$ becomes proportional to $V_A^{-1}$.

The small-signal model in subthreshold is the same as in the strong inversion, although the parameters are calculated differently. As a recap:

It follows several parameters:
$$g_m = \frac{\partial I_{DS}}{\partial V_G} = \frac{I_{DS}}{nV_T}$$
$$g_{mb} = \frac{\partial I_{DS}}{\partial V_B} = \frac{(n-1)I_{DS}}{nV_T}$$
$$\frac{1}{r_o} = g_o = \frac{\partial I_{DS}}{\partial V_{DS}} = \frac{I_{DS}}{V_A}$$
Extrinsic capacitances such as overlap capacitances $C_{ov}$, depletion junction capacitances $C_{sb}$ and $C_{db}$ and oxide capacitance $C_{ox}$ are independent of the drain current and remain equal to the weak inversion case.
Intrinsic capacitances depend on the bias condition, because they represent a fraction of the gate oxide capacitance $WLC_{ox}$.
The capacitances from gate to source/drain are only defined by the overlap capacitances, since there is no channel.
$$C_{gs} = WC_{ov}$$
$$C_{gd} = WC_{ov}$$
On the other hand, $C_{gb}$ increases because there is no channel to shield the gate from the substrate (the channel is "floating" and there is a capacitive path between gate and bulk defined by $C_{ox}$ and $C_{dep}$)
$$C_{gb} = \frac{n-1}{n}WLC_{ox} = \kappa WLC_{ox}$$
$$C_{ds} = negligible$$
Therefore, **$C_{gb}$ becomes the dominant capacitance in subthreshold operation**.

In the weak inversion case, the dominant capacitance is $C_{gb}$ instead of $C_{gs}$. Therefore, the unity-gain frequency is $$f_T = \frac{g_m}{2\pi C_{gb}}.$$

The subthreshold slope $S$ is an important parameter for switches because it defines how much is the ratio between the on and off currents. To get high on/off current ratios, we would like low subthreshold slopes so that the same difference in $V_{GS}$ can decrease the drain current by more decades.

Returning to $\eqref{eq:subthresholdeq2}$, let's find out how much $V_{GS}$ needs to decrease in order to reduce the drain current by one decade:
$$S = \frac{\partial V_{GS}}{\partial \log I_{DS}}$$
$$\frac{1}{S} = \frac{\partial \log I_S \frac{W}{L}e^{\frac{V_{GS}}{nV_T}}\left(1 - e^{-\frac{V_{DS}}{V_T}}\right)}{\partial V_{GS}}$$
$$\frac{1}{S} = \frac{\partial \log e^{\frac{V_{GS}}{nV_T}}}{\partial V_{GS}}$$
$$\frac{1}{S} = \frac{\partial \frac{\ln e^{\frac{V_{GS}}{nV_T}}}{\ln 10} }{\partial V_{GS}}$$
$$\frac{1}{S} = \frac{\partial \frac{V_{GS}}{nV_T} }{\partial V_{GS}\ln 10}$$
$$\frac{1}{S} = \frac{1}{nV_T\ln 10}$$
$$S = nV_T\ln 10$$
For $V_T$ ~26 mV (room temperature) and the ideal case of $n = 1$, the ideal subthreshold slope for MOSFETs is ~60mV/decade. However, the feasible slopes are around 70~80 mV/decade. Furthermore, as temperature increases, so does $V_T$ and the subthreshold slope.