Source: http://onmyphd.com/?p=channel.length.modulation

Pinch-off

It was previously said that a MOS transistor pinches off its channel when the drain-source voltage reaches the override voltage ($V_{DS} = V_{GS} - V_{TH}$). But what happens as we keep increasing the drain voltage? As we know, when the drain reaches pinch-off, we have $V_{GD} = V_{TH}$. More generally, the pinch-off occurs in the channel when the difference in potential between gate and that point in the channel is $V_{TH}$. If $V_{DS}$ keeps increasing, the pinch-off must occur at a position $x$ within the channel, somewhere between the drain and source.
$$V_{Gx} = V_{TH}$$
$$V_G - V_x = V_{TH}$$
$$V_x = V_G - V_{TH}$$
$$V_{xS} = V_{GS} - V_{TH}$$
and
$$V_{DS} > V_{GS} - V_{TH}.$$
There is a voltage drop between the pinch-off point $x$ and the drain. Whatever distance $\Delta L$ remains between these two is a depletion region. Since the channel is being modulated by $V_{DS}$, we call this effect **channel-length modulation**.

Channel Length modulation

To see the effect of channel length modulation on transistor operation, we replace the length of the transistor by the **effective length $L - \Delta L$** in the $I_{DS}$ equation in saturation:
$$I_{DS} = \frac{1}{2}\mu C_{ox}\frac{W}{L - \Delta L}(V_{GS} - V_{TH})^2$$
$$I_{DS} = \frac{1}{2}\mu C_{ox}\frac{W}{L} \frac{1}{1 - \Delta L/L}(V_{GS} - V_{TH})^2.$$
Assuming that $\Delta L \ll L$, we can linearize $\frac{1}{1 - \Delta L/L}$ around $\Delta L/L = 0$ to get $\frac{1}{1 - \Delta L/L} \approx 1 + \frac{\Delta L}{L}$. Then:
$$\begin{equation}I_{DS} = \frac{1}{2}\mu C_{ox}\frac{W}{L} (1 + \frac{\Delta L}{L})(V_{GS} - V_{TH})^2.\label{eq:a}\end{equation}$$
Next we state that $\Delta L$ is influenced linearly by $V_{DS}$ by a factor of $\lambda'$ (this is more or less true), which is a process parameter:
$$\Delta L = \lambda' V_{DS}.$$
But usually, a parameter $\lambda = \lambda'/L$ is used instead, so:
$$\begin{equation}\frac{\Delta L}{L} = \frac{\lambda' V_{DS}}{L} = \lambda V_{DS}\label{eq:b}\end{equation}.$$
We can replace $\eqref{eq:b}$ in $\eqref{eq:a}$ to get
$$I_{DS} = \frac{1}{2}\mu C_{ox}\frac{W}{L}(V_{GS} - V_{TH})^2(1 + \lambda V_{DS}).$$
By the above equation it is easy to see that increasing $V_{DS}$ increases $I_{DS}$. The following plots show the $I_{DS}-V_{DS}$ curve with channel-length modulation.

Now the plateaus are not flat, but have some slope. The meaning and value of the slope is discussed in the small-signal model.