Usually - but not always - we assume that in the context of classical mechanics, the $n$ particles all live on the same $d$-dimensional spatial Riemannian manifold $M$ representing the physical space in which the particles move around. (Usually $d \leq 3$ in the "real world".) In this case, the full configuration space $C$ is just the $n$-fold Cartesian product
$$C = M^n := \prod_{k=1}^n M$$
(strictly speaking, equipped with the product topology). In this case, the configuration space $C$ is parameterized by $N = nd$ coordinates $q_I$, $I = 1,\dots, N$, which naturally "factorize" into $I = (k,i)$, where $k = 1,\dots,n$ indicates the particle and $i = 1,\dots, d$ indicates the coordinate in the spatial manifold $M$. (Technical note: if the particles are indistinguishable, then we need to additionally mod the action of particle interchange out of the configuration space. But in the context of classical - as opposed to quantum - mechanics, we usually assume that particles are distinguishable.)
In this case, the kinetic energy functional is
$$K[\{q_k(t)\}] = \sum_k \frac{1}{2} m_k\, g_{ij}(q_k)\, \dot{q}_k^i \dot{q}_k^j,$$ where $m_k > 0$ is the mass of the $k$th particle, $q_k(t)$ is a point in the spatial manifold $M$ parametrized by time $t$, $g_{ij}$ is the metric tensor for $M$, $q_k^i$ denotes the $i$th coordinate of the point $q_k$, and we sum over $i$ and $j$.
But occasionally, it's useful to allow the $n$ different particles to live on different spatial manifolds $M_k$, $k = 1, \dots, n$, which can have different dimensions $d_k$. (Usually, these are different submanifolds of physical space onto which the different particles are confined.) In the case, the story is basically the same, but the notation gets a bit more complicated. The full configuration space is now the Cartesian product
$$C = \prod_{k=1}^n M_k$$
(again, endowed with the product topology), with dimension $N = \sum_k d_k$. The kinetic energy functional is now
$$K[\{q_k(t)\}] = \sum_k \frac{1}{2} m_k\, g^{(k)}_{{i_k}{j_k}}(q_k)\, \dot{q}_k^{i_k} \dot{q}_k^{j_k}.$$
The notation is the same as before, except now $g^{(k)}$ denotes the metric tensor for the manifold $M_k$ and $i_k, j_k = 1, \dots, d_k$ denote the coordinates of $M_k$ and are summed over.
In either case, we can combine the multi-index $(k, i_k)$ to a single index $I = 1, \dots, N$ on the whole configuration space and write
$$K[\{q_k(t)\}] = \frac{1}{2} M_{IJ}\, \dot{q}_I \dot{q}_J,$$
where we sum over the configuration-space coordinates $I, J = 1, \dots, N$. The symmetric "mass matrix"
$$M_{IJ} = m_k\, g^{(k)}_{{i_k}{j_k}}(q_k)$$
maps $C \times C$ to $\mathbb{R}$. $M_{IJ}$ is only nonzero if the indices $I$ and $J$ have the same subindex $k$, so it doesn’t matter whether we use the $k$ from index $I$ or $J$.
Adopting this compact notation is sometimes useful, but doing so hides the fact that the mass matrix is block diagonal and can be written as the direct sum of $n$ different smaller (symmetric positive-definite) matrices $m_k\, g^{(k)}$. So the mass matrix $M$ only has
$$\sum_{k=1}^n \frac{1}{2} d_k (d_k + 1) = \frac{1}{2} \left(\sum_{k=1}^n d_k^2 + N \right)$$
degrees of freedom, which is less than the
$$\frac{1}{2} N (N+1) = \frac{1}{2} \left(\sum_{k,l=1}^n d_k d_l + N \right)$$
degrees of freedom that we would have for a general symmetric $N \times N$ matrix.