MATHEMATICAL MODELLING OF TECHNOLOGICAL PROCESSES AND SYSTEMS
Topology as a lens for semantic organization in transformer embeddings
This paper examines the geometric structure of sentence embeddings through the lens of persistent homology. The goal is to determine whether semantic similarity produces distinctive topological patterns in a controlled embedding environment. To isolate semantic effects, a single sentence template was combined with different target words, forming two point clouds in a transformer embedding space: one derived from semantically similar words and one from dissimilar words. A Vietoris–Rips filtration was applied to both clouds, and the resulting persistence diagrams were summarized by average lifetime, entropy of birth–death intervals, and the area under the Betti curve. The results show a coherent difference across topological dimensions: similar words generate stable connected components with lower variability, while dissimilar words produce a richer set of cycle features that persist across a broader range of scales. These findings indicate that persistent homology can capture multi-scale structural differences in embedding spaces that are not visible through standard distance-based comparisons. Although the experiment is intentionally simple, it highlights the potential of topological methods for studying how semantic structure is distributed across levels of a neural embedding space.