pyclugen¶
pyclugen is Python package for generating multidimensional clusters. Each
cluster is supported by a line segment, the position, orientation and length of
which guide where the respective points are placed. The
clugen()
function is provided for this purpose, as well
as a number of auxiliary functions, used internally and modularly by
clugen()
. Users can swap these auxiliary functions by
their own customized versions, fine-tuning their cluster generation strategies,
or even use them as the basis for their own generation algorithms.
Installation¶
Install from PyPI:
Or directly from GitHub:
Quick start¶
out2 = clugen(2, 4, 400, [1, 0], 0.4, [50, 10], 20, 1, 2)
plt.scatter(out2.points[:, 0], out2.points[:, 1], c=out2.clusters)
plt.show()
out3 = clugen(3, 5, 10000, [0.5, 0.5, 0.5], 0.2, [10, 10, 10], 10, 1, 2)
fig = plt.figure()
ax = fig.add_subplot(projection="3d")
ax.scatter(out3.points[:, 0], out3.points[:, 1], out3.points[:, 2], c=out3.clusters)
plt.show()
Further reading¶
The clugen algorithm and its several implementations are detailed in the following reference (please cite it if you use this software):
- Fachada, N. & de Andrade, D. (2023). Generating multidimensional clusters with support lines. Knowledge-Based Systems, 277, 110836. https://doi.org/10.1016/j.knosys.2023.110836 (arXiv preprint)