c++ - How to create n-dimensional test data for cluster analysis? -


i'm working on c++ implementation of k-means , therefore need n-dimensional test data. beginning 2d points sufficient, since can visualized in 2d image, i'd prefer general approach supports n dimensions.

there an answer here on stackoverflow, proposed concatenating sequential vectors of random numbers different offsets , spreads, i'm not sure how create those, without including 3rd party library.

below method declaration have far, contains parameters should vary. can changed, if necessary - exception of data, needs pointer type since i'm using opencl.

auto populatetestdata(float** data, uint8_t dimension, uint8_t clusters, uint32_t elements) -> void; 

another problem came mind efficient detection/avoidance of collisions when generating random numbers. couldn't performance bottle neck, e.g. if one's generating 100k numbers in domain of 1m values, i.e. if relation between generated numbers , number space isn't small enough?


question how can efficiently create n-dimensional test data cluster analysis? concepts need follow?

it's possible use c++11 (or boost) random stuff create clusters, it's bit of work.

  1. std::normal_distribution can generate univariate normal distributions 0 mean.

  2. using 1. can sample normal vector (just create n dimensional vector of such samples).

  3. if take vector n 2. , output a n + b, you've transformed center b away + modified a. (in particular, 2 , 3 dimensions it's easy build a rotation matrix.) so, repeatedly sampling 2. , performing transformation can give sample centered @ b.

  4. choose k pairs of a, b, , generate k clusters.


notes

  • you can generate different clustering scenarios using different types of a matrices. e.g., if a non-length preserving matrix multiplied rotation matrix, can "paraboloid" clusters (it's interesting make them wider along vectors connecting centers).

  • you can either generate "center" vectors b hardcoded, or using distribution used x vectors above (perhaps uniform, though, using this).


Comments

Popular posts from this blog

toolbar - How to add link to user registration inside toobar in admin joomla 3 custom component -

linux - disk space limitation when creating war file -