clustering data with categorical variables python