BHI                 package:clValid                 R Documentation

_B_i_o_l_o_g_i_c_a_l _H_o_m_o_g_e_n_e_i_t_y _I_n_d_e_x

_D_e_s_c_r_i_p_t_i_o_n:

     Calculates the biological homogeneity index (BHI) for a given
     statistical clustering partition and biological annotation.

_U_s_a_g_e:

     BHI(statClust, annotation, names = NULL, category = "all")

_A_r_g_u_m_e_n_t_s:

statClust: An integer vector indicating the statistical cluster
          partitioning

annotation: Either a character string naming the Bioconductor
          annotation package for mapping genes to GO categories, or a
          list with the names of the functional classes and the
          observations belonging to each class

   names: A vector of labels to associate with the 'genes', to be used
          in conjunction with the Bioconductor annotation package.  Not
          needed if 'annotation' is a list providing the functional
          classes. 

category: Indicates the GO  categories to use for biological
          validation.  Can be one of "BP", "MF", "CC", or "all".

_D_e_t_a_i_l_s:

     The BHI measures how homogeneous the clusters are biologically. 
     The measure checks whether genes placed in the same statistical
     cluster also belong to the same functional classes.  The BHI is in
     the range [0,1], with larger values corresponding to more
     biologically homogeneous clusters. For details see the package
     vignette.

_V_a_l_u_e:

     Returns the BHI measure as a numeric value.

_N_o_t_e:

     The main function for cluster validation is 'clValid', and users
     should call this function directly if possible.

_A_u_t_h_o_r(_s):

     Guy Brock, Vasyl Pihur, Susmita Datta, Somnath Datta

_R_e_f_e_r_e_n_c_e_s:

     Datta, S. and Datta, S. (2006). Methods for evaluating clustering
     algorithms for gene expression data using a reference set of
     functional classes. BMC Bioinformatics 7:397.

_S_e_e _A_l_s_o:

     For a description of the function 'clValid' see 'clValid'.

     For a description of the class 'clValid' and all available methods
     see 'clValidObj' or 'clValid-class'.

     For additional help on the other validation measures see
     'connectivity',   'dunn', 'stability', and 'BSI'.

_E_x_a_m_p_l_e_s:

     data(mouse)
     express <- mouse[1:25,c("M1","M2","M3","NC1","NC2","NC3")]
     rownames(express) <- mouse$ID[1:25]
     ## hierarchical clustering
     Dist <- dist(express,method="euclidean")
     clusterObj <- hclust(Dist, method="average")
     nc <- 4 ## number of clusters      
     cluster <- cutree(clusterObj,nc)

     ## first way - functional classes predetermined
     fc <- tapply(rownames(express),mouse$FC[1:25], c)
     fc <- fc[-match( c("EST","Unknown"), names(fc))]
     BHI(cluster, fc)

     ## second way - using Bioconductor
     if(require("Biobase") && require("annotate") && require("GO") &&
     require("moe430a")) {
       BHI(cluster, annotation="moe430a", names=rownames(express), category="all")
     }

