Skip to contents

Retrieves features (usually genes) and their alignment (loadings) with the factors. Allows for the selection of features whose alignments are high relative to other features. Useful for functional interpretation of factors.

Usage

# S4 method for class 'FactorisedExperiment'
getAlignedFeatures(
  object,
  loading_threshold = 0.5,
  proportional_threshold = 0.01,
  feature_id_col = "rownames",
  format = "list",
  center_loadings = FALSE
)

Arguments

object

A FactorisedExperiment object.

loading_threshold

A value between 0 and 1 indicating the proportion of the maximal loading to be used as a threshold. A value of 0.5 (default) means that genes will be selected if their factor alignment (derived from the loadings slot) exceeds or equals 50% of the maximally aligned feature.

proportional_threshold

A value between 0 and 1 indicating the maximal proportion of features to be returned. A value of 0.01 (default) means that a maximum of 1% of the input features (usually genes) will be returned for each factor. These will be the genes in the top percentile with respect to the loadings

feature_id_col

The column in rowData(object) that will be used as a feature ID. Setting this to "rownames" (default) instead uses rownames(object).

format

A string specifying the format in which to return the results. See the value section below.

center_loadings

If TRUE, loadings will be centered column-wise to have a mean of 0.

Value

If the format argument is "list", then a list will be returned with an entry for each factor, each containing a vector of input features. Otherwise, if format is "data.frame", a data.frame is returned with a row for each gene-factor combination. The format argument can also be a function to be applied to the output data.frame before returning the results.

Author

Jack Gisby

Examples

# Get a random matrix with rnorm, with 100 rows (features)
# and 20 columns (observations)
X <- ReducedExperiment:::.makeRandomData(100, 20, "feature", "obs")

# Estimate 5 factors based on the data matrix
fe <- estimateFactors(X, nc = 5)

# Get the genes highly aligned with each factor as a list
aligned_features <- getAlignedFeatures(fe, proportional_threshold = 0.03)
aligned_features
#> $factor_1
#> [1] "feature_59" "feature_32" "feature_68"
#> 
#> $factor_2
#> [1] "feature_62" "feature_99" "feature_9" 
#> 
#> $factor_3
#> [1] "feature_1"  "feature_45" "feature_36"
#> 
#> $factor_4
#> [1] "feature_32" "feature_40" "feature_31"
#> 
#> $factor_5
#> [1] "feature_6"  "feature_30" "feature_98"
#> 

# Can also view as a data.frame
head(getAlignedFeatures(fe, format = "data.frame", proportional_threshold = 0.03))
#>            component    feature     value loadings_centered loading_threshold
#> feature_59  factor_1 feature_59 -2.308496             FALSE               0.5
#> feature_32  factor_1 feature_32 -2.115002             FALSE               0.5
#> feature_68  factor_1 feature_68  2.080829             FALSE               0.5
#> feature_62  factor_2 feature_62  3.525867             FALSE               0.5
#> feature_99  factor_2 feature_99  2.947636             FALSE               0.5
#> feature_9   factor_2  feature_9 -2.829348             FALSE               0.5
#>            proportional_threshold
#> feature_59                   0.03
#> feature_32                   0.03
#> feature_68                   0.03
#> feature_62                   0.03
#> feature_99                   0.03
#> feature_9                    0.03