Perform SVD — FastPCA • FastPCA

This function will perform either Randomized SVD or exact SVD to compute the singullar value decomposition of large matrices.

FastPCA(
  input_r_matrix,
  k = 100,
  p = 10,
  q_iter = 2,
  exact = FALSE,
  backend = c("r", "rtorch", "pytorch", "irlba"),
  device = c("CPU", "GPU"),
  cores = 4,
  ...
)

Arguments

input_r_matrix: A numeric R matrix. It's assumed that rows are observations (e.g., samples) and columns are features (e.g., genes). The function will transpose this for the PyTorch or Tinygrad SVD based on the original Python script's logic.
k: Integer. The number of singular values/vectors to compute.
p: Integer. Oversampling parameter (default: 10).
q_iter: Integer. Number of power iterations (default: 2).
exact: Boolean. Whether to compute the exact matrix or not. Only works with pytorch backend
backend: Character. which backend to use, either r, rtorch, pytorch, or irlba. Tinygrad is not implemented. Waiting on tinygrad maturation See details for informaiotn about backends.
cores: Integer. number of CPU cores to use with the backend
...: other parameters to pass to irlba when backend is either 'r' or 'irlba'

Value

A list containing:

U: The left singular vectors (R matrix). Dimensions: Features x k.
S: The singular values (R numeric vector). Length: k.
Vh: The transpose of the right singular vectors (R matrix). Dimensions: Samples x k.

All results are moved to CPU by the Python script and returned as R objects.

Details

Depending on the backend chosen, the session may need to be reset with rstudioapi::restartSession(). Mainly, this is due to conflicts between some underlying system level variables with 'rtorch' and 'pytorch'. Once one is used in a session, the other will fail. Even testing using 'rtorch' and then starting the conda environment resulted in the environment being loaded by the python libraries/modules are not available. Unless absolutely needed and for testing, would stick with 'rtorch'.

Examples

if (FALSE) { # \dontrun{

  # Create a sample R matrix (e.g., 20 samples, 100 features)
  # Ensure values are positive for log transform.
  set.seed(123)
  test_data <- matrix(runif(20 * 100, min=1, max=100), nrow = 20, ncol = 100)
  colnames(test_data) <- paste0("Feature", 1:ncol(test_data))
  rownames(test_data) <- paste0("Sample", 1:nrow(test_data))

  print(paste("Original R matrix dimensions:", paste(dim(test_data), collapse = "x")))

  # Perform Randomized SVD
  svd_results <- FastPCA::FastPCA(test_data, k = 5)
} # }