Internal Main Function for HISEA Run — run_hisea

Does the full analysis/simulation/bootstrap run Run Complete HISEA Stock Composition Analysis

This function performs comprehensive stock composition analysis using various statistical methods including LDA, Random Forest, SVM, XGBoost, and others.

Usage

run_hisea_all(
  type = "ANALYSIS",
  np,
  nv,
  seed_val = 123456,
  var_cols_std = NULL,
  var_cols_mix = NULL,
  stock_col = NULL,
  nsamps = 1000,
  Nmix = 100,
  actual = NULL,
  baseline_input = NULL,
  mix_input = NULL,
  export_csv = FALSE,
  output_dir = ".",
  verbose = FALSE,
  method_class = "LDA",
  stocks_names = NULL,
  resample_baseline = FALSE,
  resampled_baseline_sizes = NULL,
  phi_method = c("standard", "cv"),
  mclust_model_names = NULL,
  mclust_perform_cv = TRUE,
  ...
)

Arguments

type

Character. Type of analysis to perform. Default: "ANALYSIS"

np

Integer. Number of populations/stocks. Default: 2

nv

Integer. Number of variables/loci. Default: 2

seed_val

Integer. Random seed for reproducibility. Default: 123456

var_cols_std

Character vector of column names for baseline variables

var_cols_mix

Character vector of column names for mixture variables

stock_col

Character name of stock column in baseline data

nsamps

Integer. Number of bootstrap samples. Default: 1000

Nmix

Integer. Size of mixture sample. Default: 100

actual

Numeric vector. True composition proportions. Default: c(0.5, 0.5)

baseline_input

Data frame or file path for baseline data

mix_input

Data frame or file path for mixture data

export_csv

Logical. Whether to export results to CSV. Default: TRUE

output_dir

Character. Output directory path. Default: getwd()

verbose

Logical. Whether to print progress messages. Default: TRUE

method_class

Character. Classification method ("LDA", "RF", "SVM", etc.). Default: "LDA"

stocks_names

Character vector. Names of stocks. Default: NULL

resample_baseline

Logical. Whether to resample baseline data. Default: FALSE

resampled_baseline_sizes

Integer vector. Sizes for resampled baseline. Default: NULL

phi_method

Character. Method for phi calculation. Default: "default"

mclust_model_names

Character vector. Model names for mclust. Default: NULL

mclust_perform_cv

Logical. Whether to perform cross-validation for mclust. Default: FALSE

...

Additional arguments passed to the underlying classifier. Certain classifiers support hyperparameter tuning via these arguments:

Random Forest (RF): ntree, mtry, nodesize, maxnodes
- Example: run_hisea_all(..., method_class="RF", ntree=5000, mtry=3)
Support Vector Machine (SVM): cost, gamma, kernel
- Example: run_hisea_all(..., method_class="SVM", cost=10, kernel="radial")
XGBoost (XGB): nrounds, max_depth, eta, subsample, colsample_bytree
- Example: run_hisea_all(..., method_class="XGB", nrounds=100, max_depth=4)
k-Nearest Neighbors (KNN): k
- Example: run_hisea_all(..., method_class="KNN", k=5)
Artificial Neural Network (ANN): size, decay, maxit
- Example: run_hisea_all(..., method_class="ANN", size=10, decay=0.01)
Naive Bayes (NB): laplace
- Example: run_hisea_all(..., method_class="NB", laplace=1)

⚠ Ensure that the names of arguments match those expected by the classifier. Unrecognized arguments may be ignored or raise an error.

Value

List containing:

estimates: Matrix of stock composition estimates
mean: Mean estimates across bootstrap samples
sd: Standard deviations of estimates
performance: Performance metrics if applicable

Examples

if (FALSE) { # \dontrun{
# Basic analysis with LDA
result <- run_hisea_all(
  np = 2,
  nv = 3,
  method_class = "LDA",
  baseline_path = "my_baseline.txt",
  mix_path = "my_mixture.txt"
)
} # }