Tree-based models. More...

Classes
class	GBModel
	A gradient-boosting model for any child of the TreeBooster class. More...
class	TreeBooster
	A decision tree used in various gradient-boosting models as a weak learner. More...
class	TreeBoosterNode
	A node used in a TreeBooster used for gather and storing information about the decision making process. More...
struct	Histogram
	Holds the total gradients and hessians for all bins. More...
struct	DataPartition
	A data partition for the set of samples a tree node has to work with during the tree building process. More...
struct	Split
	Holds data associated with the decision making process in a TreeBoosterNode. More...
class	XGTreeBooster
	A tree booster modeled after Chen & Guestrin's XGBoost tree booster. More...

Typedefs
using	json = ::nlohmann::json
using	SubsampleFunction = ::std::function< void(size_t *, size_t, size_t, size_t, ::CNum::DataStructs::Matrix<double>) >
using	SplitValuePair = std::pair<double, double>
using	DataMatrix = std::variant< CNum::DataStructs::Matrix<int>, CNum::DataStructs::Matrix<double> >

Enumerations
enum	SplitAlg { GREEDY , HIST }
	The algorithm used for tree finding splits in tree building. More...
enum	split_dir { LEFT , RIGHT }
	Signifies the direction of a node resultant of a split in relation to its parent. More...

Variables
SubsampleFunction	default_subsample
constexpr int	N_BINS = 256
	Number of bins used in the Tree models.

Detailed Description

Tree-based models.

Typedef Documentation

◆ DataMatrix

using CNum::Model::Tree::DataMatrix = std::variant< CNum::DataStructs::Matrix<int>, CNum::DataStructs::Matrix<double> >

◆ json

using CNum::Model::Tree::json = ::nlohmann::json

◆ SplitValuePair

using CNum::Model::Tree::SplitValuePair = std::pair<double, double>

◆ SubsampleFunction

using CNum::Model::Tree::SubsampleFunction = ::std::function< void(size_t *, size_t, size_t, size_t, ::CNum::DataStructs::Matrix<double>) >

Enumeration Type Documentation

◆ split_dir

enum CNum::Model::Tree::split_dir

Signifies the direction of a node resultant of a split in relation to its parent.

Enumerator
LEFT
RIGHT

◆ SplitAlg

enum CNum::Model::Tree::SplitAlg

The algorithm used for tree finding splits in tree building.

GREEDY is the exact greedy method (available in 0.3.0) HIST is the histogram method

Enumerator
GREEDY
HIST

Variable Documentation

◆ default_subsample

SubsampleFunction CNum::Model::Tree::default_subsample

inline

Initial value:

= [] (size_t *pos_ptr,
                           size_t low,
                           size_t high,
                           size_t n_samples,
                           const ::CNum::DataStructs::Matrix<double> y) -> void {
    if (low == 0 && high == n_samples) {
      ::std::iota(pos_ptr, pos_ptr + n_samples, low);
    } else {
      ::CNum::Utils::Rand::generate_n_unique_rand_in_range<size_t>(low, high - 1, pos_ptr, n_samples, 1);
    }
  }

◆ N_BINS

int CNum::Model::Tree::N_BINS = 256

constexpr

Number of bins used in the Tree models.

The value 256 was chosen for vgather optimizations. If there are 256 bins then the bin number fits in one byte, and we can gather more gradients and hessians associated with bin numbers in parrallel when searching for the best split.

Classes

Typedefs

Enumerations

Variables