nodes¶
Class(es) for building/connecting graphs.
- class graphnet.models.data_representation.graphs.nodes.nodes.NodeDefinition(*args, **kwargs)[source]¶
- Bases: - Model- Base class for graph building. - Construct Detector. - Parameters:
- args (Any) 
- kwargs (Any) 
 
- Return type:
- object 
 - forward(x)[source]¶
- Construct nodes from raw node features. - Parameters:
- x ( - tensor) – standardized node features with shape ´[num_pulses, d]´,
- features. (where ´d´ is the number of node) 
- node_feature_names – list of names for each column in ´x´. 
 
- Returns:
- a graph without edges 
- Return type:
- graph 
 
 - property nb_outputs: int¶
- Return number of output features. - This the default, but may be overridden by specific inheriting classes. 
 
- class graphnet.models.data_representation.graphs.nodes.nodes.NodesAsPulses(*args, **kwargs)[source]¶
- Bases: - NodeDefinition- Represent each measured pulse of Cherenkov Radiation as a node. - Construct Detector. - Parameters:
- args (Any) 
- kwargs (Any) 
 
- Return type:
- object 
 
- class graphnet.models.data_representation.graphs.nodes.nodes.PercentileClusters(*args, **kwargs)[source]¶
- Bases: - NodeDefinition- Represent nodes as clusters with percentile summary node features. - If cluster_on is set to the xyz coordinates of DOMs e.g. cluster_on = [‘dom_x’, ‘dom_y’, ‘dom_z’], each node will be a unique DOM and the pulse information (charge, time) is summarized using percentiles. - Construct PercentileClusters. - Parameters:
- cluster_on ( - List[- str]) – Names of features to create clusters from.
- percentiles ( - List[- int]) – List of percentiles. E.g. [10, 50, 90].
- add_counts ( - bool, default:- True) – If True, number of duplicates is added to output array.
- input_feature_names ( - Optional[- List[- str]], default:- None) – (Optional) column names for input features.
- args (Any) 
- kwargs (Any) 
 
- Return type:
- object 
 
- class graphnet.models.data_representation.graphs.nodes.nodes.NodeAsDOMTimeSeries(*args, **kwargs)[source]¶
- Bases: - NodeDefinition- Represent each node as a DOM with time and charge time series data. - Construct NodeAsDOMTimeSeries. - Parameters:
- keys ( - List[- str], default:- ['dom_x', 'dom_y', 'dom_z', 'dom_time', 'charge']) – Names of features in the data (in order).
- id_columns ( - List[- str], default:- ['dom_x', 'dom_y', 'dom_z']) – List of columns that uniquely identify a DOM.
- time_column ( - str, default:- 'dom_time') – Name of time column.
- charge_column ( - str, default:- 'charge') – Name of charge column.
- max_activations ( - Optional[- int], default:- None) – Maximum number of activations to include in the time series.
- args (Any) 
- kwargs (Any) 
 
- Return type:
- object 
 
- class graphnet.models.data_representation.graphs.nodes.nodes.IceMixNodes(*args, **kwargs)[source]¶
- Bases: - NodeDefinition- Calculate ice properties and perform random sampling. - Ice properties are calculated based on the z-coordinate of the pulse. For each event, a random sampling is performed to keep the number of pulses below a maximum number of pulses if n_pulses is over the limit. - Construct IceMixNodes. - Parameters:
- input_feature_names ( - Optional[- List[- str]], default:- None) – Column names for input features. Minimum
- names. (required features are z coordinate and hlc column) 
- max_pulses ( - int, default:- 768) – Maximum number of pulses to keep in the event.
- z_name ( - str, default:- 'dom_z') – Name of the z-coordinate column.
- hlc_name ( - Optional[- str], default:- 'hlc') – Name of the Hard Local Coincidence Check column.
- add_ice_properties ( - bool, default:- True) – If True, scattering and absoption length of
- coordinate. (ice in IceCube are added to the feature set based on z) 
- ice_args ( - Dict[- str,- Optional[- float]], default:- {'z_offset': None, 'z_scaling': None}) – Offset and scaling of the z coordinate in the Detector,
- data. (to be able to make similar conversion in the ice) 
- sample_pulses ( - bool, default:- True) – Enable sampling random pulses. If True and the
- max_length (event is longer than the) 
- If (they will be sampled.) 
- False 
- selected. (then only the first max_length pulses will be) 
- args (Any) 
- kwargs (Any) 
 
- Return type:
- object 
 
- class graphnet.models.data_representation.graphs.nodes.nodes.ClusterSummaryFeatures(*args, **kwargs)[source]¶
- Bases: - NodeDefinition- Represent pulse maps as clusters with summary features. - If cluster_on is set to the xyz coordinates of optical modules e.g. cluster_on = [‘dom_x’, ‘dom_y’, ‘dom_z’], each node will be a unique optical module and the pulse information (e.g. charge, time) is summarized. NOTE: Developed to be used with features - [dom_x, dom_y, dom_z, charge, time] - Possible features per cluster: - total charge - feature name: total_charge - charge accumulated after <X> time units
- feature name: charge_after_<X>ns 
 
- time of first hit in the optical module
- feature name: time_of_first_hit 
 
- time spread per optical module
- feature name: time_spread 
 
- time std per optical module
- feature name: time_std 
 
- time took to collect <X> percent of total charge per cluster
- feature name: time_after_charge_pct<X> 
 
- number of pulses per clusters
- feature name: counts 
 
 - For more details on some of the features see Theo Glauchs thesis (chapter 5.3): https://mediatum.ub.tum.de/node?id=1584755 - Construct ClusterSummaryFeatures. - Parameters:
- cluster_on ( - List[- str]) – Names of features to create clusters from.
- input_feature_names ( - List[- str]) – Column names for input features.
- charge_label ( - str, default:- 'charge') – Name of the charge column.
- time_label ( - str, default:- 'dom_time') – Name of the time column.
- total_charge ( - bool, default:- True) – If True, calculates total charge as feature.
- charge_after_t ( - List[- int], default:- [10, 50, 100]) – List of times at which the accumulated charge is calculated as a feature.
- time_of_first_hit ( - bool, default:- True) – If True, time of first hit is added as a feature.
- time_spread ( - bool, default:- True) – If True, time spread is added as a feature.
- time_std ( - bool, default:- True) – If True, time std is added as a feature.
- time_after_charge_pct ( - List[- int], default:- [1, 3, 5, 11, 15, 20, 50, 80]) – List of percentiles to calculate time after charge.
- charge_standardization ( - Union[- float,- str], default:- 'log') – Either a float or ‘log’. If a float, the features are multiplied by this factor. If ‘log’, the features are transformed to log10 scale.
- time_standardization ( - float, default:- 0.001) – Standardization factor for features with a time
- order_in_time ( - bool, default:- True) –- If True, clusters are ordered in time.
- If your data is already ordered in time, you can set this to False to avoid a potential overhead. 
- NOTE: Should only be set to False if you are sure that
- the input data is already ordered in time. Will lead to incorrect results otherwise. 
 
- add_counts ( - bool, default:- False) – If True, number of log10(event counts per clusters) is added as a feature.
- args (Any) 
- kwargs (Any) 
 
- Return type:
- object 
 - NOTE: Make sure that either the input data is not already standardized or that the charge_standardization and time_standardization parameters are set to 1 to avoid a double standardization.