deeptrack.sources.base Module#
Utility classes for data sources.
This module provides a set of utility classes designed for managing and manipulating data sources.
These tools are primarily used in scenarios where data needs to be dynamically manipulated, filtered, or combined for feature generation in machine learning pipelines.
Key Features#
Node Hierarchy
Extends DeepTrackNode with utilities to create nested nodes, and structured data access.
Dynamic Data Access
Retrieve data items as callable objects, supporting custom callbacks and dependency tracking.
Randomized Splitting
Enables splitting of data sources into non-overlapping subsets with user-specified length.
Module Structure#
Classes:
SourceDeepTrackNode: Creates child nodes when accessing attributes.
SourceItem: Dict-like object that calls a list of callbacks when called.
Source: Represents one or more sources of data.
Product: Represents the product of the source with the given sources.
This class is used to represent the product of a source with one or more sources. When accessed, it returns a deeptrack object that can be passed as properties to features.
Subset: Represents the subset of a Source.
Sources: Represents multiple sources as a single access point.
Used when one of multiple sources can be passed to a feature.
Functions:
random_split(source, lengths, generator)
- def random_split(
source: Source, lengths: List[Union[int, float]], generator: np.random.Generator = np.random.default_rng()
- ) -> List[Subset]:
Randomly split source into non-overlapping new sources of given lengths.
Examples#
Call a list of callbacks:
>>> from deeptrack.sources import Source
>>> source = Source(a=[1, 2], b=[3, 4])
>>> @source.on_activate
>>> def callback(item):
>>> print(item)
>>> source[0]()
Equivalent to:
>>> SourceItem({'a': 1, 'b': 3}).
Create a node that creates child nodes when attributes are accessed:
>>> from deeptrack.sources import SourceDeepTrackNode
>>> node = SourceDeepTrackNode(lambda: {"a": 1, "b": 2})
>>> child = node.a
>>> child()
1
Join multiple sources into a single access point:
>>> import deeptrack as dt
>>> from deeptrack.sources import Source
>>> source1 = Source(a=[1, 2], b=[3, 4])
>>> source2 = Source(a=[5, 6], b=[7, 8])
>>> joined_source = Sources(source1, source2)
>>> feature_a = dt.Value(joined_source.a)
>>> feature_b = dt.Value(joined_source.b)
>>> sum_feature = feature_a + feature_b
>>> sum_feature(source1[0])
4
>>> sum_feature(source2[0])
12
Functions#
|
Randomly split source into non-overlapping new sources of given lengths. |
Classes#
|
Object corresponding to a node in a computation graph. |
alias of |
|
|
Class that represents the product of a source with one or more sources. |
|
A class that represents one or more sources of data. |
|
A node that creates child nodes when attributes are accessed. |
|
A dict-like object that calls a list of callbacks when called. |
|
Joins multiple sources into a single access point. |
|