torcharrow.functional¶
Velox Core Functions¶
Velox core functions are included in torcharrow.functional.
Here is an example usage of Velox string function lpad:
>>> import torcharrow as ta
>>> from torcharrow import functional
>>> col = ta.column(["abc", "x", "yz"])
# Velox's lpad function: https://facebookincubator.github.io/velox/functions/string.html#lpad
>>> functional.lpad(col, 5, "123")
0 '12abc'
1 '1231x'
2 '123yz'
dtype: String(nullable=True), length: 3, null_count: 0, device: cpu
Here is another example usage of Velox array function array_except:
>>> col1 = ta.column([[1, 2, 3], [1, 2, 3], [1, 2, 2], [1, 2, 2]])
>>> col2 = ta.column([[4, 5, 6], [1, 2], [1, 1, 2], [1, 3, 4]])
# Velox's array_except function: https://facebookincubator.github.io/velox/functions/array.html#array_except
>>> functional.array_except(col1, col2)
0 [1, 2, 3]
1 [3]
2 []
3 [2]
dtype: List(Int64(nullable=True), nullable=True), length: 4, null_count: 0
Text Operations¶
Append or prepend a list of tokens/indices to a column. |
Recommendation Operations¶
Apply bucketization for input feature. |
|
Apply hashing to an index, or a list of indicies. |
|
Returns the first x values of the head of the input column |
|
Returns 1.0 if the two input columns overlap, otherwise 0.0 |
|
Returns the number of overlaps between two lists of ids |
|
If there are items that overlap between input_ids and matching_ids contribute the maximum number of instances of overlapped ids to the max count. |
|
Return the jaccard_similarity between input_ids and matching_ids. |
|
Return the cosine between the vector defined by input_ids weighted by input_id_scores and the vector defined by matching_ids weighted by matching_id_scores |
|
Return the sum of all the scores in matching_id_scores that has a corresponding id in matching_ids that is also in input_ids. |
|
Return the min among of all the scores in matching_id_scores that has a corresponding id in matching_ids that is also in input_ids. |
|
Return the min among of all the scores in matching_id_scores that has a corresponding id in matching_ids that is also in input_ids. |
High-level Operations¶
Return the column data scaled to range [0,1]. |