• Docs >
  • torcharrow.functional
Shortcuts

torcharrow.functional

Velox Core Functions

Velox core functions are included in torcharrow.functional.

Here is an example usage of Velox string function lpad:

>>> import torcharrow as ta
>>> from torcharrow import functional
>>> col = ta.column(["abc", "x", "yz"])
# Velox's lpad function: https://facebookincubator.github.io/velox/functions/string.html#lpad
>>> functional.lpad(col, 5, "123")
0  '12abc'
1  '1231x'
2  '123yz'
dtype: String(nullable=True), length: 3, null_count: 0, device: cpu

Here is another example usage of Velox array function array_except:

>>> col1 = ta.column([[1, 2, 3], [1, 2, 3], [1, 2, 2], [1, 2, 2]])
>>> col2 = ta.column([[4, 5, 6], [1, 2], [1, 1, 2], [1, 3, 4]])
# Velox's array_except function: https://facebookincubator.github.io/velox/functions/array.html#array_except
>>> functional.array_except(col1, col2)
0  [1, 2, 3]
1  [3]
2  []
3  [2]
dtype: List(Int64(nullable=True), nullable=True), length: 4, null_count: 0

Text Operations

add_tokens

Append or prepend a list of tokens/indices to a column.

Recommendation Operations

bucketize

Apply bucketization for input feature.

sigrid_hash

Apply hashing to an index, or a list of indicies.

firstx

Returns the first x values of the head of the input column

has_id_overlap

Returns 1.0 if the two input columns overlap, otherwise 0.0

id_overlap_count

Returns the number of overlaps between two lists of ids

get_max_count

If there are items that overlap between input_ids and matching_ids contribute the maximum number of instances of overlapped ids to the max count.

get_jaccard_similarity

Return the jaccard_similarity between input_ids and matching_ids.

get_cosine_similarity

Return the cosine between the vector defined by input_ids weighted by input_id_scores and the vector defined by matching_ids weighted by matching_id_scores

get_score_sum

Return the sum of all the scores in matching_id_scores that has a corresponding id in matching_ids that is also in input_ids.

get_score_min

Return the min among of all the scores in matching_id_scores that has a corresponding id in matching_ids that is also in input_ids.

get_score_max

Return the min among of all the scores in matching_id_scores that has a corresponding id in matching_ids that is also in input_ids.

High-level Operations

scale_to_0_1

Return the column data scaled to range [0,1].

Docs

Access comprehensive developer documentation for PyTorch

View Docs

Tutorials

Get in-depth tutorials for beginners and advanced developers

View Tutorials

Resources

Find development resources and get your questions answered

View Resources