torcharrow.Column¶
A torcharrow.Column
is a 1-dimension torch.Tensor like data structure containing
elements of a single data type. It also supports non-numeric types such as string,
list, struct.
Data types¶
TorchArrow defines the following data types for column, which is in module torcharrow.dtypes
(abbreviated as dt
in table below):
Data type |
dtype |
---|---|
32-bit floating point |
|
64-bit floating point |
|
8-bit signed integer |
|
16-bit signed integer |
|
32-bit signed integer |
|
64-bit signed integer |
|
Boolean |
|
String |
|
List |
|
Struct |
|
Column class reference¶
- class torcharrow.Column¶
- Column.dtype¶
the data type of a
torcharrow.Column
- Column.device¶
the device on which a
torcharrow.Column
is or will be allocated.
- Column.length¶
Return number of rows including null values
- Column.null_count¶
Return number of null values
Return the first n rows. |
|
Return the last n rows. |
|
Cast the Column to the given dtype |
|
(EXPERIMENTAL API) Return whether data at index i is valid, i.e., non-null |
|
Returns column/dataframe with values appended. |
|
Check whether each element in the column is contained in values. |
|
Return whether all non-null elements are True |
|
Return whether any non-null element is True |
|
Maps rows according to input correspondence. |
|
Select rows where predicate is True. |
|
Maps rows to list of rows according to input correspondence dtype required if result type != item type. |
|
Like map() but invokes the callable on mini-batches of rows at a time. |
|
Fill null values using the specified method. |
|
Return a column/frame with rows removed where a row has any or all nulls. |
|
(EXPERIMENTAL API) Remove duplicate values from row/frame but keep the first, last, none |
|
Convert self to arrow array |
|
Convert to PyTorch containers (Tensor, PackedList, PackedMap, etc) |
|
Convert to plain Python container (list of scalars or containers) |
|
Convert self to Pandas Series |
NumericalColumn class reference¶
- class torcharrow.NumericalColumn¶
Absolute value of each element of the series. |
|
Rounds each value upward to the smallest integral |
|
Rounds each value downward to the largest integral value |
|
Round each value in a data to the given number of decimals. |
|
Returns a new column with the natural logarithm of the elements |
|
Generate descriptive statistics. |
|
Return the minimum of the non-null values. |
|
Return the maximum of the non-null values. |
|
Return the sum of the non-null values. |
|
Return the mean of the non-null values. |
|
Return the stddev(s) of the data. |
|
Return the median of the values in the data. |
StringColumn class reference¶
- class torcharrow.StringColumn¶
Compute the length of each element in the Column. |
|
Slice substrings from each element in the Column. |
|
Split strings around given separator/delimiter. |
|
Remove leading and trailing whitespaces. |
|
Return True if the string is an alphabetic string, False otherwise. |
|
Returns True if all the characters are numeric, otherwise False. |
|
Return True if all characters in the string are alphanumeric (either alphabets or numbers), False otherwise. |
|
Return True if all characters in the string are numeric, False otherwise. |
|
Return True if the string contains only decimal digit (from 0 to 9), False otherwise. |
|
Return True all characters in the string are whitespace, False otherwise. |
|
Return True if the non-empty string is in lower case, False otherwise. |
|
Return True if the non-empty string is in upper case, False otherwise. |
|
Return True if each word of the string starts with an upper case letter, False otherwise. |
|
Convert strings in the Column to lowercase. |
|
Convert strings in the Column to uppercase. |
|
Test if the beginning of each string element matches a pattern. |
|
Test if the end of each string element matches a pattern. |
|
Count occurrences of pattern in each string of column |
|
Return lowest indices in each strings in the Column. |
|
Replace each occurrence of pattern in the Column. |
|
Determine if each string matches a regular expression |
|
Test for each item if pattern is contained within a string; returns a boolean |
|
Find for each item all occurrences of pattern (see re.findall()) |
ListColumn class reference¶
- class torcharrow.ListColumn¶
Compute the length of each element in the Column. |
|
Slice sublist from each element in the column |
|
(EXPERIMENTAL API) Vectorizing map. |