A set of command line utilities for manipulating large tabular data files, provided and developed by Ebay written in "D". The tools handle files of numeric and text data commonly found in machine learning, data mining, and similar environments. They offer filtering, sampling, statistics, joins, and more. These tools are especially useful when working with large data sets. They run faster than other tools providing similar functionality, often by significant margins.

The tools are installed in CSC's computing environment (module load tsv-utils).

