What is HTDP?
HDTP is short for High-Throughput Tabular Data Processor. This Java application is intended to facilitate data exploration and reduction tasks in large text files. The software has been optimized for microarray and deep parallel sequencing data, however it can accept any character delimited tabular data sets. In the latter case the first row of the data set should be designated as header row and should contain the names of the corresponding columns. HTDP can also import, process and convert Variant Call Format (VCF) files ver. 4.0, 4.1 and 4.2. HTDP provides quick filtering functionality and can process data consisting of single or multiple input files. Files in different supported formats can be processed at the same time. The processed data can be exported as tab delimited file.

Features;
  • works on any character-delimited column data (e.g. BED, GFF, GTF, WIG, VCF)
  • merge tabular data files vertically and horizontally
  • merge any tabular data files using genome location or common column(s)
  • unlimited filtering and data reduction capabilities
  • different files with regards to format and content can be analyzed at the same time

System requirements
  • Minimum JRE: 1.6.0 (tested with version 1.6.0ga5)


HTDP on an eComStation machine with Open JDK


Installing HTDP
Download HTDP.tar.gz. Make a new directory/map with the name "htdp" on your drive. Open the file "HTDP.tar.gz", you get the directory/map "HTDP.tar". Open this directory/map and you get the directory/map "HTDP". Copy the contents to the new created directory/map "htdp". You can also copy the directory/map "HTDP" to your drive, but I don't like directory/map names in capitals.

There is a manual too, may be you need it. You can download s1_user_manual_20171229.pdf.

The used cmd file
HTDP works with Open JDK in OS/2-eCS. I have a command file "htdp.cmd" with the following contents;
@echo off
set BEGINLIBPATH=[drive: java]\JAVA160ga5\bin
set path=[drive: java]\JAVA160ga5\bin
set CLASSPATH= 
[drive: htdp]
cd [drive: htdp]\htdp
java -Xmx800m -Duser.home=[drive: htdp]\htdp -jar htdp.jar 2>htdp-bugs.txt
I use 2 separate folders (directories), one for Java and one for HTDP. The references used in the cmd file;

  • [drive: java] = drive with Java
  • [drive: htdp] = drive with HTDP

should be replaced with real drive letters. Save the file and name it "htdp.cmd" or use the file from the distribution. This file is copied to the "htdp" folder (directory). Furthermore, different paths?, adjust according to your needs.

Create a new program object. Specify the path and file name: "[drive: htdp]\htdp\htdp.cmd". In the tabpage Session check the boxes "OS/2 window", "Running as an icon" and "Close Window to end program". In the tabpage General you can enter the name "HTDP". There is an OS/2 icon in the distribution below.

Parameters / options explained
  • The statement "-Duser.home=[drive: htdp]\htdp" will ensure that HTDP will save all necessary files (if any) in own directory instead of saving them in the home directory.
  • The specification "-Xmx800m" comes from the manual and the website. The "-Xmx800m" indicates the maximum limit. With big datafiles it can go as far as the limits of your system, for OS/2 based systems this means a maximum sourcefile of around 170Mb (depending on type of the file) and this will take around 3.5Gb memory. Even bigger seems no problem for the program.
  • The addition "2>htdp-bugs.txt" ensures that errors are saved in the file "htdp-bugs.txt". The 2 in "2>" is not a typo! The files stays empty on my system.

Running the program
First thing I do when I try a new program is if the menu works and if help is available. Asking for help did give the error that the file "HTDP.jaruser_manual.pdf" could not be found. So I downloaded the manual and renamed it. But even then I got same error. At that time I didn't had included the memory specification. I had problems with the size of the screen layout, but I didn't had read the manual. The picture shows a part of an export from the database in Data Crow. The file from Data Crow was directly read without any editing or translation.

Download
In the file you can find the above command files (all drive letters are on set to C:) and an OS/2 HTDP icon: htdp-ecs.zip.

revision January 7, 2018