SETI Markup Language

SML Introduction:

The SETI community needs and deserves a modern method of storing and sharing the data it collects. The data must be stored in a context free environment so that it can be analyzed by others without access to the original tools used to create the data. The methods used to create the data must be based on an agreed standard so that others may understand its organization and be able to use its content directly from the data itself. This is the implementation of those goals.

The organization will be based on XML and will be collected into a language specific to SETI.

The Name Of This Language is:

SML - The SETI Markup Language

Language Organization:

The blocks on the right represent a portion of the SML Language. You may view the complete schema on the ineractive Schema Page.

During normal operation of a SETI station hits will be generated. A hit is the detection of a signal that passes the various tests the system runs against the signal to filter local noise from possible ET signals. Each Hit generates a file that contains precise and detailed identification of that hit. When it occurred, where the receiver was tuned, how the antenna was pointing and so on. This Hit file is an instance of SML and will be composed as described in the following manner.

Each Hit entity has two main sections the Title Block for later identification, and Equipment Setup so that data collected can be analyzed properly. These two elements are decomposed under that. Details are contained in the HIT schema on the Schema page. Please remember that not all the data is required. For example if you don't have a Band Pass Filter (most don't) you are not required to enter any data for it. Many elements are required so that the data makes sense. For example unless we know where on earth you were and which direction your antenna was pointed and what time it was when you heard the first ET contact you will forever be a footnote in SETI history as the one that almost was first.

The SML that is used to define a Hit is a mix of automatically generated and manually entered data. The goal is to be as flexible as possible for your Argus station.; For example the Creation Date in the Title Block could be entered manually or put into the data set automatically by the SETI station.

Title Block

The Title Block presents details of the hit. Elements of the Title Block are:

  1. Target The constellation and star pointed to are identified along with the identity of the Constellation/Target/Scan and Hit IDs in the database

  2. Location Coordinate System - The Argus station must be located for the data to make sense later on.

  3. Creation Date - The local Date that the file was created.

  4. Creation Time - Local time of creation.

  5. Operator Name - Who was at the controls when the file was created.

  6. Observer Name - People looking over your shoulder at the time the file was created.

  7. SML Version - Shows which version of SML was used when the file was created.

Equipment Setup

To simplify the process there can only be one equipment setup per file of data. Elements of that setup are:


Includes a description of the mount, the shape of the antenna, and feed System. This allows beam width and gain calculations to be made automatically Child nodes are:

  1. Pointing Coordinates- Just where the antenna was pointing when the Hit occurred.

  2. Mount - What system was in use to position the antenna (Birdbath, Az/El, drift etc)

  3. LNA - The gain in dB of the LNA and its noise factor.

  4. L1 Insertion Loss -The loss in signal between the LNA and the Band Pass Filter. On my system its small because they are co-located but it might be different on your system. This loss, and the others, are needed for system noise temperature calculations

  5. Band Pass Filter - The upper and lower edge of the filter and its insertion loss; Lower 3 dB point, Upper 3 dB point.

  6. Insertion Loss - Loss between the BPF and the RF Amplifier

  7. RF Amp - Characteristics of this amplifier if present in the system

  8. L3 Insertion Loss - Between the RF amp and the receiver.


Mode of operation (SSB, FM AM etc), manufacture, model and the bandwidth in use.; Child nodes are:

  1. Frequency - >What frequency the receiver was tuned to and the frequency of the Hit itself derived from the waterfall bin number.

  2. Spectrum Analyzer > - Defines how the Spec Ana was setup when the Hit was detected.

  3. Mode - What receiving mode was in use at the time (AM, FM Wide, CW, etc).

  4. Analog Filter - If you use an analog filter to flatten the IF response as I do a simple description goes here.

  5. A-to-D Converter - Most Argus systems us a Sound Blaster on a PC. This element allows the entry of the manufacture and model along with the setup such as sampling rate, encoding scheme (8, 16 bit) and the recording mode (Mono or Stereo) and the channel (left/right) in use.

  6. Weighting Window - > A signal sampled for a limited amount of time may exhibit a distorted Fourier spectrum. In order to minimize this distortion, the signal may be multiplied by a weighting function which reduces the signal towards zero at both ends of the sampling window. This element simply indicates which if any window was in use. Window types could be Triangle, Hanning, Gaussian, Hamming, Blackman or other.

  7. DSP - Digital Signal Processing in use. Most Argus systems use a FFT that creates a set number of bins. This element is used to capture the bin width in Hertz. It also defines the type of data collected from the FFT which could be Magnitude, the Power Spectrum, the Real Spectrum, the Imaginary Spectrum the Phase Spectrum or the RMS Spectrum.

  8. Filter - If digital filters were used they are defined here.

  9. Time Keeping - >The data collected must be time tagged for future analysis. This element allows you to define the time of time keeping system in use. Most Argus stations use the PC clock but it may be synchronized for better accuracy.

The above list is only the outline of the SML data organization. The SETI Markup Language gives the parser the rules that a well formed SML document must maintain. The details of SML are kept in the SML Schema and are subject to continuous update and adjustment as new parameters are added and definition of existing ones change.

Download instructions:

  1. Download by clicking the SETI_SML.XSD file. Your browser may open and display the file. In that case simply hit File | Save As... to save it on your computer.

  2. Use a simple editor, like notepad, to modify the third line of the Test1 file to point to where you put the schema file. I like to use XML Spy as the schema and XML development tool. It can be downloaded from Altova