Tutorial

Up

 

SGML, HTML and XML - What are they??? 

The difference between HTML and XML is the intent of the two.  XML is something close to HTML.

  •   SGML is the father of the markup languages.  It was created by Dr. Charles Goldfarb in the mid 80's as a way to standardize the identification and collection of data.  Many people will tell you that SGML is not a language itself but a way to create languages.  Those people want to scare you - ignore them.  Its not that complex.  Just think of it as a language.  You also don't have to learn it to understand what follows.

  •   HTML is the markup system used in the world wide web.  This page is marked up in HTML and If you would like to see an example just press View | Source on your Internet Explorer and you will see the code to create what you are reading now.  HTML is used to define the Style of the data.  In HTML you define the style of the text with markup like <FONT>, <I> for italics, <B> for bold, and <U> for underline tags.  You also can define the Structure of the document with tags like <Hn> for heading and  <P> for paragraph.  HTML also allows you to describe some  Content of the data with tags like <TITLE> and <CODE>.

  •   XML is used to define the Content of the data by allowing itself to be extended to fit the circumstances.  This is exactly why it is valuable to SETI.

The ability to define the content is exactly what is needed for SETI data collection.  We need a way to save data and later review it or share it "without regard to it's look".   We need a way to compare data from Argus stations collected at different times and on different types of equipment.  Look is unimportant when the data is generated and analyzed strictly by computer.

XML is a subset of SGML.  When a new language is defined, in either  SGML or XML, its definition is recorded in a Document Type Definition (DTD).  For example HTML, the standard language used to create the WWW page you are viewing, is described in a DTD.  The HTML DTD can be viewed at the World Wide Web consortium home page.  When revisions to HTML are made they are made to the DTD.  

XML has most of the features of SGML with some of the more cumbersome structures removed.  This makes it perfect for our needs.  One of the differences is that XML languages, like SML, are usually defined in a schema rather than a DTD.  The schema itself is written in XML so there is only one language to learn.  The SML schema is available to download.

It turns out that a astronomical markup language has been defined and committed to a DTD.  This language is the Astronomical Markup Language (AML).  AML has been examined and forms the core of our SETI DTD.

SML 

Advantages of Starting With XML in the creation of SML

New This is the cutting edge technology.  I want to be part of that.  I do this for a hobby not because I have to so why not learn something brand new.
Object Oriented The ability to nest XML elements means that a data set can be built as an object.  One observation object could contain frequency objects, amplitude objects, time objects etc.
Existing DTD for a model AML and AIML exist and can be used a starting point for the SETI Markup Language.  
Compatible with current browsers The most recent versions of Netscape (version 4.7) and Internet Explorer (version 5.0 and above) will display an XML (and therefore SML) document directly in limited form.
XML can be embedded in a standard web document An "island of XML" can be put into a HTML document which can be accessed in various ways by the users browser.  This way SML can be viewed by the casual visitor to a web site with no specialized tools.
Netscape and Microsoft are committed to XML Both companies latest versions incorporate at least some of the features of XML.  Both companies are expected to make XML the bases for future products.  SML will benefit from this joint work because the tools will be maintained for us and the specification will be professionally managed.
Resulting Data is Searchable This is the key to XML and thereby SML.  Each data point will have an element name that can be used to place the data in a data base.

If designed correctly  SML data could be used for many purposes.  For example the system noise temperature could be computed directly from the data.  Two searches from different Argus stations could be combined and normalized for a single, new, view.

 

One of the items defined in SML is the Argus stations location.  The data below defines the method of location identification used by Argus.

Grid Square (Maidenhead Grid) Extension

From Art Lange W6RXQ - Argus Station CM87XI

Here's an example of how the size of the grid (micro grid) decreases as the Maidenhead precision increases:
CM87XI42LF16
the grid sizes for different precision are:
CM = 10 X 20 degrees
CM87 = 1 X 2 degrees = approx. 69 X138 miles at the equator (1 degree = 60 nautical mile/degree at the equator)
CM87XI = 2.5 X 5 minutes = approx. 2.87 X 5.75 miles at the equator (extra letters divide by 24)
CM87XI42 = 0.25 X 0.5 minutes = approx. 0.287 X 0.575 miles at the equator (extra numbers divide by 10)
CM87XI42LF = 0.01 X 0.02 minutes = approx. 0.0114 X 0.02296 miles = 60.6 X 121.2 feet at the equator (extra letters divide by 25)
CM87XI42LF16 = represents a grid square approximately  6 X 12 feet at the equator. (extra numbers divide by 10).

For those interested in  correcting the size of the grid square for different latitudes, use the cosine of the latitude to shorten the longitude (east-west) readings. For example at 37 deg north, 1 minute difference in longitude = cos(37) X 1 nautical mile = 0.798 nm.
Note: It is important to use the same divisors for the letters (divide by 25 for the 3 rd and subsequent letter pairs.) (In Maidenhead the first letter pair divides by 18 and the second divides by 24.  The extensions divide by 25, which keeps decimal minutes rational).
The beauty of the extended Maidenhead designators is the efficiency of the representation. To represent the same precision in decimal degrees requires lots more characters. For example 3 feet = approx. 0.000001 degrees. Thus to represent my 12 character full precision grid (CM87XI42LF16) to 6 feet precision, it takes 21 characters in decimal degrees (37.383345 N 122.033456