Data Service App Markup Language (DSAML)

DSAML is a meta-language based on XML that lets you define rules to extract data from a data source.
We used DSAML to extract data from:
  • A page or website
  • A text file
  • XML file

Syntax Example

<!-- Define root node to start, select in page all "ul/li" tags inside "div" with "id" attribute equal to "myid" -->
<select xpath="//div[@id='myid']/ul/li">
    
        <!-- Get content of object "a" inside of all object "li" -->
    <field name="linkText" xpath="a" />
    
        <!-- Get attribute "href" of object "a" inside of all object "li" -->
    <field name="linkHref" xpath="a" attribute="href" />
</select>
        

Some XPath expressions

Select element
 <select /> element is a main command to use for define a root of your data.
Attribute Value

name

Set name of result in output data.

xpath

XPATH query to extract data, view XPath expressions examples.
Use special placeholder "{..}" for dynamic values or extended field result. coming soon

bridge

Unique code (like this "443A6835-82FB-48EA-B20C-41E3CEE0435B") of another Bridge, with this command include all data from specified bridge.

  Tips & Tricks: Yes! you can nested <select /> inside an another and Yes! you cand use more select in root to have more that one result whit different information.
Field element
 <field /> is key tag to extract data from current context of <select> element.
Base attribute Value

name

Name of field, use this name in return data structure.

xpath

XPATH query to extract data, view XPath expressions examples.
Use special placeholder "{..}" for dynamic values or extended field result. coming soon

validation

Validate the field value that return in the result. If you not declare the validation rule the record is added always to result. Click here for details

attribute

Specifies the name of which you want to extract the value, if not declared, use the content (text or Html) of the selected item (defined in xpath attribute).

Operator attribute Value

replace

Insert text to find for replace.

replaceto

Insert text to replace.

trim

Possible value all|start|end trim field value result.

case

Possible value upper|lower|capitalize change case of field value result.

words

Get words of index parameter ex: value="alpha beta gamma" words="2 new value is "beta"

position

Get index position of of value in array list, ex: value="alpha" position="beta|alpha|gamma" new value is 2

separator

Use this value for separator in elaboration functions like words or position.

substring

Extract sub string from field value result example:
substring="1,4"
extract string from char 1 to char 4

Validation details
  How to use validation... <field ... validation="user rule or function" />. If validation result is True, al fields in record are added to result.
  IMPORTANT! You can use validation in one or more field, if all of result of validation are True, record is added to result, if only one is false the record is not added..
Validation rule Description

empty

Check if value is empty.

not empty

Check if value is NOT empty.

is number

Check if value is a number.

is not number

Check if value is a not a number.

Validation function Description

equals(string)

Check if value is equals to parameter.

notequals(string)

Check if value is NOT equals to parameter.

contains(string)

Check if value contain parameter.

notcontains(string)

Check if value NOT contain parameter.

len(number)

Check if length of value is equal to parameter.

greaterthan(number)

Check if length of value is greater than parameter.

lessthan(number)

Check if length of value is less than parameter.

  Tips & Tricks: Use "#", with this syntax <field name="#fieldname" /> define a field like a variable, this field not show in data result.

Some example to how to use XPATH syntax

Expression Refers to

./author

All <author> elements within the current context. Note that this is equivalent to the expression in the next row.

author

All <author> elements within the current context.

first.name

All <first.name> elements within the current context.

/bookstore

The document element (<bookstore>) of this document.

//author

All <author> elements in the document.

book[/bookstore/@specialty=@style]

All <book> elements whose style attribute value is equal to the specialty attribute value of the <bookstore> element at the root of the document.

author/first-name

All <first-name> elements that are children of an <author> element.

bookstore//title

All <title> elements one or more levels deep in the <bookstore> element (arbitrary descendants). Note that this is different from the expression in the next row.

bookstore/*/title

All <title> elements that are grandchildren of <bookstore> elements.

bookstore//book/excerpt//emph

All <emph> elements anywhere inside <excerpt> children of <book> elements, anywhere inside the <bookstore> element.

.//title

All <title> elements one or more levels deep in the current context. Note that this situation is essentially the only one in which the period notation is required.

author/*

All elements that are the children of <author> elements.

book/*/last-name

All <last-name> elements that are grandchildren of <book> elements.

*/*

All grandchildren elements of the current context.

*[@specialty]

All elements with the specialty attribute.

@style

The style attribute of the current context.

price/@exchange

The exchange attribute on <price> elements within the current context.

price/@exchange/total

Returns an empty node set, because attributes do not contain element children. This expression is allowed by the XML Path Language (XPath) grammar, but is not strictly valid.

book[@style]

All <book> elements with style attributes, of the current context.

book/@style

The style attribute for all <book> elements of the current context.

@*

All attributes of the current element context.

./first-name

All <first-name> elements in the current context node. Note that this is equivalent to the expression in the next row.

first-name

All <first-name> elements in the current context node.

author[1]

The first <author> element in the current context node.

author[first-name][3]

The third <author> element that has a <first-name> child.

my:book

The <book> element from the my namespace.

my:*

All elements from the my namespace.

@my:*

All attributes from the my namespace (this does not include unqualified attributes on elements from the my namespace).

 Real example

This is real example to use DSAML to extract informations from a web page.
Data extract from this web page https://meta.wikimedia.org/wiki/List_of_Wikipedias
DSAML Query
<select xpath="//table[@class='sortable']/tr">
      <field name="language" xpath="td[2]/a" />
      <field name="languageLocal" xpath="td[3]/a" />
      <field name="languageurl" xpath="http:{td[3]/a}" attribute="href" />
      <field name="wikiurl" xpath="http:{td[4]/a}" attribute="href" />
      <field name="wiki" xpath="td[4]/a" />
      <field name="wikiapi" xpath="http://{td[4]/a}.wikipedia.org/w/api.php" />
      <field name="articles" xpath="td[5]" />
      <field name="activeUsers" xpath="td[10]/a" />
</select>

Copyright © RB-Soft 2015