Define clue extraction rules

Tools used: MetaStudio, a data schema definition tool.

When the operator turns to Clue Editor work board, he can find a Info clue which is created for property product page automatically in previous step. The Target Theme of this clue should be changed to Product_mic_en shown in the following figure.


Figure 1 (Enlarge)

In case that the commodity catalog is paginated, in order to turn pages over a in-thread clue should be defined. Take the following steps for this task:

  1. Push button newClue to create a new clue;
  2. Tick the in-thread checkbox ;
  3. Push the radio button Marker;
  4. Enable reverse selection;
  5. In the embedded Web browser window, click the word "1", the first number in the line of paginating information, at the bottom of the HTML page to position the DOM node No. 2033;
  6. Expand the sub-tree below node 2033 to find a suitable ancestor node, No. 2029 in this case. The ancestor should just embrace the paginating information.
  7. In the DOM Tree Viewer, click right-button pop-up menu item Clue Mapping>>Clue Mapping>>s_clue_1. As a result, at the left side of the last line of Clue Operation groupbox Node: 2029 is presented;
  8. In the embedded Web browser window, click the word "Next" in the line of paginating information at the bottom of the HTML page to position the DOM node No. 2076;
  9. Expand the sub-tree below node 2076 and select the #text node No. 2079;
  10. In the DOM Tree Viewer, click right-button pop-up menu item Clue Mapping>>Marker Mapping. As a result, the string Next is filled into the edit box Marker Value;

Push button viewClue to preview the SCE file shown in figure 2.


Figure 2(Enlarge)