Problem with DC_Xml2ObjectTree

Message

Auge_Ohr · #11 Post by **Auge_Ohr** » Wed Jul 27, 2016 2:10 pm

rdonnay wrote:These functions are part of the Foundation Subscription of Xbase++ 2.0.

ok, THX
still using v1.9.355

Piotr D · #12 Post by **Piotr D** » Thu Jul 28, 2016 9:19 am

Roger,
you understand my problem not properly. This XML file is used for exchange data between different account systems and Fiscal Control. Regardless of the manufacturer and database system, all account programs (in Poland) must have posibility generate XML files for Fiscal Control with so defined XML schema. As I write in another post, Internet Explorer can't visualize files larger than 10 MB (hang up) - this is because IE convert (internal) XML document to HTML-style.
In my program I want, before sending these XML-file to Fiscal Control, display data from these XML file in an user-friendly form, for example in tables. For this purpose I want read this XML file.
The second reason is data exchange between systems based on different database systems - XML is a good way.
How can I read XML nodes content from so large XML files?

Regards
Piotr

#13 Post by **rdonnay** » Thu Jul 28, 2016 10:02 am

This would require using a different technique to parse the XML data.
We would want to forget using the "heirarchal" method of an XML DOM and DC_Xml2ObjectTree() and instead use the "callback" method of Alaska's XML parser.

The callback would fill an array (or database) with the XML data for browsing.

I will see if I can write you a program that uses your XML data.

Here is a little info that I wrote about this for my XML session I will be doing at the next Devcon.

How to use the Xbase++ XML Parser

The Alaska Software XML (ASXML) implementation provides an extremly fast lightweight XML processor. To give you an idea of its performance: it can process approximately 250.000 XML tags per second on a Dell Inspiron Laptop with a 433 MHz CPU.

The major design goals for the XML processor have been: simple usage, speed and a callback processing architecture. Callback processing simplifies the development of XML readers tremendously. Unlike Microsoft's XML implementation, where the parser creates a complete tree structure representing an entire XML document, a callback parser allows you to register own functions to process specific tags in a specific hierarchy.

For example, with a hierarchical parser you have to "walk through" the tree structure until you reach the node/tag of interest. The following pseudo code shows you the steps necessary if you would want to extract the contents of the <load> tag from an XML file using a hierarchical parser:

Code: Select all

01: oRoot   := getRootTag()
02: oParent := oRoot:getChild( "DatabaseEngines" )
03; aChild  := oParent:getChildren( "load" )
04: FOR i:=1 TO Len( aChild )
05:   oChild := aChilds[i]
06:   // do whatever you want to to with the <load> tag
07: NEXT i

With a hierarchical parser you would extract the start, or root, tag and iterate through the tree until the desired tag is reached. In contrast, a callback processing parser uses a different approach. It assumes the user to register a function for a specific group of tags, or nodes, and passes the tags of interest along with associated data to that function. This means, the user associates an action with a tag and the parser starts traversing the tree and would execute all callback functions registered for specific tags. The following example demonstrates this technique and shows how easy it is to process the <load> tag using a callback processing approach.

Code: Select all

01: registerFunction( "//DatabaseEngines/load", "myFunction" )
02: processDocument()
03:
04: FUNCTION myFunction(oTag)
05:   // do whatever you want to do with the <load> tag
06: RETURN(.T.)

The program code for the callback processing approach is not only much easier to read, it is also easier to maintain if the XML tag definition changes. We just have to register a function to be called for a specific node in the XML document structure ("//DatabaseEngines/load") and then we implement this the callback function to process the tag we are interested in. If the structure of a XML configuration file changes, there is no need to adapt PRG source code to reflect the changes, because the code is executed on a "per tag" basis and is independent of the physical structure of the XML document.

Introduction to Heirarchal Parsing.

Callback parsing has its advantages as described above, but there are also many advantages to heirarchal parsing. Look at the sample program named DbeLoad.Prg. This sample demonstrates the reading of an XML file that is used as a configuration for loading DBEs in an application. The callback technique actually uses twice as many lines of code in this particular sample than the heirarchal technique. Also, the callback technique requires that the <load> tags in the XML are positioned above the <build> tags to insure that the dbes are loaded first, otherwise a runtime error will occur. The heirarchal technique does not have this requirement.

The source file named DCXML.PRG contains the source for a function and a class That uses the Xbase++ parser to create an XML DOM (Document Object Model) which is a heirarchal tree of the entire XML document as a set of node objects.

DC_Xml2ObjectTree() creates a DC_XmlNode() tree by parsing out an XML file or an XML stream. The DC_XmlNode() class contains methods for finding a child node, set of child nodes, reading attributes and content, and also rendering the object tree as text, array or XML stream. DC_XmlNode() can be used to create child nodes or a complete XML DOM for generating properly formatted XML.

Code: Select all

//DbeLoad.Xml

<?xml version="1.0"?>
<Configuration>
  <DatabaseEngines>
    <load>DBFDBE</load>
    <load>NTXDBE</load>
    <load>CDXDBE</load>
    <build name="DBFNTX">
      <data>DBFDBE</data>
      <order>NTXDBE</order>
    </build>
    <build name="DBFCDX">
      <data>DBFDBE</data>
      <order>CDXDBE</order>
    </build>
  </DatabaseEngines>
</Configuration>

//DbeLoad.Prg

FUNCTION ProcessConfig1(cFileName)  // Callback processing

LOCAL nXMLDoc,nActions

nXMLDoc := XMLDocOpenFile(cFileName)

nActions := XMLDocSetAction(nXMLDoc, "//DatabaseEngines/load",;
                            {|n,c,a,ch|handleLoad(n,c,a,ch)})

nActions := XMLDocSetAction(nXMLDoc, "//DatabaseEngines/build",;
                            {|n,c,a,ch|handleBuild(n,c,a,ch)})

IF nActions != 0
  XMLDocProcess(nXMLDoc)
  XMLDocResetAction(nXMLDoc)
ENDIF
XMLDocClose(nXMLDoc)

RETURN nil

/* Callback function to handle the DbeLoad action */
STATIC FUNCTION handleLoad(cTag,cContent,aMember,nH)

LOCAL aCH

XMLGetTag(nH,@aCH)
DbeLoad(aCH[XMLTAG_CONTENT],.F.)

RETURN (XML_PROCESS_CONTINUE)

/* Callback function to handle DbeBuild action */
STATIC FUNCTION handleBuild(cTag,cContent,aMember,nH)

LOCAL aCH :={}, nHChild,cName,cOrder,cData

cName := XMLGetAttribute(nH,"name")

nHChild := XMLGetChild(nH,"data")
XMLGetTag(nHChild,@aCH)
cData := aCH[XMLTAG_CONTENT]

nHChild := XMLGetChild(nH,"order")
XMLGetTag(nHChild,@aCH)
cOrder:= aCH[XMLTAG_CONTENT]

DbeBuild(cName,cData,cOrder)

RETURN (XML_PROCESS_CONTINUE)

* ==============

FUNCTION ProcessConfig2(cFileName) // Heirarchal processing

LOCAL oRootNode, aNodes, oNode, i, cName, cData, cOrder

oRootNode := DC_Xml2ObjectTree(cFileName)

aNodes := oRootNode:findNode({'DatabaseEngines','load'},.t.,.t.)

FOR i := 1 TO Len(aNodes)
  DbeLoad(aNodes[i]:content)
NEXT

aNodes := oRootNode:findNode({'DatabaseEngines','build'},.t.,.t.)

FOR i := 1 TO Len(aNodes)
  cName := aNodes[i]:getAttr("name")
  cData := aNodes[i]:findNode('data'):content
  cOrder := aNodes[i]:findNode('order'):content
  DbeBuild(cName,cData,cOrder)
NEXT

RETURN nil

Piotr D · #14 Post by **Piotr D** » Thu Jul 28, 2016 10:57 am

Roger,
thanks for your quick answer. Now I better understand possibility of reading an XML files. Your DC_Xml2ObjectTree() was for me very comfortable, but now I must more study about ASXML library and non-hierarchal technique. Thanks for your reach explanation.

Piotr

#15 Post by **rdonnay** » Thu Jul 28, 2016 11:24 am

It appears that your XML contains data records of different types:

ZOiS
Dziennik
KontoZapis

You would probably need to create 3 databases of those names and then populate the fields from the tag data.

After looking at this some more, I'm not sure that using callbacks is a good idea.
Instead, you may want to look at the DumpTag() function of _DCXML.PRG.

You could modify this code to write to a database or an array instead of creating a DC_XmlNode() object for each tag.

This gives me some ideas.
I am working on a DC_Xml2Array() function that should be much faster than DC_Xml2ObjectTree().

#16 Post by **rdonnay** » Thu Jul 28, 2016 12:57 pm

I had hoped that I could improve on loading of large XML files but now it appears that is nearly impossible.

The Alaska XML parser first traverses the entire XML document to test for structure errors and to load the document into memory. It takes several minutes before the XML processing can even begin.

It looks like your XML may require a different kind of parser altogether.

You may want to search the internet for one.

c-tec · #17 Post by **c-tec** » Fri Jul 29, 2016 12:35 am

Hello,
I think Chilkat XML can handle it, and its free (ActiveX).
regards
Rudolf

#18 Post by **rdonnay** » Fri Jul 29, 2016 6:11 am

Chilkat makes good products. I use is for SFTP.

Piotr D · #19 Post by **Piotr D** » Fri Jul 29, 2016 8:53 am

Hi Rudolf,
thanks for yor suggestion. I will try this.

Piotr

#20 Post by **rdonnay** » Fri Jul 29, 2016 9:11 am

I just looked at the docs.
I think I can still help you without needing Chilkat but I need to do some experiments with CDATA and Base64.

bb.donnay-software.com

Problem with DC_Xml2ObjectTree

Re: Problem with DC_Xml2ObjectTree

Re: Problem with DC_Xml2ObjectTree

Re: Problem with DC_Xml2ObjectTree

Re: Problem with DC_Xml2ObjectTree

Re: Problem with DC_Xml2ObjectTree

Re: Problem with DC_Xml2ObjectTree

Re: Problem with DC_Xml2ObjectTree

Re: Problem with DC_Xml2ObjectTree

Re: Problem with DC_Xml2ObjectTree

Re: Problem with DC_Xml2ObjectTree