WARNING! This is release early release often code.
So expect bugs.  Use at your own risk!
Read http://www.gnu.org/licenses/lgpl.html

This is a plain vanilla xml2sql version, called xml2sql-v

I wanted to make something more nifty and complete.
However all I came arround to release was this here.
It's quite useful, though.  This program needs expat.


What it does:

It transforms any well formed XML file into SQL INSERT statements.
Such that you can postprocess the XML data via SQL.
However the XML file is neither tested for correctness
nor is it checked to contain anything reasonable.
So you can insert real bullshit into the database.
The insert statements will have the UTF-8 character set.

There are two little helpers:
latin1-utf8	transforms Latin1 characters into UTF-8
entityfix	transforms html entities into XML entitites
utf8-latin1	transforms UTF-8 characters into Latin1


Usage:

xml2sql-v [-a] "id" [tableprefix]

-a		Alternate output format: Only ' is transformed in '',
		INSERT statements can span over multiple lines
		and NUL characters are not escaped properly.
tableprefix	defaults to t_xmltosql_
"id"		is copied literally into the INSERT statements.
		Such you can have more than one field as c_id.


How it works:

It basically needs some tables like this (MYSQL syntax):

create table t_xmltosql_ent
(
c_id	char(8),
c_nr	integer,
c_depth	integer,
c_ent	integer,
c_tag	char(30),
c_val	text,

primary key (c_id, c_nr)
);

create table t_xmltosql_att
(
c_id	char(8),
c_nr	integer,
c_ent	integer,
c_att	char(30),
c_val	text,

primary key (c_id, c_nr)
);


xml2sql-v "test"
transforms following xml file

<data>
  <entry>
    <one att1="a1" att2=""/>
    <two>text</two>
  </entry>
</data>

into insert statements basically filling the tables as follows:

t_xmltosql_ent
c_id	c_nr	c_depth	c_ent	c_tag	c_val
"test"	0	0	0	"data"	""
"test"	1	1	0	"entry"	""
"test"	2	2	1	"one"	null
"test"	3	2	1	"two"	"text"

t_xmltosql_att
c_id	c_nr	c_ent	c_att	c_val
"test"	0	2	"att1"	"a1"
"test"	1	2	"att2"	""

The output is written as (unordered!) insert statements to the database.
It is safe against common problems as ' or \n in the text if you use MySQL mode.


Examples:

You have an XML file which pretends to be UTF-8,
however the sender did a half-hearted job as follows:
- Some known Latin1-Characters were transformed into their HTML-entities but
  entities not covered by the XML-Standard are not declared in the XML file.
- Some "forgotten" Latin1-Characters are not transformed at all.
Additionally you want to directly insert the output with Latin1
into the database which is not MYSQL:

a="filename.xml"
cat "$a" |
latin1-utf8 |
entityfix |
xml2sql-v -a "'$a'" |
utf8-latin1 |
query {database-args}
