| About CSV2XML and XML2CSV Utilities CSV2XML and XML2CSV are Java-based command-line utilities thatconvert XML files to CSV format and vice versa. CSV2XML takes a CSV fileas input and produces an XML file in which each record from the CSVfile in converted to an XML record. XML2CSV takes an XML file asinput and produces a CSV file in which each record in the XML fileis converted to a CSV record. The converters are flexible enough toaccommodate the widest range of structural files. This flexibilityis achieved through an XML configuration file in which you specifyvarious aspects of the source and output file formats. The convertersalso accept command-line arguments that further customize aspects of theconversion. License The utilities herein are distributed under the restrictions of the GNU General Public License (GPL). Contributors Welcome! Feature requests, bug reports, and volunteer coders are all welcome. Download the latest release here. Converters are Java applications. They must be installed on amachine running Sun's JRE 1.3or later. The converters use XML configuration files to specifyconversion rules. The configuration files must adhere to theirrespective DTDs: csvtoxml.dtdand xmltocsv.dtd. Converting files from one format to another involves thefollowing steps: - Getting source file (xml or csv).
- Writing xml configuration file.
- Defining additional conversion information such asconversion rules for special characters and defining delimitersfor CSV file.
- Running batch file or shell script.
CSV2XML -
The input for CSV2XML is any CSV file with the followingcharacteristics: -
Null fields are represented as empty (,,). -
Non-null fields are double-quoted. -
Empty non-null fields are represented as strings oflength 0 (""). -
Double quotes within a field are doubled (as in"This "" is a double quote."). -
Control characters (such as tabs and new lines) arepreserved. - The configuration file defines what CSV fields to convertinto which XML nodes and attributes values. (See the Samples section for examples of the configurationfiles.)
The most difficult and error-prone part of the conversionprocess is writing the configuration file. It defines thestructure of the output XML file and the values of its nodes andattributes. The configuration file must conform to csvtoxml.dtd,which specifies the following elements. Element Name | Description | Element | Defines an XML element. The element must have the nameattribute. It could have the value attribute. value attribute could be afixed value or taken from CSV. To specify value from CSV use thefollowing format: csv:column_name,which tells converter which CSV column to use for the XML element'svalue. Elements can have aliases. Elements can specify namespaces to use within the document. Example: <element name="FirstName"value="csv:fname"> Take value fromCSV file. <element name="FirstName"value="Jones"> Use "Jones" asvalue for all FirstName elementsin the XML file. | Attribute | Defines XML attribute. Exactly the same rules are applied to attributes as to elements. Example: <attributename="stateOfMind" value="csv:mind"/> | Alias | Defines alternative column name insource CSV being used for the same element or attributein output XML. In the example below, the value for the FirstName element could come from any ofthree different columns. If none of the columns arepresent, the value "none" will be output. Example: <element name="FirstName"value="csv:fname"> <alias value="csv:f_name"/> <alias value="csv:first_name"/> <alias value="none"/> </element> | - CSV2XML supports the following command-line parameters.(See Samples for examples of various command-lineoptions in use.)
Parameter | Description | -configconfig_file | Specifies theconfiguration file to be used for conversion. | -srccsv_source_file | Identifies the source CSVfile to be converted. | -outout_xml_file | Defines output XML file.This is the result of the conversion. | -nullreplacement_value | Optional parameter toreplace null fields in the source CSV file. Defaultbehavior is to convert null fields to emptystrings (""). | -tabreplacement_value | Optional parameter toreplace \t. Defaultbehavior is to keep tabs intact. | -specialcharreplacement_value | Optional parameter toreplace special characters. Default behavior is toconvert them to spaces. Special characters are those outsidethe following range: x9, xA, xD, x20-xD7FF, xE000-xFFFD, x10000-x10FFFF. | -newlinereplacement_value | Optional parameter toreplace \n and \r.Default behavior is to keep these characters intact. | -profile | Optional parameter to turnprofiling on. The following profile information is provided: File Size: input file size in bytes Number of Records: number of records in the input file Time to process file: time taken for conversion, in the format d:h:m:s:m | - To run converter based on running environment execute batchor shell script file.
- Windows run.bat [options]
- UNIX run.sh
NOTE: Please remember to escape and/or double quotespecial/reserved characters when using them as parameters in the batchor shell script. Namespace Support CSV2XML converter supports XML namespace specification.Configuration file declares all XML namespaces used in output document.See sample below. Config.xml <?xml version="1.0"encoding="UTF-8" ?> <!DOCTYPE csvtoxml SYSTEM ".\csvtoxml.dtd"> <csvtoxml xmlns="uri:csv-to-xml-v1"> <element name="Profiles" xmlns:a="uri:foo/a"xmlns:b="http://foo/b"> <element name="Profile"> <attribute name="stateOfMind" value="csv:mind"/> <element name="FirstName" value="csv:fname"> <alias value="csv:f_name" /> <alias value="csv:first_name"/> <alias value="none" /> </element> <element name="LastName" value="csv:lname"> <attribute name="a:state" value="csv:state"> <alias value="csv:s_state" /> <alias value="none" /> </attribute> </element> <element name="b:Email" value="csv:email" /> </element> </csvtoxml> Output.xml <?xml version="1.0" encoding="UTF-8" ?> <Profiles xmlns:a="uri:foo/a" xmlns:b="http://foo/b"> <Profile stateOfMind="confused"> <FirstName>Sam</FirstName> <LastNamea:state="ca">Smith</LastName> <b:Email>ssmith@hotmail.com</b:Email> </Profile> <Profile stateOfMind="enlighten"> <FirstName>Joe</FirstName> <LastNamea:state="va">Mon'dave</LastName> <b:Email>joemondave@yahoo.com</b:Email> </Profile> <Profile stateOfMind="absent"> <FirstName /> <LastName a:state="wi">LaMonde</LastName> <b:Email>cheese@infoseek.com</b:Email> </Profile> </Profiles XML2CSV -
The input for XML2CSV is any valid XML file in which allnodes and attributes are required. In other words, XML fragments(records) must be topologically identical. If a node orattribute is present in a fragment, it must be present in allfragments throughout the file. -
The configuration file defines how XML elements andattributes map to CSV fields. As with CSV2XML, constructing theconfiguration file is the most difficult and error-prone part ofthe conversion process. This file must conform to xmltocsv.dtd,which specifies the following elements. Element Name | Description | Fields | Topmost container elementof the configuration file. It encapsulates collection of Parent and one or more Field elements. | Parent | Context information. Itdefines the beginning of the repeating content. Notethat the current implementation is not optimized, andthere is an overlap in defining a Parent node and a Fieldnode. | Field | Specifies the CSV field.It must have the nameattribute. That attribute's value specifies column namein output CSV file. The Field elementcontains the Value element. | Value | This element is acontainer of one or more Elementsand/or Attribute. The collection of Elements and/or Attributespecify the "path" to the XML source element orattribute to be used as input for the CSV field. | Element | Specifies the source XMLnode. It requires the nameattribute, in which the node's name is specified. | Attribute | Specifies the source XMLattribute. It requires the nameattribute, in which the attribute's name is specified. | -
XML2CSV supports the following command-line parameters.(See Samples for examples of various command-lineoptions in use.) Parameter | Description | -configconfig_file | Specifies theconfiguration file to be used for conversion. | -srcxml_source_file | Identifies the source XMLfile to be converted. | -outout_csv_file | Specifies the output CSVfile. This is the result of the conversion. | -delimiterdelimiter_value | Optional parameter tospecify the delimiter to be used in the CSV file.Default behavior is to use comma (,). | -tabreplacement_value | Optional parameter toreplace \t. Defaultbehavior is to convert tabs to spaces. | -profile | Optional parameter to turnprofiling on. The following profile information is provided: File Size : input file size in bytes Number of Records : number of records in the input file Time to process file: time took to convert in the format d:h:m:s:m | - To run converter based on running environment execute batchor shell script file.
- Windows run.bat[options]
- UNIX run.sh
NOTE: Please remember to escape and/or double quotespecial/reserved characters when using them as parameters in the batchor shell script. Namespace Support XML2CSV converter supports XML namespace specification.Configuration file declares all XML namespaces used in input document.See sample below. Input.xml <?xml version="1.0"encoding="UTF-8" ?> <Profiles xmlns="uri:foo/a" xmlns:x="uri:foo/x"> <Profile> <FirstName state="ca">SamHa</FirstName> <LastName>Smith</LastName> <EmailAddressxmlns="http://foo/y">ssmith@hotmail.com <Server>hotmail</Server> </EmailAddress> <Company>Google <Server>google</Server> </Company> </Profile> <Profile> <FirstNamestate="va">Joe</FirstName> <LastName>Mon"dave</LastName> <EmailAddressxmlns="http://foo/y">joemondave@yahoo.com</EmailAddress> <Company>joe"mondave</Company> </Profile> <Profile> <FirstName state="wi" /> <LastName>LaMonde</LastName> <EmailAddressxmlns="http://foo/y">cheese@infoseek.com</EmailAddress> <Company>ace</Company> </Profile> </Profiles> Config.xml <?xml version="1.0"encoding="UTF-8" ?> <!DOCTYPE xmltocsv SYSTEM ".\xmltocsv.dtd"> <xmltocsv xmlns="uri:xml-to-csv-v1" xmlns:a="uri:foo/a"xmlns:b="uri:foo/x" xmlns:c="http://foo/y" xmlns:d="http://foo/z"> <fields> <parent> <element name="a:Profiles" /> <element name="a:Profile" /> </parent> <field name="email"> <value> <element name="a:Profile" /> <element name="c:EmailAddress" /> </value> </field> <field name="state"> <value> <element name="a:Profile" /> <element name="a:FirstName"/> <attribute name="a:state" /> </value> </field> <field name="fname"> <value> <element name="a:Profile" /> <element name="a:FirstName"/> </value> </field> <field name="lname"> <value> <element name="a:Profile" /> <element name="a:LastName" /> </value> </field> </fields> </xmltocsv> XML2CSV - Convert XML distribution list to CSV.
ConfigurationFile: configdistributionlist.xml Source File: srcdistributionlist.xml Output File: outdistributionlist.csv Command Line Windows: run -config configdistributionlist.xml -srcsrcdistributionlist.xml -out outdistributionlist.csv UNIX: - Convert XML profile list to CSV.
Configuration File: configprofile.xml Source File: srcprofile.xml Output File: outprofile.csv Command Line Windows: run -config configprofile.xml -src srcprofile.xml-out outprofile.csv UNIX: CSV2XML - Convert CSV file into XML file.
Configuration File: configcampaignlog.xml Source File: srccampaign.csv Output File: outcampaignlog.xml Command Line Windows:run -config configcampaignlog.xml -srcsrccampaignlog.csv -out outcampaignlog.xml UNIX: - Convert CSV distribution list into XML file.
Configuration File: configdistributionlist.xml Source File: typesofemailnamelist.csv Output File: outdistributionlist.xml Command Line Windows: run -config configdistributionlist.xml -srcTypesOfEmailNameList.csv -out outdistributionlist.xml UNIX: CSV2XML - Only ANSI files are supported for input. Output XML isformatted in UTF-8 encoding.
XML2CSV - xmltocsv.dtddefines seven namespaces. If the source document uses more than sevennamespaces, the DTD must be modified to define them.
CSV2XML XML2CSV - New line characters \r and \n are being lost in conversion.
|