The Extensible Markup Language (XML) is a general-purpose specification for creating custom markup languages. It is classified as an extensible language, because it allows the user to define the mark-up elements. XML's purpose is to aid information systems in sharing structured data, especially via the Internet, to encode documents, and to serialize data; in the last context, it compares with text-based serialization languages such as JSON and YAML.
XML began as a simplified subset of the Standard Generalized Markup Language (SGML), meant to be readable by people via semantic constraints; application languages can be implemented in XML. These include XHTML, RSS, MathML, GraphML, Scalable Vector Graphics, MusicXML, and others. Moreover, XML is sometimes used as the specification language for such application languages.
XML is recommended by the World Wide Web Consortium (W3C). It is a fee-free open standard. The recommendation specifies lexical grammar and parsing requirements.
An XML document has two correctness levels:
- Well-formed. A well-formed document conforms to the XML syntax rules; e.g. if a start-tag (< >) appears without a corresponding end-tag (>), it is not well-formed. A document not well-formed is not in XML; a conforming parser is disallowed from processing it.
- Valid. A valid document additionally conforms to semantic rules, either user-defined or in an XML schema, especially DTD; e.g. if a document contains an undefined element, then it is not valid; a validating parser is disallowed from processing it.