用DOM解析器解析“未知”xml
在这里,我们迭代 XML 文档树中存在的所有元素。
这样一旦我们在遍历树时获得所需的信息,我们就可以使用它。
public class ParseUnknownXMLStructure { public static void main(String[] args) throws ParserConfigurationException, SAXException, IOException { //Get Document Builder DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance(); DocumentBuilder builder = factory.newDocumentBuilder(); //Build Document Document document = builder.parse(new File("employees.xml")); //Normalize the XML Structure; It's just too important !! document.getDocumentElement().normalize(); //Here comes the root node Element root = document.getDocumentElement(); System.out.println(root.getNodeName()); //Get all employees NodeList nList = document.getElementsByTagName("employee"); System.out.println("============================"); visitChildNodes(nList); } //This function is called recursively private static void visitChildNodes(NodeList nList) { for (int temp = 0; temp < nList.getLength(); temp++) { Node node = nList.item(temp); if (node.getNodeType() == Node.ELEMENT_NODE) { System.out.println("Node Name = " + node.getNodeName() + "; Value = " + node.getTextContent()); //Check all attributes if (node.hasAttributes()) { // get attributes names and values NamedNodeMap nodeMap = node.getAttributes(); for (int i = 0; i < nodeMap.getLength(); i++) { Node tempNode = nodeMap.item(i); System.out.println("Attr name : " + tempNode.getNodeName()+ "; Value = " + tempNode.getNodeValue()); } if (node.hasChildNodes()) { //We got more childs; Let's visit them as well visitChildNodes(node.getChildNodes()); } } } } } }
DOM 解析器旨在将 XML 作为内存中的对象图(树状结构)处理,即所谓的“文档对象模型 (DOM)”。
首先,解析器遍历输入的 XML 文件并创建与 XML 文件中的节点对应的 DOM 对象。
这些 DOM 对象以树状结构链接在一起。
一旦解析器完成了解析过程,我们就会从中得到这个树状 DOM 对象结构。
现在我们可以来回遍历 DOM 结构,因为我们想要从中获取/更新/删除数据。
使用 DOM 解析器读取 XML
测试xml 文件employees.xml
<employees> <employee id="111"> <firstName>JackLi</firstName> <lastName>Gupta</lastName> <location>Netherlands</location> </employee> <employee id="222"> <firstName>JackLi</firstName> <lastName>Gussin</lastName> <location>Russia</location> </employee> <employee id="333"> <firstName>Tomm</firstName> <lastName>Feezor</lastName> <location>USA</location> </employee> </employees>
//Get Document Builder DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance(); DocumentBuilder builder = factory.newDocumentBuilder(); //Build Document Document document = builder.parse(new File("employees.xml")); //Normalize the XML Structure; It's just too important !! document.getDocumentElement().normalize(); //Here comes the root node Element root = document.getDocumentElement(); System.out.println(root.getNodeName()); //Get all employees NodeList nList = document.getElementsByTagName("employee"); System.out.println("============================"); for (int temp = 0; temp < nList.getLength(); temp++) { Node node = nList.item(temp); System.out.println(""); //Just a separator if (node.getNodeType() == Node.ELEMENT_NODE) { //Print each employee's detail Element eElement = (Element) node; System.out.println("Employee id : " + eElement.getAttribute("id")); System.out.println("First Name : " + eElement.getElementsByTagName("firstName").item(0).getTextContent()); System.out.println("Last Name : " + eElement.getElementsByTagName("lastName").item(0).getTextContent()); System.out.println("Location : " + eElement.getElementsByTagName("location").item(0).getTextContent()); } }
欢迎来到之路教程(on itroad-com)
读取数据到 POJO 对象
另一个现实生活中的应用程序的要求可能是用上面示例代码中获取的信息填充 DTO 对象。
我编写了一个简单的程序来了解如何轻松完成。
假设我们必须填充定义如下的“Employee”对象。
public class Employee { private Integer id; private String firstName; private String lastName; private String location; //Setters and Getters @Override public String toString() { return "Employee [id=" + id + ", firstName=" + firstName + ", lastName=" + lastName + ", location=" + location + "]"; } }
使用 DOM 解析器读取 XML 文件的 Java 程序。
public class PopulateDTOExamplesWithParsedXML { public static void main(String[] args) throws ParserConfigurationException, SAXException, IOException { List<Employee> employees = parseEmployeesXML(); System.out.println(employees); } private static List<Employee> parseEmployeesXML() throws ParserConfigurationException, SAXException, IOException { //Initialize a list of employees List<Employee> employees = new ArrayList<Employee>(); Employee employee = null; DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance(); DocumentBuilder builder = factory.newDocumentBuilder(); Document document = builder.parse(new File("employees.xml")); document.getDocumentElement().normalize(); NodeList nList = document.getElementsByTagName("employee"); for (int temp = 0; temp < nList.getLength(); temp++) { Node node = nList.item(temp); if (node.getNodeType() == Node.ELEMENT_NODE) { Element eElement = (Element) node; //Create new Employee Object employee = new Employee(); employee.setId(Integer.parseInt(eElement.getAttribute("id"))); employee.setFirstName(eElement.getElementsByTagName("firstName").item(0).getTextContent()); employee.setLastName(eElement.getElementsByTagName("lastName").item(0).getTextContent()); employee.setLocation(eElement.getElementsByTagName("location").item(0).getTextContent()); //Add Employee to list employees.add(employee); } } return employees; } }
DOM 解析器 API
在 Java 中创建和使用 DOM 解析器来解析 XML 文件的步骤。
导入 dom 解析器包
我们需要首先在我们的应用程序中导入 dom 解析器包。
import org.w3c.dom.*; import javax.xml.parsers.*; import java.io.*;
创建文档生成器
下一步是创建 DocumentBuilder 对象。
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance(); DocumentBuilder builder = factory.newDocumentBuilder();
从 xml 文件创建 Document 对象
将 XML 文件读取到 Document
对象。
Document document = builder.parse(new File( file ));
验证文档结构
XML 验证是可选的,但最好在开始解析之前进行验证。
Schema schema = null; try { String language = XMLConstants.W3C_XML_SCHEMA_NS_URI; SchemaFactory factory = SchemaFactory.newInstance(language); schema = factory.newSchema(new File(name)); } catch (Exception e) { e.printStackStrace(); } Validator validator = schema.newValidator(); validator.validate(new DOMSource(document));
提取根元素
我们可以使用以下代码从 XML 文档中获取根元素。
Element root = document.getDocumentElement();
检查属性
我们可以使用以下方法检查 xml 元素属性。
element.getAttribute("attributeName") ; //returns specific attribute element.getAttributes(); //returns a Map (table) of names/values
检查子元素
子元素可以通过以下方式查询。
node.getElementsByTagName("subElementName") //returns a list of sub-elements of specified name node.getChildNodes() //returns a list of all child nodes
日期:2020-09-17 00:10:13 来源:oir作者:oir