C++中處理XML文件
增大字體 顏色 默認 灰度 橄欖色 綠色 藍色 褐色 紅色
寫(xiě)Unmanaged Code在.NET時(shí)代成為一種很悲慘的事,當你需要處理XML文件時(shí),這種感覺(jué)會(huì )變得尤其強烈。FCL中的System.Xml多簡(jiǎn)單啊,連Steve Ballmer都知道怎么用。
事情不會(huì )總是那么理想的,如果你要在C/C++程序里處理XML怎么辦呢?
選擇一:市面上的XML lib還是有幾個(gè)的,最有名的當然是libxml。我一年前用過(guò),很不錯,我還特意寫(xiě)了一份簡(jiǎn)明教程,后來(lái)不知擱哪兒了。
選擇二:MS的MSXML,我要介紹的就是這個(gè)。
先說(shuō)一下在MSDN哪里找文檔吧,往下看的時(shí)候也好有個(gè)參考:在Index里打:Windows Media Services 9 Series SDK=>Programming Reference=>Programming Reference (C++)=>XML DOM Interfaces (C++)。什么?Windows Media?呵呵,不錯,我覺(jué)得這個(gè)guide反而是最清楚的,你直接找MSXML,得到的結果,我覺(jué)得還沒(méi)這個(gè)好。
在C程序里調用MSXML基本就是一堆COM接口,不過(guò)在Visual Studio里操作先要做點(diǎn)簡(jiǎn)單的設置:
在你的Project里Add References=>COM標簽=>Microsoft XML v4.0,5.0其實(shí)也有了,但因為是和Office一起發(fā)布的,覺(jué)得有點(diǎn)怪,不想用,反正也未必用什么很怪異的功能,4.0可以了。
然后在加入這兩行:
#include <msxml2.h>
#import <msxml4.dll>
頭文件和dll庫。什么?在哪里加?頭文件或者c/cpp文件啊,哪里合適放哪兒。
然后就開(kāi)始編程了,先定義兩個(gè)必用的變量:
IXMLDOMDocumentPtr xmlFile = NULL;
IXMLDOMElement* xmlRoot = NULL;
為什么是必用的? 汗...
第一步當然是初始化COM:
if(FAILED(CoInitialize(NULL))) ....
接下來(lái)初始化xmlFile對象:
if(FAILED(xmlFile.CreateInstance("Msxml2.DOMDocument.4.0"))) ...
然后就可以加載xml文件了:
_variant_t varXml(L"C:\\test.xml"); //L for unicode
VARIANT_BOOL varOut;
xmlFile->load(varXml, &varOut);
取得root element:
xmlFile->get_documentElement(&xmlRoot))
取得第一級element:
IXMLDOMNodeList* xmlChildNodes = NULL;
xmlRoot->get_childNodes(&xmlChildNodes);
遍歷所有第一級element:
IXMLDOMNode* currentNode = NULL;
while(!FAILED(xmlChildNodes->nextNode(¤tNode)) && currentNode != NULL)
{
//do something
}
取得當前element的名稱(chēng):
BSTR nodeName;
currentNode->get_nodeName(&nodeName);
取得當前element的一個(gè)attribute(假設這個(gè)attribute叫type)的值:
IXMLDOMNamedNodeMap* attributes = NULL;
IXMLDOMNode* attributeName = NULL;
_bstr_t bstrAttributeName = "type";
BSTR nameVal;
currentNode->get_attributes(&attributes);
attributes->getNamedItem(bstrAttributeName, &attributeName);
attributeName->get_text(&nameVal);
需要注意的是,你要記住釋放所有的借口,IXMLDOM***->Release(),這可不是.NET,有人幫你GC,你得自己調用Release()來(lái)減reference count,it‘s COM, remember?
好了,大致就這樣,順便提一下XPath:
_bstr_t bstrXmlQuery = L"/books/book[@type=scifi and @author=fox]";
IXMLDOMNodeList* nodes = NULL;
if(FAILED(xmlRoot->selectNodes(bstrXmlQuery, &nodes)) || FAILED(nodes->get_length(&length)) || length == 0)
//no match found or something went wrong
else
//match found
上面是找這樣的node:
<books>
<book type="scifi" author="fox">....
</book>
....
</books>
具體的XPath語(yǔ)法就查手冊吧,到處都有。
哦,對了,忘了說(shuō):如果你全部用ATL的類(lèi)的話(huà),借口的調用會(huì )簡(jiǎn)單一點(diǎn),不過(guò)很容易轉換的,比如:
IXMLDOMDocument* 對應 IXMLDOMDocumentPtr(我這里用了),其他基本也是加個(gè)Ptr,我不廢話(huà)了。
最后提供一個(gè)sample,我臨時(shí)攢的。工作的時(shí)候寫(xiě)的程序當然不能拿來(lái)貼的,呵呵。這個(gè)sample基本就是遍歷整個(gè)xml,然后報告一遍文件的結構,對每個(gè)node,如果它有一個(gè)叫id的attribute,就同時(shí)打印id的值。If you want the complete VS project, shoot me an email. But I guess no one really needs it anyway, right, : )
#include "stdafx.h"
#include <windows.h>
#include <msxml2.h>
#import <msxml4.dll>
HANDLE logFile = NULL;
#define INDENT 4
#define TESTHR(hr) \
{ \
if(FAILED(hr)) goto fail; \
}
void PrintChild(IXMLDOMNodeList* nodeList, int level)
{
if(nodeList == NULL)
return;
IXMLDOMNode* currentNode = NULL;
IXMLDOMNodeList* childNodes = NULL;
IXMLDOMNamedNodeMap* attributes = NULL;
IXMLDOMNode* attributeID = NULL;
while(!FAILED(nodeList->nextNode(¤tNode)) && currentNode != NULL)
{
BSTR nodeName;
TESTHR(currentNode->get_nodeName(&nodeName));
DWORD dwBytesWritten;
for(int i=0; i<level*INDENT; i++)
WriteFile(logFile, L" ", (DWORD)(sizeof(WCHAR)), &dwBytesWritten, NULL);
//WCHAR msg[MAX_SIZE];
//wsprintf(msg, L"%s ", nodeName);
WriteFile(logFile, nodeName, (DWORD)(wcslen(nodeName)*sizeof(WCHAR)), &dwBytesWritten, NULL);
TESTHR(currentNode->get_attributes(&attributes));
if(attributes!=NULL)
{
_bstr_t bstrAttributeName = "id";
BSTR idVal;
TESTHR(attributes->getNamedItem(bstrAttributeName, &attributeID));
if(attributeID != NULL)
{
TESTHR(attributeID->get_text(&idVal));
WriteFile(logFile, L" ", (DWORD)(sizeof(WCHAR)), &dwBytesWritten, NULL);
WriteFile(logFile, idVal, (DWORD)(wcslen(idVal)*sizeof(WCHAR)), &dwBytesWritten, NULL);
WriteFile(logFile, L"\r\n", (DWORD)(2*sizeof(WCHAR)), &dwBytesWritten, NULL);
attributeID->Release(); attributeID = NULL;
}
else
{
WriteFile(logFile, L"\r\n", (DWORD)(2*sizeof(WCHAR)), &dwBytesWritten, NULL);
}
attributes->Release(); attributes = NULL;
}
else
{
WriteFile(logFile, L"\r\n", (DWORD)(2*sizeof(WCHAR)), &dwBytesWritten, NULL);
}
TESTHR(currentNode->get_childNodes(&childNodes));
PrintChild(childNodes, level+1);
currentNode=NULL;
}
fail:
if(childNodes!=NULL)
childNodes->Release();
if(attributeID!=NULL)
attributeID->Release();
if(attributes!=NULL)
attributes->Release();
if(currentNode != NULL)
currentNode->Release();
}
int _tmain(int argc, _TCHAR* argv[])
{
IXMLDOMDocumentPtr xmlFile = NULL;
IXMLDOMElement* xmlRoot = NULL;
_variant_t varXml(L"C:\\demo1.xml");
logFile = CreateFile(L"log.txt", GENERIC_WRITE, 0, NULL, CREATE_ALWAYS, FILE_ATTRIBUTE_NORMAL, NULL);
if(logFile == INVALID_HANDLE_VALUE)
goto fail;
TESTHR(CoInitialize(NULL));
TESTHR(xmlFile.CreateInstance("Msxml2.DOMDocument.4.0"));
VARIANT_BOOL varOut;
TESTHR(xmlFile->load(varXml, &varOut));
TESTHR(xmlFile->get_documentElement(&xmlRoot));
BSTR rootName;
DWORD dwBytesWritten;
TESTHR(xmlRoot->get_nodeName(&rootName));
WriteFile(logFile, rootName, (DWORD)(wcslen(rootName)*sizeof(WCHAR)), &dwBytesWritten, NULL);
WriteFile(logFile, L"\r\n", (DWORD)(2*sizeof(WCHAR)), &dwBytesWritten, NULL);
IXMLDOMNodeList* xmlChildNodes = NULL;
TESTHR(xmlRoot->get_childNodes(&xmlChildNodes));
PrintChild(xmlChildNodes, 2);
fail:
if(logFile != INVALID_HANDLE_VALUE)
CloseHandle(logFile);
if(xmlChildNodes!=NULL)
xmlChildNodes->Release();
if(xmlRoot!=NULL)
xmlRoot->Release();
return 0;
}