简介
本文是关于QString乱码的一些补充。主要就两点,QChar、QString底层存储的字符都是16进制的Unicode编码。会结合源码进行“刨根问祖”。
QChar
QChar对应16位的Unicode字符集。
The QChar class provides a 16-bit Unicode character.
In Qt, Unicode characters are 16-bit entities without any markup or structure. This class represents such an entity. It is lightweight, so it can be used everywhere. Most compilers treat it like a unsigned short.
// 源码QChar.h
class Q_CORE_EXPORT QChar
{
public: friend Q_DECL_CONSTEXPR bool operator==(QChar, QChar) Q_DECL_NOTHROW;friend Q_DECL_CONSTEXPR bool operator< (QChar, QChar) Q_DECL_NOTHROW;ushort ucs;//一个QChar就是一个ushort变量
};
QString
因为QString存储的是QChar,而QChar是16位、2字节的Unicode字符。对于大于65535的Unicode字符,则存储在连续的两个QChar中。
The QString class provides a Unicode character string.
QString stores a string of 16-bit QChars, where each QChar corresponds one Unicode 4.0 character. (Unicode characters with code values above 65535 are stored using surrogate pairs, i.e., two consecutive QChars.)
//源码QString.htypedef QTypedArrayData<ushort> QStringData;
class Q_CORE_EXPORT QString
{
public:typedef QStringData Data;//...
public:typedef Data * DataPtr;inline DataPtr &data_ptr() { return d; }
}
源码路径
在QT安装目录,我的是这个路径:C:\Qt\Qt5.6.3.32\5.6.3\Src\qtbase\src\corelib\tools
结合代码调试分析
以下程序验证中文字符串在QString
中以Unicode的编码保存。
#include <QtCore/QCoreApplication>#include <QString>int main(int argc, char *argv[])
{QCoreApplication a(argc, argv);QString str;str = QString::fromStdWString(L"杨奶粉");auto data = str.data();return a.exec();
}
1.中文字符串“杨奶粉”存储到QString
2.通过str获取QChar数组指针
3.QChar与Unicode编码的比对。注意:调试器内存中看到的数据高位在高地址,低位在低地址。
参考文献:
1.“About the Unicode Character Database”
2. QString Class | Qt Core 6.5.0