Exetools

Exetools (https://forum.exetools.com/index.php)
-   General Discussion (https://forum.exetools.com/forumdisplay.php?f=2)
-   -   maybe a bug of Windows API PathMatchSpecW() ? (https://forum.exetools.com/showthread.php?t=21058)

WhoCares 07-23-2024 23:47

maybe a bug of Windows API PathMatchSpecW() ?
 
Consider the following code. PathMatchSpecW() is expected to do case-insensitive match(It seems that MS never documents this but it's true), so variable "matched" should be TRUE.

static const WCHAR pattern[] = L"的f";
const auto matched = PathMatchSpecW(L"的F", pattern);

But actually it is FALSE. Why?

I stepped into its asm code, it uses IsDBCSLeadByte(0x84) to check the first Chinese UTF-16 WCHAR '的'(0x7684), and IsDBCSLeadByte(0x84) returns TRUE for Chinese code page. Then it does case-sensitive comparison of 'f' and 'F', and finally returns FALSE.

I think it's WRONG to use IsDBCSLeadByte() for any byte of a UTF-16 string.
Perhaps MS guys add this logic to provide some compatibility with old weird strings?

sendersu 07-23-2024 23:53

how about dig under the WinAPI hood?
using your lovely debugger!

WhoCares 07-24-2024 00:00

I already debugged it:D

It's unexpected to use IsDBCSLeadByte() for a UTF-16 string.


Quote:

Originally Posted by sendersu (Post 131367)
how about dig under the WinAPI hood?
using your lovely debugger!


WhoCares 07-25-2024 12:33

IsDBCSLeadByte() should be used for GBK/GB2312/GB18030 encoding etc.(CJK encodings)

Here is a good replacement for PatchMatchSpecW():
https://github.com/kirkjkrauss/MatchingWildcards

This article is also interesting:
https://www.codeproject.com/Articles/5163931/Fast-String-Matching-with-Wildcards-Globs-and-Giti


All times are GMT +8. The time now is 19:40.

Powered by vBulletin® Version 3.8.8
Copyright ©2000 - 2026, vBulletin Solutions, Inc.
Always Your Best Friend: Aaron, JMI, ahmadmansoor, ZeNiX