It's not so much non 'English' characters but rather, non 'Latin/Roman' characters (from which English is scripted) which are the issue. There's plenty of languages other than English which are scripted from Latin/Roman characters and Lua based pattern-matching will work fine with these. The particular issue with Cyrillic is that it's scripted from Greek characters.
However, a wider issue exists - in that you're going to run into problems with any language that uses characters whose value can't be stored in UTF-8 format by AMS. Arabic is a prime example - because although AMS can handle Arabic input, it can't store the value of those characters in UTF-8 (as far as I know). And which is why I directed you to that thread at: https://forums.indigorose.com/forum/...rabic-language Thought it might be posssible with a bit of 'thinking & tinkering' to find a workaround using those plugins of Ulrich's which are mentioned there.
Interestingly enough, I believe it is actually possible to use POSIX regular expressions with Cyrillic but unfortunately (as previously mentioned), Lua does not use regex for pattern-matching. The reason for which (I think) has something to do with its sheer size. (Takes 'thousands' of lines of code for a typical implementation of POSIX regexp, compared to the mere 'hundreds' required for pattern-matching in Lua).
That's the basic gist of the problem here, at least to my understanding anyway. There may actually be some kind of simple workaround that I'm just overlooking (my grasp of working with non-Latin/Roman based characters in AMS is tenuous at best) but someone like Ulrich or IP would probably have a much better handle on the whole issue. My suggestion would be to have a poke around with those aforementioned plugins of Ulrich's first though - to see what might be feasible/achievable.
Announcement
Collapse
No announcement yet.
Devide 2 words and get First letter of each word HELP
Collapse
X
-
hi charliechaps , BioHazard is correct, using %W in non english character will result nil value..Originally posted by charliechaps View PostDo some research or google - it is simple to see that you are trying to match %w
[ATTACH=CONFIG]n304718[/ATTACH]
Leave a comment:
-
Just an afterthought ...
You might actually find Ulrich's Encoding & Language Actions plugins useful if working with Cyrillic. Have a look thru this thread:
https://forums.indigorose.com/forum/...rabic-language
Forgot all about those plugins.
Leave a comment:
-
@telco,
IP provided a solution based on standard "pattern-matching" (which Lua uses in place of POSIX regular expressions. ie.regex). It makes use of character classes to identify matches to characters typically found on a standard Western keyboard (ie. aA-zZ, 0-9, control characters, etc) - and so can not match for characters from Cyrillic script.
You've already acknowledged that the pattern-matching function which IP provided is 'very helpful' (and yes, it is) but if you first take some time to read up on the basics of pattern-matching (https://www.lua.org/pil/20.2.html) you'll understand WHY his example is helpful. And at the same time come to understand WHY this will not work for languages that do not use the Latin/Roman alphabet as the basis of their scripting.
The bottom line is that in order to be able to use this kind 'matching' for non Latin/Roman based languages (such as the pictographic scripting of Cantonese or Cyrillic scripting of Russian), it would first require an intermediary library of functions to draw upon which could act as an 'interpreter' for these kinds of foreign characters. Not impossible - but most likely way beyond the scope of this forum.
- Likes 1
Leave a comment:
-
Do some research or google - it is simple to see that you are trying to match %w
Leave a comment:
-
local input = "сфк-2015*ещнщкф|VIOS"
local first, second, third, fourth = input:match("([%w%s]+)%-([0-9]+)*([%w%s]+)%|([%w%s]+)");
result = Dialog.Message("Notice", tostring(first), MB_OK, MB_ICONINFORMATION, MB_DEFBUTTON1);
in that example the match results a nil but if using English character works fine..
Thanks
Leave a comment:
-
Hello Imagine Programming i am here again.. because i have encounter an issue about this scripts.. this works really great in English character but if i used chinese, russian or ukraine character the matches returns nil value.Originally posted by Imagine Programming View Post
If your input is always in the "AAAA-YEAR*AAAA|AAAA" format, allowing for spaces in the words, you could try:
Where ([%w%s]+) matches a word with optional spaces, %- matches a '-' (dash) character, ([0-9]+) matches a number (the year), * the '*' character, ([%w%s]+) matches another word and you catch the drift...Code:local input = "CAR-2015*TOYOTA|VIOS" local first, second, third, fourth = input:match("([%w%s]+)%-([0-9]+)*([%w%s]+)%|([%w%s]+)");
can you help me? Thank you so much and happy valentines day.
Leave a comment:
-
-
If your input is always in the "AAAA-YEAR*AAAA|AAAA" format, allowing for spaces in the words, you could try:Originally posted by telco View PostHello again, i have this code to separate the string is there any other ways to make it short or even simple?
example string is: x = "CAR-2015*TOYOTA|VIOS"
....
i am very limited on the code above, if you guys have more simple ways very much appreciated if you can share it to me. thank you.
Where ([%w%s]+) matches a word with optional spaces, %- matches a '-' (dash) character, ([0-9]+) matches a number (the year), * the '*' character, ([%w%s]+) matches another word and you catch the drift...Code:local input = "CAR-2015*TOYOTA|VIOS" local first, second, third, fourth = input:match("([%w%s]+)%-([0-9]+)*([%w%s]+)%|([%w%s]+)");
Leave a comment:
-
Hello again, i have this code to separate the string is there any other ways to make it short or even simple?
example string is: x = "CAR-2015*TOYOTA|VIOS"
nPipe = String.Find(x, "-", 1, false);
local sTemp = String.Mid(x, nPipe+1, -1);
local Type = String.Left(x, nPipe-1);
nPipe1 = String.Find(sTemp, "*", 1, false);
local sTemp1 = String.Mid(sTemp, nPipe1+1, -1);
local yModel = String.Left(sTemp, nPipe1-1);
local nTemp = String.Find(sTemp1, "|", 1, false);
local carBrand = String.Mid(sTemp1, nTemp+1, -1);
local carName = String.Left(sTemp1, nTemp-1);
Input.SetText("Input1", Type)
Input.SetText("Input2", yModel)
Input.SetText("Input3", carName)
Input.SetText("Input4", carBrand)
Output:
Input1 = CAR
Input2 = 2015
Input3 = TOYOTA
Input4 = VIOS
i am very limited on the code above, if you guys have more simple ways very much appreciated if you can share it to me. thank you.
Leave a comment:
-
Trying is the most important part of software development, you could've put my first example pretty much anywhere.Originally posted by telco View Post... yeah i ddint try the IP share because i dont know where to put it...
Leave a comment:
-
Hello i just tried it and it works for my needs.. thank you so much everyone even though i dont understand the code in global,.
telco
Leave a comment:
-
Hello everyone.. sorry for the late response just got home for a vacation.. yeah i ddint try the IP share because i dont know where to put it..
charliechaps and herrin and Imagine Programming thank you for contributions.
Leave a comment:
Leave a comment: