String comparison
Jump to navigation
Jump to search
Lexicographical comparison
- Jb compares strings lexicographically, pretty much like all other languages. This word actually means a (one of possible?) way to compare strings of different length in a meaningful manner.
- Lexicographical comparison goes like this:
- Two strings are compared from the first character, letter by letter.
- If characters are equal, we just take next pair.
- If during this comparison one of strings ended, that string considered to be "less than" longer one.
- If both strings ended at the same time, that means every character of one string is the same as corresponding character in another string. In that case strings are considered equal.
- If we encountered different characters, then order of words ("<" or ">") determined by ordering on those characters.
- As you see, result of comparing strings is determined by order of characters.
Character order
- From above explanation, it should be clear that actually one could pick arbitrary character order. You can take one that will put numbers after letters. Or consonants after vowels. That seems to have not much value, though.
- There is a character order already presenting – each letter has unique ASCII code, and many languages use this ordering. ASCII character order looks like this:
! " # $ % & ' ( ) * + , - . /
0 1 2 3 4 5 6 7 8 9 : ; < = > ?
@ A B C D E F G H I J K L M N O
P Q R S T U V W X Y Z [ \ ] ^ _
` a b c d e f g h i j k l m n o
p q r s t u v w x y z { | } ~
- JB/LB use another character order:
' - ! " # $ % & ( ) * , . /
: ; ? @ [ \ ] ^ _ ` { | } ~ + <
= > 0 1 2 3 4 5 6 7 8 9 A a B b
C c D d E e F f G g H h I i J j
K k L l M m N n O o P p Q q R r
S s T t U u V v W w X x Y y Z z
- Let's call this order Alphabetic.
- You should be aware of the difference.
- Note also that used order is locale specific (could differ if another computer configured for different language). You could get your own order table running code in Example section (if your language uses symbols with codes above 127, uncomment "n=255" line).
- If you are aware of the difference, you understand why for example condition
If a$>"a" and a$<"z" then
- in JB allows capital letters (in languages using ASCII order you will get strictly a..z).
Example
n = 127
'n = 255
DIM a$(n)
FOR i = 1 TO n
a$(i) = CHR$(i)
NEXT
gosub [printTable]
PRINT
INPUT "press Enter"; aa
nSpace = asc(" ") '32
FOR i = n TO nSpace+1 STEP -1
FOR j = nSpace TO i - 1
IF a$(j) > a$(j + 1) THEN
tmp$ = a$(j)
a$(j) = a$(j + 1)
a$(j + 1) = tmp$
END IF
NEXT
NEXT
gosub [printTable]
END
[printTable]
FOR i = 2 TO int(n/16)
FOR j = 0 TO 15
c = j + i * 16
PRINT a$(c); " ";
NEXT
PRINT
NEXT
PRINT
RETURN
Useful Procedures
- In that (unlikely?) case you need exact ASCII compare, you can use this function. Of course it will be slower than native string compare.
'Ascii string compare
'returns -1 if s1$<s2$, 0 if s1$=s2$, 1 if s1$>s2$
function AsciiCompare(s1$,s2$)
if s1$=s2$ then AsciiCompare=0: exit function
minLen=len(s1$)
if minLen>len(s2$) then minLen=len(s2$)
for i=1 to minLen
c1$=mid$(s1$,i,1)
c2$=mid$(s2$,i,1)
if c1$<>c2$ then 'sign of asc(c1$)-asc(c2$)
AsciiCompare = (asc(c1$)>asc(c2$))*2-1
exit function
end if
next
'if we are here then one string ended, and we already checked for equal - so
AsciiCompare=(len(s1$)>len(s2$))*2-1
end function